Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: fread multiple files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fread multiple files

Subject: Re: fread multiple files
From: Tomasz Koziara <email@hidden>
Date: Sat, 26 Sep 2009 19:06:30 +0100

Thanks again for all replies

In fact one of my "optimization scenarios" was to open the N file triplets (data, index, label) in N threads. I think I will try it. The biggest bottleneck though seems to be the fact that XDR implementation uses fread internally and does not seem to buffer the red data too well. All my reading and writing goes through XDR calls hence, I don have control over the internal handling for file stream. I can still see some space for improvement as XDR stream can be attached to a memory block, which I could prior read in one go - this though depends on whether the stream "positions" returned by xdr_getpos are actually absolute byte positions in a memory block or something else.

Tomek

On 25 Sep 2009, at 22:21, Terry Lambert wrote:

You could definitely be doing your three I/Os concurrently instead of serailly, assuming the index data is not needed to drive the I/O, at which point you could do your index I/O concurrently and then do your data file I/O after that (semiserially, in other words).

But if you arranged your code to do the first index read up front before going into the loop, you could do your *next* index reads concurrently with your data I/O for your current indices.

You could further increase concurrency , assuming the data is index linked instead of data linked (i.e. you don't need each set of data prior to the next set of data) by inlementing a producer/consumer model, and issuing your I/Os as fast as you can and queueing the buffered data records for processing (you'd want to rate limit the number of outstanding I/Os to limit the data you have in hand at one time, before you ask for more data).

If your data comes in at a rate higher than 1.5 times faster than you can process it, if you have a sufficiently large pool of work item queue elements, you could establish high and low water marks on number of elements in the queue, and then only go back for more data when you hit the low water mark.

-- Terry


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >fread multiple files (From: Tomasz Koziara <email@hidden>)
  >Re: fread multiple files (From: Jens Alfke <email@hidden>)
  >Re: fread multiple files (From: Tomasz Koziara <email@hidden>)
  >Re: fread multiple files (From: Terry Lambert <email@hidden>)




Prev by Date:
Re: fread multiple files

Next by Date:
Re: Floating-point exception handling

Previous by thread:
Re: fread multiple files

Next by thread:
Re: fread multiple files

Index(es):

Date
Thread