Re: fread multiple files
Re: fread multiple files
- Subject: Re: fread multiple files
- From: Terry Lambert <email@hidden>
- Date: Fri, 25 Sep 2009 12:07:49 -0700
On Sep 25, 2009, at 10:27 AM, Tomasz Koziara
<email@hidden> wrote:
Hi
An MPI code I use outputs 3 * N files, where N is the number of
processors. Then I am using a serial code to post-process the
results. The post-processor is single-threaded and for every time
instant reads a chunk of data from each of 3 *N files. I noticed
that most of the time (nearly 50%) is spent in:
__spin_lock
pthread_mutex_lock
fread
pthread_mutex_unlock
flockfile
funlockfile
I am wondering whether those of you who know more about optimizing
IO could advice me on how to make this reading more efficient.
What you are seeing with the fread is serialization of access to the
FILE/fd pair being read in order to protect the FILE's stdio buffer
contents and the fd's file offset from being simultaneously accessed
and therefore subject to races. This is normal, and unless you are
accessing the same file from multiple threads simultaneously, the lock
being held is not contended, and you are misinterpreting your
statistics by not subtracting the fread time from the time the lock is
being held to get the actual lock overhead.
If the same file is being contended in multiple threads (the mutex
usage will normally not kick in unless your program is multithreaded
in the first place), then you need to stop using stdio and switch to
using aio_read or press instead, and managing any block buffering,
like that normally provided by stdio, in your own code.
PS: It seems to me that if there isn't simultaneous access to the same
file happening, the main serialization overhead your application has
is in your post-processing code.
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden