• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Anything I should know about thread safety and fprintf?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Anything I should know about thread safety and fprintf?


  • Subject: Re: Anything I should know about thread safety and fprintf?
  • From: Terry Lambert <email@hidden>
  • Date: Thu, 24 Jan 2008 04:59:11 -0800



Ah.  I thought the standard way of system logging was just
fprintf(stderr, ... ).

I don't believe there is anything special about stdout and stdin. They're just FILE* that initially correspond to file descriptors 1 and 2. They can be changed with freopen(3).



OK, I guess some more clarification is in order, since the problem was "fixed", but not "really fixed"...


They (and stderr) are special in threaded programs in that threads libraries implementing stdio must serialize access to FILE *'s, the contents of which are process scope global data, not thread scoped data.

What the get connected to depends on what the parent process connects the fds to, or, if freopen() or some other call moves the (another example would be explicitly closing and opening the underlying fds, which are stored as dumb integers in the FILE *, which has no concept of what it is actually talking to).

If you implement your own operations on FILE *'s, to be thread safe, you have to also participate in the serialization protocol by using filelock(3) (it even has a man page, and it's visible in the problem thread stack).

Since buffered I/O buffering happens in the stdio library in user space, the only way to do this is to either disable buffering (see below, though), which can have bad side effects for some underlying file types, or hold the lock describrd above over any operation capable of changing buffer descriptor contents or metadata (i.e. the FILE *).

When this happens, blocking on output blocks everyone else attempting to take the lock on the same FILE * behind acquiring the lock.

In normal pthreads programs, the programmer will usually avoid the use of FILE * altogether to avoid the serialization point (e.g. because the reason they were using threads in the first place was increased concurrency), OR call setbuf() to disable buffering (and hope/pray that the stdio library author took setbuf() into account, and made the microoptimization of avoiding locking in this one case: usually a vain hope), OR use stdio from a single thread (marshalling all data output to that thread to avoid contention for the FILE * lock).

The third one gets a small performance win, if the output data tends to be, on average, half or less of the default (or selected) stdio buffer size, since it avoids I/O at the cost of acquiring and releasing an uncontested lock on the buffer. The smaller the writes and the longer the time before an automatic or forced flush, the bigger the win, of course.

In general, it's more or less impossible to usefully use the same FILE * from multiple threads in a single process, if the data sink is unable to keep up with the output data rate, or is of a type which can have I/O stalls at all. Even if you don't stall (e.g. you are writing to a local disk drive that is spun ip and responding to all I/O requests), the serialization overhead can result in up to a 2X slowdown. This is due to holding the FILE * lock over both the copy from the data source (and any formatting processing needed), and the flush of the FILE * buffer contents to the real underlying fd (copying from user to kernel space).

Note that ASL and syslog both have this same serialization for formatted data, and because both use sockets on a local connection, you get a transmit queue depth and a receive queue depth of 64K before you see a stall (pipes are 4K, remote fs's and sockets are local send queue + remote receive queue + TCP window size). A tty/pty is ~4K (this is why the paste limit in terminal windows, if you use a shell that's not smart enough to rebuffed into its internal command line buffer). Other things that can be represented by fds and which provide reliable stream deliver have their own limits.

The reason the original poster doesn't see the problem with syslog/ASL at this point is only because the other end is keeping up with the load (for now).

By the way: the multiple threads accessing the same serialized API problem is a design problem, and hasn't gone away, it's just been submerged until the load/thread count goes up high enough in the program doing the logging, or until a high enough priority process loads the system to the point it starts robbing cycles from the logging daemon on the other side of the connection, and it can no longer keep up. At that point, the write will stall again.

If this "can't happen" because more threads will never be created, and the system is dedicated to a single task, so it doesn't need to worry about competitive loading, then this warning can probably be safely ignored, as long as there's never any future changes to the code or system load invalidate these conditions.

-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Prev by Date: Re: Anything I should know about thread safety and fprintf?
  • Next by Date: Re: Anything I should know about thread safety and fprintf?
  • Previous by thread: Re: Anything I should know about thread safety and fprintf?
  • Next by thread: perl 5.8.8, backtick execution and leopard
  • Index(es):
    • Date
    • Thread