On Jul 9, 2014, at 9:26 PM, Paul Smith <paul@mad-scientist.net> wrote:
On Wed, 2014-07-09 at 19:55 -0500, Stephen J. Butler wrote:
Can you distill this down to self contained test case?
Difficult. It only happens once in a while on our system in our test suite which is 14,000 individual tests of our entire application. Basically every now and then we notice that the tests are hung and when we go to investigate and get a core this is what we find. It happens at different times in different areas (the fclose() hang was in a totally different function, reading different files).
Our use of stdio functions is literally as simple as the example I quoted in my original email: the FILE* is opened and assigned to a local variable, used, then closed before the function returns and not passed to any functions (except other stdio functions).
I could try to write a simple program that created lots of threads and randomly read/wrote files: not sure how effective that would be as a repro case.
I'm wondering if someone has pointers on what we might investigate (and how) when we get a process in this state, that might help us narrow down where to look or what to concentrate on.
Some possibilities include: * That thread is deadlocked against itself because it's trying to call fread() from a signal handler and the signal handler interrupted another flockfile-ing call. What is the rest of that stack trace? * The process is deadlocked because some other thread owns the lock and won't let go for some reason. What are the other threads' stack traces? * The lock is broken because a memory error smashed it. What does the memory contents of the lock look like? (I don't know what the internals of the current pthread mutex looks like, but the first four bytes should be something similar to 'MUTX'.) -- Greg Parker gparker@apple.com Runtime Wrangler _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.app... This email sent to site_archiver@lists.apple.com