site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com I am not sure whether this is the best group for my problem; if not, please accept my apologies and, er, tell me where to go ... I have been working on a parallel Lisp implementation (actually, Scheme, which is a dialect of Lisp), in which separate instances of the *same* Macintosh application run at the same time, and share a good deal of memory, both for Scheme object storage and for interprocess communication. If N copies of my application are to be launched, the first one is launched in some normal way, e.g., by mousing on it, and it launches the other N-1 via in essence system("<path to my application>/Contents/MacOS/<my app's name> &"); (Actually, the call uses argv[0] to get the full path, and adds a few flags and operands that have to do with identifying which of the N processes is being started, setting up the mmap, and the like.) To my complete and utter astonishment, this works like gangbusters -- I really do get N complete instances of my application, each with a its own main window, menu bar, and so on. (And by the way, much thanks to Terry Lambert, a few months ago, for hints on why and how to use mmap.) After a time chasing down deadlocks, critical-section violations, and other untoward consequences of parallel processing, I have gotten things to where I begin to see errors that are not obviously the consequence of my own obtuseness and inadequate coding skills. One of them is my subject for tonight. It occurs rarely -- roughly, once in 10000 runs of my application (I am doing lots of regression testing) -- actually, that's once in about 2000 runs of 5 parallel copies of my application at a time; that's about once a day with my Mac running regression tests nearly full-time. I get a crash with a crash log, and I won't bother you with too many details. What I see is a failure in thread zero of my app -- that's the one where GUI I/O is done, always deep inside a display or displayIfNeeded of my main window. The actual crash is an EXC_BAD_ACCESS with code KERN_PROTECTION_FAILURE in a function called szone_free, in libSystem.B.dylib. This is nested some fifteen or twenty function calls inside of anything I myself wrote. (And I should say that my GUI code is written in Cocoa -- my app is model/view/controller, with the view and controller in cocoa and the model a separate thread of straight C++.) (I should also say that I am still working in Tiger, running XCode 2.4.1, Mac OS X 10.4.10, on a 2006 model 13-inch Macbook with an Intel Core-Duo). While contemplating this problem, I remembered something I had read in Dalrymple and Hillegass's "Advanced Mac OS X Programming": "... Mach ports are used for a lot of interprocess communications, particularly to the window server, and are very important to Cocoa." (p. 366) I then noticed that every time I launched N parallel copies of my application, I would get N-1 error messages in the console log, each of the form Perusal of archives suggested that each instance of my application is trying to open a Mach port based on the CFBundleIdentifier in my app's Info.plist, and only the first of the N is able to do so -- the others all find the port name they are looking for in use. The archives I found suggested that the message was harmless, which blindsided me about it for a while, perhaps. Furthermore, the "first of N" of my parallel processes is always the one that does the system calls to open the others, and the processes that have crashed have always been one of the other N-1; that is, one of the ones that did not get the Mach port it wanted. So I am wondering if failure to open this Mach port (I don't know what it is actually for, by the way) is in some way causing this extremely rare failure. If anyone is still reading, do you have a sense of whether I am on the right track? Any ideas for a fix? Any ideas for how to instrument and test to see if I can better understand what is going on? (For the terminally curious, the application in question is "Wraith Scheme", described on my web site, whose URL is in my .sig below, but what's on the web site is *not* the parallel version, that last is still under development.) Thanks much! -- Jay Reynolds Freeman --------------------- Jay_Reynolds_Freeman@mac.com http://web.mac.com/jay_reynolds_freeman (personal web site) _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... "... CFLog (99): CFMessagePortCreateLocal(): failed to name Mach port ..." This email sent to site_archiver@lists.apple.com