Re: XQuartz internal-behind-the-scenes question
Re: XQuartz internal-behind-the-scenes question
- Subject: Re: XQuartz internal-behind-the-scenes question
- From: Ken Thomases <email@hidden>
- Date: Sat, 13 Sep 2014 00:14:28 -0500
On Sep 12, 2014, at 7:45 AM, René J.V. Bertin <email@hidden> wrote:
> Not really a question for a *users* list, but here goes.
Perhaps should have been sent to the xquartz-dev list, then. <https://lists.macosforge.org/mailman/listinfo/xquartz-dev>
> I'm facing a situation in a KDE application where a crash occurs because I'm deleting a Qt object corresponding to a UI element that may still have Cocoa events pending (or be currently handling an event somewhere higher up in the call stack). Qt provides a `deleteLater` method which is intended for cases where to-be-deleted object still has events pending and/or might be in use in a different thread; it defers the delete till after exit from the event loop (and/or until the owning thread exits?). Indeed, using that deferred delete systematically avoids the crash.
>
> I'm rather sure that there must be ObjC code and objects somewhere deep in XQuartz's OS X specific code,
There is. You can see it yourself here: http://cgit.freedesktop.org/xorg/xserver/tree/hw/xquartz
> and that there are situations where an X11 object corresponding to a UI element is being freed in similar conditions, for instance because the user clicks on a button that should close a dialog (rather than close the widget which ultimately leads to deleting the object).
No. This situation doesn't arise in the X server. The X server basically doesn't handle events. It just sends them to clients. Note that the window manager (often quartz-wm when using XQuartz) is just a client. So, the window close button is the same as any other button in a window from the perspective of the X server. The client would eventually call an X library function that submits a request to destroy the window and the X server would destroy it then.
Any in-flight events or calls for a window which has been destroyed are not a problem. Those things do not hold any direct reference (like a pointer) to the destroyed window. First, the events are likely to be in a different process where there can be no pointer to the X server's resources. But even for events within the X server, they hold window IDs which are just numbers. At the time the event or call is handled, the window is looked up from its ID. If the window has been destroyed, this lookup comes up empty, the handler recognizes that and returns an error. It doesn't crash.
By the way, the X server is single-threaded. (XQuartz, as a Cocoa app, has multiple threads, but only one of those is the X server as such. Only that one deals with X resources.)
> The crash I mentioned occurs only on OS X, not on Linux, and the app's authors think I shouldn't follow advice found on forums (showing near-identical backtraces to the one I'm seeing) to prefer using the `deleteLater` method rather than a direct/immediate delete, at least on OS X. They're inclined to think there's a bug in Qt.
I'm not familiar with Qt's implementation, but it sounds to me like a fundamental design flaw in Qt. Qt should use a reference counting scheme (or something similar). Any code which requires an object to exist should ensure it does by taking (shared) ownership of it. It should increase its reference count. It should release it when it's done. Code should never delete objects directly; it should only release its ownership claim. The object will only be deleted when the last such claim is released. Therefore, except for bugs resulting in unbalanced retains vs. releases, it should not be possible for an object to be deleted while something needs it to continue to exist.
Hand-offs between threads should be carefully coordinated so that the thread receives ownership of any objects it receives references to.
Cocoa does implement autorelease pools, which is somewhat similar in concept to what you describe of Qt's deleteLater. However, that's only really necessary since Objective-C doesn't otherwise have a clean way for a callee to pass ownership of a returned object to the caller. C++ does support that, so Qt shouldn't require deleteLater.
> Myself, I think we're in situation created by reimplementing a framework/API (ultimately designed for X11, I think) on top of OS X APIs. I know from personal experience in this domain that it's not at all difficult to delete objets before all its events have been handled completely on OS X. In my experience, the bug would be to do an immediate delete rather than a deferred delete, or in this case, trying to provide a 100% safe way to do an immediate delete.
Well, you can surely _release_ a Cocoa window while there are still events pending for it in the event queue. However, those events either don't have direct references to the window yet or, if they do, they're strong references (ownership claims increasing the reference count). In the former case, the window may be deallocated by the release if it was the last reference, but that won't cause a crash. When the event is dequeued, it will either be discarded because the window number it holds can't be translated to an extant window or it will be rerouted to a different window. In the latter case, the window won't be deallocated by the release because the event had a strong reference to it and there will be no problem routing the event to the window.
> I'm not familiar enough with Qt's internals to know exactly why the crash occurs and I'm not expectng that from anyone on here.
> What I'm hoping for is some feedback from developers who have experience building a foreign GUI framework (like X11...) on top of OS X and who know OS X and its frameworks much better than I do.
I have experience with that both from hacking on the XQuartz code and from implementing the Win32 API on top of Cocoa <http://source.winehq.org/git/wine.git/tree/HEAD:/dlls/winemac.drv>.
> Am I right that doing an immediate delete of an object in response to a UI event is a bad idea because you cannot know in that event handler how many more events could arrive for that object and/or how many events are being processed? Or should it indeed be safe to do a direct delete as long as it's not in use in (or owned by) another thread?
As I've said, any design involving direct "delete" as opposed to "retain" and "release" is almost impossible to deal with. It doesn't much matter whether it's immediate or deferred.
Complex software systems involve multiple subsystems that have to coordinate access to shared resources. In a "delete" model, only a single subsystem can have ownership of a resource at any given time. Either the resource is never shared outside of the subsystem, so the subsystem can be sure that all references are cleared before deleting the resource, or the resource must be strictly passed to another subsystem which assumes ownership. The subsystem passing the resource must clear all of its references to the resource upon passing it off and the receiving subsystem will have to follow similar rules if it ever passes it off.
Probably none of this helps because you're not in a position to redesign Qt. It's hard to know whether you or the app's developers are right. Qt ought to spell out the design contract about object lifetimes. When it's legal to delete objects, when it's _required_, and when it's illegal. If Qt does spell this out, then that would determine if there's a bug in the app because it doesn't adhere to the contract or a bug in Qt because it crashes even though the app does adhere to the contract. If Qt doesn't spell it out, then disagreement being its implementation and clients is inevitable, as are crashes.
Regards,
Ken
_______________________________________________
Do not post admin requests to the list. They will be ignored.
X11-users mailing list (email@hidden)
This email sent to email@hidden