Re: C++ RTTI/dynamic_cast across shared module boundaries?

Subject: Re: C++ RTTI/dynamic_cast across shared module boundaries?
From: Zachary Pincus <email@hidden>
Date: Sun, 5 Mar 2006 15:59:30 -0800

Thanks for the information!

I've been asking about this on the python lists, and have just received a discouraging reply. Python extensions on OS X are bundles, not shared libraries, and are not opened with dlopen().

The python folk think that there's no way to get weak symbols bound across bundles. Is this really the case, or is there an analogous mechanism for getting symbols shared between different bundles?

Zach

On Mar 5, 2006, at 2:05 PM, Howard Hinnant wrote:

Nick Kledzik passed this information on to me. I don't know whether it will help or not, but I figured it couldn't hurt to pass it on:
The one other thing dyld requires can be shown with otool -hv.
Mach header magic cputype cpusubtype filetype ncmds sizeofcmds flags MH_MAGIC PPC ALL DYLIB 11 1324 NOUNDEFS DYLDLINK TWOLEVEL WEAK_DEFINES BINDS_TO_WEAK

The key flags are: WEAK_DEFINES and BINDS_TO_WEAK. If an image has either bit set, dyld will search it for weak external symbols and for each found, scan all images with the WEAK_DEFINES set looking for the first external symbol with the same name. I only mention the mach_header flags because there are some cases where ld is not setting the WEAK_DEFINES bit and causing confusion.
-Howard
On Mar 5, 2006, at 2:23 AM, Zachary Pincus wrote:
Hi folks,
I'm still trying to resolve my problem with typeinfo object resolution across DSO boundaries. I thought that I had fixed it, but now it's broken again. Worse, I still can't reproduce the problem with simpler test cases.

To recap, I have some shared objects (python extensions) which all use a particular template class. For dynamic_casting to work, the typeinfo symbols need to be exported from the modules and loaded globally by dyld for this to work. Somehow, I can't get this working.

My newest theory is that different libraries are seeing different definitions of the template class due to some macro issues (as suggested by Howard below). Is there any way to tell by looking at the symbols that nm -m produces whether two symbols will resolve to the same thing at load time?

e.g.: zpincus% nm -m _ITKBasicFiltersAPython.so | c++filt | grep_for_stuff 00ba2280 (__DATA,__const_coal) weak external typeinfo for itk::Image<float, 2u>

zpincus% nm -m _ITKCommonAPython.so | c++filt | grep grep_for_stuff 005b3b60 (__DATA,__const_coal) weak external typeinfo for itk::Image<float, 2u>

Is there any way I can tell whether the loader will resolve these two symbols (or others) as the same (if RTLD_GLOBAL is passed to dlopen), or if they will be resolved differently because one library saw a slightly different definition of the class than the other?
Thanks everyone for the help,
Zach Pincus
On Feb 18, 2006, at 9:41 AM, Howard Hinnant wrote:
Templates (like the type_info in your example) should be implicitly declared weak, meaning they will be unique'd across DSO boundaries. So as long as the template class has the same definition, you should be ok with respect to the ODR. If different DSO's saw different definitions of your templates (say via different macro flags or whatever), only then would you run afoul of the ODR.

Know that we (Apple) are concerned about your problem and no need to apologize for bringing this up on the Xcode list, no matter where the problem ultimately lies.
-Howard
On Feb 18, 2006, at 12:14 PM, Zachary Pincus wrote:
Steve,
Thanks for your discussion of this problem.
One question: How does the One Definition Rule interact with templated classes? The issue that I'm running into is that the master "image" class that the image filters in my different modules all need to interact with is a template. So any module that needs to create a new image will implicitly implement that image class. I just can't see any way to *not* violate the One Definition Rule when you need to share templated classes across DSO boundaries.

Is this correct? I ask because if there is no good way to not violate the One Definition Rule with templated classes, that seems like a good argument for why the current GCC RTTI implementation is wrong.

Technically, I guess that I could ensure that no module ever actually constructs any instances of templated classes. Instead I could have an object factory, defined only in one place, that handles the construction. Though the One Definition Rule would still be violated in this case, it wouldn't matter, because everything would get the same typeinfo object. This is exceptionally nasty though, and would definitely void out any performance increases due to not having to do string comparisons in dynamic_cast operations (etc).

In reference to my specific problem, I'll try to verify whether python on OS X is (a) using dlopen() to load the modules (I rather think that it is, but never hurts to check) and (b) that the dlopen flags are getting set right. Maybe the best approach will be to write a little module that does run dlopen() with the right flags. If loading that in python before loading the rest of the modules fixes things, then it's a python problem and I'll have to apologize for bugging everyone on the XCode list!
Zach
On Feb 18, 2006, at 10:44 AM, Steve Baxter wrote:
Hi Zach,
The big problem here is the way that GCC implements RTTI - it is different to pretty much every other implementation. The GCC runtime compares type_info by pointer rather than by name (). This means that a class implemented in two different DSOs (dylibs) will not be considered the same by the RTTI in GCC, but will be considered the same by the RTTI on almost every other platform.

Personally I feel this is a mistake on the part of the GCC designers. Having the same class implemented in two different libs is technically a bug in your application, it violates the One Definition Rule:
http://en.wikipedia.org/wiki/One_Definition_Rule
However, in practice the one definition rule is very difficult (and sometimes impossible if you are a third-party plugin being loaded by an app over which you have no control) to get right. GCC requires it to be true or RTTI will fail. VC++ and Codewarrior do not require this to be true.

The first thing I would do is file a bug on radar against the GCC RTTI implementation. If Apple compiled libsubc++ with __GXX_MERGED_TYPEINFO_NAMES=0, type_info would be compared by name() not address and all your problems will just go away without any more work. I did file a bug and got it returned as "behaves correctly". If lots of people file a bug against this we can maybe change Apple's mind (or at least get them to provide an alternative version of libsubc++ that does have this option switched on). See type_info for information about this compile switch. My bug was 4424486 - please feel free to reference it.

Failing this, you can work around this problem. Here are the requirements:

(1) You must export all the symbols in your dylibs. This will prevent dead code stripping from working and increase the size of your plugins, but disk space is cheap right (right, but internet bandwidth is not).

(2) You must load the dylibs by calling dlopen() with RTLD_GLOBAL. You may also need to pass RTLD_NOW (the documentation says otherwise, but I have a feeling I couldn't make it work without this).

I found that CFBundle does not pass RTLD_GLOBAL to dlopen() - if you are using C++ and RTTI, you cannot use CFBundle to open your plugins (or rather you can, but you need to use dlopen() as well before you call CFBundleLoadExecutable()).

I have to say though that you *seem* to be jumping through all the hoops correctly. Are you sure that Python is definitely passing your flags on to dlopen()?

I wrote a much longer post about all of this a couple of weeks ago:
http://lists.apple.com/archives/xcode-users/2006/Feb/msg00234.html
Cheers,
Steve.
On 18 Feb 2006, at 12:52, Zachary Pincus wrote:
I've not tried this on OS X (my problems were on Linux, IRIX, and Solaris). However, I assume (maybe incorrectly) the problem exists in all GCC implementations. I also have not tried this in GCC 4, we were using GCC 3. Maybe the problem identified in the FAQ was fixed in GCC 4? If so, that would be fantastic.

The problem we had is that we were using dynamic_cast as a method for implementing a plug-in architecture. The dynamic_cast was used to access data types provided by each plug-in. The problem we ran into was that some of our plug- ins were also libraries that others linked to. We had tons of duplicate symbol errors when we exported all symbols.
This is definitely the same species of difficulty that I'm having: dynamic_cast used for data types across plugins. I'm sure there's some little OS X-specific twist with how gcc works that I'm just not understanding. Arrg.
Zach
On Feb 17, 2006, at 9:33 PM, Zachary Pincus wrote:
Michael,
Thanks for this information! That's exactly what I was looking for.

I assume that you're saying linking with "-Wl,-E" (as specified on the web page you referred) isn't a good solution because it exports all global symbols. Our of curiosity, what about exporting all the global symbols is bad? Just that it increases the potential for symbol-name collisions?
Zach
On Feb 17, 2006, at 6:55 PM, Michael Rice wrote:
It sounds like you are running into the C++ ABI described in the GCC FAQ (http://gcc.gnu.org/faq.html#dso). I ran into this problem long ago and have still to find a good, generic solution for this problem (i.e., not having to export every symbol in the library). My best solution so far has been to implement my own, less efficient, RTTI system.
On Feb 17, 2006, at 8:43 AM, Zachary Pincus wrote:
Thanks Howard.
In the Code Generation build settings of all targets, uncheck "Symbols Hidden by Default".
Right now, I'm not actually using XCode (part of my debugging was to remove XCode from the mix and do all of the building and linking directly on the command line, so I could easily fix problem flags). There are absolutely no '-fvisibility=hidden' flags on the link or compile command lines I have been using, so I don't think symbols are being hidden. (Given that the man page for g++ says that the default is for public visibility.)
Is there any way I verify this with, say, otool?
Also, a correction: telling Python to load with *either* dyld flags of RTLD_LAZY|RTLD_GLOBAL *or* RTLD_NOW| RTLD_GLOBAL doesn't help.
Zach
On Feb 17, 2006, at 8:24 AM, Howard Hinnant wrote:
On Feb 17, 2006, at 9:01 AM, Zachary Pincus wrote:
Hi folks,
I've been trying for a while to get c++ RTTI and dynamic casting to work across the boundaries of several "bundle" shared modules. I've spent a day looking at man pages and online, to no avail.

In my case, instances of particular classes can be created in various modules, but need to work (and dynamically cast properly) when passed to other modules. (Before you ask: it's an image processing library, where different image filter types are defined in different modules, but they all need to be able to send and receive the same image types.)

I've linked the modules as follows: /usr/bin/c++ -bundle -o [output].so [object files] -L [link paths] -l[link libs]

Now, how do I need to set up my environment to get RTTI and dynamic_cast working across several such modules?

Right now, the module loader is Python, which I think uses dlopen to load the modules. I've set the dlopen flags (in python, sys.setdlopenflags()) to 0x9, which is RTLD_LAZY|RTLD_GLOBAL (as they are defined in /usr/ include/dlfcn.h), but that really doesn't help. (Other permutations on the dlopen flags don't help.)

Is there anything else I need to do? Is there anything else I can try? Is this a hopeless project?
In the Code Generation build settings of all targets, uncheck "Symbols Hidden by Default".
-Howard
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: @stanford.edu

This email sent to email@hidden
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: email@hidden
This email sent to email@hidden
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: 40improvision.com

This email sent to email@hidden
Steve Baxter
Software Development Manager
Improvision
+44-2476-692229
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: 40apple.com

This email sent to email@hidden
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: 40stanford.edu

This email sent to email@hidden
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: 40apple.com

This email sent to email@hidden
_______________________________________________ Do not post admin requests to the list. They will be ignored. Xcode-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: 40stanford.edu

This email sent to email@hidden


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: C++ RTTI/dynamic_cast across shared module boundaries?
From: Eric Albert <email@hidden>


References:  
  >Re: C++ RTTI/dynamic_cast across shared module boundaries? (From: Zachary Pincus <email@hidden>)
  >Re: C++ RTTI/dynamic_cast across shared module boundaries? (From: Howard Hinnant <email@hidden>)




Prev by Date:
Header files are included in built package resources

Next by Date:
Re: Header files are included in built package resources

Previous by thread:
Re: C++ RTTI/dynamic_cast across shared module boundaries?

Next by thread:
Re: C++ RTTI/dynamic_cast across shared module boundaries?

Index(es):

Date
Thread