Re: C++ RTTI/dynamic_cast across shared module boundaries?
Re: C++ RTTI/dynamic_cast across shared module boundaries?
- Subject: Re: C++ RTTI/dynamic_cast across shared module boundaries?
- From: Zachary Pincus <email@hidden>
- Date: Sun, 5 Mar 2006 15:59:30 -0800
Thanks for the information!
I've been asking about this on the python lists, and have just
received a discouraging reply. Python extensions on OS X are bundles,
not shared libraries, and are not opened with dlopen().
The python folk think that there's no way to get weak symbols bound
across bundles. Is this really the case, or is there an analogous
mechanism for getting symbols shared between different bundles?
Zach
On Mar 5, 2006, at 2:05 PM, Howard Hinnant wrote:
Nick Kledzik passed this information on to me. I don't know
whether it will help or not, but I figured it couldn't hurt to pass
it on:
The one other thing dyld requires can be shown with otool -hv.
Mach header
magic cputype cpusubtype filetype ncmds sizeofcmds flags
MH_MAGIC PPC ALL DYLIB 11 1324
NOUNDEFS DYLDLINK TWOLEVEL WEAK_DEFINES BINDS_TO_WEAK
The key flags are: WEAK_DEFINES and BINDS_TO_WEAK. If an image
has either bit set, dyld will search it for weak external symbols
and for each found, scan all images with the WEAK_DEFINES set
looking for the first external symbol with the same name. I only
mention the mach_header flags because there are some cases where
ld is not setting the WEAK_DEFINES bit and causing confusion.
-Howard
On Mar 5, 2006, at 2:23 AM, Zachary Pincus wrote:
Hi folks,
I'm still trying to resolve my problem with typeinfo object
resolution across DSO boundaries. I thought that I had fixed it,
but now it's broken again. Worse, I still can't reproduce the
problem with simpler test cases.
To recap, I have some shared objects (python extensions) which all
use a particular template class. For dynamic_casting to work, the
typeinfo symbols need to be exported from the modules and loaded
globally by dyld for this to work. Somehow, I can't get this working.
My newest theory is that different libraries are seeing different
definitions of the template class due to some macro issues (as
suggested by Howard below). Is there any way to tell by looking at
the symbols that nm -m produces whether two symbols will resolve
to the same thing at load time?
e.g.:
zpincus% nm -m _ITKBasicFiltersAPython.so | c++filt | grep_for_stuff
00ba2280 (__DATA,__const_coal) weak external typeinfo for
itk::Image<float, 2u>
zpincus% nm -m _ITKCommonAPython.so | c++filt | grep grep_for_stuff
005b3b60 (__DATA,__const_coal) weak external typeinfo for
itk::Image<float, 2u>
Is there any way I can tell whether the loader will resolve these
two symbols (or others) as the same (if RTLD_GLOBAL is passed to
dlopen), or if they will be resolved differently because one
library saw a slightly different definition of the class than the
other?
Thanks everyone for the help,
Zach Pincus
On Feb 18, 2006, at 9:41 AM, Howard Hinnant wrote:
Templates (like the type_info in your example) should be
implicitly declared weak, meaning they will be unique'd across
DSO boundaries. So as long as the template class has the same
definition, you should be ok with respect to the ODR. If
different DSO's saw different definitions of your templates (say
via different macro flags or whatever), only then would you run
afoul of the ODR.
Know that we (Apple) are concerned about your problem and no need
to apologize for bringing this up on the Xcode list, no matter
where the problem ultimately lies.
-Howard
On Feb 18, 2006, at 12:14 PM, Zachary Pincus wrote:
Steve,
Thanks for your discussion of this problem.
One question: How does the One Definition Rule interact with
templated classes? The issue that I'm running into is that the
master "image" class that the image filters in my different
modules all need to interact with is a template. So any module
that needs to create a new image will implicitly implement that
image class. I just can't see any way to *not* violate the One
Definition Rule when you need to share templated classes across
DSO boundaries.
Is this correct? I ask because if there is no good way to not
violate the One Definition Rule with templated classes, that
seems like a good argument for why the current GCC RTTI
implementation is wrong.
Technically, I guess that I could ensure that no module ever
actually constructs any instances of templated classes. Instead
I could have an object factory, defined only in one place, that
handles the construction. Though the One Definition Rule would
still be violated in this case, it wouldn't matter, because
everything would get the same typeinfo object. This is
exceptionally nasty though, and would definitely void out any
performance increases due to not having to do string comparisons
in dynamic_cast operations (etc).
In reference to my specific problem, I'll try to verify whether
python on OS X is (a) using dlopen() to load the modules (I
rather think that it is, but never hurts to check) and (b) that
the dlopen flags are getting set right. Maybe the best approach
will be to write a little module that does run dlopen() with the
right flags. If loading that in python before loading the rest
of the modules fixes things, then it's a python problem and I'll
have to apologize for bugging everyone on the XCode list!
Zach
On Feb 18, 2006, at 10:44 AM, Steve Baxter wrote:
Hi Zach,
The big problem here is the way that GCC implements RTTI - it
is different to pretty much every other implementation. The
GCC runtime compares type_info by pointer rather than by name
(). This means that a class implemented in two different DSOs
(dylibs) will not be considered the same by the RTTI in GCC,
but will be considered the same by the RTTI on almost every
other platform.
Personally I feel this is a mistake on the part of the GCC
designers. Having the same class implemented in two different
libs is technically a bug in your application, it violates the
One Definition Rule:
http://en.wikipedia.org/wiki/One_Definition_Rule
However, in practice the one definition rule is very difficult
(and sometimes impossible if you are a third-party plugin being
loaded by an app over which you have no control) to get right.
GCC requires it to be true or RTTI will fail. VC++ and
Codewarrior do not require this to be true.
The first thing I would do is file a bug on radar against the
GCC RTTI implementation. If Apple compiled libsubc++ with
__GXX_MERGED_TYPEINFO_NAMES=0, type_info would be compared by
name() not address and all your problems will just go away
without any more work. I did file a bug and got it returned as
"behaves correctly". If lots of people file a bug against this
we can maybe change Apple's mind (or at least get them to
provide an alternative version of libsubc++ that does have this
option switched on). See type_info for information about this
compile switch. My bug was 4424486 - please feel free to
reference it.
Failing this, you can work around this problem. Here are the
requirements:
(1) You must export all the symbols in your dylibs. This will
prevent dead code stripping from working and increase the size
of your plugins, but disk space is cheap right (right, but
internet bandwidth is not).
(2) You must load the dylibs by calling dlopen() with
RTLD_GLOBAL. You may also need to pass RTLD_NOW (the
documentation says otherwise, but I have a feeling I couldn't
make it work without this).
I found that CFBundle does not pass RTLD_GLOBAL to dlopen() -
if you are using C++ and RTTI, you cannot use CFBundle to open
your plugins (or rather you can, but you need to use dlopen()
as well before you call CFBundleLoadExecutable()).
I have to say though that you *seem* to be jumping through all
the hoops correctly. Are you sure that Python is definitely
passing your flags on to dlopen()?
I wrote a much longer post about all of this a couple of weeks
ago:
http://lists.apple.com/archives/xcode-users/2006/Feb/msg00234.html
Cheers,
Steve.
On 18 Feb 2006, at 12:52, Zachary Pincus wrote:
I've not tried this on OS X (my problems were on Linux, IRIX,
and Solaris). However, I assume (maybe incorrectly) the
problem exists in all GCC implementations. I also have not
tried this in GCC 4, we were using GCC 3. Maybe the problem
identified in the FAQ was fixed in GCC 4? If so, that would
be fantastic.
The problem we had is that we were using dynamic_cast as a
method for implementing a plug-in architecture. The
dynamic_cast was used to access data types provided by each
plug-in. The problem we ran into was that some of our plug-
ins were also libraries that others linked to. We had tons of
duplicate symbol errors when we exported all symbols.
This is definitely the same species of difficulty that I'm
having: dynamic_cast used for data types across plugins. I'm
sure there's some little OS X-specific twist with how gcc
works that I'm just not understanding. Arrg.
Zach
On Feb 17, 2006, at 9:33 PM, Zachary Pincus wrote:
Michael,
Thanks for this information! That's exactly what I was
looking for.
I assume that you're saying linking with "-Wl,-E" (as
specified on the web page you referred) isn't a good
solution because it exports all global symbols. Our of
curiosity, what about exporting all the global symbols is
bad? Just that it increases the potential for symbol-name
collisions?
Zach
On Feb 17, 2006, at 6:55 PM, Michael Rice wrote:
It sounds like you are running into the C++ ABI described
in the GCC FAQ (http://gcc.gnu.org/faq.html#dso). I ran
into this problem long ago and have still to find a good,
generic solution for this problem (i.e., not having to
export every symbol in the library). My best solution so
far has been to implement my own, less efficient, RTTI system.
On Feb 17, 2006, at 8:43 AM, Zachary Pincus wrote:
Thanks Howard.
In the Code Generation build settings of all targets,
uncheck "Symbols Hidden by Default".
Right now, I'm not actually using XCode (part of my
debugging was to remove XCode from the mix and do all of
the building and linking directly on the command line, so
I could easily fix problem flags). There are absolutely no
'-fvisibility=hidden' flags on the link or compile command
lines I have been using, so I don't think symbols are
being hidden. (Given that the man page for g++ says that
the default is for public visibility.)
Is there any way I verify this with, say, otool?
Also, a correction: telling Python to load with *either*
dyld flags of RTLD_LAZY|RTLD_GLOBAL *or* RTLD_NOW|
RTLD_GLOBAL doesn't help.
Zach
On Feb 17, 2006, at 8:24 AM, Howard Hinnant wrote:
On Feb 17, 2006, at 9:01 AM, Zachary Pincus wrote:
Hi folks,
I've been trying for a while to get c++ RTTI and dynamic
casting to work across the boundaries of several
"bundle" shared modules. I've spent a day looking at man
pages and online, to no avail.
In my case, instances of particular classes can be
created in various modules, but need to work (and
dynamically cast properly) when passed to other modules.
(Before you ask: it's an image processing library, where
different image filter types are defined in different
modules, but they all need to be able to send and
receive the same image types.)
I've linked the modules as follows:
/usr/bin/c++ -bundle -o [output].so [object files] -L
[link paths] -l[link libs]
Now, how do I need to set up my environment to get RTTI
and dynamic_cast working across several such modules?
Right now, the module loader is Python, which I think
uses dlopen to load the modules. I've set the dlopen
flags (in python, sys.setdlopenflags()) to 0x9, which is
RTLD_LAZY|RTLD_GLOBAL (as they are defined in /usr/
include/dlfcn.h), but that really doesn't help. (Other
permutations on the dlopen flags don't help.)
Is there anything else I need to do? Is there anything
else I can try? Is this a hopeless project?
In the Code Generation build settings of all targets,
uncheck "Symbols Hidden by Default".
-Howard
_______________________________________________
Do not post admin requests to the list. They will be
ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
@stanford.edu
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
email@hidden
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
40improvision.com
This email sent to email@hidden
Steve Baxter
Software Development Manager
Improvision
+44-2476-692229
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
40apple.com
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
40stanford.edu
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
40apple.com
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
40stanford.edu
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden