I have a source file in my class library with several large static functions, for which I'm trying to get a decent profile with Shark. However, gcc insists on inlining all these functions, even with -fno-inline-functions and -fno-default-inline in the build settings (overriding the general optimization setting of -Os). Furthermore, Shark's mapping from source to assembly is all mixed up, indicating that 70% of my runtime is spent on a single SSE intrinsic in my source file, well outside the inner loop. (Shark maps this intrinsic to about 300 totally unrelated lines of assembly. I haven't figured out how to untangle this. Is the source<->assembly mapping something the compiler does, or does Shark parse the source file itself?)
So... I tried rebuilding my library with -fkeep-inline-functions also, which generates the following warning:
/Developer/SDKs/MacOSX10.4u.sdk/usr/include/c++/4.0.0/i686-apple-darwin8/bits/gthr-default.h:551: warning: control reaches end of non-void function
This is puzzling for several reasons. First, this looks like a legitimate bug in the headers. Second, I thought XCode 2.3 uses gcc 4.0.1 (not 4.0.0). Third, this option causes my tiny little file to generate over a megabyte of object code. Fourth, applications are subsequently prevented from linking against my rebuilt library, giving the error:
/usr/bin/ld: Undefined symbols: vtable for CFoo ../../Core/build/Debug/libMyLib_d.a(CFoo.o) reference to undefined vtable for CFoo collect2: ld returned 1 exit status
When I remove -fkeep-inline-functions from that one source file, everything links fine. But I still haven't figured out how to generate an optimized but non-inlined version of my code, for profiling purposes. Any suggestions? (This is all happening with x86 code on a MacBook Pro. XCode 2.3, Shark 4.3.3 (4).)
Thanks, Ben |