Re: GCC stack size
Re: GCC stack size
- Subject: Re: GCC stack size
- From: Ryan McGann <email@hidden>
- Date: Tue, 03 Mar 2009 00:08:45 -0800
Just to let everybody know (and for the sake of the archives), it
appears to be our kernel extension is running into a known limitation
of GCC. Just to recap, we have a function in a very portable, very
lightweight library, that for some reason is getting compiled with a
1600+ byte stack when compiled on/for Darwin. No matter what compiler
options I tried, and what optimization levels I used, the stack was no
smaller than 1400 bytes . Since it's in the kernel that stack size is
much too large, and is causing a "machine check" panic in pretty
reproducible (but not frequent) situations.
After a lot of searching on the Web, learning some i386 assembly and
talking to some people who are much better at assembly, I have learned
that GCC is somewhat notorious for having horrid stack allocation.
Sure enough, Visual Studio C++ compiles this the exact same code with
absolutely no modifications or macro trickery, and the function is
given a 96 byte stack (which is exactly what we wrote it to have).
Apparently this is gcc's dirty little secret, except it's not much of
a secret to some--Linus Torvalds has complained several times on
various lists about the gcc stack allocation (search lkml.org for "gcc
stack usage"). Once I knew what to search for, there was plenty of
griping about gcc's subpar allocation of stack variables, and in
particular, it's inability to re-use stack space for variables in
different scopes.
Unfortunately none of this helps me find an actual solution. The
library is too portable, and too complex, to hand-code in assembly for
3 different architectures. So for now we are shopping around for
different compilers. I am trying the Intel compiler tomorrow to see
what its output is like, but the PowerPC code still has the same
problem. Unless anybody knows of an alternative PowerPC-capable
compiler that runs on Mac OS X, our PowerPC users might have to live
with a much more limited featureset (I don't think I can convince
anybody to live with a "rare but reproducible panic").
Thanks,
Ryan
A couple weeks ago I asked about a machine-check panic I was
getting. As it turns out, my suspicions were right about stack
corruption. I disassembled a function in our kext and saw that for
some reason, GCC's function prologue was allocating around 1660
bytes of stack (in release, debug was slightly better at 1140
bytes). On other platforms compiled with GCC (FreeBSD and many
different Linux distros) the stack usage is near 110 bytes, but for
some reason gcc on Mac OS X is allocating almost 4K.
We are currently using -O3 because the code is pretty compute-bound,
and on other platforms -O3 has a nice 5% boost compared to -O2. But
changing it to -O2 doesn't even help, we have to go all the way to -
O1 to get a usable stack of 400 bytes (still 4x larger than our
Linux driver).
The code is vanilla C++ without anything fancy—no virtual functions
even. There are no warnings about temporarys being used, so I have
no clue what is causing the stack usage. It's a huge function with a
lot ofswitch statements and for loops, but not a lot of function
calls, mostly just computes on arrays of data. My best guess is that
GCC is trying to optimize the intermediate operations and temporary
results by placing them on the stack.
Anybody have ideas on how to show where GCC is allocating things in
the frame, and how to reduce the stack usage? It's hard to distill
this to a single issue because the function is so large, but I am
tempted to file a bug since GCC 4 optimizes things quite nicely on
other platforms.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden