To Answer My Own Question (was Re: Odd GCC Behavior)
site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com ret = gimplify_expr (from_p, pre_p, post_p, rhs_predicate_for (*to_p), fb_rvalue); if (ret == GS_ERROR) return ret; -J. Aaron Pendergrass _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... This email sent to site_archiver@lists.apple.com On Jun 30, 2008 at 4:05 PM, Andy Wiese wrote: Thank you Brian. That's the most useful thing I've learned from this thread! I'm not sure I can beat using -Wall, but after the luke warm list response I figured I'd just dive into the GCC source and find out for myself exactly what was going on. The short answer, this is behavior is specific to Apple's GCC 4.0.1 build 5465 (well > 5370, I don't know exactly when the change was made because Apple doesn't allow access to their GCC source control repository, just the current version). It is not specific to the PPC branch as I originally thought (the Intel compiler I had was just an older build). And, the cause of it does may actually impact the performance of completely defined (by the C specification) code. The long answer follows: Here's the code responsible for the behavior copied from gimplify_modify_expr() in gcc/gimplify.c (see: http://www.opensource.apple.com/darwinsource/10.5/gcc-5465/gcc/gimplify.c ): /* APPLE LOCAL begin 4228828 */ /* For stores to a pointer, keep computation of the address close to the pointer. Later optimizations should do this; the 'sink' pass in 4.2 does it, but that's not in 4.0. Temporary. */ if (TREE_CODE (*to_p) != INDIRECT_REF) { ret = gimplify_expr (to_p, pre_p, post_p, is_gimple_lvalue, fb_lvalue); if (ret == GS_ERROR) return ret; } /* APPLE LOCAL end 4228828 */ /* Now see if the above changed *from_p to something we handle specially. */ ret = gimplify_modify_expr_rhs (expr_p, from_p, to_p, pre_p, post_p, want_value); if (ret != GS_UNHANDLED) return ret; /* APPLE LOCAL begin 4228828 */ if (TREE_CODE (*to_p) == INDIRECT_REF) { ret = gimplify_expr (to_p, pre_p, post_p, is_gimple_lvalue, fb_lvalue); if (ret == GS_ERROR) return ret; } /* APPLE LOCAL end 4228828 */ gimplify_modify_expr() is the routine responsible for converting the language dependent syntax tree for an assignment operation into the language independent GIMPLE language/tree (side note: you can view the GIMPLE using the -fdump-tree-gimple option to GCC, this was a very helpful ability when trying to understand exactly why this was happening). As you can all see, Apple inserted a special check to see if the left- hand-side (called to_p) is an INDIRECT_REF (i.e., a pointer expression). If it is a pointer inspection then the right-hand-side is gimplified before the left, otherwise (as is the case in stock gcc) the left-hand-side is always gimplified first. In general, gimplification tends to proceed from left to right, so why did Apple change it here? The intent here AFAICT is to make register assignment a bit simpler (especially on a particular register-starved architecture) by ensuring that the storage location of the LHS is does not need to be stored to memory in order to perform the computation of the RHS. The "sink" pass referred to in the comment is applied after the SSA transformation (which is after GIMPLIFICATION), and like the other optimization passes won't (/isn't supposed) not to change the observable effects of its input, so mainline gcc which does not include this trickery computes the l-value first because the GIMPLE says to and none of the other optimizations will violate that ordering (due to the embedded assignments effect. A noteworthy bit about all this is that the INDIRECT_REF type does not include array subscripts (unless, I believe, the array being subscripted is a formal parameter of the current subroutine in which case all operations on it are converted to pointer ops by the C-frontend). (Here comes the piece that may have a performance impact): The effect is that the lvalue of the LHS of an assignment to an array element is likely to be computed before the r-value to be assigned, this in turn may force that l-value to be written to memory during the computation of the RHS and then read back in when the assignment is ready to proceed. I have not finished experimenting with the optimizer to try to force this behavior (my test programs have largely been trivial enough that the other optimizers have essentially wiped out the entire computation), but if the above code segment is the only solution to this problem, then it is definitely missing array element updates. I'm glad I took the time to track this down, understanding why things happen the way they do is always very rewarding. I had been meaning to tackle the GCC source code for some time, but had been somewhat daunted and didn't have any good questions to ask it. That said, this was clearly the wrong list to ask the question I did. I was looking for more of a gcc-devel list, but had an inkling this was an apple only phenomenon (and I was right), and the mainline folks tend to not be thrilled about supporting apple's bizarre forks. smime.p7s
participants (1)
-
J. Aaron Pendergrass