On Oct 18, 2005, at 15:57, Robert Dell wrote:
let's see, the whole thing SHOULD take 17 ppc machine code instructions if the compiler is doing it's job properly. at 1 GHZ processor speed, it'll take the CPU 0.000000017 seconds to perform that single line.
OK, doesn't that assume all instructions take one cycle? Maybe all of the instructions you specified do take one cycle, or maybe all instructions on the PPC take 1 cycle; something I don't know for certain, but just wondering.
How then is performing functions going to make it run any faster? People that haven't grown up with 48k RAM, 1MHz machines that had to hand code for streamlined speed don't know how to optimize properly. functions do NOT increase speed, they REDUCE the speed.
You are correct, of course, but you are missing some points:
If speed is so vital, why don't you inline your functions? Or, if that doesn't work, you could <horror!> use preprocessor macros! It makes the code much more readable; isn't that worth something?
Of course, you can also use Build->Preprocess and see whether the inlining is working or not.
I also remember reading that the PPC has a fairly impressive branch prediction unit, along with it's own rather large cache. In fact, as I recall the G5 has TWO branch prediction units, a large cache (I don't remember... 32k? 16k?), and some internal logic that chooses the prediction of the unit that has been getting the highest accuracy. I also expect that the branch predictor is pretty damn accurate when it comes to functions (compared to conditionals, for eg) since you'll often go to a function every time. I admit I don't know what the invocation overhead is to create a function's stack frame.
I also wrote some trivial code not too long ago with a goto statement. My friends laughed at me and told me I was saving nothing. Trying to prove them wrong, I re-wrote the function in a more structured fashion. The results favoured me... But only by about a 100th of a second for extremely large data sets, and I was hard-pressed to make that a statistically significant variation. However, the code with the goto was much more readable in C form (I thought) and generated assembly about 2/3rds the size of the structured code.
Thus, I do see where you're coming from, Robert, but modern processors occasionally do some incredibly impressive things when we're not looking. You may well be right in your assertion, but I would encourage you to test it out empirically. You might end up pleasantly surprised.
As a related story, a number of years ago I worked with some code from an extremely popular game designer who must remain nameless because I think I signed an NDA. The code was for the AI of one of their bestselling flagship games. We were evaluating a subset of the AI. In order to "speed things up", the AI was about 30 pages of code evolved from something that must have made sense at some point. There were few functions, lots of redundant bits, and lots of gotos. For an experiment we re-wrote the code, removing redundancies and gotos and replacing them with functions. This ended up increasing the speed of the code by around (I think) 10%.
Ok, that's my $0.02!
Markian
----
When arguing with an idiot, be sure they aren't doing the same.