Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: float to int (kinda OT)



> > Personally, I've always wanted to see a compiler which would do smart  
> > things with the eight condition registers that each PowerPC has at its  
> > disposal. Right now everybody just uses cr0 for everything. (I think  
> > I've seen some rare instances where gcc will use cr7 as well.)  
> > Originally the PowerPC was going to minimize on branching by allowing  
> > you to do things like:
> > cmplwi cr1, something
> > cmplwi cr2, something
> > cmplwi cr3, something
> > cmplwi cr4, something
> > cror/crand something
> > bc something
> >
> > That way you can stack a lot of comparisons together by or'ing and  
> > and'ing the condition bits. But no compiler actually generates it.  

XLC and XLF do. It's not a sure win to make this optimization over a long
stretch of conditionals though. You end up making all paths equal length/time
if you use CR logic as above. By doing individual compare/branches, you have
an opportunity to speed up the common case. (You are putting your most likely
condition first, right? :)

Here's a test case where we're checking 3 conditionals. XLC groups the 1st
two together, but it leaves the 3rd condition on its own:

int foo(int a, int b, int c) {
        if (a<0 || b<0 || c<0) return 1;
        return 0;
}

00000000        cmpwi   r4,0x0
00000004        cmpwi   cr1,r3,0x0
00000008        cror    2,4,0
0000000c        beq     0x20
00000010        li      r3,0x0
00000014        cmpwi   r5,0x0
00000018        blt     0x20
0000001c        blr
00000020        li      r3,0x1
00000024        blr

Note that you can 'optimize' this function by converting logical ops into
arithmetic ops. (I didn't test the speed of this code, so I'm not sure that
this is actually faster.)

int foo(int a, int b, int c) {
        unsigned int boolSum = (a<0) + (b<0) + (c<0) ;
        if (boolSum>0) return 1;
        else return 0;
}

00000040        srwi  r0,r3,31
00000044        srwi  r3,r4,31
00000048        srwi  r2,r5,31
0000004c        add     r2,r3,r2
00000050        add     r0,r0,r2
00000054        addic   r2,r0,0xffff
00000058        subfe   r3,r2,r0
0000005c        blr


--Sanjay


		
_______________________________
Do you Yahoo!?
Declare Yourself - Register online to vote today!
http://vote.yahoo.com
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden

This email sent to email@hidden



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.