XCode 2.2.1 / gcc 4.0 Peephole Bug
XCode 2.2.1 / gcc 4.0 Peephole Bug
- Subject: XCode 2.2.1 / gcc 4.0 Peephole Bug
- From: Ben Weiss <email@hidden>
- Date: Wed, 19 Apr 2006 01:09:11 -0700
Given the Altivec function:
vector unsigned short peepholebug(vector unsigned short a, vector
unsigned short b) {
vector unsigned short mask = (vector unsigned short)vec_cmplt(a, b);
if (vec_all_ge(a, b)) return a;
return mask;
}
XCode 2.2.1 / gcc 4.0 generates ( with optimizer set to -os):
mfspr r0,256
stw r0,-8(r1)
oris r0,r0,0x8000
mtspr 256,r0
vcmpgtuh. v0,v3,v2
vcmpgtuh v0,v3,v2
beq cr6,L99
vor v2,v0,v0
lwz r12,-8(r1)
mtspr 256,r12
blr
Note the second "vcmpgtuh" instruction, which is completely
superfluous. The peephole optimizer should recognize this situation
and remove the instruction. (I've filed a bug with Apple; #4519214.)
Anyone know if more recent versions of gcc are able to do this? I
have some bottleneck code that could seriously benefit from this, and
I'd rather avoid assembly if I can...
Thanks,
Ben
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden