Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Shark (was altivec/velocity)




On Sep 16, 2004, at 10:47 AM, Sanjay Patel wrote:


And no -O !!!! I think in the configure it says if I I use my own
CFLAGS it doesn't set -O2. Well, back to building!

Here is the asm (followed by C code)
	0x24c83c	lwz      r2,64(r30)	3:1	Stall=1	aset.c:544
	0x24c840	slwi     r0,r9,2	2:1	Stall=1	aset.c:544	
	0x24c844	add      r2,r0,r2	2:1	Stall=1	aset.c:544	
	0x24c848	lwz      r0,28(r2)	3:1	Stall=2	aset.c:544	
	0x24c84c	stw      r0,72(r30)	3:1		aset.c:544	
	0x24c850	 lwz      r0,72(r30)	3:1	!Stall=2, Loop start[1], Unroll,

The last load above will also causing LSU rejects on G5 (because the previous
store is to the same address). Well, this is what truly bad (but very
debuggable) -O0 codegen looks like.



with -O2 things get a lot uglier, but the LSU rejects go away,

5.8% 0x1a70b0 slwi r8,r11,2 2:1 aset.c:544
0x1a70b4 li r9,0 2:1 aset.c:543
0x1a70b8 add r2,r8,r26 2:1 Stall=1 aset.c:544
0x1a70bc lwz r10,28(r2) 3:1 Stall=2 aset.c:544
6.3% 0x1a70c0 cmpwi cr6,r10,0 3:1 Stall=2 aset.c:544
0x1a70c4 beq cr6,$+84 <AllocSetAlloc + 412> 2:1 aset.c:544
2.6% 0x1a70c8 lwz r0,4(r10) 3:1 !Stall=2, Loop start[2], Unroll, Unaligned loop start aset.c:546
0x1a70cc cmplw cr7,r0,r25 3:1 Stall=2 aset.c:546
0x1a70d0 bge cr7,$+24 <AllocSetAlloc + 364> 2:1 aset.c:546
0x1a70d4 mr r9,r10 2:1 aset.c:548
0x1a70d8 lwz r10,0(r10) 3:1 Stall=2 aset.c:544
0x1a70dc cmpwi cr6,r10,0 3:1 Stall=2 aset.c:544
0x1a70e0 bne cr6,$-24 <AllocSetAlloc + 332> 2:1 Loop end[2] aset.c:544
0x1a70e4 b $+52 <AllocSetAlloc + 412> 2:1 aset.c:544
6.8% 0x1a70e8 beq cr6,$+48 <AllocSetAlloc + 412> 2:1 aset.c:555
This is for this C-code (again):
12.1% 544 for (chunk = set->freelist[fidx]; chunk; chunk = (AllocChunk) chunk->aset)
545 {
2.6% 546 if (chunk->size >= size) !Unroll, Unaligned loop start
547 break;
548 priorfree = chunk;
549 }
550
551 /*
552 * If one is found, remove it from the free list, make it again a
553 * member of the alloc set and return its data address.
554 */
6.8% 555 if (chunk != NULL)


Marc



_______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/email@hidden

This email sent to email@hidden
References: 
 >Re: Shark (was altivec/velocity) (From: Sanjay Patel <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.