Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Gcc 333 and altivec optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gcc 333 and altivec optimizations

Subject: Re: Gcc 333 and altivec optimizations
From: Marc Van Olmen <email@hidden>
Date: Wed, 14 Apr 2004 13:00:55 -0400

Hi,

Thanks for your reply really appreciated. Thx for pointing me out to this
doc, I see how I can add inline assembly!!! thx!!

I added: -mtune=G4 -fast to my release build.

To answer your question:
I was expecting this

  vmhraddshs    v11,v0,v6,v9
  vmhraddshs    v12,v0,v8,v9
  vmhraddshs    v13,v0,v7,v9


Because my code is in a loop, c0 is a vector initialized outside the loop
Tr0 and r0 and r1 and r2 are the result of a previous vector operations just
above.

But:

I  added register to my variables and now it seems that the code in the
release mode is about 16 millisecond to execute in debug 25miliseconds. So
some optimization is done.

Question:
Stilll one question remain how can I see generated assembler code and use
that as basis to optimize.

thx

mvo



> On Apr 13, 2004, at 10:39 PM, Marc Van Olmen wrote:
>
>> Hi
>>
>> I have some altivec code:
>>
>>
>>             t0 = vec_mradds( tr0, r0, c0 );
>>             t1 = vec_mradds( tr0, r1, c0 );
>>             t2 = vec_mradds( tr0, r2, c0 );
>>
>> It is generating the following assembler code:
>>
>> 0x0000b53c  <+0944>  addi    r2,r30,160
>> 0x0000b540  <+0948>  lvx    v12,r0,r2
>> 0x0000b544  <+0952>  addi    r2,r30,304
>> 0x0000b548  <+0956>  lvx    v9,r0,r2
>> 0x0000b54c  <+0960>  vmhraddshs    v11,v0,v12,v9
>> 0x0000b550  <+0964>  addi    r2,r30,176
>> 0x0000b554  <+0968>  lvx    v9,r0,r2
>> 0x0000b558  <+0972>  addi    r2,r30,304
>> 0x0000b55c  <+0976>  lvx    v6,r0,r2
>> 0x0000b560  <+0980>  vmhraddshs    v12,v0,v9,v6
>> 0x0000b564  <+0984>  addi    r2,r30,192
>> 0x0000b568  <+0988>  lvx    v6,r0,r2
>> 0x0000b56c  <+0992>  addi    r2,r30,304
>> 0x0000b570  <+0996>  lvx    v5,r0,r2
>> 0x0000b574  <+1000>  vmhraddshs    v9,v0,v6,v5
>>
>>
>> To me this looks lousy because there is no need to all of this... Extra
>> code...
>
> What extra instructions do you see (not clear what tr0, r0, c0 and t0
> are... they stack vars?, etc.)? The only extra stuff I see is the two
> additional addi r2,r30,304 that could be avoid if one wanted to utilize
> an extra register to cache the calculated value and the related caching
> of  the value loaded.
>
> It is not clear what optimizer options or tune option you are using ...
> if the debug build then likely no optimization is taking place. So the
> compiler is being literal in what it is doing when working with your
> code.
>
> Anyway assuming you have the developer tools installed (xcode 1.1) then
> the GCC3 release notes have some helpful information as does the
> supplied GCC documentation (also covers how to inline assemble).
>
> <file:///Developer/Documentation/ReleaseNotes/DeveloperTools/GCC3.html>
>
> -Shawn
_______________________________________________
xcode-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/xcode-users
Do not post admin requests to the list. They will be ignored.

References:
	>Re: Gcc 333 and altivec optimizations (From: Shawn Erickson <email@hidden>)

Prev by Date: Re: Gcc 333 and altivec optimizations
Next by Date: Re: Gcc 333 and altivec optimizations
Previous by thread: Re: Gcc 333 and altivec optimizations
Next by thread: Re: Gcc 333 and altivec optimizations
Index(es):
- Date
- Thread