Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Should I "manualy unfold" LoadUnaligned" function?




On Nov 1, 2005, at 3:04 AM, Rustam Muginov wrote:

Hello all.
I am using LoadUnaligned function described here:
http://developer.apple.com/hardware/ve/alignment.html
The function is pretty simple:


static vector unsigned char LoadUnaligned( unsigned char *target )
{
  vector unsigned char MSQ, LSQ, result;
  vector unsigned char mask;

  MSQ = vec_ld(0, target); // most significant quadword
  LSQ = vec_ld(15, target); // least significant quadword

  mask = vec_lvsl(0, target); // create the permute mask
  return vec_perm(MSQ, LSQ, mask); // align the data
}


Then iterating through the data in 128bit chunks, the third instruction in the function seems an invariant for me. It would always create the same permute mask then loaded at the 16bytes*n offsets.


Should I manualy "unfold" this function in the cycle, i.e. create permute mask

  mask = vec_lvsl(0, target);

before cycle started, and do only three instructions inside the cycle:

  MSQ = vec_ld(0, target);
  LSQ = vec_ld(15, target);
  return vec_perm(MSQ, LSQ, mask);


Or it is enough to declare this function as inline and compiler would remove invariant from loop itself?
I am interested in the behaviour of both gcc3.3. and 4.0

Last time I checked, neither version of gcc had the logic to remove the "lvsl" instruction from the loop since your "target" is not loop- invariant (just loop-invariant in the last 4 bits most likely).


Take a look at this page:
http://developer.apple.com/hardware/ve/code_optimization.html

In particular, see the section titled "Efficient Safe Unaligned Vector Loads."

Most likely, you'll want to use a loop like the one shown that starts with:
fixAlignment = vec_add( vec_lvsl( 15, ptr ), vec_splat_u8(1) );



-- Sanjay Patel Architecture and Performance Group Apple Computer, Inc.

_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden

This email sent to email@hidden
References: 
 >Should I "manualy unfold" LoadUnaligned" function? (From: Rustam Muginov <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.