| |||
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] |
Last time I checked, neither version of gcc had the logic to remove the "lvsl" instruction from the loop since your "target" is not loop-invariant (just loop-invariant in the last 4 bits most likely).
Take a look at this page: http://developer.apple.com/hardware/ve/code_optimization.html
In particular, see the section titled "Efficient Safe Unaligned Vector Loads."
Most likely, you'll want to use a loop like the one shown that starts with:
fixAlignment = vec_add( vec_lvsl( 15, ptr ), vec_splat_u8(1) );
_______________________________________________ Do not post admin requests to the list. They will be ignored. PerfOptimization-dev mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden
| References: | |
| >Should I "manualy unfold" LoadUnaligned" function? (From: Rustam Muginov <email@hidden>) | |
| >Re: Should I "manualy unfold" LoadUnaligned" function? (From: Sanjay Patel <email@hidden>) |
| Home | Archives | FAQ | Terms/Conditions | Contact | RSS | Lists | About |
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE
Contact Apple | Terms of Use | Privacy Policy
Copyright © 2007 Apple Inc. All rights reserved.