Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: SSE3.5, SSE4
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SSE3.5, SSE4




On Sep 28, 2006, at 11:09 AM, Ian Ollmann wrote:

For the moment, you can turn the extension on in the compiler using -mmni, and #include <tmmintrin.h>. (The extension was originally slated to ship with the Tejas processor, which was cancelled. Thus, the "T".)

Since this is the first time that you've all had to support an optional vector ISA extension (for real) on the platform, I'll take a minute to point out some differences between SSSE3 and AltiVec.


The -faltivec flag (without -maltivec) turned on the AltiVec programming model but didn't give the compiler permission to stick AltiVec anywhere it wanted to. You could safely intermingle vector code and scalar code without causing trouble for the G3. SSSE3 support doesn't work that way. The -mmni flag gives the compiler carte blanche to stick SSSE3 anywhere it wants in the file, even in places you didn't explicitly use it. If a Code Duo hits that code, it'll crash. Now, for the moment, I don't think the compiler is aware enough of the merits of the new ISA extension to do that, but it could at any time silently start doing so. You can save yourself a world of trouble later if you do the right thing now.

The first thing you need to do is move any SSSE3 code off to it's own file. Formally, SSE3 is also optional, so if you want to be pedantic, you ought to move that out too. They should be different files.

This causes another problem. If your PowerPC build sees -mmni, it will choke, so you can only pass -mmni on the Intel side. (I already filed this bug and it was returned not to be fixed, by the way, so don't bother.) While Xcode provides a way to do per-file compilation flags and per-arch compilation flags, it doesn't provide a straightforward way to do per-file per-arch compilation flags in GUI, so you have to do that with build settings. Here's how:

	USE_SSSE3			=	$(USE_SSSE3_$(CURRENT_ARCH))
	USE_SSSE3_		=
	USE_SSSE3_i386		=	-mmni
	USE_SSSE3_ppc		=
	USE_SSSE3_ppc64	=
	USE_SSSE3_x86_64	=	-mmni

...and then the a per-file compile flag for your SSSE3 code, pass:

	$(USE_SSSE3)

More on per-file compile flags:

http://developer.apple.com/documentation/DeveloperTools/Conceptual/ XcodeUserGuide/Contents/Resources/en.lproj/05_04_bs_build_settings/ chapter_32_section_8.html#//apple_ref/doc/uid/TP40002691-CHDBDDBI

You can also set up a separate target for each architecture and lipo them all together, but that's a lot of work in my opinion.

Ian
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Meaning of letters in simg5 output (From: Jonathan Taylor <email@hidden>)
 >SSE3.5, SSE4 (From: Holger Bettag <email@hidden>)
 >Re: SSE3.5, SSE4 (From: Shaun Wexler <email@hidden>)
 >Re: SSE3.5, SSE4 (From: Ian Ollmann <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.