You don't actually need the permute unit. This frees it up to
handle other things like misalignment:
//set up a constant -- do once
vector signed short mask = vec_mergeh( vec_splat_s16( 0 ),
vec_splat_s16(-1) ); // 0X0, 0XFFFF, 0X0, 0XFFFF...
//the two cycle way to do it
v = vec_xor( v, mask );
v = vec_subs( v, mask );
You can also do it in one instruction with vec_mladd, if you'd rather
use the VCIU:
v = vec_mladd( v, (vector signed short) (1, -1, 1, -1, 1, -1, 1,
-1), (vector signed short) (0) );
This, however, has the unfortunate problem of what to do with 0x8000.
The good news is that quite often you don't just want to change the
sign, but also do some multiplication too. Sometimes that is
multiplication by a constant between 0 and 1. In that case, use
vec_mradds and flip the sign of the constant as appopriate.
vec_mradds will also solve the problem of overflow, since it is
saturating.