Re: Efficient use of Altivec for CG vector operations
# Re: Efficient use of Altivec for CG vector operations

**Subject**: **Re: Efficient use of Altivec for CG vector operations**
- From: Eric Lengyel <email@hidden>
- Date: Tue, 24 Feb 2004 21:12:28 -0800

This is all just off the top of my head...

Try defining something like this:

struct Vector4D

{

float x, y, z, w;

// User-defined conversion

operator vector float(void) const

{

return (*(vector float *) &x);

}

// Assignment

Vector4D& operator =(vector float v)

{

*(vector float *) &x = v;

return (*this);

}

Vector4D& operator +=(const Vector4D& v)

{

return (*this = vec_add(*this, v));

}

// ... etc.

};

(You'll have to ensure that Vector4D structs are 16-byte aligned.)

As for a dot product, this is one of the major design oversights of
Altivec, and the only reason I'm not really using it much. You're
probably better off just using the ordinary FP instructions, but in
Altivec it can be done as follows. (Anybody have anything better?)

vector float DotProduct4D(vector float v1, vector float v2)

{

vector float m = vec_mul(v1, v2);

vector float x = vec_splat(m, 0);

vector float y = vec_splat(m, 1);

vector float z = vec_splat(m, 2);

vector float w = vec_splat(m, 3);

return (vec_add(vec_add(vec_add(x, y), z), w);

}

A cross product would involve doing some permutes, then a multiply,
followed by a multiply-add.

-- Eric Lengyel

On Feb 24, 2004, at 8:06 PM, Vareck Bostrom wrote:

Hello everyone. This isn't 100% OpenGL related but covers computer
graphics in general. I've been on a ray-tracing kick for quite some
time, as I was expirementing with lighting models and vertex/pixel
shaders 1.x were too limiting at the time - but that's really an
aside.

What it comes down to is for those of us still on G4s or even G3s, how
can we most effectivly make use of the vector floating point
facilities available to us in the form of altivec?

We have the "vector float" class available to us, but I need to refer
to those vectors with altivec and non-altivec functions so I ended up
declaring a vector like:

typedef union

{

float elements[4];

vector float vFloat;

} vec4;

That way I can use functions both with and without the apple defined
c-interface extentions for altivec. A simple example:

/* in place addition of two vectors, result in first argument */

void vec4_iadd(vec4 * l_result, vec4 * l_b)

{

#ifndef AVEC

int i;

for(i=0;i<4;++i)

{

l_result->elements[i] += l_b->elements[i];

}

#endif

#ifdef AVEC

l_result->vFloat = vec_add( l_result->vFloat, l_b->vFloat );

#endif

}

I know I can get rid of the non-altivec loop and replace it directly
four additions, I'm thinking more about altivec efficiency though. It
seems to me that this method has developed into quite a bit of pointer
chasing and that completely defeats the execution speed advantage of
altivec.

So my question is: how would you define a vector that can be accessed
by both the scalar per-element instructions and altivec instructions,
such that overhead for pointer chasing and referencing/dereferencing
is minimized.

What would the definition for a vector look like, and then how would
say two sample functions look: vector dot product and vector cross
product?

_______________________________________________

mac-opengl mailing list | email@hidden

Help/Unsubscribe/Archives:

http://www.lists.apple.com/mailman/listinfo/mac-opengl
Do not post admin requests to the list. They will be ignored.