Re: AltiVec on OSX in Cocoa
Re: AltiVec on OSX in Cocoa
- Subject: Re: AltiVec on OSX in Cocoa
- From: Stuart Bryson <email@hidden>
- Date: Wed, 14 Nov 2001 14:29:25 +1100
On Monday, November 12, 2001, at 06:06 AM, Brendan Younger wrote:
On Sunday, November 11, 2001, at 08:47 AM, Charles Jolley wrote:
MrC has really complete support for altivec, unfortunately, gcc is
not quite as tailored to it.
The C extensions and the "vector" keyword are both fully supported,
but the memory allocation is a little sketchy.
Stack-based vectors will be automatically aligned correctly, but
AFAIK, there is no vec_malloc() which will guarantee 16-bit aligned
memory blocks. So, you are stuck either rolling your own, or using
NewPtr() (from Carbon). However, I have had some difficulties with
NewPtr() (more specifically, DisposePtr()). Hence, I suggest using a
custom struct which will hold both a pointer to the actual memory
allocated and a pointer to the next 16-bit aligned memory address.
You know, the Performance documentation I was reading the other day
states that the standard malloc always allocates memory on the heap
aligned to 16-byte boundaries. It might be worth just trying with
standard malloc first.
See /Developer/Documentation/Essentials/Performance/Performace.pdf,
p22 (Understanding Malloc):
... The granularity of the blocks malloc returns is 16 bytes. So if
you ask for 4 bytes, malloc consumes 16 bytes, and if you ask for 24
bytes, malloc will consume 32 bytes. ...
Yeah, sorry about that, silly me for reading the man pages . . .
"Malloc and free provide a general-purpose memory alloca-
tion package. Malloc returns a pointer to a block of at
least size bytes beginning on a long boundary."
But hey, if it works, so much the better.
What about alloc?
I have been playing around with Altivec just recently and it has been
the memory alignment that has been stuffing me up. If you say gcc and
malloc always align to 16 bytes then what about alloc in Obj-C?
Up till now I have been using Obj-C++. The code below shows how I have
been allocating memory. It has some example logs after it and it seems
that it is always aligned even before I align it myself. But is this
guaranteed?
void * AltiMem::align16(void *ptr)
{
void **aligned_ptr = &ptr;
(*(long *)aligned_ptr) &= 0xFFFFFFF0; //Strip lower 4 bits
(*(long *)aligned_ptr) += 16; //Move pointer up by 16 bytes
return *aligned_ptr;
}
float * AltiMem::allocateFloat(unsigned int n)
{
//Allocate n * sizeof(float) + 16 bytes
float *p_unaligned = (float *)::operator new(n*sizeof(float) + 16);
//Align the pointer
float *p_aligned = static_cast<float *>(align16(p_unaligned));
//Store the difference between aligned and unaligned in byte at
location (p_aligned - 1)
unsigned char *p_offset = (unsigned char *)p_unaligned - 1;
*p_offset = p_aligned - p_unaligned;
#ifdef DEBUG
NSLog(@"allocate: %x->%x (%x)\n", p_unaligned, p_aligned, p_offset);
#endif
return p_aligned;
}
2001-11-14 14:23:39.933 AltiRay[346] allocate: 1b89470->1b89480 (1b8946f)
2001-11-14 14:23:39.934 AltiRay[346] allocate: 1b872d0->1b872e0 (1b872cf)
2001-11-14 14:23:39.934 AltiRay[346] allocate: 1b872f0->1b87300 (1b872ef)
2001-11-14 14:23:39.934 AltiRay[346] allocate: 1b89780->1b89790 (1b8977f)
2001-11-14 14:23:39.934 AltiRay[346] allocate: 1b897b0->1b897c0 (1b897af)
Stuart