Re: vImage ARGB frustrations
Re: vImage ARGB frustrations
- Subject: Re: vImage ARGB frustrations
- From: Alastair Houghton <email@hidden>
- Date: Tue, 3 Feb 2004 23:15:08 +0000
On 3 Feb 2004, at 19:04, Jeff LaMarche wrote:
>
My attempts to add a blur method to NSImage in a category have been
>
complete unsuccessful. In order to be able to use the
>
vImageConvolve_ARGB8888 method, I'm trying create ARGB bitmap data from
>
my bitmap image - I'm guessing that this might be a problem in an image
>
with a premultiplied alpha, but in my case, my test image is a plain
>
old RGB bitmap with no alpha channel, and it's still not working.
When you say it's not working, what is it actually doing? You aren't
very specific.
I did notice that you wrote
>
int matrixDimension = (radius*2)+1;
>
int16_t kernel[matrixDimension^2];
which is surely wrong because ^2 means XOR 2, not squared... I imagine
you meant
>
int16_t kernel[matrixDimension * matrixDimension];
[snip]
>
Probably not fast, but seems like it should generate the ARGB data I
>
need, right?
If you want performance, you can use Altivec to do it quickly; RGBA to
ARGB is easiest... you can just use Altivec's vec_perm() operation,
something like this:
vector unsigned char permute = (vector unsigned char) (3, 0, 1, 2,
7, 4, 5, 6,
11, 8, 9, 10,
15, 12, 13, 14);
vector unsigned char *prgba = <pointer to RGBA data>;
vector unsigned char *pargb = <pointer to ARGB data>;
unsigned size = width * height / 4;
while (--size) {
/* Second parameter doesn't matter, may as well be permute */
*pargb++ = vec_perm (*prga++, permute, permute);
}
Plain RGB to ARGB is a bit trickier to do with Altivec, because Altivec
can only load 16-byte aligned data. I think it'd go something like
this:
/* Inserts A octets before each RGB in first three quadruplets */
vector unsigned permute1 = (vector unsigned char) (0x10, 0x00, 0x01,
0x02,
0x10, 0x03, 0x04,
0x05,
0x10, 0x06, 0x07,
0x08,
0x10, 0x09, 0x0a,
0x0b);
/* Extracts the second RGB quadruplet from the middle */
vector unsigned extract2 = (vector unsigned char) (0x0c, 0x0d, 0x0e,
0x0f,
0x10, 0x11, 0x12,
0x13,
0x14, 0x15, 0x16,
0x17,
0x00, 0x00, 0x00,
0x00);
/* Extracts the third RGB quadruplet in the middle */
vector unsigned extract3 = (vector unsigned char) (0x08, 0x09, 0x0a,
0x0b,
0x0c, 0x0d, 0x0e,
0x0f,
0x10, 0x11, 0x12,
0x13,
0x00, 0x00, 0x00,
0x00);
/* Inserts A octets for the final quadruplet */ vector unsigned
permute4 = (vector unsigned char) (0x10, 0x04, 0x05, 0x06,
0x10, 0x07, 0x08,
0x09,
0x10, 0x0a, 0x0b,
0x0c,
0x10, 0x0d, 0x0e,
0x0f);
/* Alpha vector (only the first element actually matters) */
vector unsigned alpha = (vector unsigned char) 0xff;
vector unsigned char *prgb = <pointer to RGB data>;
vector unsigned char *pargb = <pointer to ARGB data>;
unsigned size = width * height / 16;
while (--size) {
/* 16 pixels of RGB data in three 128-bit words */
vector unsigned char rgb1 = prgb[0]; /* RGB RGB RGB RGB|RGB R */
vector unsigned char rbg2 = prgb[1]; /* GB RGB RGB|RGB RGB RG */
vector unsigned char rgb3 = prgb[2]; /* B RGB|RGB RGB RGB RGB */
pargb[0] = vec_perm (rgb1, alpha, permute1);
pargb[1] = vec_perm (vec_perm (rgb1, rgb2, extract2),
alpha, permute1);
pargb[2] = vec_perm (vec_perm (rgb2, rgb3, extract3),
alpha, permute1);
pargb[3] = vec_perm (rgb3, alpha, permute4);
prgb += 3;
pargb += 4;
}
Obviously you need to make sure that the image size (in pixels) is a
multiple of 16 in the latter case and of 4 in the former case. You
could also unroll the loops a bit, if you wanted (PPC has plenty of
vector registers :-)).
For greyscale, you'd probably load 16 pixels at a time (and therefore
write four 128-bit words for every one you read). The idea is very
similar, though. Equally, going the other way (converting from ARGB
back to RGB or RGBA) is pretty similar.
The usual warning applies... I haven't tested the code, and wrote it
just now in Mail.app, but you should be able to get the idea. If you
want more detail on Altivec, take a look here:
http://developer.apple.com/hardware/ve/index.html
There's a lot of pretty good stuff there. And trying to figure-out the
best way to write SIMD code can be quite good fun (if you're into that
sort of thing ;->).
Kind regards,
Alastair.
[demime 0.98b removed an attachment of type application/pkcs7-signature which had a name of smime.p7s]
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.