I'd like to get a sanity check on the results I'm seeing. If this all
seems reasonable, fine. But if not, then I know I have more work to do.
Of course, suggestions for improvements are welcome.
The code I'm working on is doing a simple rectangular pixel transfer,
with possible scaling and RGB lookup table. I'm currently testing on a
dual 2GHz PowerMac with Radeon 9600 under 10.3.4. In the test case I'm
scaling a 2K x 1K image down to 700 x 350 pixels for display.
Our QuickDraw code can do this in 45 ms, or 46 ms if a 256-entry RGB
LUT
is active. And that's doing a bicubic scale, not linear.
Using OpenGL, I can do this in 20 ms using a texture (and all the
Apple-
recommended optimizations) or 17 ms by using glDrawPixels. Good, I'll
take that improvement. It works out to about 395 Mbytes/sec transfer
rate.
But if I enable a LUT, then things go wonky. If I use GL_COLOR_TABLE
with
a texture, the times go up by a factor of 20x. If I use GL_COLOR_TABLE
with glDrawPixels, it goes up by a factor of 100x! And with
glDrawPixels,
there are random horizontal lines through the image if any scaling is
being done.
Judging from the dramatic slowdown, I have to assume that the color
table
is being implemented in software, and very poorly at that.
I tried using glPixelMap for the LUT, with similar results.
So am I just doing something stupid? Is it a 9600-specific issue? Or
just
the way it is?
That, as far as I know, is just the way it is. I don't know of any
cards which
implemented the Imaging extensions on hardware that are presently
available for
Mac.
I have implemented texture LUTs in realtime with great success, but I
implemented
the operation using a fragment program. I assume you are doing
something simple like
re-mapping each channel individually. In this case, you can create a 1D
texture 256
pixels wide. For each pixel in the texture, set the value of the
component to what
the value retrieved from the existing LUT would have been, taking the
index of the pixel in
this 1D texture as being the equivalent of a value for the component in
the source
image.
Bind your look-up texture and your source texture simultaneously when
you are
drawing. Draw the geometry using a texture-mapped primitive. If you
have your fragment
program bound at the same time, you should be able to re-map the image
per-pixel. All
this happens on the GPU, so you should see almost no appreciable drop
in performance.
Your bottleneck performance-wise in this application is doubtless
texture upload
bandwidth.
--
Kind Regards,
James Milne
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Mac-opengl mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/mac-opengl/email@hidden