Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Checking shader optimizations



Am Montag, 29. Oktober 2007 22:48:17 schrieb Nick Burns:
> That window was never supposed to have existed -- it was a bug that
> it was enabled at all (since fixed)
So I guess I should take it with a grain of salt what I see in there :-)

> > I have heard that the code you see here is pre-optimization—an
> > optimization pass is applied to the code but what you see does not
> > reflect that pass. Also, I believe that there is also another
> > optimization phase where the vertex and fragment shader are fused
> > together into one big program, and that is also something that
> > developers don't get to see.
At linking time I guess the function inlining is done
 
> > AFAIK this is an undocumented, custom language. It's not too hard
> > to read if you know ARB, but it's definitely got its own set of
> > quirks. There's no public way to submit code to the OS in this
> > language either. (If there is a private way, I don't know it.)

> > Really the anomaly is that they present this information to us at
> > all, because there isn't much that we can do with it. The only
> > thing I can think of is, if you suspect the GLSL parser is
> > translating your program incorrectly, you might be able to use that
> > view to demonstrate the issue.
What I am trying is to find out how the optimizer deals with my intentional 
inefficiency in my autogenerated GLSL. From what I can gather with my general 
experience is that the optimizations are quite useable, just not for my 
use(generating glsl from direct3d assembler).

My testing with the asm output indicates that there is some truth in it. I 
have a vs_1_1 shader which falls back into software rendering for some 
reason. Testing that shader with this assembler output gave quite sane 
results, and I think I know now how to get it into hardware(I managed to do 
so with some hacks based on that testing already).

My main concerns are mova and constants indexing. For example, the mova 
implementation currently looks like this:

dst = int(floor(abs(src) + 0.5) * sign(src)));

Which is excessive for an instruction that exists natively on nvidia and ati 
hardware. No, dst = int(floor(src + 0.5)) does not work because of the -0.5 
should be rounded to -1.0, not 0.0, and 0.5 to 1.0. For indexing I need some 
more insight, but it seems to blow up to 4 instructions instead of 1, eating 
2 temp registers. I guess I'll open a support incident for this.

Attachment: signature.asc
Description: This is a digitally signed message part.

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Mac-opengl mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/mac-opengl/email@hidden

This email sent to email@hidden

References: 
 >Checking shader optimizations (From: Stefan Dösinger <email@hidden>)
 >Re: Checking shader optimizations (From: John Stiles <email@hidden>)
 >Re: Checking shader optimizations (From: Nick Burns <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.