Very interesting papers on large, not so hard FPS gain

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4

Very interesting papers on large, not so hard FPS gain

Post by tuan kuranes »

Fast Triangle Order Optimization for Vertex Locality and Reduced Overdraw

The triangle reorder can give up to 30% speed up and can be done at run-time and thus be optimized per GPU. (it's ati tootle but at runtime speed.)

Accelerating Real-Time Shading with Reverse Reprojection Caching
Very simple to implement, but very nice fps/quality improvements on complex shaders (soft-shadow, motion blur, dof, etc..)
Shadow007
Gremlin
Posts: 185
Joined: Sat May 07, 2005 3:27 pm

Re: Very interesting papers on large, not so hard FPS gain

Post by Shadow007 »

tuan kuranes wrote:Fast Triangle Order Optimization for Vertex Locality and Reduced Overdraw

The triangle reorder can give up to 30% speed up and can be done at run-time and thus be optimized per GPU. (it's ati tootle but at runtime speed.)

Accelerating Real-Time Shading with Reverse Reprojection Caching
Very simple to implement, but very nice fps/quality improvements on complex shaders (soft-shadow, motion blur, dof, etc..)
I had seen the first one. It also relates to
Linear-Speed Vertex Cache Optimisation
http://home.comcast.net/~tom_forsyth/pa ... e_opt.html
Which may be less optimized, but also less hardware dependent.
I had toughts (but absolutely NO time) to implement one (or both) in MeshMagick ...
User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4

Post by tuan kuranes »

You would miss 2 important point of the First paper :

- It is not for offline tool, but at runtime, optimizing for end-user video cards (have to find its vertex cache size.). So it has to be run once on the end-user PC, and even mutliple times if he changes its GPU.

- it adds another optimization than only vertex caching aware index buffer, that aims at reducing overdraw, meaning faster rasterization, specially when using slow complex pixel shaders: It computes index order to minimize overdraw. (no other vertex cache optimizer does that IMHO, but Ati Tootle, which is dead slow.)
Shadow007
Gremlin
Posts: 185
Joined: Sat May 07, 2005 3:27 pm

Post by Shadow007 »

The other killer app would be to optimize for a "composite" mesh (by example when adding a sword/helm to a character).

I thought of using the "forsyth" one in meshmagick at the production stage.
But you're right, it could also be computed as part of the "install part" or even (but may be to lengthy) at load time ...

For the second part, the "Reduced Overdraw" package reordering could be added as a second part of the "Forsyth" approach.

I also noted that Nehab talked about an availability of the source ... but there is still none on the radar ... is there?

Edit: Finally, I'm not sure the "Reduced Overdraw" part would work well with low Poly Models ...
User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4

Post by tuan kuranes »

Source is noted to be be "soon". On other author of the paper website, there is a presentation ppt the has overdraw code screenshot (along with nice perf graphs) and the tipsify is nealy complete in the paper already.
I'm not sure the "Reduced Overdraw" part would work well with low Poly Models
Well unless under 1k tri models, I think it's worthy a try as it comes with only a small loading cost... And 1k are really pretty common now, I think.
The other killer app would be to optimize for a "composite" mesh (by example when adding a sword/helm to a character).
What do you mean ? like static geometry / instancing ?
Shadow007
Gremlin
Posts: 185
Joined: Sat May 07, 2005 3:27 pm

Post by Shadow007 »

What do you mean ? like static geometry / instancing ?
Not exactly sure myself but something like that yes ...

It may also be applyable for procedural content that could be optimized "on the fly".