It's perfect for a community contribution indeed.
PSOs basically contain everything condensed into one big object. Macroblock & Blendblock information, stencil test params, vertex input layout, even render target information (including MRT count, MSAA settings, formats of each MRT, depth buffer format).
From an engine design point of view this means PSOs should be created on load. From an API perspective, PSOs give a lot of performance optimizations to the driver as the driver can see everything that can and will be used (and each of its relationships), perform heavy optimizations; and encapsulate the optimized state into the PSO.
However for immediate-mode style rendering (very common in most GUIs out there); this paradigm sucks (unless the GUI has been designed in mind with PSOs).
Ogre (and therefore, the Hlms) separates the PSO into two segments: PSO data that rarely changes (such as RTT formats, depth buffer format, stencil test settings), and PSO data that may change often: Macroblock, blendblock, vertex layout, and shaders.
In Ogre we have for that HlmsPso & HlmsPassPso (HlmsPassPso is part of HlmsPso).
My idea is that a cache class should work something like this:
Code: Select all
PsoCache cache; //This is persistent, not a local variable
void render()
{
cache.clear();
cache.setRenderTarget( renderTarget );
cache.setStencilSettings( stencilParams );
foreach( object_to_render )
{
cache.setVertexFormat( vertexElements, operationType, enablePrimitiveRestart );
cache.setMacroblock( macroblock );
cache.setBlendblock( blendblock );
cache.setShaders( ... );
HlmsPso *pso = cache.getPso();
renderSystem->_setPipelineStateObject( pso );
}
}
getPso() would check if it's dirty; if it's not, just return the same PSO as before. If it is; it will find an already created HlmsPso from its cache. If it's not, then create a new one (by calling _hlmsPipelineStateObjectCreated; when destroying the entire cache don't forget to call _hlmsPipelineStateObjectDestroyed).
Naturally, the dev using the cache should optimize as much as possible (i.e. if the vertex format is always the same, then call cache.setVertexFormat outside the loop).
Also if the user provides macro & blendblocks by pointer created from HlmsManager, checking if they're different is just a pointer compare. i.e. if( oldBlendblock != newBlendblock ) mDirty = true;
Overall it's simple, and would simplify a lot the porting of GUI tools (Gorilla, CEGUI, etc) to Ogre 2.1-pso