From a 10.000km overall view, auto instancing reduces to this:
Code: Select all
// We have this:
for( int i < numObjects )
{
drawCall( renderable[i] );
}
// We want to turn it into this:
drawCalls( renderable, numObjects );
The way we achieve this is conceptually very simple:
Code: Select all
lastDrawnObjStart = 0;
numObjsToDraw = 0;
for( int i < numObjects )
{
if( needsMoreCommands[i] )
{
// Draw everything accumulated so far
drawCalls( &renderable[lastDrawnObjStart], numObjsToDraw );
lastDrawnObjStart = numObjsToDraw;
numObjsToDraw = 0;
// Set new shader
setShader( renderable[i] );
// Keep accumulating
++numObjsToDraw;
}
else
{
// Keep accumulating
++numObjsToDraw;
}
}
// Draw everything accumulated so far
drawCalls( &renderable[lastDrawnObjStart], numObjsToDraw );
That is, we start accumulating objects to render until we find a something that breaks the instancing, submit what we have so far, issue that breaking call, and then start accumulating again.
Now, what you are you asking is "what causes needsMoreCommands[i]
to be true".
Well... anything that isn't a draw call:
-
Any change in HlmsBlendblock
-
Any change in HlmsMacroblock
-
Any change in HlmsSamplerblock
-
Needing a different shader
-
Any change in vertex format
-
Binding new const / texture buffers
-
Any change in texture pools
Because of a technique called AZDO and MultiDraw, we don't usually need to break auto instancing for different meshes or vertex buffers.
As for textures, we try to aggressively use AutomaticBatching, where we put lots of textures together into one pool (i.e. we externally pretend a single TextureGpu is alone a Type2D, when internally it actually is a slice of a Type2DArray) so that we don't have to issue setTexture() calls frequently, thus increasing the chances of auto instancing.
If 2 Items use different textures but they are living in the same pool, auto instancing won't be broken because internally the same texture is still bound.
In short, every time we need to call:
the instancing will be broken up.
If you are writing your own HlmsPbs modification, you need to be care of not calling commandBuffer->addCommand<T>() with every fillBuffersFor
call.
We use key sorting in OgreRenderQueue.cpp (as long as the sort mode for that particular queue isn't DisableSort) to group everything by state as much as possible (e.g. transparents are the hardest because sorting back to front takes priority) and thus maximize the chance of auto instancing (e.g. all Items using the same shader should be grouped together).