Renderdoc made me ponder my custom HLMS...

Discussion area about developing with Ogre2 branches (2.1, 2.2 and beyond)
Post Reply
Posts: 66
Joined: Tue Jun 14, 2016 12:26 pm
x 10

Renderdoc made me ponder my custom HLMS...

Post by Hrenli » Tue Nov 27, 2018 6:18 pm


It's a bit stupid/complicated, sorry. :) While I was looking at renderdoc output of my scenes I've noticed one thing which I don't understand why. Maybe someone can explain it to me.

I have a scene with two very similar sets of generated meshes (one for ground and one for water). Both sets are assigned just one material each. I use two different custom HLMSs for those and what I noticed is that for the ground all draws are grouped under one glMultiDrawElementsIndirect(93) and for the water it's a long list of separate glMultiDrawElementsIndirect calls. If I change water material to something else (PBS or my other "ground" HLMS) it becomes grouped under one common call...

So, it's really something on the material/datablock side... Why Ogre sends one set of items having the same material as a group and the same set with another material as all separate? What it depends on?

I am not even sure it affects the performance, but it bothered me enough to spent some extra time in the debugger but I still don't get it. And I want to understand! :)
0 x

Posts: 66
Joined: Tue Jun 14, 2016 12:26 pm
x 10

Re: Renderdoc made me ponder my custom HLMS...

Post by Hrenli » Wed Nov 28, 2018 11:22 am

OK, just by blind experimenting I think I narrowed it to usage of reflections and stuff.

But I still don't feel like I really understand how Ogre groups materials and renderables etc... Is there a description of Ogre's 2+ pipeline apart from the porting manual?
0 x

OGRE Expert User
OGRE Expert User
Posts: 1193
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 74

Re: Renderdoc made me ponder my custom HLMS...

Post by al2950 » Wed Nov 28, 2018 2:00 pm

There is not too much on this documentation wise, but the OgreRenderQueue.cpp code is fairly good and clear:

Code: Select all

        uint64 hash;
        if( !transparent )
            //Opaque objects are first sorted by material, then by mesh, then by depth front to back.
            hash =
            OGRE_RQ_HASH( subId,            RqBits::SubRqIdBits,        RqBits::SubRqIdShift )      |
            OGRE_RQ_HASH( transparent,      RqBits::TransparencyBits,   RqBits::TransparencyShift ) |
            OGRE_RQ_HASH( macroblock,       RqBits::MacroblockBits,     RqBits::MacroblockShift )   |
            OGRE_RQ_HASH( hlmsHash,         RqBits::ShaderBits,         RqBits::ShaderShift )       |
            OGRE_RQ_HASH( meshHash,         RqBits::MeshBits,           RqBits::MeshShift )         |
            OGRE_RQ_HASH( texturehash,      RqBits::TextureBits,        RqBits::TextureShift )      |
            OGRE_RQ_HASH( quantizedDepth,   RqBits::DepthBits,          RqBits::DepthShift );
In an ideal world, for optimal performance, everything would use the same shader and the same buffers (mesh and textures), and this is what ogre tries to achieve by putting multiple meshes into a single vertex buffer and textures into a textures array. Obviously this practically never really happens, but changing the GPU pipeline state, eg pixel shaders, or texture buffers, can be expensive, and cause GPU stalls. So Ogre orders its draw calls to minimise this. As you seen the code comment above, the comment 'sorted by material mesh and then depth'. Old system used to just sort by depth to try and reduce pixel overdraw, but its now more efficient to try and reduce pipeline state changes and let GPU crunch numbers. Also, if overdraw is an issue you just use a pre-z buffer.

The code is actually a bit more descriptive, and orders by
- macroblock (material specific values)
- HlmsHash (the shader, vertex, pixel, etc)
- meshHash (the vertex buffer)
- textureHash (the texture buffer state)
- quantizedDepth (depth order)

I am pretty sure the above is correct, but may need dark_sylinc stamp of approval!
3 x

User avatar
OGRE Team Member
OGRE Team Member
Posts: 3987
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 184

Re: Renderdoc made me ponder my custom HLMS...

Post by dark_sylinc » Wed Nov 28, 2018 5:39 pm


al2950's is a high level overview, and is correct. Although I would point out that shader switching is not always bad, specially if we're switching from an expensive one into a cheap one.

From a much lower level perspective:
For us in order to put two draws in the same glMultiDraw call, ALL of the following conditions must be met:
  1. The same shader and PSO settings must be used. That means Hlms::getMaterial returns the same pointer as the previous call
  2. They live in the same VAO. Most of the time this means both draws have the exact same vertex definition. However it may happen two meshes got a different VAO because you're using so much memory they end up in different pools of vertex buffers
  3. mCommandBuffer->addCommand() was not called while inside Hlms::fillBuffersForVX
    • Most of the time this happens because the materials have different texture sets. We try to keep everything within the same set via Texture 2D arrays. But if you're constantly using textures with different resolution or format (or for some reason they end up in a different 2D array pool), their effectiveness diminishes. See Watching out for memory consumption section of the Porting manual on how to debug textures. The more homogeneous your textures, the better
    • The material lives in a different ConstBufferPool. This is only relevant if you have more 256 materials. Should be rare
    • Last draw used a different type Hlms (e.g. it used Unlit instead of Pbs)
    • We ran out of space in the buffers, and need to bind a new one
    • More reasons... just find in OgreHlmsPbs.cpp for addCommand calls
However this is not chaotic. What al2950 said about RenderQueue means that we group together objects that will use the same texture sets, the same shaders, the same VAOs, the same types of Hlms.

If you suspect the reflections are the problem, then it is likely a problem with the textures, which is the most common reason to break up the draws.
2 x

Post Reply