Can someone explain ConstBufferPool / the new uniforms to me

Discussion area about developing with Ogre2 branches (2.1, 2.2 and beyond)
Post Reply
white_waluigi
Goblin
Posts: 253
Joined: Sat Sep 28, 2013 3:46 pm
x 10

Can someone explain ConstBufferPool / the new uniforms to me

Post by white_waluigi »

I'm still working on the DeferredShading HLMS, and I am starting to make progress (after 3 Months, I know) I'm currently kinda stuck on uploading new uniforms or Shader Parameters to the Shaders.
Currently I have 3 different Shader Types: GBuffer (For Stuff that wants to be Lit by DS), LigthMaterial (For the Light Gometry) and Forward (For stuff that cannot be light by DS (Transparent, Forward rendered Stuff by other HLMS, Custom Shaders, etc)). These * Have their own class where their own Propertys and Shader Paramters are set.
And thats where I'm kinda stuck. I don't exactly understand how the new System works. In the old version it was a simple case of setNamedConstant.
I know that It is all uploaded into a float map which are a list of Uniform Paramters which is then uploaded to the GPU.
What I don't get is how can the Shader find the right value for the right variable since the Uniform Buffers have no names?
Also How are Datablock values uploaded (roughness etc.)? They are newer adressed in OgreHLMSPBS.cpp.

So how can I upload values to the Shader and find them again?

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4211
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 802
Contact:

Re: Can someone explain ConstBufferPool / the new uniforms t

Post by dark_sylinc »

See this thread where somebody asked a similar question: http://www.ogre3d.org/forums/viewtopic.php?f=25&t=84066
Basically ConstBufferPools are used to store material properties that do not change very often (e.g. diffuse colour, specular colour, fresnel settings, alpha transparency)

Then see this thread http://www.ogre3d.org/forums/viewtopic.php?f=25&t=83763. Particularly my replies. They explain how to use a per-pass ConstBufferPacked (not -Pools) to store data that changes every frame.

It also explains the drawID: I suspect this is what you're missing. The drawID is used to find the material properties & transform in the constant buffers for the individual object we're drawing:
The value of drawID will be based on the return value of fillBuffersFor; hence it was so important above, which you need to keep contiguous 0, 1, 2, 3, 4, ... 4095. For every API draw call, it will start from the return value of fillBuffersFor, then will be incremented by +1 for every instance the auto instance system can batch together (irrespective of what fillBuffersFor will say for those auto-instanced Renderables. If you need fillBuffersFor's return value to always be respected in drawId, you must insert a dummy command to fool Ogre to break the auto instancing)
TL;DR: From a 1000 mile view; we basically have thre buffers, one with data that changes infrequently, two with data that changes every frame. And a drawID supplied by Ogre.

Inside every shader, we grab the data like this:

Code: Select all

void main()
{
    float4x4 worldViewProjMatrix = changesEveryFrameBuffer0[in.drawId];
    uint materialIdx = changesEveryFrameBuffer1[in.drawId & 0x1FFu];
    float4 diffuseColour = changesOcassionally[materialIdx].diffuse;
}
Note we do 'in.drawId & 0x1FF' because that's just a convention used by the PBS Hlms implementation. The first 9 bits are used to store the material ID. The rest of the bits are used to store an offset to the skeleton bones in the const buffers (only used when skeleton animation is enabled).

That means we can only address up to 512 unique materials per batch. If you need to address material #513, it will need a different const buffer (so material #513 becomes #0).
In C++ code, the Hlms implementation ensures that happens correctly with this snippet:

Code: Select all

if( mLastBoundPool != datablock->getAssignedPool() )
{
	//layout(binding = 1) uniform MaterialBuf {} materialArray
	const ConstBufferPool::BufferPool *newPool = datablock->getAssignedPool();
	*commandBuffer->addCommand<CbShaderBuffer>() = CbShaderBuffer( PixelShader,
																   1, newPool->materialBuffer, 0,
																   newPool->materialBuffer->
																   getTotalSizeBytes() );
	mLastBoundPool = newPool;
}
Note that in practice the PBS implementation allows up to 273 materials per batch (not 512), since otherwise it exceeds the 64kb limit common to many GPUs and APIs. This limit is calculated by the ConstBufferPool based on the size of each element in the array:

Code: Select all

HlmsPbs::HlmsPbs( Archive *dataFolder, ArchiveVec *libraryFolders ) :
	HlmsBufferManager( HLMS_PBS, "pbs", dataFolder, libraryFolders ),
	ConstBufferPool( HlmsPbsDatablock::MaterialSizeInGpuAligned,
					 ConstBufferPool::ExtraBufferParams() ),
Internally it will perform something like floor( 65535 / HlmsPbsDatablock::MaterialSizeInGpuAligned ). Note: Don't assume the limit is always 64kb.
The ConstBufferPool calculates the limit via this code:

Code: Select all

mBufferSize = std::min<size_t>( _mVaoManager->getConstBufferMaxSize(), 64 * 1024 );
In other words, it can be less than 64kb, but it won't be bigger with the current hardcoded limits (note that getConstBufferMaxSize may return 2GB!!!)

I hope this shines some light. You may seem confused. This change wasn't made just "in the name of performance". It's how modern GPUs work, and how graphics rendering is being performed by other modern engines.

Remember that from above, think of it as we're just sending a couple pointers, and an index to know where to read.
In the C++ side, "sending a couple pointers" means binding a ConstBufferPacked or TexBufferPacked with a determined range (e.g. we may want to bind byte offset 128 through 512, instead of the whole range) and most of the confusing parts are figuring out which buffer we need to bind, and a few extra hoops to make sure the code can run in all GPUs (e.g. watch out for buffer limits, alignment requirements).
If we were to target GCN hardware alone on consoles, this would be super easy (just send one mega huge buffer with byte offsets!); but instead we need to categorize between ConstBufferPacked & TexBufferPacked (see my blog post), and watch out the indexes are in the right format (e.g. to address a float4 array; send the byte offset divided by 16, make sure it's always zero-based to the start of the buffer... that was a giant PITA, but taken care automatically by ConstBufferPool)

white_waluigi
Goblin
Posts: 253
Joined: Sat Sep 28, 2013 3:46 pm
x 10

Re: Can someone explain ConstBufferPool / the new uniforms t

Post by white_waluigi »

thanks, this help a lot

Post Reply