See the Ogre 2.1+ FAQ for resources on implementing custom Hlms.
The data that is sent every frame for every obj it can be split in multiple parts:
Every frame
Code: Select all
// mat4 worldViewProj
Matrix4 tmp =
mPreparedPass.viewProjMatrix[mUsingInstancedStereo ? 4u : useIdentityProjection] * worldMat;
memcpy( currentMappedTexBuffer, &tmp, sizeof( Matrix4 ) );
currentMappedTexBuffer += 16;
That fills currentMappedTexBuffer with 16 floats (the 4x4) and advances the pointer.
You can see that HlmsPbs sends both worldViewProj and worldView, so it does this twice.
But earlier you can spot this code:
Code: Select all
const size_t minimumTexBufferSize = 16;
bool exceedsTexBuffer = static_cast<size_t>( currentMappedTexBuffer - mStartMappedTexBuffer ) +
minimumTexBufferSize >=
mCurrentTexBufferSize;
Which anticipates how much data we will be writing. If the buffer is not big enough, it creates a new one (or recycles a discarded one) and maps it.
In your case, since you want to send two matrices instead of 1, you want minimumTexBufferSize = 32;
instead of 16.
Binding
This is handled by rebindTexBuffer( commandBuffer );
or mapNextTexBuffer( ... )
.
The binding slot is hardcoded to slot R0 (t0 in HLSL, SSBO #0 in Vulkan and modern GL, TBO #0 in very old OpenGL HW).
Shader Code
See declarations:
Code: Select all
ReadOnlyBufferF( 0, float4, worldMatBuf ); // GLSL
ReadOnlyBuffer( 0, float4, worldMatBuf ); // HLSL
device const float4 *worldMatBuf [[buffer(TEX_SLOT_START+0)]] // Metal
The macro ReadOnlyBufferF deals declaring the variable for different HW support. Ideally we want to use readonly SSBOs, but when that's not supported, we fallback to TBOs. Additionally, the macro UNPACK_MAT4
uses different code for reading SSBOs and TBOs because the shader syntax is different.
The data is unpacked with a macro:
Code: Select all
float4x4 worldViewProj = UNPACK_MAT4( worldMatBuf, finalDrawId );
Which is the C++ equivalent of doing float4x4 worldViewProj = worldMatBuf[finalDrawId]
.
Note that Unlit doesn't support skeletal animation, while PBS does. Skeletal animations complicate things because indexing worldMatBuf[idx]
is much more intricate (basically idx needs to be sent to another buffer so that shader ends up doing worldMatBuf[perInstanceOffsets[finalDrawId]]
, conceptually it's simple but from C++ side it becomes ridiculously complex. Unlit is simple because we can assume every object consumes exactly 16 floats, while w/ skeletons each object can consume an arbitrary amount of floats)
Misc
In HlmsUnlit::createShaderCacheEntry
OpenGL needs to assign the slot # to the variables, which is named worldMatBuf
.
In Vulkan, HlmsUnlit::setupRootLayout
needs to tell the RootLayout that DescBindingTypes::ReadOnlyBuffer
has at least 1 slot.