Ok, this is much easier than you think. The data is literally getting memcpy'd from C++'s HlmsPbsDatablock. If you watch at its definition:I'm trying to understand how this bit of code:Relates to C++ code.Code: Select all
struct Material { /* kD is already divided by PI to make it energy conserving. (formula is finalDiffuse = NdotL * surfaceDiffuse / PI) */ vec4 kD; //kD.w is alpha_test_threshold vec4 kS; //kS.w is roughness //Fresnel coefficient, may be per colour component (vec3) or scalar (float) //F0.w is transparency vec4 F0; vec4 normalWeights; vec4 cDetailWeights; vec4 detailOffsetScaleD[4]; vec4 detailOffsetScaleN[4]; uvec4 indices0_3; //uintBitsToFloat( indices4_7.w ) contains mNormalMapWeight. uvec4 indices4_7; };
I'm trying to adapt it to my needs and I would like to reorganize, add and remove stuff. But I don't understand how this part relates to a C++ counterpart.
The only thing I managed to understand is that the needed space is defined here:52*4... where does that come from?Code: Select all
const size_t HlmsPbsDatablock::MaterialSizeInGpu = 52 * 4 + NUM_PBSM_TEXTURE_TYPES * 2 + 4; const size_t HlmsPbsDatablock::MaterialSizeInGpuAligned = alignToNextMultiple( HlmsPbsDatablock::MaterialSizeInGpu, 4 * 4 );
If my new struct would be like this:How do I define MaterialSizeInGpu and how do I send the data?. Thank you.Code: Select all
struct Material { vec4 param1; vec4 param2; uvec4 param3; };
Code: Select all
class _OgreHlmsPbsExport HlmsPbsDatablock
{
//...
float mkDr, mkDg, mkDb; //kD
float _padding0;
float mkSr, mkSg, mkSb; //kS
float mRoughness;
float mFresnelR, mFresnelG, mFresnelB; //F0
float mTransparencyValue;
float mDetailNormalWeight[4];
float mDetailWeight[4];
Vector4 mDetailsOffsetScale[8];
uint16 mTexIndices[NUM_PBSM_TEXTURE_TYPES];
float mNormalMapWeight;
//...
};
But it's essentially the same. The data gets memcpy'd to GPU in HlmsPbsDatablock::uploadToConstBuffer. You can see we do a couple adjustments that are mere optimizations (e.g. we fill 'padding' with the alpha threshold since it was in our base class and hence not contiguous for the memcpy, when transparency is enabled we multiply the fresnel and kD by mTransparencyValue to avoid doing it in the shader, etc).Although to be honest, these rules are so complicated and confusing that even driver implementations get it wrong, which is why I prefer using vec4 as much as possible rather than i.e. mixing vec3 with a float, because one driver may pack them together, while another adds padding between them. And the shader will be broken on that vendor until it fixes its driver bug. No, thanks.
You control what you send to shader via uploadToConstBuffer. Whenever this needs to be called (i.e. a parameter changed), you need to call scheduleConstBufferUpdate() so that an upload is scheduled for when the time is appropriate. Multiple calls scheduleConstBufferUpdate will be merged into one uploadToConstBuffer call.
As for the value of MaterialSizeInGpu:
We have 13 vec4s (from kD through detailOffsetScaleN[3]). Multiplied by 4 gives 52 (52 floats). Multiplied by 4 again (the sizeof float) to get the size in bytes.
Then we sum the size of the texture types and finally the last float (mNormalMapWeight). As simple as that.
Note that you need to account for any padding between those variables according to the std140 rules. That's why we work only with vec4s, makes it easier (two vec3 should be the same as two vec4; two vec2 together can be the same as one vec4 or two vec4 depending on what came earlier, it can be tricky; gets worse when drivers get it wrong too)
The difference between MaterialSizeInGpu & MaterialSizeInGpuAligned is just the difference between what gets memcpy'd from C++ to GPU (to avoid reading out of bounds in system memory) and the final padding that gets the structure in GPU.
e.g.:
Code: Select all
for( all_materials_that_changed )
{
memcpy( dataInGPU, datablock->dataInCPU, MaterialSizeInGpu );
dataInGPU += MaterialSizeInGpuAligned;
++datablock;
}
Hope this clarifies everything.