Why does this Hlms change not work on Windows? Topic is solved

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


jwwalker
Goblin
Posts: 269
Joined: Thu Aug 12, 2021 10:06 pm
Location: San Diego, CA, USA
x 19

Why does this Hlms change not work on Windows?

Post by jwwalker »

I was trying to add 2 32-bit unsigned integers to the Hlms Pbs material structure in my fork of Ogre-next. The change doesn't seem to cause any trouble on macOS rendering with Metal, but on Windows 64-bit, using either Vulkan or Direct3D11, the rendering is wrong. Here's Sample_Postprocessing after the change, with Vulkan,

Image

and here is how it should look:

Image

There is no difference in the Ogre.log in the two cases, except a little difference in the number of bytes saved to the pipeline cache.

Here is the diff of my changes:

Code: Select all

--- a/Components/Hlms/Pbs/include/OgreHlmsPbsDatablock.h
+++ b/Components/Hlms/Pbs/include/OgreHlmsPbsDatablock.h
@@ -272,6 +272,7 @@ namespace Ogre
         float mClearCoatRoughness;
         float _padding1;
         float mUserValue[3][4];  // can be used in custom pieces
+        uint32 mUserInt[2];
         // uint16  mTexIndices[NUM_PBSM_TEXTURE_TYPES];
 
         CubemapProbe *mCubemapProbe;
@@ -675,6 +676,13 @@ namespace Ogre
         void    setUserValue( uint8 userValueIdx, const Vector4 &value );
         Vector4 getUserValue( uint8 userValueIdx ) const;
 
+        /** Sets the value of the userInt, this can be used in a custom piece
+        @param userIntIdx
+            Which userInt to modify, 0 or 1
+        */
+        void   setUserInt( uint8 userIntIdx, uint32 value );
+        uint32 getUserInt( uint8 userIntIdx ) const;
+
         /** When set, it treats the emissive map as a lightmap; which means it will
             be multiplied against the diffuse component.
         @remarks
diff --git a/Components/Hlms/Pbs/src/OgreHlmsPbsDatablock.cpp b/Components/Hlms/Pbs/src/OgreHlmsPbsDatablock.cpp
index 73819e7a319..afdebe07546 100644
--- a/Components/Hlms/Pbs/src/OgreHlmsPbsDatablock.cpp
+++ b/Components/Hlms/Pbs/src/OgreHlmsPbsDatablock.cpp
@@ -61,7 +61,7 @@ namespace Ogre
                                        "Lighten",         "Darken",       "GrainExtract", "GrainMerge",
                                        "Difference" };
 
-    const uint32 HlmsPbsDatablock::MaterialSizeInGpu = 60u * 4u + NUM_PBSM_TEXTURE_TYPES * 2u;
+    const uint32 HlmsPbsDatablock::MaterialSizeInGpu = 62u * 4u + NUM_PBSM_TEXTURE_TYPES * 2u;
     const uint32 HlmsPbsDatablock::MaterialSizeInGpuAligned =
         alignToNextMultiple<uint32>( HlmsPbsDatablock::MaterialSizeInGpu, 4 * 4 );
 
@@ -96,6 +96,7 @@ namespace Ogre
         mClearCoat( 0.0f ),
         mClearCoatRoughness( 0.0f ),
         _padding1( 0 ),
+        mUserInt{ 0, 0 },
         mCubemapProbe( 0 ),
         mBrdf( creator->getDefaultBrdfWithDiffuseFresnel() ? PbsBrdf::DefaultHasDiffuseFresnel
                                                            : PbsBrdf::Default )
@@ -962,6 +963,18 @@ namespace Ogre
                         mUserValue[userValueIdx][2], mUserValue[userValueIdx][3] );
     }
     //-----------------------------------------------------------------------------------
+    void   HlmsPbsDatablock::setUserInt( uint8 userIntIdx, uint32 value )
+    {
+        assert( userIntIdx < 2 );
+        mUserInt[userIntIdx] = value;
+        scheduleConstBufferUpdate();
+    }
+    uint32 HlmsPbsDatablock::getUserInt( uint8 userIntIdx ) const
+    {
+        assert( userIntIdx < 2 );
+        return mUserInt[userIntIdx];
+    }
+    //-----------------------------------------------------------------------------------
     void HlmsPbsDatablock::setCubemapProbe( CubemapProbe *probe )
     {
         if( mCubemapProbe != probe )
diff --git a/Samples/Media/Hlms/Pbs/Any/Main/500.Structs_piece_vs_piece_ps.any b/Samples/Media/Hlms/Pbs/Any/Main/500.Structs_piece_vs_piece_ps.any
index 3d32eaf304c..02c03a9430d 100644
--- a/Samples/Media/Hlms/Pbs/Any/Main/500.Structs_piece_vs_piece_ps.any
+++ b/Samples/Media/Hlms/Pbs/Any/Main/500.Structs_piece_vs_piece_ps.any
@@ -301,6 +301,7 @@ struct Material
 	float clearCoatRoughness;
 	float _padding1;
 	float4 userValue[3];
+	uint userInt[2];
 
 	@property( syntax != metal )
 		uint4 indices0_3;
User avatar
bishopnator
Gnome
Posts: 334
Joined: Thu Apr 26, 2007 11:43 am
Location: Slovakia / Switzerland
x 16

Re: Why does this Hlms change not work on Windows?

Post by bishopnator »

Just a guess that there can be some troubles with padding or conversion of float to uint. Try to change those 2 uints as float4 (zw won't be used) and cast the uint to float before you upload the data and in the shader cast back to uint. If the values are under 16777216, the cast is safe. If you expect higher values, you can reinterpret uint bits as float and in shader back (there functions for such conversions).

jwwalker
Goblin
Posts: 269
Joined: Thu Aug 12, 2021 10:06 pm
Location: San Diego, CA, USA
x 19

Re: Why does this Hlms change not work on Windows?

Post by jwwalker »

@bishopnator I don't get it, why would there be any conversion between uint and float, if the field is declared uint in both the shader code and the C++ code?

User avatar
bishopnator
Gnome
Posts: 334
Joined: Thu Apr 26, 2007 11:43 am
Location: Slovakia / Switzerland
x 16

Re: Why does this Hlms change not work on Windows?

Post by bishopnator »

I know that it seems reasonable what you write, but I would give it a try. I know that GPUs work better if there are just floats sent between cpu and gpu in general and there can be specific improvements in the drivers (nvidia, amd, intel, etc. and of courve opengl vs d3d implementation by the vendor). One has to be also careful with padding (or byte alignment). As really a quick fix, try to add 2x 4-bytes values after your 2 uints (so there will be added additional 16 bytes). I am just giving you some hints how to proceed with the problem .. not saying it as fact that it is 100% a problem.

jwwalker
Goblin
Posts: 269
Joined: Thu Aug 12, 2021 10:06 pm
Location: San Diego, CA, USA
x 19

Re: Why does this Hlms change not work on Windows?

Post by jwwalker »

When I changed the shader code from uint userInt[2] to uint2 userInt, the rendering problem went away!

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5490
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1361

Re: Why does this Hlms change not work on Windows?

Post by dark_sylinc »

jwwalker wrote: Mon Mar 17, 2025 11:56 pm

When I changed the shader code from uint userInt[2] to uint2 userInt, the rendering problem went away!

uint userInt[2] translates to uint4 userInt[2]

As for why is that, I explain this in my blog post Uniform buffers vs texture buffers. Particularly where it says "Old days: Shader constant waterfalling" about rule #3 from the std140.

HLSL doesn't follow std140, but it's very similar and to address the same reason.
Metal came out much later, so it never had to care.