[Solved][2.1] Trying to create a new HLMS

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

[Solved][2.1] Trying to create a new HLMS

Post by Kinslore »

Hello !

I've been trying to play with the HLMS system and create a new HLMS implementation from scratch. Even after investigating the sources, a lot of things remain a little hard to understand for me, so I hope I'm not asking something too trivial !

I am trying to create a random terrain within a program. So far, I've generated the geometry, and now is the time to mess with its material ! So, as I wanted to use the new functionalities, I tried to do this by creating a new HLMS from scratch (I want to understand how it works).

So, here is what I've done :

- Create a custom implementation (The HLMS and a simple Datablock)
(Headers)

Code: Select all

class HlmsGroundDatablock : public Ogre::HlmsDatablock
	{
		private :

			// Attributs

		public :

			// Constructeur, destructeur
			HlmsGroundDatablock (Ogre::IdString name, Ogre::Hlms* creator, const Ogre::HlmsMacroblock* macroblock, const Ogre::HlmsBlendblock* blendblock, const Ogre::HlmsParamVec& params)
			:	HlmsDatablock(name, creator, macroblock, blendblock, params)
			{
				// Nothing to do
			}

			~HlmsGroundDatablock ()
			{
				// Nothing to do
			}
	} ;

class HlmsGround : public Ogre::Hlms
	{
		private :
		
			// Attributs
			// Vao Manager
			Ogre::VaoManager* _vaoManager ;
			// Région mappée pour le prog
			Ogre::ConstBufferPacked* _entityUniforms ;
		
		public :
		
			// Constructeur, destructeur
			HlmsGround (Ogre::Archive* dataFolder, Ogre::ArchiveVec* libraryFolders, Ogre::VaoManager* vaoManager) ;
			~HlmsGround () ;

			// Les fonctions HLMS pures
			virtual const Ogre::HlmsCache* createShaderCacheEntry (Ogre::uint32 renderableHash, const Ogre::HlmsCache& passCache, Ogre::uint32 finalHash, const Ogre::QueuedRenderable& queuedRenderable) ;
			virtual void calculateHashFor (Ogre::Renderable* renderable, Ogre::uint32& outHash, Ogre::uint32& outCasterHash) ;
			virtual Ogre::HlmsCache preparePassHash (const Ogre::CompositorShadowNode* shadowNode, bool casterPass, bool dualParaboloid, Ogre::SceneManager* sceneManager) ;

			virtual Ogre::uint32 fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::uint32 lastTextureHash) ;
			virtual Ogre::uint32 fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) ;
			virtual Ogre::uint32 fillBuffersForV1 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) ;
			virtual Ogre::uint32 fillBuffersForV2 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) ;

			virtual HlmsGroundDatablock* createDatablockImpl (Ogre::IdString datablockName, const Ogre::HlmsMacroblock* macroblock, const Ogre::HlmsBlendblock* blendblock, const Ogre::HlmsParamVec& paramVec) ;
	} ;
(Sources)

Code: Select all

/// Constructeur, destructeur ----------------

HlmsGround::HlmsGround (Ogre::Archive* dataFolder, Ogre::ArchiveVec* libraryFolders,  Ogre::VaoManager* vaoManager)
:	Ogre::Hlms (Ogre::HLMS_USER0, "Ground", dataFolder, libraryFolders),
	_vaoManager (vaoManager),
	_entityUniforms (vaoManager->createConstBuffer(sizeof(float) * 16, Ogre::BT_DYNAMIC_PERSISTENT, NULL, false))
{
	// Nothing to do
}

HlmsGround::~HlmsGround ()
{
	// Plus besoin des buffers
	_vaoManager->destroyConstBuffer(_entityUniforms) ;
}

/// Fonctions HLMS ---------------------------

const Ogre::HlmsCache* HlmsGround::createShaderCacheEntry (Ogre::uint32 renderableHash, const Ogre::HlmsCache& passCache, Ogre::uint32 finalHash, const Ogre::QueuedRenderable& queuedRenderable)
{
	std::cout << "Shader Cache Entry" << std::endl ;

	const Ogre::HlmsCache* retVal = Ogre::Hlms::createShaderCacheEntry(renderableHash, passCache, finalHash, queuedRenderable) ;

	return retVal ;
}

void HlmsGround::calculateHashFor (Ogre::Renderable* renderable, Ogre::uint32& outHash, Ogre::uint32& outCasterHash)
{
	std::cout << "Hash For" << std::endl ;
	Hlms::calculateHashFor(renderable, outHash, outCasterHash) ;
}

Ogre::HlmsCache HlmsGround::preparePassHash (const Ogre::CompositorShadowNode* shadowNode, bool casterPass, bool dualParaboloid, Ogre::SceneManager* sceneManager)
{
	Ogre::HlmsCache retVal = Hlms::preparePassHash (shadowNode, casterPass, dualParaboloid, sceneManager) ;

	return retVal ;
}

Ogre::uint32 HlmsGround::fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::uint32 lastTextureHash)
{
	std::cout << "fillBuffersFor Tex" << std::endl ;

	return 0 ;
}

Ogre::uint32 HlmsGround::fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer)
{
	// Tentons par les commandes
	const Ogre::Matrix4& worldMat = queuedRenderable.movableObject->_getParentNodeFullTransform() ;

	// Mappage du buffer
	float* mappedBuffer = reinterpret_cast<float*>(_entityUniforms->map(0, sizeof(float) * 16)) ;

	memcpy(mappedBuffer, &worldMat, 16 * sizeof(float)) ;

	_entityUniforms->unmap(Ogre::UO_KEEP_PERSISTENT) ;

	// Et on le donne à manger aux lions
	*commandBuffer->addCommand<Ogre::CbShaderBuffer>() = Ogre::CbShaderBuffer (Ogre::VertexShader, 0, _entityUniforms, 0, _entityUniforms->getTotalSizeBytes()) ;

	return 0 ;
}

Ogre::uint32 HlmsGround::fillBuffersForV1 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer)
{
	return fillBuffersFor(cache, queuedRenderable, casterPass, lastCacheHash, commandBuffer) ;
}

Ogre::uint32 HlmsGround::fillBuffersForV2 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer)
{
	return fillBuffersFor(cache, queuedRenderable, casterPass, lastCacheHash, commandBuffer) ;
}

HlmsGroundDatablock* HlmsGround::createDatablockImpl (Ogre::IdString datablockName, const Ogre::HlmsMacroblock* macroblock, const Ogre::HlmsBlendblock* blendblock, const Ogre::HlmsParamVec& paramVec)
{
	return OGRE_NEW HlmsGroundDatablock (datablockName, this, macroblock, blendblock, paramVec) ;
}
- Do some simple shaders, not even using the templates for now :
(Vertex)

Code: Select all

#version 330

layout (std140) uniform ;

// Inputs
in vec4 vertex ;

layout(binding = 0) uniform entityBuffer
{
	mat4 mvpMat ;
} entityVars ;

// Sortie
out gl_PerVertex
{
	vec4 gl_Position ;
} ;

void main ()
{
	gl_Position = entityVars.mvpMat * vertex ;
}
The Pixel shader is just setting the color of the fragment, nothing brutal.

My problem as it is now, is that I want to give the famous worldViewProj matrix to the vertex shader. So far, it's not really working, because I get the exception :

Code: Select all

"Mapping the buffer twice within the same frame detected! This is not allowed."
I think this is because I map the ConstBufferPacked every time the fillsBufferFor is called (I want it right for every Item of the terrain. Each Item is a tile, the terrain is actually a grid of Item). So, during a frame, it will be changed more than 1 time :/

I don't really know how to do this. From what I've understood in HlmsPbs, you are doing this with a texture. Is it possible to give a Mat4 directly, using the CommandBuffer ?

By the way, while getting mistreated by the HLMS, I ran accross something I think it is weird (maybe not).
In your shaders, you are always adding this line :

Code: Select all

layout (std140) uniform ;
I am not saying I understand this line completely, but because it has nothing behind it (I mean, no name, no structure definition), it is triggering a break in :

Code: Select all

void GLSLProgram::extractLayoutQualifiers(void) L.309
If this line is not the first uniform 'declaration', then the function tries to get the semantic for the other uniforms (it tried with my "mvpMat"), possibly triggering an assert (missing attribute in OgreGLSLProgram).
I am wondering if it is a wanted feature ?

That's a big post, sorry. The last thing I want to add is that I really like the new Ogre, you're doing a great job ! Thanks !
Last edited by Kinslore on Mon Aug 10, 2015 9:19 pm, edited 1 time in total.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: [2.1] Trying to create a new HLMS

Post by dark_sylinc »

Hi!

Kudos for trying! I like your approach. Take the component to the bare minimum to render something basic.
Writing your Hlms implementation can be scary at first, but it is very rewarding because of the flexibility and power it gives you.
Kinslore wrote: I think this is because I map the ConstBufferPacked every time the fillsBufferFor is called (I want it right for every Item of the terrain. Each Item is a tile, the terrain is actually a grid of Item). So, during a frame, it will be changed more than 1 time :/
Yes, that is correct.
fillBuffersFor is called per Renderable. And you can't map the same dynamic buffer more than once per frame.

With regular D3D11 and GL, the API performs a whole lot of overhead to provide a "dynamic buffer" mechanism where you think you're dealing with the same buffer every time you map, but actually the API is giving you a different buffer pointer so that data can be queued up.
We do not do this. It is very inefficient. We do it explicitly.

Instead, the data from a buffer must remain consistent across the entire frame so the GPU can access it correctly. It is your responsibility to ensure the shader will access the correct region of memory, instead on relying on the API to tell it to "write to the same memory region" while behind the scenes the API must write to different regions of memory.

In simple words, if you try to force that code by removing the debug check, you will see that all your rendered objects will be rendered in the same location, or will jump between random locations because of race conditions (you're still reading that data from GPU while the CPU is writing to it).
You need to ensure each instance reads from its own region of memory (and you avoid the race condition by not mapping more than once per frame).

For that, you'll have to do what the default implementations do: Have a pool of ConstBufferPacked with a lot of preallocated memory, and keep track of the regions of memory you write to.
The HlmsBufferManager provides a lot of utility functionality here.
HlmsBufferManager::mapNextConstBuffer will return you a pointer to a mapped const buffer that can hold up to 64kb of data (actual size varies depending on GPU HW capabilities).
When it's full, you call it again. All buffers that remain mapped after all Renderables have gone through fillBuffersFor will be unmapped in preCommandBufferExecution (which should be called automatically if you don't overload the virtual call).

In your case the relevant code snippet is:

Code: Select all

uint32 * RESTRICT_ALIAS currentMappedConstBuffer    = mCurrentMappedConstBuffer;
bool exceedsConstBuffer = (size_t)((currentMappedConstBuffer - mStartMappedConstBuffer) + 16) >
                                                                        mCurrentConstBufferSize;

if( exceedsConstBuffer )
{
    currentMappedConstBuffer = mapNextConstBuffer( commandBuffer );
}

memcpy( mappedBuffer, &worldMat, 16 * sizeof(float) );

//Return value will be important. See later drawID
return ((mCurrentMappedConstBuffer - mStartMappedConstBuffer) >> 4) - 16;
Note that:
"+16" in exceedsConstBuffer is the minimum number of 32-bit values you need (mat4 mvpMat => 64 bytes / sizeof(uint32) = 16)
mapNextConstBuffer will also bind the const buffer for you to slot 2 in the vertex & pixel shaders.

Now you've filled the data for each own instance. But you need a way to tell the shader to use its individual offset.

There are two ways:

1. Extremely low CPU overhead, slightly higher GPU overhead: Index the const buffer in the shader. This is what the default implementations do.
Declare your shader as:

Code: Select all

layout(binding = 2) uniform entityBuffer
{
   mat4 mvpMat[1024];
} entityVars ;
Then use the draw ID which is filled in automatically by Ogre:

Code: Select all

in uint drawId;
void main ()
{
   gl_Position = entityVars.mvpMat[drawId] * vertex ;
}
The value of drawID will be based on the return value of fillBuffersFor; hence it was so important above, which you need to keep contiguous 0, 1, 2, 3, 4, ... 4095. For every API draw call, it will start from the return value of fillBuffersFor, then will be incremented by +1 for every instance the auto instance system can batch together (irrespective of what fillBuffersFor will say for those auto-instanced Renderables. If you need fillBuffersFor's return value to always be respected in drawId, you must insert a dummy command to fool Ogre to break the auto instancing)


2. Relatively high CPU overhead, optimum GPU usage. This method is undesired because it prevents auto instancing from working. But it could be used if you're extremely GPU shader bound and have little amount of objects. Note that this method only works in OpenGL and D3D11.1 running on Windows 8.1. It will not work on D3D11 or in Windows 7 or below.
Instead of indexing in the shader with drawId, set the const buffer for every object:

Code: Select all

if( exceedsConstBuffer )
{
    currentMappedConstBuffer = mapNextConstBuffer( commandBuffer );
}

memcpy( mappedBuffer, &worldMat, 16 * sizeof(float) );

//Rebind the buffer holding the MVP matrix for every object, at a different offset each time.
*commandBuffer->addCommand<CbShaderBuffer>() = CbShaderBuffer(
                    VertexShader, 2, mConstBuffers[mCurrentConstBuffer],
                    (mCurrentMappedConstBuffer - mStartMappedConstBuffer - 1) * sizeof(uint32),
                    16 * sizeof(float) );
You will be setting the vertex buffer for every object, thus performing many API calls and also instancing cannot be used.
By the way, while getting mistreated by the HLMS, I ran accross something I think it is weird (maybe not).
In your shaders, you are always adding this line :

Code: Select all

layout (std140) uniform ;
No, I'm always adding the line "layout(std140) uniform;" (note the lack of spaces between layout and the first parenthesis). :)
There is a GLSL parser that was contributed eons ago to Ogre, and your spacing between layout and the '(' must be confusing the parser (at least that's what I think is probably happening because the bit of code you mention is about that parser).
I am not saying I understand this line completely, but because it has nothing behind it (I mean, no name, no structure definition)
It instructs OpenGL to use the std140 layout definition as a default for all structures constant buffers which has a strict set of rules about alignment and padding (you can also instruct it per struct).
You can read the 9 rules in page 123 of the GL specs.
Although to be honest, these rules are so complicated and confusing that even driver implementations get it wrong, which is why I prefer using vec4 as much as possible rather than i.e. mixing vec3 with a float, because one driver may pack them together, while another adds padding between them. And the shader will be broken on that vendor until it fixes its driver bug. No, thanks.
Direct3D's HLSL packing rules are much more friendly and easier to understand.
We use std140 because otherwise the offsets for each member variable has to be queried from the GL driver for each shader, since each driver will try to do "its best" to pack and pad variables to the best of what their GPU can do, and thus the offsets are not known beforehand (btw almost always they do a crappy job anyway, so I just stick to std140).

Hopefully this should have clarified your doubts.
Cheers
Matias
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: [2.1] Trying to create a new HLMS

Post by dark_sylinc »

Kinslore wrote: I don't really know how to do this. From what I've understood in HlmsPbs, you are doing this with a texture. Is it possible to give a Mat4 directly, using the CommandBuffer ?
Just to clarify, the PBS implementation uses a TextureBuffer instead of a ConstBuffer because TextureBuffers don't have the 64kb limitation.
You can see from the code I suggested that "entityVars" cannot hold more than 1024 matrices (because 1024 * 4 * 4 = 64kb); when Ogre can auto-instance up to 4096 instances together in the same draw call.
Add just a second variable (e.g. a float) and the limit goes down to 819 (64 bytes + 4 bytes + 12 bytes of padding = 80 bytes. 65536 / 80 = 819.2).

Also HW skinning (skeletal animation) can use like 50 matrices per Renderable in some cases. That will not instance well when using ConstBuffers.
So we use a TextureBuffer instead, and store the address/offset to that TextureBuffer in the ConstBuffer (in other words, we add an extra level of indirection).

In pseudo-code, we're basically doing (no HW skinning / skeletal animation):

Code: Select all

textureBuffer[ drawId ].mvpMatrix
And when doing HW skinning / skeletal animation:

Code: Select all

textureBuffer[ constBuffer[drawId].x ].mvpMatrix
And that's it. We use a TextureBuffer due to the size limitations of the ConstBuffer. But not because of the amount of times we map the buffer per frame.
The rest of the mechanism is the same.
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [2.1] Trying to create a new HLMS

Post by Kinslore »

Hello !

Thanks a lot, this really has helped me to understand a lot of things (even if HLMS are still a little scary). I will try this whenever I have time (hope this will be soon !).

But I thought about it today, and have some questions again !

To begin, I am wondering, how is it possible to have more than 1024 Items using the same HLMS, with these buffers (and in this simple case). From your second reply, I can understand that it can be done by using the TextureBuffer. But maybe it has some limits (we have to know how many objects are using the HLMS I guess), or maybe we could use some commands to break the instancing and cheat a little ?
Edit : after invastigating the BufferManager, here is what I understand : you fill the active buffer, for X movables using the HLMS (depending on the size of the data you have to fill to get to 64kb). When this buffer is full, you ask for another one (by asking for it or creating it if needed). By pushing the new command, you break the auto instancing done by Ogre, and thus the drawID can begin again, on the new buffer you set. And you can write that for as many movables as you want ! At least this is what I am guessing ?

Also, I had the same idea about the parser : was it the nasty space messing with it ? After some debugging, I think it is not a problem. I launched it with and without the space on this shader (I only erased the magic line) :

Code: Select all

#version 330

// Inputs
in vec4 vertex ;

[b]layout (binding = 0) uniform entityBuffer[/b]
{
	mat4 mvpMat ;
} entityVars ;

// Sortie
out gl_PerVertex
{
	vec4 gl_Position ;
} ;

void main ()
{
	gl_Position = entityVars.mvpMat * vertex ;
}
Notice the bold line with the binding, the second time it was :

Code: Select all

layout(binding = 0) uniform entityBuffer
The result is the same in this case. The parser is splitting the line in 3 parts, line 303 :

Code: Select all

StringVector parts = StringUtil::split(line, " ");
// With these results
parts[0] = "uniform" ;
parts[1] = "entityBuffer{mat4" ; // With the lines ending
parts[2] = "mvpMat" ;
Then, with the parts[2]'s value it tries to get the semantic of "mvpMat", and it seems that's not something it likes.
This is during this debugging that I saw that this line (note the lack of space :p)

Code: Select all

layout(std140) uniform ;
Was saving the other uniforms, because it gets splitted in only one part, "uniform", triggering the break (which is asking for a 3 parts array). This is why I was wondering if it was something wanted :)
Seeing the rules however, I will always write it !

Anyway, I am going to try everything as soon as possible. Thanks again, that was really helpful ! :D
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [2.1] Trying to create a new HLMS

Post by Kinslore »

Hi !

I got some time to play with the HLMS and I must say, I am starting to enjoy it ! It's a powerful toy !

I successfully got the matrix buffer populated, used in the shader, and I could go further and add a pass binding. I am now trying to put some textures, and think I understood how to do it :D

I want to post how I did it here, as a 'minimalistic example' full of comments. However, before that, I have some questions (want it perfect :mrgreen: )

First, when I want to inherit from HLMS, I add the following line :

Code: Select all

#include <OgreHlms.h>
If before it, I do not include this :

Code: Select all

#include <OgreArchive.h>
#include <OgreHardwareVertexBuffer.h>
I get some errors coming from OgreHlms.h :

Code: Select all

error C2061: syntax error : identifier 'ArchiveVec'   OgreMain\include\OgreHlms.h	292
error C2061: syntax error : identifier 'VertexElementSemantic'	OgreMain\include\OgreHlms.h	274
I don't treally understand why those are coming in fact. Seems missing in the HLMS base class, and always included in other classes, thus never triggering the problem ?

Last question, about the HLMS_USER0's enum. Can we register multiple Hlms systems by the same index, or are we really limited to 4 custom Hlms implementations ?

Thanks again for your help !
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: [2.1] Trying to create a new HLMS

Post by dark_sylinc »

Kinslore wrote:Last question, about the HLMS_USER0's enum. Can we register multiple Hlms systems by the same index, or are we really limited to 4 custom Hlms implementations ?
You can't register multiple Hlms with the same index at the same time. You have to unregister the previous one.

Yes, you're limited to 4 custom implementations; but note that you can use HLMS_PBS, HLMS_TOON & HLMS_UNLIT if you never register the default ones we provide.

4 custom implementations should be more than enough. Normally you want 1 to 3 implementations that do one specific thing with high performance; while you can use 1 or 2 that are more focused on features rather than performance (i.e. an Hlms type that has subtypes, and evaluates what subtype/shader to use while iterating per object) if you really need that many types.
For example HLMS_LOW_LEVEL loads the shaders from the old material system instead of creating one from the template.

The limit is not fully arbitrary. The RenderQueue sorts all renderables by state, and it assigns 3 bits in the 64-bit sorting key. Eight different Hlms implementations should be enough for most needs. Increasing it would mean editing the RenderQueue to shift all assigned bits and take bits from some other element.
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [2.1] Trying to create a new HLMS

Post by Kinslore »

Hello !

Thanks again for the answer !

So, here is what I did with the HLMS. It is really a small code, but I hope it will be useful to other users, if they want to understand everything.
Feel free to correct everything you wish ! :)

First, let's see what are the shaders we will be working with. I didn't use templates for this... And this is GLSL.

Vertex shader :

Code: Select all

#version 330

// Do not forget this line first. I had problems with the parser if it wasn't before the other layout vars. Moreover, it indicates some infos to the compiler (see Dark_Sylinc's answer)
layout (std140) uniform ;

// Inputs : those are key words for Ogre, i believe. You can't mess with it
in vec4 vertex ;
in vec3 normal ;
in vec2 uv0 ;

// The drawId is some info given by you in the HLMS to track which Ogre::Item is getting its draw.
in uint drawId ;

// We will only use the binding 2 here
layout (binding = 2) uniform entityBuffer
{
	// The array is here because our buffer will contain the datas for (max) 1024 Ogre::Item. The drawId is here to help you get the good index.
	// As a ConstBuffer will have a fixed size, if you change this uniform's structure, the array will be shorter (again, see Dark_Sylinc's answer). Be careful with the memory padding too.
	mat4 mvpMat [1024] ;
} entityVars ;

// Output
out gl_PerVertex
{
	vec4 gl_Position ;
} ;

// Fragment shader's output
out block
{
	vec3 normal ;
	vec2 uv ;
} outPs ;

void main ()
{
	// We simply compute the real position
	gl_Position = entityVars.mvpMat[drawId] * vertex ;
	
	// And some Fragment shaders infos
	outPs.normal = normal ;
	outPs.uv = uv0 ;
}
And then its Fragment shader :

Code: Select all

#version 330

// Again, do not forget this line
layout (std140) uniform ;

// Input
layout (binding = 0) uniform passBuffer
{
	// vec3 LightDir + float (as memory padding will pack it with a vec4, even if we give a vec3 only)
	vec4 lightDir ;
	vec4 diffuseColour ;
} passVars ;

// Some texture
uniform sampler2D baseTex ;

// Datas from the Vertex shader
in block
{
	vec3 normal ;
	vec2 uv ;
} inPs ;

// Output
out vec3 fragColor ;

void main ()
{
	// Diffuse component and a texture part
	// This is a directional light
	fragColor = max(passVars.diffuseColour.rgb * dot(-passVars.lightDir.xyz, inPs.normal), 0.0) * texture(baseTex, inPs.uv).rgb ;
}
Then, here is a simple HlmsBufferManager that will prepare only ConstBuffers (no TexBuffer I mean). It will be something shared among the HLMS (having to prepare the buffers), so every HLMS will inherit from it.
Note that you can find the complete versions within the PBS and Unlit HLMS in Ogre sources.

Here is the header (HlmsBufferManager.h)

Code: Select all

//////////////////////////////////////////////
//			HlmsBufferManager.h				//
//										//
//										//
//////////////////////////////////////////////

#ifndef __HlmsBufferManager
	#define __HlmsBufferManager

	// Forward declarations
	namespace Ogre
	{
		class VaoManager ;
		class SceneManager ;

		class Renderable ;
		class CompositorShadowNode ;

		class CommandBuffer ;

		struct QueuedRenderable ;
	}

	#include <OgrePrerequisites.h>
	#include <OgreArchive.h>
	#include <OgreHardwareVertexBuffer.h>
	#include <OgreHlms.h>

	#include <vector>
	
	// This one will only be an HLMS
	class HlmsBufferManager : public Ogre::Hlms
	{
		protected :
		
			// Attributes
			// The VAO Manager (useful for the buffer's creation)
			Ogre::VaoManager* _vaoManager ;
			// The buffers in an array
			std::vector<Ogre::ConstBufferPacked*> _constBuffers ;
			// The active buffer
			Ogre::uint32 _currentConstBuffer ;
			// Some pointers to those buffers
			Ogre::uint32* _bufferStartIndex ;
			Ogre::uint32* _bufferCurrentIndex ;
			// A buffer size
			Ogre::uint32 _bufferSize ;
		
		public :
		
			// Constructor, destructor
			HlmsBufferManager (Ogre::Archive* dataFolder, Ogre::ArchiveVec* libraryFolders, Ogre::VaoManager* vaoManager, std::string name, Ogre::HlmsTypes type) ;
			~HlmsBufferManager () ;

			// HLMS functions
			virtual const Ogre::HlmsCache* createShaderCacheEntry (Ogre::uint32 renderableHash, const Ogre::HlmsCache& passCache, Ogre::uint32 finalHash, const Ogre::QueuedRenderable& queuedRenderable) ;
			virtual void calculateHashFor (Ogre::Renderable* renderable, Ogre::uint32& outHash, Ogre::uint32& outCasterHash) ;
			virtual Ogre::HlmsCache preparePassHash (const Ogre::CompositorShadowNode* shadowNode, bool casterPass, bool dualParaboloid, Ogre::SceneManager* sceneManager) ;

			virtual Ogre::uint32 fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) = 0 ;
			virtual Ogre::uint32 fillBuffersForV1 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) ;
			virtual Ogre::uint32 fillBuffersForV2 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) ;

			virtual void preCommandBufferExecution (Ogre::CommandBuffer* command) ;
			virtual void postCommandBufferExecution (Ogre::CommandBuffer* command) ;
			virtual void frameEnded () ;

			// Buffers functions
			Ogre::uint32* mapNextConstBuffer (Ogre::CommandBuffer* command) ;
			void unmapConstBuffer () ;
	} ;
	
#endif
The cpp code :

Code: Select all

//////////////////////////////////////////////
//			HlmsBufferManager.cpp			//
//											//
//											//
//////////////////////////////////////////////

/// Includes ---------------------------------

// Locals
#include "HlmsBufferManager.h"

// Natives
#include <cstddef>

#include <OgreRenderQueue.h>
#include <OgreRenderable.h>

#include <CommandBuffer/OgreCbShaderBuffer.h>
#include <CommandBuffer/OgreCommandBuffer.h>

#include <Vao/OgreConstBufferPacked.h>
#include <Vao/OgreVaoManager.h>

/// Constructor, destructor ------------------

HlmsBufferManager::HlmsBufferManager (Ogre::Archive* dataFolder, Ogre::ArchiveVec* libraryFolders, Ogre::VaoManager* vaoManager, std::string name, Ogre::HlmsTypes type)
:	Ogre::Hlms (type, name, dataFolder, libraryFolders),
	_vaoManager (vaoManager),
	_constBuffers (),
	_currentConstBuffer (0),
	_bufferStartIndex (NULL),
	_bufferCurrentIndex (NULL),
	_bufferSize (0)
{
	// We register it with the type given
	// You can register it with every id available, as long as it does not overlap another registered HLMS' id (see Dark_Sylinc's answer).
	// Concerning the dataFolder and the libraryFolder, the shaders in those will be parsed (this is where you have your templates).
}

HlmsBufferManager::~HlmsBufferManager ()
{
	// Buffer delete
	for (unsigned int i = 0 ; i < _constBuffers.size() ; i++)
	{
		// See if it is unmaped or not
		if (_constBuffers[i]->getMappingState() != Ogre::MS_UNMAPPED)
			_constBuffers[i]->unmap(Ogre::UO_UNMAP_ALL) ;

		// We can destroy it
		_vaoManager->destroyConstBuffer(_constBuffers[i]) ;
	}
}

/// Hlms Functions ---------------------------

const Ogre::HlmsCache* HlmsBufferManager::createShaderCacheEntry (Ogre::uint32 renderableHash, const Ogre::HlmsCache& passCache, Ogre::uint32 finalHash, const Ogre::QueuedRenderable& queuedRenderable)
{
	// We won't be going further than the base Hlms implementation
	const Ogre::HlmsCache* retVal = Ogre::Hlms::createShaderCacheEntry(renderableHash, passCache, finalHash, queuedRenderable) ;

	return retVal ;
}

void HlmsBufferManager::calculateHashFor (Ogre::Renderable* renderable, Ogre::uint32& outHash, Ogre::uint32& outCasterHash)
{
	// Same here
	Hlms::calculateHashFor(renderable, outHash, outCasterHash) ;
}

Ogre::HlmsCache HlmsBufferManager::preparePassHash (const Ogre::CompositorShadowNode* shadowNode, bool casterPass, bool dualParaboloid, Ogre::SceneManager* sceneManager)
{
	// Again, the base function only
	Ogre::HlmsCache retVal = Hlms::preparePassHash (shadowNode, casterPass, dualParaboloid, sceneManager) ;

	return retVal ;
}

Ogre::uint32 HlmsBufferManager::fillBuffersForV1 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer)
{
	// For a V1 or V2 item, I didn't make any difference (but used V2 Item)
	return fillBuffersFor(cache, queuedRenderable, casterPass, lastCacheHash, commandBuffer) ;
}

Ogre::uint32 HlmsBufferManager::fillBuffersForV2 (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer)
{
	// See V1's comment
	return fillBuffersFor(cache, queuedRenderable, casterPass, lastCacheHash, commandBuffer) ;
}

void HlmsBufferManager::preCommandBufferExecution (Ogre::CommandBuffer* command)
{
	// This is called after the CommandBuffer has been populated, but before its execution. We unmap every buffer to be sure everything is fine
	unmapConstBuffer() ;
}

void HlmsBufferManager::postCommandBufferExecution (Ogre::CommandBuffer* command)
{
	// Nothing to do for this simple example (called after the CommandBuffer execution)
}

void HlmsBufferManager::frameEnded ()
{
	// Called after the frame is done. Here we reset the active bfufer index
	_currentConstBuffer = 0 ;
}

/// Buffers Functions ------------------------

Ogre::uint32* HlmsBufferManager::mapNextConstBuffer (Ogre::CommandBuffer* command)
{
	// The last is to be unmaped
	unmapConstBuffer() ;

	// Let's see what we have to map
	if (_currentConstBuffer >= _constBuffers.size())
	{
		// We have to create a new buffer, because we already populated the ones before it.
		// A ConstBuffer size is 64kb max from HardWare, so we see what size is best (see Dark_Sylinc's answer)
		size_t bufferSize = std::min<size_t>(65536, _vaoManager->getConstBufferMaxSize()) ;

		// Here the VAO is put in good use
		Ogre::ConstBufferPacked* newBuff = _vaoManager->createConstBuffer(bufferSize, Ogre::BT_DYNAMIC_PERSISTENT, NULL, false) ;

		// Keep it in memory
		_constBuffers.push_back(newBuff) ;
	}

	// We can get the next buffer then
	Ogre::ConstBufferPacked* buffer = _constBuffers[_currentConstBuffer] ;

	// Map it
	_bufferStartIndex = reinterpret_cast<Ogre::uint32*>(buffer->map(0, buffer->getNumElements())) ;

	// Update some vars
	_bufferCurrentIndex = _bufferStartIndex ;
	// Number of elements / float size in byte (4)
	_bufferSize = buffer->getNumElements() >> 2 ;

	// Push some commands, as the buffer is changed
	// The buffer will be bound to the binding 2 ("layout (binding = 2)"), to both shaders
	Ogre::CbShaderBuffer* commandBuffer = command->addCommand<Ogre::CbShaderBuffer>() ;
	*commandBuffer = Ogre::CbShaderBuffer(Ogre::VertexShader, 2, buffer, 0, buffer->getTotalSizeBytes()) ;

	commandBuffer = command->addCommand<Ogre::CbShaderBuffer>() ;
	*commandBuffer = Ogre::CbShaderBuffer(Ogre::PixelShader, 2, buffer, 0, buffer->getTotalSizeBytes()) ;

	// Can return the buffer's pointer
	return _bufferCurrentIndex ;
}

void HlmsBufferManager::unmapConstBuffer ()
{
	// If we had a buffer, we can unmap it
	if (_bufferStartIndex)
	{
		// Unmap the current one
		Ogre::ConstBufferPacked* buffer = _constBuffers[_currentConstBuffer] ;

		buffer->unmap(Ogre::UO_KEEP_PERSISTENT, 0, (_bufferCurrentIndex - _bufferStartIndex) * sizeof(float))  ;

		// Get the next index
		_currentConstBuffer++ ;

		// Reset
		_bufferStartIndex = NULL ;
		_bufferCurrentIndex = NULL ;
		_bufferSize = 0 ;
	}
}
Then, the simple HLMS inheriting from it. This time, it's the real one !

Header

Code: Select all

//////////////////////////////////////////////
//			HlmsGround.h					//
//											//
//											//
//////////////////////////////////////////////

#ifndef __HlmsGround
	#define __HlmsGround

	// Forward declarations
	namespace Ogre
	{
		class VaoManager ;
		class ConstBufferPacked ;
	}

	class OgreInitializer ;

	#include "HlmsBufferManager.h"

	#include <OgreHlmsDatablock.h>
	#include <OgreMatrix4.h>

	// This is a simple datablock. I am not sure what to put in there for now, but in this simple example it is not needed
	class HlmsGroundDatablock : public Ogre::HlmsDatablock
	{
		private :

			// Attributes

		public :

			// Constructor, destructor
			HlmsGroundDatablock (Ogre::IdString name, Ogre::Hlms* creator, const Ogre::HlmsMacroblock* macroblock, const Ogre::HlmsBlendblock* blendblock, const Ogre::HlmsParamVec& params)
			:	HlmsDatablock(name, creator, macroblock, blendblock, params)
			{
				// Nothing to do
			}

			~HlmsGroundDatablock ()
			{
				// Nothing to do
			}
	} ;
	
	// Here is the real class, inheriting from the BufferManager
	class HlmsGround : public HlmsBufferManager
	{
		private :
		
			// Attributes
			// The buffers used for the passes (one for every pass, we can't map the same accross different frames)
			unsigned int _currentPassBuffer ;
			std::vector<Ogre::ConstBufferPacked*> _passBuffers ;

			// We want to give the WorldMat. This one will be calculated only once, so keep it there
			Ogre::Matrix4 _passViewProj ;

			// Sampler block we will use for the texture
			const Ogre::HlmsSamplerblock* _samplerBlock ;
		
		public :
		
			// Constructor, destructor
			HlmsGround (Ogre::Archive* dataFolder, Ogre::ArchiveVec* libraryFolders, Ogre::VaoManager* vaoManager) ;
			~HlmsGround () ;

			// HLMS functions
			virtual const Ogre::HlmsCache* createShaderCacheEntry (Ogre::uint32 renderableHash, const Ogre::HlmsCache& passCache, Ogre::uint32 finalHash, const Ogre::QueuedRenderable& queuedRenderable) ;
			virtual Ogre::HlmsCache preparePassHash (const Ogre::CompositorShadowNode* shadowNode, bool casterPass, bool dualParaboloid, Ogre::SceneManager* sceneManager) ;
			virtual Ogre::uint32 fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::uint32 lastTextureHash) ;
			virtual Ogre::uint32 fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer) ;
			virtual void frameEnded () ;

			virtual HlmsGroundDatablock* createDatablockImpl (Ogre::IdString datablockName, const Ogre::HlmsMacroblock* macroblock, const Ogre::HlmsBlendblock* blendblock, const Ogre::HlmsParamVec& paramVec) ;
	} ;
	
#endif
And the source file attached to it

Code: Select all

//////////////////////////////////////////////
//			HlmsGround.cpp					//
//											//
//											//
//////////////////////////////////////////////

/// Defines ----------------------------------

// Buffer : Light direction + light's colour
// I put the total pass' size here, in values
#define GROUND_HLMS_PASS_BUFFER_SIZE 8

/// Includes ---------------------------------

// Locals
#include "HlmsGround.h"

// Natives
#include <cstddef>

#include <iostream>

#include <OgreHlmsCommon.h>

#include <OgreLight.h>
#include <OgreCamera.h>
#include <OgreRenderable.h>
#include <OgreRenderQueue.h>
#include <OgreTextureManager.h>

#include <CommandBuffer/OgreCbShaderBuffer.h>
#include <CommandBuffer/OgreCbTexture.h>
#include <CommandBuffer/OgreCommandBuffer.h>

#include <Vao/OgreConstBufferPacked.h>
#include <Vao/OgreVaoManager.h>

/// Constructor, destructor ------------------

// Here you can see we give the name we want it to have, and its type
HlmsGround::HlmsGround (Ogre::Archive* dataFolder, Ogre::ArchiveVec* libraryFolders,  Ogre::VaoManager* vaoManager)
:	HlmsBufferManager (dataFolder, libraryFolders, vaoManager, "Ground", Ogre::HLMS_USER0),
	_currentPassBuffer (0),
	_passBuffers (),
	_passViewProj (Ogre::Matrix4::IDENTITY),
	_samplerBlock (NULL)
{
	// We register this HLMS class with the Ogre::HLMS_USER0 id. For another one, you could use HLMS_USER1, and so on

	// Initiailize our sampler block for the texture
	// We create a reference sampler block, to ask Ogre about the one used in intern. This is done in order to cache things and avoid having multiple blocks doing the same thing
	// That's less API overhead as we won't switch for nothing
	Ogre::HlmsSamplerblock samplerBlockRef ;
	// You can change every parameter here. This is defining the sampler block you want from Ogre
	// Let's put WRAP on U and V, it will repeat the texture (you can MIRROR, CLAMP...)
	samplerBlockRef.mU = Ogre::TAM_WRAP ;
	samplerBlockRef.mV = Ogre::TAM_WRAP ;

	// Then, let's ask Ogre about its internal buffer. We will keep it and use it !
	Ogre::HlmsManager* hlmsMan = Ogre::Root::getSingleton().getHlmsManager() ;
	_samplerBlock = hlmsMan->getSamplerblock(samplerBlockRef) ;
}

HlmsGround::~HlmsGround ()
{
	// We have to delete our pass buffer
	for (unsigned int i = 0 ; i < _passBuffers.size() ; i++)
		_vaoManager->destroyConstBuffer(_passBuffers[i]) ;
}

/// HLMS functions ---------------------------

const Ogre::HlmsCache* HlmsGround::createShaderCacheEntry (Ogre::uint32 renderableHash, const Ogre::HlmsCache& passCache, Ogre::uint32 finalHash, const Ogre::QueuedRenderable& queuedRenderable)
{
	// First the parent one
	const Ogre::HlmsCache* retVal = HlmsBufferManager::createShaderCacheEntry(renderableHash, passCache, finalHash, queuedRenderable) ;

	// Here is where you can put the default values for the shaders.
	// So we set the samplerId for our texture (to 0, let's be fair)
	retVal->pixelShader->getDefaultParameters()->setNamedConstant("baseTex", 0) ;

	// We have to switch the values in the program
	mRenderSystem->_setProgramsFromHlms(retVal) ;

	// Doing it only with the pixel shader as it is the only one needing it
	Ogre::GpuProgramParametersSharedPtr psParams = retVal->pixelShader->getDefaultParameters() ;
	mRenderSystem->bindGpuProgramParameters(Ogre::GPT_FRAGMENT_PROGRAM, psParams, Ogre::GPV_ALL) ;
	
	// Would be this with the vertex shader
	/*Ogre::GpuProgramParametersSharedPtr vsParams = retVal->vertexShader->getDefaultParameters() ;
	mRenderSystem->bindGpuProgramParameters(Ogre::GPT_VERTEX_PROGRAM, psParams, Ogre::GPV_ALL) ;*/

	// Done
	return retVal ;
}

Ogre::HlmsCache HlmsGround::preparePassHash (const Ogre::CompositorShadowNode* shadowNode, bool casterPass, bool dualParaboloid, Ogre::SceneManager* sceneManager)
{
	// Let's call the parent again
	Ogre::HlmsCache retVal = Hlms::preparePassHash (shadowNode, casterPass, dualParaboloid, sceneManager) ;

	// Let's prepare our viewProj matrix (before the pass)
	// You can get the camera from wherever you want, but we will take the current active camera here
	Ogre::Camera* cam = sceneManager->getCameraInProgress() ;

	Ogre::Matrix4 projection = cam->getProjectionMatrixWithRSDepth() ;

	Ogre::RenderTarget* rt = sceneManager->getCurrentViewport()->getTarget() ;

	// Maybe we need to flip it (RTT at least)
	if (rt->requiresTextureFlipping())
	{
		projection[1][0] = -projection[1][0] ;
		projection[1][1] = -projection[1][1] ;
		projection[1][2] = -projection[1][2] ;
		projection[1][3] = -projection[1][3] ;
	}

	Ogre::Matrix4 viewProj = projection * cam->getViewMatrix(true) ;
	_passViewProj = viewProj ;

	// Here is the system avoiding us some errors
	// Every time we have a frame, we access a new buffer, stored in the array (and add it if needed)
	// Then, when the frame is ended (we know it via the frameEnded() callback), we reset our counter
	if (_passBuffers.size() <= _currentPassBuffer)
	{
		// On doit créer un nouveau buffer
		_passBuffers.push_back(_vaoManager->createConstBuffer(GROUND_HLMS_PASS_BUFFER_SIZE << 2, Ogre::BT_DYNAMIC_PERSISTENT, NULL, false)) ;
	}

	// Get the buffer to map. We know the next time will be the next frame, so we add 1 to the current buffer index
	Ogre::ConstBufferPacked* passBuffer = _passBuffers[_currentPassBuffer++] ;

	// Here is where we fill the pass buffer.
	// In this simple one, only the light is set
	// Map the buffer
	float* passBufferPtr = reinterpret_cast<float*>(passBuffer->map(0, passBuffer->getNumElements())) ;

	// Let's put the light direction (directional light)
	// You fill it wih the light you wish
	Ogre::Vector3 lightParam = someDirectionalLight->getDirection() ;

	// Fill it
	*passBufferPtr++ = lightParam.x ;
	*passBufferPtr++ = lightParam.y ;
	*passBufferPtr++ = lightParam.z ;
	// Because of the padding, we set another float to be sure
	*passBufferPtr++ = 0.f ;

	// Let's give its colour
	const Ogre::ColourValue& colour = _ogreInit->getSunLight()->getDiffuseColour() ;

	// Simply
	*passBufferPtr++ = colour.r ;
	*passBufferPtr++ = colour.g ;
	*passBufferPtr++ = colour.b ;
	*passBufferPtr++ = colour.a ;

	// Done, we can unmap the buffer
	passBuffer->unmap(Ogre::UO_KEEP_PERSISTENT) ;

	return retVal ;
}

Ogre::uint32 HlmsGround::fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::uint32 lastTextureHash)
{
	// I didn't get any call for this one, but at least the console will tell us if it is so
	std::cout << "fillBuffersFor Tex" << std::endl ;

	return 0 ;
}

Ogre::uint32 HlmsGround::fillBuffersFor (const Ogre::HlmsCache* cache, const Ogre::QueuedRenderable& queuedRenderable, bool casterPass, Ogre::uint32 lastCacheHash, Ogre::CommandBuffer* commandBuffer)
{
	// Here we have a big thing
	// First let's see if thsi HLMS was the last called
	if (OGRE_EXTRACT_HLMS_TYPE_FROM_CACHE_HASH(lastCacheHash) != Ogre::HLMS_USER0)
	{
		// If not, it means the textures and the pass buffer are not bound to our shader (or at least, those are not from this HLMS)
		// Put the pass buffer (from the current index minus one, as we incremented it in the preparePassHash) to the slot 0 of both shaders
		// Remember the "layout (binding = 0)" ? That's why the index is 0
		Ogre::ConstBufferPacked* passBuffer = _passBuffers[_currentPassBuffer - 1] ;
		*commandBuffer->addCommand<Ogre::CbShaderBuffer>() = Ogre::CbShaderBuffer(Ogre::VertexShader, 0, passBuffer, 0, passBuffer->getTotalSizeBytes()) ;
		*commandBuffer->addCommand<Ogre::CbShaderBuffer>() = Ogre::CbShaderBuffer(Ogre::PixelShader, 0, passBuffer, 0, passBuffer->getTotalSizeBytes()) ;

		// Also bind the texture
		// For this example, I took this texture from the Ogre's media folder.
		Ogre::TexturePtr tex = Ogre::TextureManager::getSingletonPtr()->getByName("floor_diffuse.PNG") ;

		// Well, it is supposed to be loaded from somewhere else, but...
		if (tex.isNull())
			tex = Ogre::TextureManager::getSingleton().load("floor_diffuse.PNG", Ogre::ResourceGroupManager::DEFAULT_RESOURCE_GROUP_NAME) ;

		// We add the command for the texture. Link it to the sampler 0 (we set the baseTex index to 0 earlier) and use our sampler block
		*commandBuffer->addCommand<Ogre::CbTexture>() = Ogre::CbTexture(0, true, tex.getPointer(), _samplerBlock) ;

		// The commandBuffer seems to be a... Buffer containing every "state change" you wish. Before the draw, with the HLMS, we then change the textures, the buffers, and so on
	}

	// We have now to fill the datas for an instance (and not a pass)
	// The size of what we write to keep track of where we are
	// We write only a mat4, so 16 values
	Ogre::uint32 valuesToWrite = 16 ;

	// Let's check if the buffer is big enough
	bool bufferTooSmall = (_bufferCurrentIndex - _bufferStartIndex + valuesToWrite) > _bufferSize ;

	if (bufferTooSmall)
	{
		// We have to ask for a new buffer, from the HlmsBufferManager parent
		mapNextConstBuffer(commandBuffer) ;
	}

	// Fill the buffer then.
	// Get the World mat
	const Ogre::Matrix4& worldMat = queuedRenderable.movableObject->_getParentNodeFullTransform() ;

	// We can then compute the final matrix
	Ogre::Matrix4 mvpStandard = _passViewProj * worldMat ;
	// Here I guess I had to transpose because it is OpenGL. Not sure though, as I saw transpose in the PBS one was done every time
	Ogre::Matrix4 mvpMat = mvpStandard.transpose() ;

	// We can push the matrix into our buffer
	memcpy(_bufferCurrentIndex, &mvpMat, 16 * sizeof(float)) ;

	// We have to update the offset for the next Item calling this function before its draw
	_bufferCurrentIndex += valuesToWrite ;

	// This drawId is important (see Dark_Sylinc's answer)
	// Keep in mind that only the first one has to be correct, as Ogre's auto instancing will increment the base value automatically.
	Ogre::uint32 drawId = ((_bufferCurrentIndex - _bufferStartIndex) / valuesToWrite) - 1 ;
	return drawId ;
}

void HlmsGround::frameEnded ()
{
	// We have to call this one, it is resetting the constBuffer vars for us here
	HlmsBufferManager::frameEnded() ;

	// And as we added a pass component, we have to reset it ourself
	_currentPassBuffer = 0 ;
}

HlmsGroundDatablock* HlmsGround::createDatablockImpl (Ogre::IdString datablockName, const Ogre::HlmsMacroblock* macroblock, const Ogre::HlmsBlendblock* blendblock, const Ogre::HlmsParamVec& paramVec)
{
	// Only return the datablock
	return OGRE_NEW HlmsGroundDatablock (datablockName, this, macroblock, blendblock, paramVec) ;
}
I think everything is here, do not hesitate to say if it is not the case or if you notice mistakes. Anyway, I hope it will be useful !

My only doubt is the matrix transpose. I know OpenGL is row major, that's why I thought it had to be, but the PBS isn't making a difference between Direct3d or OpenGL, from what I saw.

But, your killer answer really helped me, thanks Matias ! I can get going.

EDIT : After some head banging against the wall, I added some corrections. The buffers had wrong sizes (by some magic it was working), I didn't flip the projection (so if you did RTT it was messed up), and the parameters for the textures weren't really bound, so you couldn't have more than 1 texture. Corrected though, sorry !

EDIT2: Added the sampler block support, the multi pass buffer stuff and corrected the Ogre::HlmsTypes given in the HlmsBufferManager (switched it to the GroundHlms).
Last edited by Kinslore on Wed Apr 27, 2016 5:38 am, edited 4 times in total.
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

Thank you very much for this!
I am trying to use this code but I get a crash when I create the datablock:

Code: Select all

    Ogre::RenderSystem *renderSystem = Ogre::Root::getSingleton().getRenderSystem();
    Ogre::VaoManager *vaoManager = renderSystem->getVaoManager();

    Ogre::ArchiveVec library;

    //Ogre::Archive *archiveLibrary = Ogre::ArchiveManager::getSingletonPtr()->load("../../media/Hlms/Common/GLSL", "FileSystem", true );
    //library.push_back( archiveLibrary );

    Ogre::Archive *archivePbs = Ogre::ArchiveManager::getSingletonPtr()->load( "../../media/CustomHlms/Hlms", "FileSystem", true ); //here are those shaders
    hlmsGround = OGRE_NEW HlmsGround( archivePbs, &library, vaoManager );
    Ogre::Root::getSingleton().getHlmsManager()->registerHlms( hlmsGround );

    Ogre::String datablockName = "myGroundDatablock";
    HlmsGroundDatablock *groundDatablock = static_cast<HlmsGroundDatablock*>( hlmsGround->createDatablock( datablockName,
                                                  datablockName,
                                                  Ogre::HlmsMacroblock(),
                                                  Ogre::HlmsBlendblock(),
                                                  Ogre::HlmsParamVec() ) ); //CRASH HERE
is this code ok?
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [Solved][2.1] Trying to create a new HLMS

Post by Kinslore »

Hello !

I'm glad it was helpful ! :D

About the crash I am not sure, what is it telling you ?
If I remember correctly, I am only registering the HLMS class and then, calling setDatablock("GroundRenderer"), with a material file containing :

Code: Select all

hlms GroundRenderer Ground
{
    // Will put later some values to parse, like textures layers or such
}
Another thing I can think of, is that I separated the shader sources from the PBS ones, setting a GroundHLMS shader folder, and giving it to the HlmsGround. I don't know your folder hierarchy (I only saw "archivePbs"), but maybe it is trying to set some constants in the PBS shaders (it got parsed because it was there), and those uniforms are inexistant.

I will check the sources and give more details when I'm back home. But I noticed another mistake : I use the BufferManager to register the HLMS as USER0. It should be the Ground giving this parameter, have you got multiple HLMS using the BufferManager ? I will correct and complete this tonight (well don't know if it will be tonight for you).

Hope this will at least help a little !
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [Solved][2.1] Trying to create a new HLMS

Post by Kinslore »

Hey !

I checked and indeed, here is how I do it :

When I initialize the HLMS :

Code: Select all

// As seen in the samples
Ogre::String dataFolder = cfg.getSetting("DoNotUseAsResource", "Hlms", "") ;

...

// Get the new folder. Not giving the shader syntax as I know I only use GLSL
Ogre::Archive* archiveGround = Ogre::ArchiveManager::getSingletonPtr()->load(dataFolder + "Hlms/Ground/", "FileSystem", true) ;
HlmsGround* hlmsGround = OGRE_NEW HlmsGround (archiveGround, &lib, _ogreRoot->getRenderSystem()->getVaoManager(), this) ;
_ogreRoot->getHlmsManager()->registerHlms(hlmsGround) ;
Then, in a .material file, I declare the material using this HLMS :

Code: Select all

hlms WorldBuilder/GroundTest Ground
{
    // Nothing for now
}
And when creating the mesh, what I need is :

Code: Select all

...
// Create the item
Ogre::Item* tileItem = _sceneManager->createItem(newTile, Ogre::SCENE_STATIC) ;

// Attach it to a node
tileNode->attachObject(tileItem) ;

// Material setting !
tileItem->setDatablock("WorldBuilder/GroundTest") ;
...
And I get no crash. My shader folder is only for this HLMS, and contain only GLSL (because I use OpenGL).

Let me know if this can help you !
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

Thank you very much!!!
It ends out everything was ok...
I was doing something stupid, I was creating the datablock before the hlmsground, lol.
but I had this error:
0(20) : error C7532: layout qualifier 'binding' requires "#version 420" or later
0(20) : error C0000: ... or #extension GL_ARB_shading_language_420pack : enable
that I solved replacing version 330 with 430

now it seems to be working ok =)
Thank you for this awesome contribution!! :P
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

I have another problem...
I am porting my sky system to hlms using this code. before I was using low level materials, and I had this:

Code: Select all

texture_unit
			{
				texture SkyTones.png gamma
				filtering			trilinear
				tex_address_mode	wrap clamp
			}
And is absolutelty necessary for me to control the tex_address_mode, so I tried

Code: Select all

      // Also bind the texture
      // For this example, I took this texture from the Ogre's media folder.
      Ogre::TexturePtr tex = Ogre::TextureManager::getSingletonPtr()->getByName("SkyTones.png") ;

      // Well, it is supposed to be loaded from somewhere else, but...
      if (tex.isNull())
         tex = Ogre::TextureManager::getSingleton().load("SkyTones.png", Ogre::ResourceGroupManager::DEFAULT_RESOURCE_GROUP_NAME) ;

           Ogre::HlmsSamplerblock* samplerblock = OGRE_NEW Ogre::HlmsSamplerblock();
           samplerblock->mU = Ogre::TextureAddressingMode::TAM_WRAP;
           samplerblock->mV = Ogre::TextureAddressingMode::TAM_CLAMP;
           samplerblock->setFiltering( Ogre::TFO_TRILINEAR );

      // We add the command for the texture. Link it to the sampler 0 (we set the baseTex index to 0 earlier)
      *commandBuffer->addCommand<Ogre::CbTexture>() = Ogre::CbTexture(0, true, tex.getPointer(), samplerblock) ;
But with no luck, the moment I use a samplerblock in the 4th argument of CbTexture(), I get banding, no matter what filtering I use

and I have another texture that I need it to be wrap wrap, for tiling... when I don't set a samplerblock I got tiling but only in certain camera angles... when I move the camera it flickers on/off the tiling. Then if I set a sampler block using wrap wrap, I doesn't wrap at all =(

maybe that's not the place to configure samplerblock?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: [Solved][2.1] Trying to create a new HLMS

Post by dark_sylinc »

You need to create your samplerblocks via the HlmsManager. I'm surprised it didn't crash.

Code: Select all

HlmsSamplerblock *finalSamplerblock=0;
//During creation
void init()
{
    HlmsSamplerblock samplerblock;
    samplerblock.mU             = TAM_WRAP;
    samplerblock.mV             = TAM_CLAMP;
    samplerblock.mW            = TAM_CLAMP;
    samplerblock.mBorderColour  = ColourValue::White;

    HlmsSamplerblock *finalSamplerblock = mHlmsManager->getSamplerblock( samplerblock );
}

void everyFrame()
{
    *commandBuffer->addCommand<Ogre::CbTexture>() = Ogre::CbTexture(0, true, tex.getPointer(), finalSamplerblock) ;
}

void atShutdown()
{
    if( finalSamplerblock )
    {
        mHlmsManager->destroySamplerblock( finalSamplerblock );
        finalSamplerblock = 0;
    }
}
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

thanks! it works now =)
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [Solved][2.1] Trying to create a new HLMS

Post by Kinslore »

Hey !

Glad everything work as intended !

I updated the post with some little things I added (the more I work with it, the more I understand it and find limits in what I did :mrgreen:). If there is any problem (wrong comments or code), do not hesitate to tell me !

Xrgo, if you think about doing multi pass scene rendering, I added a multiple passBuffer support, so that you won't have problems with this. You only have to check the HlmsGround files (the parts linked to the old _passBuffer variable, which is now an array).

Thanks for all the infos !
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

Thank you!
This code should be integrated in the repos as an example, its very useful

another thing to notice is that you should replace this:

Code: Select all

   // Let's prepare our viewProj matrix (before the pass)
   // You can get the camera from wherever you want
   Ogre::Camera* cam = something->getCam() ;
for this:

Code: Select all

   // Let's prepare our viewProj matrix (before the pass)
   // You can get the camera from wherever you want
   Ogre::Camera* cam = sceneManager->getCameraInProgress();
this way you can render your object from any camera, I was having trouble because I was rendering my scene from different cameras (VR, cubemap, mirrors) and it looked wrong except for the camera "something->getCam()" so "sceneManager->getCameraInProgress()" solved this issue =)
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [Solved][2.1] Trying to create a new HLMS

Post by Kinslore »

Hello !

I updated the line, thanks !
As for integrating in the repos, I don't know how to do this :?
But if it can be useful to other people, that would be great !
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

A little question related
I am working with hlms and shader templates, and I have this quick question... performance wise, whats better?:
A) lots of @property( condition ), so it will generate many different shaders
B) lots of uniforms in struct Material, and lots of if( condition ), so it will generate few shaders with lots of ifs

Thanks in advance!!!
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: [Solved][2.1] Trying to create a new HLMS

Post by dark_sylinc »

'A' will generate the optimum shader. However note that the first time the variation needs to be used it needs to be compiled, which can be seen as a minor (or major) FPS spike. This problem can be fixed by doing a warm up before starting to render (e.g. render all variations during a "Loading" screen) and/or using the microcode cache.

'B' trades that little annoyance for slower runtime performance inside the shaders (and easier maintenance of code).

On very rare cases, 'B' can outperform 'A' if the number of permutations is insane; which means lots of state switching for 'A'; and only as long as the if( condition ) from 'B' are non-divergent. But this is normally not the case as the number of permutations isn't usually that high.

Think of 'A' as inlining in C/C++. The inline will produce the optimum code for each specific case. But of you have lots and lots of inlines, you can blow the icache.
Similarly in shaders, too many specializations needed at the same time will slow you down (but this usually isn't the case)
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 169

Re: [Solved][2.1] Trying to create a new HLMS

Post by xrgo »

Thank you so much for the detailed response
hyyou
Gremlin
Posts: 173
Joined: Wed Feb 03, 2016 2:24 am
x 17

Re: [Solved][2.1] Trying to create a new HLMS

Post by hyyou »

Kinslore, it is a great starting point for dummy like me, thank!
I can't wait for your next episode of this epic tutorial series! (many lights?, XD)

There is a few (very minor) issue in the source code :-
HlmsGround.cpp at the destructor

Code: Select all

_vaoManager->destroyConstBuffer(_passBuffers) ;
have to be changed to

Code: Select all

for(int n=0;n<_passBuffers.size();n++){
	_vaoManager->destroyConstBuffer(_passBuffers[n]) ;
}
-- The following note may be useful for some readers --
According to "Ogre 2.1 Porting Manual",
  • both .glsl in Kinslore's tutorial have to be named as VertexShader_vs.glsl and PixelShader_ps.glsl precisely
    (I have thought that just suffix = _vs.glsl /_ps.glsl is ok, but it required full exact filename.)
AND they have to be in the_DIRECTORY

Code: Select all

Ogre::ArchiveManager::getSingletonPtr()->load( the_DIRECTORY )
In each timestep, the complex functions are called like this:-

Code: Select all

preparePassHash->fillBuffersFor (x no.item)->preCommandBufferExecution->frameEnded
That is why preparePassHash should do something that shared among every items e.g. calculate viewProj,
  • while preCommandBufferExecution should clean-up all dirty works.
The transpose is necessary because Ogre::Matrix4 [0][0 to 3] = first row, while Opengl's matrix[0 to 3] is first column.
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [Solved][2.1] Trying to create a new HLMS

Post by Kinslore »

Hello !

I am happy to see this is useful for everyone ! I corrected what you pointed out, feel free to tell me if there are other things to change. Like I said I wanted it perfect and... It is not ! :?

Also, sorry for the late answer. I am currently working on other projects and have been less active with Ogre, even if I follow the news when I can !
It would be a little hard for me to go further right now, but who knows, maybe one day :)

Thanks again for the tip :P