[2.1+] custom renderable HLMS properties

al2950 · Post by **al2950** » Wed Apr 24, 2019 9:04 pm

Ok this is something that has come up a few times, but I keep ending up creating an entire custom HLMS for a minor difference

The issue is I want to be able to have some custom or additional functionality per renderable. That is insert/manipulate the shader code pieces based on HLMS property values.

Now this is a performance sensitive area, we do NOT want the average user trying set different properties for many different renderables, and we do NOT want any virtual functions that are called on any renderable.

But as an advanced user it would be great if we could set a number of HLMS properties for specific renderables, instead of having to create entirely new HLMS system. So what I propose is a simple protected map or even list inside Ogre::Renderable that if you are creating your own renderable type can be manipulated, then in calculateHashForPreCreate it just copies that map/list into mSetProperties.... Or even have the list/map private which can only be populated via constructor.

This way there is no performance issues (Well it allows for more shader permutations, but someone creating their own Renderable should be well aware of this), but allows advance users to have custom functionality per renderable without having to create a new HLMS type

Thought please...

Should I just create a PR that shows what I am trying to explain?

Post by **dark_sylinc** » Thu Apr 25, 2019 12:20 am

Create a custom Hlms that derives from HlmsPbs, overrides HlmsPbs::calculateHashForPreCreate and performs the following:

Code: Select all

class MyHlmsPbs : public HlmsPbs
{
void calculateHashForPreCreate( Renderable *renderable, PiecesMap *inOutPieces )
{
    HlmsPbs::calculateHashForPreCreate( renderable, inOutPieces );
    
    const Renderable::CustomParameterMap &paramMap = renderable->getCustomParameters();
    Renderable::CustomParameterMap::const_iterator itor = paramMap.find( magic1234 );
    if( itor != paramMap.end() )
    {
        setProperty( "BlaBlah", (int)itor->second.x );
    }
}
};

No need to get fancy. That should be enough. Remember you may want to overload calculateHashForPreCaster too.

Cheers
Matias

al2950 · Post by **al2950** » Sun Apr 28, 2019 1:24 pm

Its irritatingly simple when you put it like that

, although I might try and save face and say its not a super clean solution and you still need to create new HLMS

hyyou · Post by **hyyou** » Tue May 14, 2019 3:02 am

Code: Select all

void calculateHashForPreCreate( Renderable *renderable, PiecesMap *inOutPieces )

In the dark_sylinc's snippet, the virtual function is called per-renderable, right?

However, the cost of virtual function here is still very small because there is 1 unique virtual function per 1 Hlms (not per 1 renderable).
Thus branch prediction is mostly correct and cache miss is majorly avoided. Correct?

I just want to check my understanding.

Post by **dark_sylinc** » Tue May 14, 2019 3:19 am

There are several things mixed in that question that I want to clarify.

Yes, that's correct: calculateHashForPreCreate gets called per renderable, and belongs to the vtable of the Hlms class.

However it is "cheap" because it only gets called when associating a material with a Renderable. Which should happen ideally once during the lifetime of a Renderable (i.e. when creating it), or at a low frequency.
If you change a lot of materials from a lot of Renderables every frame, it's going to be expensive (btw because we also do a lot more when associating materials).

Now as to the vtable and cache misses evaluation:

The vtable is indeed per Hlms, thus almost all Renderables will end up calling the same function overload. The vtable entry should be "hot" in the cache and thus "cheap", which makes it much cheaper than calling a virtual function that would vary a lot per Renderable thus blowing the caches and the branch predictor.
However virtual calls are more expensive than regular calls. This is specially true for CPUs with no or poor speculative execution and branch prediction such as: Atom CPUs, previous gen of consoles (PS3, XBox360), current gen of consoles (AMD Jaguar does have Spec Execution & br. pred but it is overall a slow CPU), and several mobile phones (it depends on brand, model and quality. Generally speaking it boils to how expensive and new the phone is)

However like I said none of this matters too much because calculateHashForPreCreate is not meant to be called every frame for a lot of objects (and if you do, there will be other much expensive routines to worry about as well). It should only be called when changing/assigning materials, or whenever HlmsDatablock::flushRenderable gets called.

Cheers

Ogre Forums

[2.1+] custom renderable HLMS properties

[2.1+] custom renderable HLMS properties

Re: [2.1+] custom renderable HLMS properties

Re: [2.1+] custom renderable HLMS properties

Re: [2.1+] custom renderable HLMS properties

Re: [2.1+] custom renderable HLMS properties