[2.1+] custom renderable HLMS properties
-
- OGRE Expert User
- Posts: 1227
- Joined: Thu Dec 11, 2008 7:56 pm
- Location: Bristol, UK
- x 157
[2.1+] custom renderable HLMS properties
Ok this is something that has come up a few times, but I keep ending up creating an entire custom HLMS for a minor difference
The issue is I want to be able to have some custom or additional functionality per renderable. That is insert/manipulate the shader code pieces based on HLMS property values.
Now this is a performance sensitive area, we do NOT want the average user trying set different properties for many different renderables, and we do NOT want any virtual functions that are called on any renderable.
But as an advanced user it would be great if we could set a number of HLMS properties for specific renderables, instead of having to create entirely new HLMS system. So what I propose is a simple protected map or even list inside Ogre::Renderable that if you are creating your own renderable type can be manipulated, then in calculateHashForPreCreate it just copies that map/list into mSetProperties.... Or even have the list/map private which can only be populated via constructor.
This way there is no performance issues (Well it allows for more shader permutations, but someone creating their own Renderable should be well aware of this), but allows advance users to have custom functionality per renderable without having to create a new HLMS type
Thought please...
Should I just create a PR that shows what I am trying to explain?
The issue is I want to be able to have some custom or additional functionality per renderable. That is insert/manipulate the shader code pieces based on HLMS property values.
Now this is a performance sensitive area, we do NOT want the average user trying set different properties for many different renderables, and we do NOT want any virtual functions that are called on any renderable.
But as an advanced user it would be great if we could set a number of HLMS properties for specific renderables, instead of having to create entirely new HLMS system. So what I propose is a simple protected map or even list inside Ogre::Renderable that if you are creating your own renderable type can be manipulated, then in calculateHashForPreCreate it just copies that map/list into mSetProperties.... Or even have the list/map private which can only be populated via constructor.
This way there is no performance issues (Well it allows for more shader permutations, but someone creating their own Renderable should be well aware of this), but allows advance users to have custom functionality per renderable without having to create a new HLMS type
Thought please...
Should I just create a PR that shows what I am trying to explain?
- dark_sylinc
- OGRE Team Member
- Posts: 5299
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1280
- Contact:
Re: [2.1+] custom renderable HLMS properties
Create a custom Hlms that derives from HlmsPbs, overrides HlmsPbs::calculateHashForPreCreate and performs the following:
No need to get fancy. That should be enough. Remember you may want to overload calculateHashForPreCaster too.
Cheers
Matias
Code: Select all
class MyHlmsPbs : public HlmsPbs
{
void calculateHashForPreCreate( Renderable *renderable, PiecesMap *inOutPieces )
{
HlmsPbs::calculateHashForPreCreate( renderable, inOutPieces );
const Renderable::CustomParameterMap ¶mMap = renderable->getCustomParameters();
Renderable::CustomParameterMap::const_iterator itor = paramMap.find( magic1234 );
if( itor != paramMap.end() )
{
setProperty( "BlaBlah", (int)itor->second.x );
}
}
};
Cheers
Matias
-
- OGRE Expert User
- Posts: 1227
- Joined: Thu Dec 11, 2008 7:56 pm
- Location: Bristol, UK
- x 157
Re: [2.1+] custom renderable HLMS properties
Its irritatingly simple when you put it like that , although I might try and save face and say its not a super clean solution and you still need to create new HLMS
-
- Gremlin
- Posts: 173
- Joined: Wed Feb 03, 2016 2:24 am
- x 17
- Contact:
Re: [2.1+] custom renderable HLMS properties
Code: Select all
void calculateHashForPreCreate( Renderable *renderable, PiecesMap *inOutPieces )
However, the cost of virtual function here is still very small because there is 1 unique virtual function per 1 Hlms (not per 1 renderable).
Thus branch prediction is mostly correct and cache miss is majorly avoided. Correct?
I just want to check my understanding.
- dark_sylinc
- OGRE Team Member
- Posts: 5299
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1280
- Contact:
Re: [2.1+] custom renderable HLMS properties
There are several things mixed in that question that I want to clarify.
Yes, that's correct: calculateHashForPreCreate gets called per renderable, and belongs to the vtable of the Hlms class.
However it is "cheap" because it only gets called when associating a material with a Renderable. Which should happen ideally once during the lifetime of a Renderable (i.e. when creating it), or at a low frequency.
If you change a lot of materials from a lot of Renderables every frame, it's going to be expensive (btw because we also do a lot more when associating materials).
Now as to the vtable and cache misses evaluation:
Cheers
Yes, that's correct: calculateHashForPreCreate gets called per renderable, and belongs to the vtable of the Hlms class.
However it is "cheap" because it only gets called when associating a material with a Renderable. Which should happen ideally once during the lifetime of a Renderable (i.e. when creating it), or at a low frequency.
If you change a lot of materials from a lot of Renderables every frame, it's going to be expensive (btw because we also do a lot more when associating materials).
Now as to the vtable and cache misses evaluation:
- The vtable is indeed per Hlms, thus almost all Renderables will end up calling the same function overload. The vtable entry should be "hot" in the cache and thus "cheap", which makes it much cheaper than calling a virtual function that would vary a lot per Renderable thus blowing the caches and the branch predictor.
- However virtual calls are more expensive than regular calls. This is specially true for CPUs with no or poor speculative execution and branch prediction such as: Atom CPUs, previous gen of consoles (PS3, XBox360), current gen of consoles (AMD Jaguar does have Spec Execution & br. pred but it is overall a slow CPU), and several mobile phones (it depends on brand, model and quality. Generally speaking it boils to how expensive and new the phone is)
Cheers