HLMS vs RTSS

Post by **paroj** » Fri Dec 22, 2017 6:02 pm

I am currently looking into hooking up PBS in the 1.10 branch.
Well technically it is already there via the HLMS backport by wolfmanfx - just not connected to the material scripts.

However I am wondering whether I should rather bring it to the RTSS instead.

To me the RTSS is a strict superset of HLMS in the sense that it can do everything HLMS can and more.
At its core the HLMS shader templates are just another variation of an uber-shader with all of its drawbacks like the
#ifdef explosion accompanied by the hard to maintain paths through the shader. Bad encapsulation basically.

The RTSS on the other hand shines in this regard as it automatically handles the connection of the library functions and one
only looks at one function at a time.

Furthermore it is more flexible by providing custom shader stages - similarly to the custom render queues.
Certainly the current RTSS implementation is rather arcane and badly documented, but this can be solved
if it is the better approach on the conceptual level.

I guess there are some users here that have used both and thus can provide some first hand experience.
And this is what I am looking for

Post by **dark_sylinc** » Sat Dec 23, 2017 3:52 am

Do you want to start a war? you crazy guy!

paroj wrote: ↑Fri Dec 22, 2017 6:02 pm To me the RTSS is a strict superset of HLMS in the sense that it can do everything HLMS can and more.

I could say the opposite and we'd scream at each other the same ad infinitum.

paroj wrote: ↑Fri Dec 22, 2017 6:02 pm At its core the HLMS shader templates are just another variation of an uber-shader with all of its drawbacks like the
#ifdef explosion accompanied by the hard to maintain paths through the shader. Bad encapsulation basically.

That is really left to the maintainer (the one writing the templates). So you could say I am the one to blame for the poor encapsulation, but not the technology itself.
In that sense Hlms is very much like C++. Very flexible, can be very fast, and a billion ways of doing the same. And when you shoot yourself in the foot you blow yourself the entire leg.

TBH at the time I wrote the Hlms, the reasons I didn't properly encapsulated and implemented as functions like RTSS does (which is indeed cleaner) were 3:

For functions, I did not trust GL drivers in particular at optimizing them. So I ended up going straight for code chunk insertion. The parser wasn't complex, so it couldn't handle functions itself (i.e. implement function calling parsing inside the Hlms, rather than leaving it to the shader compiler). In particular, Android Mobile was (and still is) the main concern (ironically, it ended up as the platform with poorest support) The @foreach statement was literally born out of necessity from an Adreno driver bug by hardcoding and manually unrolling a regular for loop.
I did not trust GL shader compilers at correctly handling macros either. At that time I did not know we actually ran our own GLSL macro preprocessor (literally because the GL shader compilers had proven to not be trustworthy in that area!) rather than leaving it to the driver. Had I known that, I'd made more use of macro-like functions. In fact I'm slowly porting the current code to this style which results in cleaner, more modular and understandable code.
Absolutely no C++ integration (more on this later)

Certainly the current RTSS implementation is rather arcane and badly documented, but this can be solved

Another of the reasons I didn't bother with the RTSS and preferred to rather start anew.

About C++ integration:
The Hlms does more than just gluing shader code together.
Because there's a C++ part, each implementation can upload const and texture buffers as efficiently and flexible as possible. The RTSS inherited the "shader parameter" paradigm of old Ogre, which doesn't map well to modern APIs; and gets in the way of high performance.
Another important part of the Hlms is handling PSO. PSO is one big blob of everything:

Shaders to use (vertex, pixel, geometry, hull, domain shaders)
Vertex format (position, normals, uvs, etc)
Raster state such as (aka 2.1's macroblocks):
- Depth writes
- Depth comparison function
- Alpha to coverage setting
- Culling mode
- Polygon fill mode
- Scissor setup
- etc
Blending state (alpha blending setup per color attachment, aka 2.1's blendblocks)
Information about the render target:
- Pixel format of each color target
- Pixel format of depth and stencil buffers
- MSAA setting

As a result there are 3 different moments we can distinguish:

Material bind time. At that point we know information about the model (such as normals, UVs), and the intrinsic material information (such as whether there's a normal map, the blending modes).
Pass time. We do not know about an individual object (unless the one writing the Hlms implementation can perform certain assumptions, eg. Terrain rendering), but we do know about global stuff: MSAA settings, depth setting overrides, whether shadow mapping is supported, etc. We can also upload toe const buffers pass-invariant data such as view and projection matrices.
Draw time. That's where the material and pass are merged together and we know all the information to properly generate a shader and the PSO.

In terms of the Hlms, that's literally two hashes that are merged to form a final hash; and that is used to retrieve the PSO from a cache, and generate it if it's not there.
This algorithm solves the PSO problem, allows for efficient caching and decouples information evaluation. It's not without its problems as now we don't have a way to predict the parameter combinations / permutations which makes shader precaching a lot harder.

The RTSS did none of that. It simply handled the GPU side of things, and surrendered itself to the old material system, evaluating everything either at draw time or at bind time. The Hlms was instead in charge (in fact, old materials are now implemented as an Hlms implementation).
And the RTSS had really poor documentation.

Comparing the Hlms to RTSS at this point is more like comparing apples to cows really. Unless you stick to the generated shader side.

Now let's see what the users say...

Hotshot5000 · Post by **Hotshot5000** » Sat Dec 23, 2017 11:07 am

Personally I never used RTSS in Ogre 1.8. It seemed to complex at that time so I wrote my own shaders. With Hlms, while not trivial, I found my way around much faster. I've ported the common Hlms shaders to GLES3 and off I went. It can get a bit confusing with all those #ifdefs but I got the thing much faster than I thought I would. RTSS made me not even try. I think it was lack of documentation that made me avoid it completely and just roll my own stuff (if I remember correctly).

Just my 2 cents.

Post by **paroj** » Sat Dec 23, 2017 2:33 pm

dark_sylinc wrote: ↑Sat Dec 23, 2017 3:52 am Do you want to start a war? you crazy guy!

no, I actually want to find out to which degree the deprecation of the RTSS happened due to its conceptual issues and to which degree due to the used coding style and lack of documentation.

To me the RTSS it very hard to read as well, but reading is the only option as there is no other documentation. However once I wrapped my head around it, I find the concept quite intriguing.

I started summarizing my findings here:
https://github.com/paroj/ogre/blob/65d1 ... s_overview

basically it can fully auto generate the main function taking care of (local) variable naming. I think even when using functions this would not be possible with the HLMS.

Furthermore the RTSS/GLSLES implementation contains a function dependency parser that automatically copy pastes the referenced function (and their dependencies) into your main shader file - as if each of them was surrounded by @piece.

xrgo · Post by **xrgo** » Sat Dec 23, 2017 10:07 pm

Never used RTSS, but I can say HLMS is very powerful so I really doubt RTSS can do more, so lets call it a tie? xD
I do think that HLMS is hard to understand at first, and its a little hard to maintain also... when adding new features you have to change various things in order to work and not get glitches. But I think its a really good basis to make something more user/noob friendly, I imagine something where you can just add modules (mate) (via json o code) and automatically generate everything in the right order. I think I am capable of that task but I am nowhere near to have such free time =(

Post by **paroj** » Sun Dec 31, 2017 2:33 am

ok, the first conclusion I drew from the replies was to improve the RTSS documentation:
https://ogrecave.github.io/ogre/api/1.1 ... s_overview

take a look at the System Overview section for an idea how the RTSS works.

Then after seen this:
https://github.com/OGRECave/ogre/blob/e ... sl#L23-L34

I do not think that the HLMS is "easier" then the RTSS. Both are about the same level of WTF. For comparision here is some RTSS:
https://github.com/OGRECave/ogre/blob/c ... #L242-L269

al2950 · Post by **al2950** » Wed Jan 10, 2018 9:55 pm

I have used RTSS considerably in the past and use HLMS heavily now.

A lot of RTSS complexion was not the concept its self but a lot to do with how it was hammered into the old material system. However what makes HLMS more powerful than RTSS, in my mind, is its pre-processor syntax. Yes HLMS has all the other benefits mentioned, but a lot of that is down to not being tide down to the Ogre 1.x way or rendering. ie the old material system.

Having said that HLMS has an issue of complexity running out of control. RTSS was nice in that is basically applied a bunch of functions in a certain order. You just had to specify the inputs, outputs and the function, this was great but had a fairly severe impact on un-optimised mobile platforms, which may or may not be an issue now?

It would be really nice to have a similar system that was a layer on top of HLMS. Take the BRDFs, for example, they are in their own functions but are difficult to use in other HLMS as you have to make sure certain variables are declared, and filled correctly, in your own shared. If they were some sort of HLMS_FUNC that can be really easily shared, that would be great

Ogre Forums

HLMS vs RTSS

HLMS vs RTSS

Re: HLMS vs RTSS

Re: HLMS vs RTSS

Re: HLMS vs RTSS

Re: HLMS vs RTSS

Re: HLMS vs RTSS

Re: HLMS vs RTSS