Texture Streaming Refactor proposal (w/ SLIDES)

Design / architecture / roadmap discussions related to future of Ogre3D (version 2.0 and above)
al2950
OGRE Expert User
OGRE Expert User
Posts: 1140
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 55

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by al2950 » Thu Sep 22, 2016 7:59 am

This is looking very promising :D

I am not so worried about streaming as I can load everything in on start up, but everything else is very high on my list of dreams!

I would also agree that we should merge the PSO branch & metal branch into 2.1 and do an official release, and then start on a 2.2. I am literally delivering a system as I type this using Ogre 2.1, and although I know it is very stable, convincing my QA department was a struggle!
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Thu Sep 22, 2016 3:51 pm

I agree with merging PSO into 2.1.

My main blocker is that we're not providing any sort of utility to cache PSOs for those who need to write low level shaders (mostly porting 3rd party GUI solutions).

It isn't hard to write one, it's just time consuming. It would do something similar to what Hlms does. Basically build an HlmsPso from the needed parameters, find in a map if it's already been created; if so, return that PSO. Otherwise create one.
0 x

al2950
OGRE Expert User
OGRE Expert User
Posts: 1140
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 55

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by al2950 » Fri Sep 23, 2016 1:43 am

dark_sylinc wrote: It isn't hard to write one, it's just time consuming. It would do something similar to what Hlms does. Basically build an HlmsPso from the needed parameters, find in a map if it's already been created; if so, return that PSO. Otherwise create one.
Sounds perfect for a community contribution! I have not got round to looking at PSO's in Ogre yet so not really sure how they function, but I am sure ill look at it in the near future. But please share any other details you can so someone here can help implement and give you more time elsewhere.
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Fri Sep 23, 2016 2:29 am

It's perfect for a community contribution indeed.

PSOs basically contain everything condensed into one big object. Macroblock & Blendblock information, stencil test params, vertex input layout, even render target information (including MRT count, MSAA settings, formats of each MRT, depth buffer format).

From an engine design point of view this means PSOs should be created on load. From an API perspective, PSOs give a lot of performance optimizations to the driver as the driver can see everything that can and will be used (and each of its relationships), perform heavy optimizations; and encapsulate the optimized state into the PSO.

However for immediate-mode style rendering (very common in most GUIs out there); this paradigm sucks (unless the GUI has been designed in mind with PSOs).

Ogre (and therefore, the Hlms) separates the PSO into two segments: PSO data that rarely changes (such as RTT formats, depth buffer format, stencil test settings), and PSO data that may change often: Macroblock, blendblock, vertex layout, and shaders.
In Ogre we have for that HlmsPso & HlmsPassPso (HlmsPassPso is part of HlmsPso).

My idea is that a cache class should work something like this:

Code: Select all

PsoCache cache; //This is persistent, not a local variable

void render()
{
     cache.clear();
     cache.setRenderTarget( renderTarget );
     cache.setStencilSettings( stencilParams );

     foreach( object_to_render )
     {
          cache.setVertexFormat( vertexElements, operationType, enablePrimitiveRestart );
          cache.setMacroblock( macroblock );
          cache.setBlendblock( blendblock );
          cache.setShaders( ... );
          HlmsPso *pso = cache.getPso();
          renderSystem->_setPipelineStateObject( pso );
     }
}
getPso() would check if it's dirty; if it's not, just return the same PSO as before. If it is; it will find an already created HlmsPso from its cache. If it's not, then create a new one (by calling _hlmsPipelineStateObjectCreated; when destroying the entire cache don't forget to call _hlmsPipelineStateObjectDestroyed).

Naturally, the dev using the cache should optimize as much as possible (i.e. if the vertex format is always the same, then call cache.setVertexFormat outside the loop).
Also if the user provides macro & blendblocks by pointer created from HlmsManager, checking if they're different is just a pointer compare. i.e. if( oldBlendblock != newBlendblock ) mDirty = true;

Overall it's simple, and would simplify a lot the porting of GUI tools (Gorilla, CEGUI, etc) to Ogre 2.1-pso
0 x

al2950
OGRE Expert User
OGRE Expert User
Posts: 1140
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 55

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by al2950 » Fri Oct 14, 2016 10:43 am

I follow 75% of what you have explained, but I think I should tackle and update one of the GUI's to get a better idea, maybe MyGUI. As I currently understand it though, this PSOCache is not required to get the GUI's to work, but they would greatly benefit from it?
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Fri Oct 14, 2016 3:26 pm

Managing PSOs is required. What's optional would be using a PsoCache implementation we'll provide to make this management easy.

I had started writing a PSO cache (turns out writing these requirements down helped a lot), but I have to re-tune it for better performance, then commit and push.
0 x

al2950
OGRE Expert User
OGRE Expert User
Posts: 1140
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 55

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by al2950 » Fri Oct 14, 2016 3:45 pm

dark_sylinc wrote:Managing PSOs is required. What's optional would be using a PsoCache implementation we'll provide to make this management easy.

I had started writing a PSO cache (turns out writing these requirements down helped a lot), but I have to re-tune it for better performance, then commit and push.
Fair enough, ill wait for it and see if it matches what I had in mind! :P
0 x


skhoroshavin
Gnoblar
Posts: 15
Joined: Sat May 21, 2016 5:07 pm

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by skhoroshavin » Fri Apr 14, 2017 4:10 pm

Slides look great, any info on when this refactor is going to be started?
0 x

skhoroshavin
Gnoblar
Posts: 15
Joined: Sat May 21, 2016 5:07 pm

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by skhoroshavin » Sun Apr 16, 2017 11:37 pm

Oops, seems like it's already started. 22 hours ago, bitbucket says :)
0 x

User avatar
spookyboo
Silver Sponsor
Silver Sponsor
Posts: 1139
Joined: Tue Jul 06, 2004 5:57 am
x 15
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by spookyboo » Mon Apr 17, 2017 11:08 am

It's a 2.2 branch. Meaning that a release candidate for 2.1 is on its way? In that case shouldn't the license text "Copyright (c) 2000-2014 Torus Knot Software Ltd" be updated? (or adding a supplemental 2015-2017 copyright).
0 x

xrgo
OGRE Expert User
OGRE Expert User
Posts: 913
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 51

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by xrgo » Sat Jul 29, 2017 5:20 pm

Hello! I see a lot of movement in the branch, I hope it's doing great! You are a effing god Matias!
I have a couple of requests regarding this new system:
1) Would be possible to load easily in to vram just up to n mips? So I can have an option to use lower quality version of the textures that will actually use less vram.
2) And I would like an easy way to load a texture from a specific (relative or absolute) path, no using the resource manager.
Thanks!
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Sat Jul 29, 2017 8:57 pm

xrgo wrote:1) Would be possible to load easily in to vram just up to n mips? So I can have an option to use lower quality version of the textures that will actually use less vram.
I can't comment on this because there's a lot of factors involved that make this complex and hard. Something like this is within the goals, but right now we're too far from that.

The short story is that if you have a material with three 2048x2048 textures and they're all in the same texture array, that's all and well. But if you only change one of those textures from 2048x2048 to 1024x1024; it's going to be in a different texture array and generate a new shader. And new shader = hiccup while compiling it.

If you downsize all 3 textures and if they end up in the same array again, then this can proceed without hiccups. But to be hiccup-free we have to guarantee that:
  • The texture you want is downsized
  • The other textures used by that material are also downsized
  • The downsized textures are put in the same arrays (to be able to use the same shader).
  • If one of these textures is also used by another material (which uses other textures), it may cause a domino effect
  • Ensuring that all of these conditions are met has its own overhead which could outweight just recompiling the shader.
But if memory consumption is your top priority you may just want to ignore that and pay the price of recompiling the shader. Btw the shader may already be in the microcode cache though, and the price will be very small. But because the number of texture permutations could be huge, we have to do a big effort to keep it from exploding or else the shader is likely not going to be in the cache.

That's the short version. There are more details at play.
xrgo wrote:2) And I would like an easy way to load a texture from a specific (relative or absolute) path, no using the resource manager.
Thanks!
Yes, absolutely.
0 x

xrgo
OGRE Expert User
OGRE Expert User
Posts: 913
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 51

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by xrgo » Sat Jul 29, 2017 11:48 pm

Thank you so much!

1) but what if I set that at the moment that I load the texture, in other words, for the shader was never 2048, so no hiccup. I actually do this right now with something like this: http://www.ogre3d.org/forums/viewtopic. ... 31#p518003 and you actually commented
dark_sylinc wrote:The idea is quite clever btw. I was thinking of something similar, but I like yours better.
=D

2) Fantastique!!!
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Sun Jul 30, 2017 12:06 am

xrgo wrote:1) but what if I set that at the moment that I load the texture, in other words, for the shader was never 2048, so no hiccup.
If this data is known beforehand like you're suggesting, then there should be no problem at all.
0 x

xrgo
OGRE Expert User
OGRE Expert User
Posts: 913
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 51

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by xrgo » Sun Jul 30, 2017 1:09 am

dark_sylinc wrote:If this data is known beforehand like you're suggesting, then there should be no problem at all.
yes! that would be very useful. I actually set a quality setting as "Low" and then every texture is loaded with a minLod of non 0, and just stays there
Thank you!
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Fri Sep 28, 2018 7:51 pm

After a long time, it's finally done: Yesterday I pushed a set of several commits that introduced the Texture Metadata Cache.

I opted to use JSON instead of a binary format because the metadata cache file on disk is easier to inspect that way. Not to mention the metadata can also be used to manipulate the new feature of texture pool IDs, which can be very important for some engines that take advantage of it.
Pool IDs are basically a way to ensure textures with the same pool ID get grouped together (as long as they have same format & resolution), or rather... a way to prevent completely different textures to accidentally end up being grouped together.

The Metadata cache still needs testing, but I can already notice the fewer of fps hitches in OpenGL when the textures finish streaming and appear on screen. But I have yet to test D3D11. I'd expect D3D11 to be much more benefited from the metadata cache.

The code also handles the case were the metadata was out of date (or just intentionally lied...). If the cache was out of date, loading times will be higher because we have to retry loading a few things again related to that texture from scratch. To keep thread safety the cache-missed texture needs to go back to the main thread and then back again to the worker thread.
While optimizing this corner case could be possible, it only complicates the code and design, and we have to work under the assumption that the cache will be correct 99% of the time, because it's rare to modify the width/height/pixel format/texture type of a texture even during development. And when that happens, the performance hit is definitely acceptable (it's a small 'hitch').
3 x

rujialiu
Greenskin
Posts: 138
Joined: Mon May 09, 2016 8:21 am
x 11

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by rujialiu » Mon Oct 08, 2018 3:48 am

dark_sylinc wrote:
Fri Sep 28, 2018 7:51 pm
After a long time, it's finally done: Yesterday I pushed a set of several commits that introduced the Texture Metadata Cache.
By "it's finally done" do you mean "The texture refactor is finally done"?
So when will you remove the "WIP" suffix from the branch name? 8-)
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3810
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 139
Contact:

Re: Texture Streaming Refactor proposal (w/ SLIDES)

Post by dark_sylinc » Mon Oct 08, 2018 4:42 am

I've actually been thinking about that.

Before dropping the WIP label, the following is needed:
  • Test TextureGpu::scheduleTransitionTo (3x, one for each GpuPageOutStrategy option):
    • Resident -> OnSystemRam
    • Resident -> OnStorage
    • OnSystemRam -> OnStorage
    • OnSystemRam -> Resident
    I don't think any of these work too well, if at all. The only well tested path is OnStorage -> Resident, and maybe Resident -> OnStorage.
  • Implement going Resident with AlwaysKeepSystemRamCopy. Right now TextureGpu demands the sysram copy to be provided with _transitionTo when going Resident, otherwise exceptions/asserts are triggered. Now that I've had time to think, this makes little sense. The pointer must be provided before/during TextureGpu::notifyDataIsReady gets called. There is no need to require it while going Resident. Back then, when I started, I had the notion that a TextureGpu being Resident meant it was ready to display, which is not the same thing. Hence it asks for a memory pointer when using AlwaysKeepSystemRamCopy. This is wrong.
  • Better error handing. Right now if there is an exception in the worker thread, the thread terminates abruptly and textures stop streaming, and the main thread will likely deadlock or livelock
Once that's done, it's basically it... the WIP label could be officially be dropped and it's 2.2. There could be rough edges to polish, but the big ones are these. And it's not that much work actually.
0 x

Post Reply