HlmsDiskCache purpose

Design / architecture / roadmap discussions related to future of Ogre3D (version 2.0 and above)
Post Reply
User avatar
TaaTT4
OGRE Contributor
OGRE Contributor
Posts: 142
Joined: Wed Apr 23, 2014 3:49 pm
x 4

HlmsDiskCache purpose

Post by TaaTT4 » Wed Nov 28, 2018 7:02 pm

What is the goal of the HLMS disk cache? For what I've understood reading in OgreHlmsDiskCache.h, microcode caches the result of shaders compilation (to prevent D3DCompile calls between subsequent runs). On the contrary, HLMS caches the properties used to generate shaders, the output of HLMS parser and a mapping between PSO objects and the shader compiler output.
Before the introduction of HLSM disk cache, I was been able to eradicate all the runtime stalls using the microcode cache. Currently, apart the initialization process increased by 5 to 7 seconds (caused by HlmsDiskCache::applyTo invokations) which could be sustainable, I'm not able anymore to avoid runtime stalls. I've tried all the possible configurations of caches (just microcode cache, just HLMS cache and both of them), but nothing helps.

I guess the stalls are related to what is written in OgreHlmsDiskCache.h comments:

Code: Select all

Pipeline State Object       This is a huge amalgamation of all the information required to
       (PSO)                draw a triangle on screen. See HlmsPso for what's in it.
                            The driver will internally merge the compiled microcode and PSO info
                            and translate it into an ISA (Instruction Set Architecture) which is
                            specific to the GPU & Driver the user currently has installed;
                            and store the ISA into the PSO.
                            Under Vulkan & D3D12 this ISA can be saved to disk.
                            However for the rest of the APIs, HlmsDiskCache saves all the info
                            required to rebuild the PSO/ISA from scratch again.
                            HlmsDiskCache stores this info in HlmsDiskCache::Cache::pso
                            Depending on the API and Driver, building the PSO can be very fast
                            or take significant time.

                            Note that due to a technical issue, this information is currently
                            being saved to disk but the PSO is not rebuilt (i.e. the information
                            is not used). Because of it, certain platforms may still experience
                            some stalls at runtime, due to the driver translating the Microcode
                            to the internal ISA.
Runtime stalls are sores in racing games since every time they happen you are almost sure to have an accident (with walls or other players). I can afford stalls the first time you race in a track (because the shaders must be compiled yet), but not in subsequent runs. Any advice on this topic?
0 x
Senior game programmer at Vae Victis
Working on Racecraft

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 3917
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 167
Contact:

Re: HlmsDiskCache purpose

Post by dark_sylinc » Wed Nov 28, 2018 9:10 pm

First, make sure your disk cache is newer than this commit, because that was a critical bug that seriously hampered its effectiveness.

To answer your question:
What is the goal of the HLMS disk cache?
1. To speed up runtime generation of shaders, as the Hlms parser no longer needs to be run for cached shaders. WIth the microcode cache alone, this step is still needed. In Debug mode the impact is much bigger.
2. To reduce load time spent in Hlms::addRenderableCache when creating the Items for the first time
3. To have a cache on platforms where the microcode cache is not supported (Android, Apple ecosystem)
4. Cache/regenerate the PSOs. --- TODO/PENDING.
TaaTT4 wrote:
Wed Nov 28, 2018 7:02 pm
Before the introduction of HLSM disk cache, I was been able to eradicate all the runtime stalls using the microcode cache. Currently, apart the initialization process increased by 5 to 7 seconds (caused by HlmsDiskCache::applyTo invokations) which could be sustainable, I'm not able anymore to avoid runtime stalls. I've tried all the possible configurations of caches (just microcode cache, just HLMS cache and both of them), but nothing helps.
THAT makes no sense. The HlmsDiskCache as can be seen, is external to the Hlms class.
We did make a few modifications to Hlms in order to allow more reuse of shaders, as we were pointlessly generating too many. But it should not affect the microcode cache.

So you're getting >250ms stalls again despite the caches (both microcode and hlmsdiskcache) being active, as if there were no cache active?
Does the problem persist before the merge of the disk cache branch?

When inside Hlms::createShaderCacheEntry, does it hit or miss this:

Code: Select all

if( itCodeCache == mShaderCodeCache.end() )
    compileShaderCode( codeCache );
else
{
    for( size_t i=0; i<NumShaderTypes; ++i )
        codeCache.shaders[i] = itCodeCache->shaders[i];
    codeCache.mergedCache.setProperties.swap( mSetProperties );
}
When inside D3D11HLSLProgram::loadFromSource, does it hit or miss the isMicrocodeAvailableInCache call?
0 x

Post Reply