Before the introduction of HLSM disk cache, I was been able to eradicate all the runtime stalls using the microcode cache. Currently, apart the initialization process increased by 5 to 7 seconds (caused by HlmsDiskCache::applyTo invokations) which could be sustainable, I'm not able anymore to avoid runtime stalls. I've tried all the possible configurations of caches (just microcode cache, just HLMS cache and both of them), but nothing helps.
I guess the stalls are related to what is written in OgreHlmsDiskCache.h comments:
Code: Select all
Pipeline State Object This is a huge amalgamation of all the information required to (PSO) draw a triangle on screen. See HlmsPso for what's in it. The driver will internally merge the compiled microcode and PSO info and translate it into an ISA (Instruction Set Architecture) which is specific to the GPU & Driver the user currently has installed; and store the ISA into the PSO. Under Vulkan & D3D12 this ISA can be saved to disk. However for the rest of the APIs, HlmsDiskCache saves all the info required to rebuild the PSO/ISA from scratch again. HlmsDiskCache stores this info in HlmsDiskCache::Cache::pso Depending on the API and Driver, building the PSO can be very fast or take significant time. Note that due to a technical issue, this information is currently being saved to disk but the PSO is not rebuilt (i.e. the information is not used). Because of it, certain platforms may still experience some stalls at runtime, due to the driver translating the Microcode to the internal ISA.