Dealing with Load/Store Semantics

Problems building or running the engine, queries about how to use features etc.
psysu
Halfling
Posts: 74
Joined: Tue Jun 01, 2021 7:47 am
x 6

Dealing with Load/Store Semantics

Post by psysu »

Ogre Version: 3.0
Operating System: Win 10
Render System: Vulkan

Hi,

for the below compositor node on Vulkan Rendering System, the color output is not stored properly ( some pixels are discarded ) . I attached an image of the final output below. I think I'm doing something wrong with load/store semantics since I'm using multi pass for colibri gui. it would be really helpful if someone could explain what I'm doing wrong here.

Also i tried to do a renderdoc capture of this, but the application crashes if I do that.

Thanks

final image :

Image

compositor node :

Code: Select all

compositor_node TestNode
{
    in 0 rt_renderwindow
    
texture rtt target_width target_height PFG_RGBA8_UNORM_SRGB msaa_auto texture rtt_depthbuffer target_width target_height PFG_D32_FLOAT_S8X24_UINT msaa_auto rtv rtt { depth_stencil rtt_depthbuffer } target rtt_depthbuffer { pass stencil { check true mask 0xFF read_mask 0xFF ref_value 1 both { pass_op invert depth_fail_op invert fail_op invert comp_func always_pass } profiling_id "Stencil buffer Pass" } pass render_scene { load { all clear clear_colour 1.0 1.0 1.0 1.0 } rq_first 10 rq_last 11 profiling_id "Main Object Closed Parts Queue Depth/Stencil Pass" } } target rtt { pass stencil { check false } pass render_scene { load { depth clear } rq_first 6 rq_last 7 profiling_id "Bg Pass" } pass custom colibri_gui { skip_load_store_semantics false identifier 123 profiling_id "Back Colibri GUI" aspect_ratio_mode keep_width } pass render_scene { rq_first 7 rq_last 12 profiling_id "Main Object Closed & Non-Closed Queue Colour Pass" } pass stencil { check true mask 0xFF read_mask 0xFF ref_value 0 both { pass_op keep depth_fail_op zero fail_op keep comp_func not_equal } } pass render_scene { rq_first 12 rq_last 13 profiling_id "Cap Quad Pass" } pass stencil { check false } pass render_scene { rq_first 13 rq_last 200 profiling_id "Other Objects Colour Pass" } pass render_scene { rq_first 200 rq_last 225 camera OrthoCamera profiling_id "Ortho2D Perspective Pass" } pass custom colibri_gui { skip_load_store_semantics false identifier 456 profiling_id "Front Colibri GUI" aspect_ratio_mode keep_width } } target rt_renderwindow { pass render_quad { load { all dont_care } store { depth dont_care stencil dont_care } material Ogre/Copy/4xFP32 input 0 rtt profiling_id "CopyBack rtt to MainWindow RenderTexture" } } }
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: Dealing with Load/Store Semantics

Post by dark_sylinc »

Hi!

Also i tried to do a renderdoc capture of this, but the application crashes if I do that.

RenderDoc has an option "Debugger Delay" which freezes the app for N seconds to give you time to go to Visual Studio Debug -> Attach to Running Process.

I suspect the problem may be here:

Code: Select all

pass render_scene
    {
        load
        {
            depth clear
        }

    rq_first 6
    rq_last 7

    profiling_id "Bg Pass"
}

Because you're not clearing colour. So if that bg pass fails to fill the whole screen (maybe it's not showing up like you think it should be doing?), or anything at all, it could end up glitching the screen.

Btw unrelated to your bug, it sounds like you want the following to be part of a single pass (I'm not sure if what I'm about to suggest works if you use stencil passes though). You can't have a single pass because you're using multipass Colibri.
However the next best thing is to try to use a fewer passes as possible.
OgreNext will try to do this for you automatically, but if you set it explicitly you can be certain of it:

Code: Select all

compositor_node TestNode
{
    in 0 rt_renderwindow

texture rtt target_width target_height PFG_RGBA8_UNORM_SRGB msaa_auto
texture rtt_depthbuffer target_width target_height PFG_D32_FLOAT_S8X24_UINT msaa_auto

rtv rtt
{
    depth_stencil rtt_depthbuffer
}

target rtt_depthbuffer
{   
    pass stencil
    {
        load
        {
            all clear
            clear_colour 1.0 1.0 1.0 1.0
        }

        check true 
        
        mask 0xFF
        read_mask 0xFF

        ref_value 1

        both
        {
            pass_op invert
            depth_fail_op invert
            fail_op invert

            comp_func always_pass
        }

        profiling_id "Stencil buffer Pass"
    }

    pass render_scene
    {
        skip_load_store_semantics true

        rq_first 10
        rq_last 11

        profiling_id "Main Object Closed Parts Queue Depth/Stencil Pass"
    }
}

target rtt
{        
    pass stencil
    {
        load
        {
            depth clear
        }

        check false 
    }

    pass render_scene
    {
        skip_load_store_semantics true

        rq_first 6
        rq_last 7

        profiling_id "Bg Pass"
    }

    pass custom colibri_gui
    {
        skip_load_store_semantics false // Unfortunately we must break the pass here because of multipass in Colibri

        identifier 123
        profiling_id "Back Colibri GUI"
        aspect_ratio_mode keep_width
    }

    pass render_scene
    {
        skip_load_store_semantics true // Continue using the same opened pass as the previous Colibri pass
        
        rq_first 7
        rq_last 12

        profiling_id "Main Object Closed & Non-Closed Queue Colour Pass"
    }

    pass stencil
    {
        skip_load_store_semantics true

        check true

        mask 0xFF
        read_mask 0xFF

        ref_value 0

        both
        {
            pass_op keep
            depth_fail_op zero
            fail_op keep  
            comp_func not_equal  
        }
    }

    pass render_scene
    {
        skip_load_store_semantics true

        rq_first 12
        rq_last 13

        profiling_id "Cap Quad Pass"
    }

    pass stencil
    {
        skip_load_store_semantics true
            
        check false
    }

    pass render_scene
    {
        skip_load_store_semantics true

        rq_first 13
        rq_last 200

        profiling_id "Other Objects Colour Pass"
    }

    pass render_scene
    {
        skip_load_store_semantics true

        rq_first 200
        rq_last 225
        camera OrthoCamera
        
        profiling_id "Ortho2D Perspective Pass"
    }

    pass custom colibri_gui
    {
        skip_load_store_semantics false  // Unfortunately we must break the pass here because of multipass in Colibri

        identifier 456
        profiling_id "Front Colibri GUI"
        aspect_ratio_mode keep_width
    }
}
}

But IMO your priority problem should be to understand why is RenderDoc crashing (you have to hook a debugger for that).

When RenderDoc crashes it's often because of wrong API usage of wrong alignment, or there is some use-after-free that was going incognito during a normal run, etc. And most of those errors may explain why you're having a wrong render.

Cheers

psysu
Halfling
Posts: 74
Joined: Tue Jun 01, 2021 7:47 am
x 6

Re: Dealing with Load/Store Semantics

Post by psysu »

ok I'll look into things you mentioned for that compositor node.

But here's one basic compositor node which has the same issue as the previous one except I managed to capture a frame for this node and i attached the capture below.

Here's the compositor node I tested it on. I commented out several passes for debugging purposes, but normally i do run those passes and i get the same colour discarding issue regardless.

Code: Select all

compositor_node DefaultNode
{
    in 0 rt_renderwindow
    
target rt_renderwindow { pass clear { colour_value 1.0 1.0 1.0 1.0 profiling_id "Clear Pass" } pass render_scene { rq_first 6 rq_last 7 profiling_id "Bg Pass" } // pass custom colibri_gui // { // skip_load_store_semantics false // identifier 123 // profiling_id "Back Colibri GUI" // aspect_ratio_mode keep_width // } pass render_scene { rq_first 7 rq_last 200 profiling_id "Main Objects Pass" } //pass render_scene //{ // rq_first 200 // rq_last 225 // camera OrthoCamera // // profiling_id "Ortho2D Perspective Pass" //} //pass custom colibri_gui //{ // skip_load_store_semantics false // identifier 456 // profiling_id "Front Colibri GUI" // aspect_ratio_mode keep_width //} } }
load_store_semantic_issue.rar

On this Capture, colour output is discarded at EID : 28. why was that? Ogre-Next should handle those automatically, right?

You do not have the required permissions to view the files attached to this post.
psysu
Halfling
Posts: 74
Joined: Tue Jun 01, 2021 7:47 am
x 6

Re: Dealing with Load/Store Semantics

Post by psysu »

Okay for this one i removed all the commented parts, this is how i normally render the scene.

Code: Select all

compositor_node DefaultNode
{
    in 0 rt_renderwindow
    
target rt_renderwindow { pass clear { colour_value 1.0 1.0 1.0 1.0 profiling_id "Clear Pass" } pass render_scene { rq_first 6 rq_last 7 profiling_id "Bg Pass" } pass custom colibri_gui { skip_load_store_semantics false identifier 123 profiling_id "Back Colibri GUI" aspect_ratio_mode keep_width } pass render_scene { rq_first 7 rq_last 200 profiling_id "Main Objects Pass" } pass render_scene { rq_first 200 rq_last 225 camera OrthoCamera profiling_id "Ortho2D Perspective Pass" } pass custom colibri_gui { skip_load_store_semantics false identifier 456 profiling_id "Front Colibri GUI" aspect_ratio_mode keep_width } } }
load_store_semantic_with_colibri_issue.part01.rar
load_store_semantic_with_colibri_issue.part02.rar
load_store_semantic_with_colibri_issue.part03.rar

For this compositor node, the colour is getting discarded at EID : 61 at the end of a render pass.

You do not have the required permissions to view the files attached to this post.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: Dealing with Load/Store Semantics

Post by dark_sylinc »

OK I was able to take a look (I couldn't before because Vulkan RenderDoc captures can't really be used on other GPUs, so I just recompiled RenderDoc and forced it anyway).

Maybe the fact that you're rendering to the swapchain in multiple VkRenderPasses AND using MSAA is what's causing this (Swapchains are a bit of a special case, and MSAA complicates things)

The first thing I see is that you're using MSAA. The default option should be store_or_resolve (however it's acting as if it's "dont_care" for some reason). Even if it were working correctly, this is not what you want.

  • BgPass + Back Colibri GUI + Main Object Pass + Ortho2D Perspective Pass should be set to "store" (not store_or_resolve). Because you want to continue rendering to MSAA later.
  • The last pass, Front Colibri GUI, you want it to be set to "store_or_resolve" so that it loads the MSAA from the previous pass, and then finally resolve to the swapchain.

Now as to why it's using dont_care when you're using default values, I don't know. Sounds like a potential bug. Without debugging into VulkanRenderPassDescriptor::setupColourAttachment and see what is being set, and trace it backwards to CompositorPassDef::mStoreActionColour; it'd be difficult. Nothing of what you posted hints why it's behaving like that.

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: Dealing with Load/Store Semantics

Post by dark_sylinc »

If you're able to create a small repro by tweaking some of the samples (including Colibri sample if you wish) perhaps I could take a look.

psysu
Halfling
Posts: 74
Joined: Tue Jun 01, 2021 7:47 am
x 6

Re: Dealing with Load/Store Semantics

Post by psysu »

dark_sylinc wrote: Thu Dec 07, 2023 6:47 pm

If you're able to create a small repro by tweaking some of the samples (including Colibri sample if you wish) perhaps I could take a look.

i need to setup a new workspace to repro this, when i reported this issue i was working with a different organization. i'll try to repro after i set this workspace

psysu
Halfling
Posts: 74
Joined: Tue Jun 01, 2021 7:47 am
x 6

Re: Dealing with Load/Store Semantics

Post by psysu »

Okay i was able to repro the bug in the Colibri Sample.

For this, FSAA has to be enabled, if i used FSAA x16 (which i was doing in my actual application) then the application got stalled and i get a plain white screen. but i used FSAA x2 to repro where the color output is getting discarded.

Here's how i setup the scene and compositor node to repro this:

ColibriGuiGameState.cpp :

Code: Select all

  
// This listener to imitate Colibri widgets that renders on the back and front of Ogre Scene Renderables class MultiPassWorkspaceListener : public Ogre::CompositorWorkspaceListener { public: MultiPassWorkspaceListener( Colibri::Window *pBackWindow, Colibri::Window *pFrontWindow ) : m_pBackWindow{ pBackWindow }, m_pFrontWindow{ pFrontWindow } {} void passEarlyPreExecute( Ogre::CompositorPass *pPass ) override { const uint32_t identifier = pPass->getDefinition()->mIdentifier; if( identifier == 123 ) { m_pBackWindow->setHidden( false ); m_pFrontWindow->setHidden( true ); } else if( identifier == 456 ) { m_pBackWindow->setHidden( true ); m_pFrontWindow->setHidden( false ); } } Colibri::Window *m_pBackWindow; Colibri::Window *m_pFrontWindow; }; ... void ColibriGuiGameState::createScene01(void) { ... // after colibri widgets are created // Here i consider vertWindow widgets as back window and mainWindow widgets as front window m_pWorkspaceListener = static_cast<Ogre::CompositorWorkspaceListener *>( OGRE_NEW MultiPassWorkspaceListener( vertWindow, mainWindow ) ); mGraphicsSystem->getCompositorWorkspace()->addListener( m_pWorkspaceListener ); TutorialGameState::createScene01(); }

Main.compositor :

Code: Select all

compositor_node RenderingNode
{
    in 0 renderWindow

target renderWindow
{
    pass clear
    {
        all clear
        colour_value 0.2 0.4 0.6 1
    }

    pass render_scene
    {
        rq_first 0
        rq_last 10
        profiling_id "BG Render Pass"
    }

    pass custom colibri_gui
    {
        identifier 123
        // True is the default value since 99% of the time
        // we want to append ourselves to the previous pass.
        skip_load_store_semantics false

        profiling_id "Colibri GUI Back Pass"
    }

    pass render_scene
    {
        rq_first 10
        rq_last max
    }

    pass custom colibri_gui
    {
        identifier 456
        // True is the default value since 99% of the time
        // we want to append ourselves to the previous pass.
        skip_load_store_semantics false

        profiling_id "Colibri GUI Front Pass"
    }
}
}

For me, the color got discarded at the end of "BG Render Pass"

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5433
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1341

Re: Dealing with Load/Store Semantics

Post by dark_sylinc »

OK I took a look at your sample (great repro btw! easy, simple, straight to the point. Thanks!) and I had a "D'oh!" moment.

I remember when I was writing this code around 4 years ago a certain decision would come back to bite me.

The problem is the following:

  1. "BG Render Pass"

    • Default is store_or_resolve (non-MSAA = store, MSAA = resolve). The idea is that (when ideally using one pass) we resolve once, MSAA is never flushed to memory; and later the resolved results are used by postprocessing (unless certain algorithms like MSAA specifically access the MSAA content). But in this case we're far from the ideal because the pass must be split in multiple passes.

    • This means when using MSAA, it ends up using resolve. The MSAA contents are discarded.

  2. "Colibri GUI Back Pass"

    • We are continuing rendering to MSAA. (Which was discarded on the previous pass!)

    • Default is store_or_resolve

    • Ends up using resolve. The MSAA contents are discarded again.

  3. Unnamed render_scene

    • We are continuing rendering to MSAA. (Which was discarded on the previous pass!)

    • Default is store_or_resolve

    • Ends up using resolve. The MSAA contents are discarded again.

  4. "Colibri GUI Front Pass"

    • We are continuing rendering to MSAA. (Which was discarded on the previous pass!)

    • Default is store_or_resolve

    • Ends up using resolve. The MSAA contents are discarded. This is the only time this is what we want. Since we're ready for display.

Unfortunately OgreNext is not smart enough to figure out you will continuing to render afterwards and transform the store_or_resolve into "store" instead of "resolve ".

The solution is to override the defaults, because the default is not what you want here:

Code: Select all

compositor_node RenderingNode
{
    in 0 renderWindow

    target renderWindow
    {
        pass render_scene
        {
            load
            {
                all clear
                colour_value 0.2 0.4 0.6 1
            }
            store
            {
                // We nevert want to resolve yet, as we want to continue rendering into it.
                colour store
            }

            rq_first 0
            rq_last 10
            profiling_id "BG Render Pass"
        }

        pass custom colibri_gui
        {
            store
            {
                // We nevert want to resolve yet, as we want to continue rendering into it.
                colour store
            }

            identifier 123
            // True is the default value since 99% of the time
            // we want to append ourselves to the previous pass.
            skip_load_store_semantics false

            profiling_id "Colibri GUI Back Pass"
        }

        pass render_scene
        {
            store
            {
                // We nevert want to resolve yet, as we want to continue rendering into it.
                colour store
            }
            rq_first 10
            rq_last max
        }

        pass custom colibri_gui
        {
            store
            {
                // This is the final pass. We want to resolve iif using MSAA.
                colour store_or_resolve
            }

            identifier 456
            // True is the default value since 99% of the time
            // we want to append ourselves to the previous pass.
            skip_load_store_semantics false

            profiling_id "Colibri GUI Front Pass"
        }
    }
}

I also merged your clear pass into the render_scene for better efficiency.

This behavior should definitely be documented in the docs. to be very clear what's going on.

I'm sorry it caused you this much trouble.

Cheers
Matias