[2.1] Likely GL3+ RenderSystem Memory Bug (with fix) Topic is solved

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


Post Reply
zxz
Gremlin
Posts: 184
Joined: Sat Apr 16, 2016 9:25 pm
x 19

[2.1] Likely GL3+ RenderSystem Memory Bug (with fix)

Post by zxz »

I have tracked down what seems to be a memory bug in the OpenGL rendersystem. I was having issues with my compositing workspaces only working every second time they were recreated.

Running with Address sanitizer turned up a heap-use-after-free issue when releasing a renderbuffer. I was able to recreate the issue in Sample_Hdr, by binding a key to call mGraphicsSystem->restartCompositor(). Pressing the key twice results in the following memory error:

Code: Select all

==25308==ERROR: AddressSanitizer: heap-use-after-free on address 0x6110000e43c0 at pc 0x7f455647cc79 bp 0x7ffe5e9a2c60 sp 0x7ffe5e9a2c58
READ of size 4 at 0x6110000e43c0 thread T0
    #0 0x7f455647cc78 in Ogre::v1::HardwarePixelBuffer::getHeight() const OgreMain/include/OgreHardwarePixelBuffer.h:191
    #1 0x7f455648dbb4 in Ogre::GL3PlusFBOManager::releaseRenderBuffer(Ogre::GL3PlusSurfaceDesc const&) RenderSystems/GL3Plus/src/OgreGL3PlusFBORenderTexture.cpp:557
    #2 0x7f45564a6617 in Ogre::GL3PlusFrameBufferObject::~GL3PlusFrameBufferObject() RenderSystems/GL3Plus/src/OgreGL3PlusFrameBufferObject.cpp:77
    #3 0x7f4556493c5a in Ogre::GL3PlusFBORenderTexture::~GL3PlusFBORenderTexture() RenderSystems/GL3Plus/include/OgreGL3PlusFBORenderTexture.h:40
    #4 0x7f4556493c95 in Ogre::GL3PlusFBORenderTexture::~GL3PlusFBORenderTexture() RenderSystems/GL3Plus/include/OgreGL3PlusFBORenderTexture.h:40
    #5 0x7f45750a831d in Ogre::RenderSystem::destroyRenderTarget(std::string const&) OgreMain/src/OgreRenderSystem.cpp:228
    #6 0x7f45564956aa in Ogre::v1::GL3PlusTextureBuffer::~GL3PlusTextureBuffer() RenderSystems/GL3Plus/src/OgreGL3PlusTextureBuffer.cpp:136
    #7 0x7f4556495769 in Ogre::v1::GL3PlusTextureBuffer::~GL3PlusTextureBuffer() RenderSystems/GL3Plus/src/OgreGL3PlusTextureBuffer.cpp:139
    #8 0x7f4574df1ab3 in Ogre::SharedPtrInfoDelete<Ogre::v1::HardwarePixelBuffer>::~SharedPtrInfoDelete() OgreMain/include/OgreSharedPtr.h:78
    #9 0x7f45564a2b0b in Ogre::SharedPtr<Ogre::v1::HardwarePixelBuffer>::destroy() OgreMain/include/OgreSharedPtr.h:343
    #10 0x7f45564a27cc in Ogre::SharedPtr<Ogre::v1::HardwarePixelBuffer>::release() OgreMain/include/OgreSharedPtr.h:329
    #11 0x7f45564a265b in Ogre::SharedPtr<Ogre::v1::HardwarePixelBuffer>::~SharedPtr() OgreMain/include/OgreSharedPtr.h:240
    #12 0x7f45564a263f in Ogre::v1::HardwarePixelBufferSharedPtr::~HardwarePixelBufferSharedPtr() OgreMain/include/OgreHardwarePixelBuffer.h:199
    #13 0x7f45564a3373 in Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> >::destroy(Ogre::v1::HardwarePixelBufferSharedPtr*) OgreMain/include/OgreMemorySTLAllocator.h:177
    #14 0x7f45564a31c7 in std::enable_if<std::__and_<std::allocator_traits<Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::__destroy_helper<Ogre::v1::HardwarePixelBufferSharedPtr>::type>::value, void>::type std::allocator_traits<Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::_S_destroy<Ogre::v1::HardwarePixelBufferSharedPtr>(Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> >&, Ogre::v1::HardwarePixelBufferSharedPtr*) /opt/gcc-5.3.0-3-0.el7/include/c++/5.3.0/bits/alloc_traits.h:285
    #15 0x7f45564a2f46 in void std::allocator_traits<Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::destroy<Ogre::v1::HardwarePixelBufferSharedPtr>(Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> >&, Ogre::v1::HardwarePixelBufferSharedPtr*) /opt/gcc-5.3.0-3-0.el7/include/c++/5.3.0/bits/alloc_traits.h:414
    #16 0x7f45564a2b8d in void std::_Destroy<Ogre::v1::HardwarePixelBufferSharedPtr*, Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >(Ogre::v1::HardwarePixelBufferSharedPtr*, Ogre::v1::HardwarePixelBufferSharedPtr*, Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> >&) /opt/gcc-5.3.0-3-0.el7/include/c++/5.3.0/bits/stl_construct.h:143
    #17 0x7f45564a288b in std::vector<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::_M_erase_at_end(Ogre::v1::HardwarePixelBufferSharedPtr*) /opt/gcc-5.3.0-3-0.el7/include/c++/5.3.0/bits/stl_vector.h:1438
    #18 0x7f45564a26a7 in std::vector<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::STLAllocator<Ogre::v1::HardwarePixelBufferSharedPtr, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::clear() /opt/gcc-5.3.0-3-0.el7/include/c++/5.3.0/bits/stl_vector.h:1212
    #19 0x7f45564b4a87 in Ogre::GL3PlusTexture::freeInternalResourcesImpl() RenderSystems/GL3Plus/src/OgreGL3PlusTexture.cpp:485
    #20 0x7f4575154622 in Ogre::Texture::freeInternalResources() OgreMain/src/OgreTexture.cpp:361
    #21 0x7f45751546d1 in Ogre::Texture::unloadImpl() OgreMain/src/OgreTexture.cpp:368
    #22 0x7f457515c173 in Ogre::Resource::unload() OgreMain/src/OgreResource.cpp:321
    #23 0x7f45564b1a3e in Ogre::GL3PlusTexture::~GL3PlusTexture() RenderSystems/GL3Plus/src/OgreGL3PlusTexture.cpp:74
    #24 0x7f45564b1ab1 in Ogre::GL3PlusTexture::~GL3PlusTexture() RenderSystems/GL3Plus/src/OgreGL3PlusTexture.cpp:80
    #25 0x7f4574d99709 in Ogre::SharedPtrInfoDelete<Ogre::Resource>::~SharedPtrInfoDelete() OgreMain/include/OgreSharedPtr.h:78
    #26 0x7f4574e6ae45 in Ogre::SharedPtr<Ogre::Texture>::destroy() OgreMain/include/OgreSharedPtr.h:343
    #27 0x7f4574e68fbe in Ogre::SharedPtr<Ogre::Texture>::release() OgreMain/include/OgreSharedPtr.h:329
    #28 0x7f4574e66eff in Ogre::SharedPtr<Ogre::Texture>::~SharedPtr() OgreMain/include/OgreSharedPtr.h:240
    #29 0x7f4574e67523 in Ogre::SharedPtr<Ogre::Texture>::operator=(Ogre::SharedPtr<Ogre::Texture> const&) OgreMain/include/OgreSharedPtr.h:192
    #30 0x7f45752ed098 in Ogre::TextureUnitState::setTexture(Ogre::SharedPtr<Ogre::Texture> const&) OgreMain/src/OgreTextureUnitState.cpp:240
    #31 0x7f4575521c02 in Ogre::CompositorPassQuad::execute(Ogre::Camera const*) OgreMain/src/Compositor/Pass/PassQuad/OgreCompositorPassQuad.cpp:171
    #32 0x7f45754d795a in Ogre::CompositorNode::_update(Ogre::Camera const*, Ogre::SceneManager*) OgreMain/src/Compositor/OgreCompositorNode.cpp:1020
    #33 0x7f45754bdf01 in Ogre::CompositorWorkspace::_update() OgreMain/src/Compositor/OgreCompositorWorkspace.cpp:744
    #34 0x7f45754f3134 in Ogre::CompositorManager2::_update(Ogre::SceneManagerEnumerator&, Ogre::HlmsManager*) OgreMain/src/Compositor/OgreCompositorManager2.cpp:699
    #35 0x7f45750921d4 in Ogre::Root::_updateAllRenderTargets() OgreMain/src/OgreRoot.cpp:1517
    #36 0x7f457508de53 in Ogre::Root::renderOneFrame() OgreMain/src/OgreRoot.cpp:1041
    #37 0x444102 in Demo::GraphicsSystem::update(float) Samples/2.0/Common/src/GraphicsSystem.cpp:298
    #38 0x458d53 in Demo::MainEntryPoints::mainAppSingleThreaded(int, char const**) Samples/2.0/Common/src/System/Desktop/MainLoopSingleThreaded.cpp:117
    #39 0x432b61 in mainApp(int, char const**) Samples/2.0/Showcase/Hdr/Hdr.cpp:23
    #40 0x432a5f in main Samples/2.0/Common/include/MainEntryPointHelper.h:34
    #41 0x7f4571e44b34 in __libc_start_main (/lib64/libc.so.6+0x21b34)
    #42 0x432968  (bin/Sample_Hdr-2.1.0+0x432968)

0x6110000e43c0 is located 64 bytes inside of 200-byte region [0x6110000e4380,0x6110000e4448)
freed by thread T0 here:
    #0 0x7f4576c47dda in __interceptor_free ../../../../build/libsanitizer/asan/asan_malloc_linux.cc:28
    #1 0x7f456c6418b0  (/usr/lib64/nvidia/libGLX_nvidia.so.0+0xb48b0)

previously allocated by thread T0 here:
    #0 0x7f4576c481e1 in __interceptor_calloc ../../../../build/libsanitizer/asan/asan_malloc_linux.cc:54
    #1 0x7f456c640ae2  (/usr/lib64/nvidia/libGLX_nvidia.so.0+0xb3ae2)

SUMMARY: AddressSanitizer: heap-use-after-free OgreMain/include/OgreHardwarePixelBuffer.h:191 Ogre::v1::HardwarePixelBuffer::getHeight() const
The trace itself doesn't really explain what is going on, but it pointed me in the right direction. The problem appears to be caused by line 506 in RenderSystems/GL3Plus/src/OgreGL3PlusFBORenderTexture.cpp:

Code: Select all

    GL3PlusSurfaceDesc GL3PlusFBOManager::requestRenderBuffer(GLenum format, uint32 width, uint32 height, uint fsaa)
    {
        GL3PlusSurfaceDesc retval;
        retval.buffer = 0; // Return 0 buffer if GL_NONE is requested
        if(format != GL_NONE)
        {
            RBFormat key(format, width, height, fsaa);
            RenderBufferMap::iterator it = mRenderBufferMap.find(key);
            if(it != mRenderBufferMap.end() && (it->second.refcount == 0))  // <--- HERE
            {
                retval.buffer = it->second.buffer;
                retval.zoffset = 0;
                retval.numSamples = fsaa;
                // Increase refcount
                ++it->second.refcount;
            }
            else
            {
                // New one
                v1::GL3PlusRenderBuffer *rb = new v1::GL3PlusRenderBuffer(format, width, height, fsaa);
                mRenderBufferMap[key] = RBRef(rb);
                retval.buffer = rb;
                retval.zoffset = 0;
                retval.numSamples = fsaa;
            }
        }
        //        std::cerr << "Requested renderbuffer with format " << std::hex << format << std::dec << " of " << width << "x" << height << " :" << retval.buffer << std::endl;
        return retval;
    }
This function has a map in order to reuse renderbuffers with the same format. However, the second clause in the condition

Code: Select all

if(it != mRenderBufferMap.end() && (it->second.refcount == 0))
causes existing renderbuffers with the same format to not be reused. Instead, a new renderbuffer is created with the same format, which overwrites the element in the map that might have been there prior to the call. Thus, when the original renderbuffer with the same format is supposed to be released, the new renderbuffer is found instead, and deleted. With the wrong renderbuffer deleted, the memory error occurs when a texture is deleted which refers to the incorrectly deleted renderbuffer.

I don't know why "it->second.refcount == 0" is there, because it seems to run counter to the purpose of the function (looking up formats and reusing existing buffers). Removing it solves the heap-use-after-free, and restores expected behaviour to my application.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1279
Contact:

Re: [2.1] Likely GL3+ RenderSystem Memory Bug (with fix)

Post by dark_sylinc »

Grrrr I hated that routine when I saw it the first time, I now hate it even more.

I assume this problem happens when MSAA is enabled?

MSAA rendering requires of an MSAA surface and a regular texture where the MSAA contents are resolved.
This routine was written a long, long, long time ago back when direct access to an MSAA surface was not possible or made little sense (as a texture fetch in a shader), and the idea was that MSAA surfaces get shared and reused among different Ogre::Texture pointers, while each Ogre::Texture would have a unique resolve texture, but would share the same MSAA surface.

So this routine implements a simple reference count scheme to destroy the MSAA surfaces that are no longer needed. But from what you're saying, it's somehow broken.

I'll try to take a look at what's actually going on, but it's not too high priority for me as I'm focusing on the texture refactor (which is going to be released 2.2) and this refactor gets rid of this sharing routines. But if I get some free time I'll try to take a look.

Thanks for the thorough examination
zxz
Gremlin
Posts: 184
Joined: Sat Apr 16, 2016 9:25 pm
x 19

Re: [2.1] Likely GL3+ RenderSystem Memory Bug (with fix)

Post by zxz »

Yes, I think the problem only occurs with multisampling enabled.

From a local perspective (releaseRenderBuffer and requestRenderBuffer), the condition for refcount to be zero to reuse a buffer makes no sense to me. When the refcount reaches zero, releaseRenderBuffer removes the entry from the map. The condition breaks the actual case where the same format is requested again. If the condition is removed, it just looks like a common refcounting scheme. So it seems like the proper solution to me, if refcounting and reusing is the purpose of the scheme.
zxz
Gremlin
Posts: 184
Joined: Sat Apr 16, 2016 9:25 pm
x 19

Re: [2.1] Likely GL3+ RenderSystem Memory Bug (with fix)

Post by zxz »

Hello!

I'm going through some local Ogre changes for possible upstreaming :)

Below is a patch to fix the issue described in this thread. We have been using this fix successfully since it was reported.

Code: Select all

diff --git a/RenderSystems/GL3Plus/src/OgreGL3PlusFBORenderTexture.cpp b/RenderSystems/GL3Plus/src/OgreGL3PlusFBORenderTexture.cpp
index 67d2599..6ca911b 100644
--- a/RenderSystems/GL3Plus/src/OgreGL3PlusFBORenderTexture.cpp
+++ b/RenderSystems/GL3Plus/src/OgreGL3PlusFBORenderTexture.cpp
@@ -503,7 +503,7 @@ namespace Ogre {
         {
             RBFormat key(format, width, height, fsaa);
             RenderBufferMap::iterator it = mRenderBufferMap.find(key);
-            if(it != mRenderBufferMap.end() && (it->second.refcount == 0))
+            if(it != mRenderBufferMap.end())
             {
                 retval.buffer = it->second.buffer;
                 retval.zoffset = 0;
I am still pretty confident in my analysis of the issue, and that the solution is good. The refcounting and reuse scheme is crippled by the extra condition.

New entries into the map are created with a refcount of 1, and releaseRenderBuffer removes any buffer that reaches 0 refcount from the map in question. Thus, reusing a buffer only when the refcount is 0 makes no sense. The extra condition only makes the code fail to reuse buffers, and instead overwrites existing entries causing the memory error as described earlier.
paroj
OGRE Team Member
OGRE Team Member
Posts: 1995
Joined: Sun Mar 30, 2014 2:51 pm
x 1075
Contact:

Re: [2.1] Likely GL3+ RenderSystem Memory Bug (with fix)

Post by paroj »

yes, I have been shipping this since 1.10.7 without any regressions, see:
https://github.com/OGRECave/ogre/pull/429
zxz
Gremlin
Posts: 184
Joined: Sat Apr 16, 2016 9:25 pm
x 19

Re: [2.1] Likely GL3+ RenderSystem Memory Bug (with fix)

Post by zxz »

Yeah, I noticed that commit today.

I think this is another example of the extremely unfortunate situation that there are no merges taking place between the two branches.

I understand that such a merge would be a huge undertaking at this point, with the divergence that has taken place.

In place of a proper merge, I think there should be some kind of policy in place whereby bugfixes are atleast applied in both branches wherever applicable. For example, this bug was initially reported on 2.1, but the fix only ended up on 1.10 (until now). It could at least have been applied to both.

I don't mean this as personal criticism, it would just be nice if some sort of policy existed to keep both branches updated instead of either branch randomly getting some fixes, but missing others.

Ps. Thanks for applying the fix to 2.1, dark_sylinc!
Post Reply