[2.3] Vulkan Progress

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

[2.3] Vulkan Progress

Post by dark_sylinc »

This is a continuation from News Announcement post.

That post doesn't have much visibility so I'm continuing here.

It's beginning to work!
Beginning.png
Stereo isn't working yet. From this angle one cannot notice that there is no depth buffer on Linux yet (Windows does have it). But it's working!!!

Many samples that could be working (I think) are broken due to (at least) a bug in V1's buffer management. I hope to fix that tomorrow.

Any sample which renders to RenderTextures (i.e. has shadow maps, doesn't directly render to screen) isn't working yet either.
xrgo
OGRE Expert User
OGRE Expert User
Posts: 1148
Joined: Sat Jul 06, 2013 10:59 pm
Location: Chile
x 168

Re: [2.3] Vulkan Progress

Post by xrgo »

Awesomeeee!!
How's performance looking compared to dx11?
Will this "just works" on Android?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

xrgo wrote: Mon Aug 03, 2020 5:12 am How's performance looking compared to dx11?
Too early to tell. I barely got that sample to work. Without depth buffers it's not even apples to apples comparison. We haven't integrated a SPIRV shader optimizer (which we don't know if it affects performance on Desktop) and DescriptorSetTextures are being emulated to cut some corners.

Ask that question in a couple weeks until we got more stuff up, in a couple months until it fully catches up to the rest of the backends :)
xrgo wrote: Mon Aug 03, 2020 5:12 am Will this "just works" on Android?
That's the idea. Though from other engines' experience Android 7.x is really bad at driver bugs; and a blacklist of known bad driver versions needs to be maintained.

But other on a recent, Vulkan-capable Android version it should just work. And with some luck on not-so-recent too.
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

For some reason in commit 'Added ability to programmatically specify RootLayouts' I get an error in OgreRootLayout that I forgot to include OgreStableHeaders. After including it compiles. I think it's something related to precompiled headers?
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

I have fixed an issue with Win32Window where we weren't using the already provided window and created another one instead. You might want to cherry pick it.
paroj
OGRE Team Member
OGRE Team Member
Posts: 1994
Joined: Sun Mar 30, 2014 2:51 pm
x 1074
Contact:

Re: [2.3] Vulkan Progress

Post by paroj »

dark_sylinc wrote: Mon Aug 03, 2020 5:25 am But other on a recent, Vulkan-capable Android version it should just work. And with some luck on not-so-recent too.
Note, that starting with Android 11, you can also use ANGLE for OpenGL > Vulkan translation:
https://developer.android.com/preview/features/angle
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

paroj wrote: Mon Aug 03, 2020 11:57 am Note, that starting with Android 11, you can also use ANGLE for OpenGL > Vulkan translation:
https://developer.android.com/preview/features/angle
Wow I was not aware of it. Thanks! I wished they'd come with this sooner. Android 11 is a way too new.
Though it looks like it's only ES? If so, then it's a shame. Full GL is where the good stuff is.
Hotshot5000 wrote: Mon Aug 03, 2020 10:23 am I have fixed an issue with Win32Window where we weren't using the already provided window and created another one instead. You might want to cherry pick it.
I haven't integrated any of your latest changes to that file, so I will most likely replace mine with yours.

Also my Linux version has the same problem as yours (there are 2 windows because it doesn't support external handle yet)

Btw I created a ticket with your tasks. I added two more.
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

Possible fix for autobatching for textures?

They seem to be uploading fine but they are not actually used in the custom renderable sample so I don't know if everything is OK. At least it doesn't crash anymore.

Now I have another round of issues. I see the cube but texture drawing seems problematic:

Code: Select all

Ogre:  v1 * _render: CbRenderOp 
Ogre:  v1 * _render: CbDrawCallStrip 
Ogre:  v1 * _render: CbRenderOp 
Ogre:  v1 * _render: CbDrawCallStrip 
Ogre: VulkanQueue::commitAndNextCommandBuffer
Ogre: ERROR: [Validation] Code 0 :  [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x1975bb8dd90 (Type = 6) | Submitted command buffer expects VkImage 0xb3ec550000000090[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL.
Ogre: ERROR: [Validation] Code 0 :  [ UNASSIGNED-CoreValidation-DrawState-QueueForwardProgress ] Object: VK_NULL_HANDLE (Type = 6) | VkQueue 0x1973acb96e0[] is waiting on VkSemaphore 0x2d8f4b00000000ce[] that has no way to be signaled.
Ogre: ERROR: [Validation] Code 0 :  [ VUID-VkPresentInfoKHR-pImageIndices-01296 ] Object: 0x1973acb96e0 (Type = 4) | Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
Ogre: [VulkanWindow::swapBuffers] vkQueuePresentKHR: error presenting VkResult = VK_ERROR_VALIDATION_FAILED_EXT
EDIT: I didn't fix anything since with autobatching the mNextLayout remains undefined. Must be properly initialized.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

Pushed a fix for AuomaticBatching.

Aside from your change, like you said mNextLayout was not being set, mDisplayTextureName was not properly set when dummies are used(*), and _isDataReadyImpl was wrong.

Also vkCmdEndRenderPass was not being called when it should be.

From what I can see the only missing function to implement is VulkanTextureGpu::copyTo. Could you handle that one?

(*) A dummy is blank 4x4 texture we use to display on screen while we are waiting the real texture to finish streaming in the background (and that we do not own, and we cannot write to it. It's only used for display).
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

Ok I'll work on VulkanTextureGpu::copyTo().

EDIT: Any hints on how to test this? I see in HlmsPbs something in OgreParallaxCorrectedCubemapAuto that calls into copyTo and in OgreVctMaterial which I have no idea what it is. I assume that there are samples that test this functionality but they probably need other features before they work.

EDIT2: Basic VulkanTextureGpu::copyTo() implementation. Not sure if it handles all the cases correctly.

In particular this piece is not clear to me:

Code: Select all

//Do not perform the sync if notifyDataIsReady hasn't been called yet (i.e. we're
        //still building the HW mipmaps, and the texture will never be ready)
        if( dst->_isDataReadyImpl() &&
            dst->getGpuPageOutStrategy() == GpuPageOutStrategy::AlwaysKeepSystemRamCopy )
        {
            dst->_syncGpuResidentToSystemRam();
        }
EDIT3: Now the sample cube is working fine also for me. It was the change in OgreCompositorManager2 that for some reason Visual Studio decided to ignore until now.

Now I will start exploring the HLSL shader path for Vulkan o see if it would be easier to use HLSL instead of GLSL. Unless you have something more important/urgent on your mind.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

Hotshot5000 wrote: Mon Aug 03, 2020 5:43 pm EDIT: Any hints on how to test this? I see in HlmsPbs something in OgreParallaxCorrectedCubemapAuto that calls into copyTo and in OgreVctMaterial which I have no idea what it is. I assume that there are samples that test this functionality but they probably need other features before they work.
Create two AutomaticBatching textures chicken.png and cow.png, call waitForStreamingCompletion, then copy one over the other:

Code: Select all

cow = textureManager->createOrRetrieveTexture(
                        "cow.png", Ogre::GpuPageOutStrategy::Discard,
                        Ogre::CommonTextureTypes::Diffuse,
                        Ogre::ResourceGroupManager::AUTODETECT_RESOURCE_GROUP_NAME );
chicken = textureManager->createOrRetrieveTexture(
                        "chicken.png", Ogre::GpuPageOutStrategy::Discard,
                        Ogre::CommonTextureTypes::Diffuse,
                        Ogre::ResourceGroupManager::AUTODETECT_RESOURCE_GROUP_NAME );

cow->scheduleTransitionTo( Ogre::GpuResidency::Resident );
chicken->scheduleTransitionTo( Ogre::GpuResidency::Resident );

textureManager->waitForStreamingCompletion();

cow->copyTo( chicken );
Thanks!
Hotshot5000 wrote: Mon Aug 03, 2020 5:43 pm In particular this piece is not clear to me:

Code: Select all

//Do not perform the sync if notifyDataIsReady hasn't been called yet (i.e. we're
        //still building the HW mipmaps, and the texture will never be ready)
        if( dst->_isDataReadyImpl() &&
            dst->getGpuPageOutStrategy() == GpuPageOutStrategy::AlwaysKeepSystemRamCopy )
        {
            dst->_syncGpuResidentToSystemRam();
        }
The comment makes it hard to understand because it was documenting an edge case I can't remember right now; that we should not call, which is why we're calling _isDataReadyImpl instead of calling isDataReady.

What that piece of code is simply doing is that "AlwaysKeepSystemRamCopy" means that there are two copies: 1 in GPU for use, and 1 in CPU for immediate read access (and for immediate re-upload in case device lost event)

Normally CPU -> GPU modifications are easy to keep in sync (we do memcpy CPU -> CPU then CPU -> GPU), but when a GPU -> GPU copy is performed, we need to download the data from GPU to keep the CPU version up to date.
That is what that block of code is doing.
Hotshot5000 wrote: Mon Aug 03, 2020 5:43 pm Now I will start exploring the HLSL shader path for Vulkan o see if it would be easier to use HLSL instead of GLSL. Unless you have something more important/urgent on your mind.
Great! copyTo was the most urgent one, the rest is up to you.

From ticket (added today) "Add options to RenderSystem config like all the other RS do (like Resolution, VSync, device to use, whether to force-use debug layers, etc)" is not urgent, but at some point if left unattended for too long I will have to do it if it gets to my way (but hopefully by that time almost everything would be working... yay!).
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

More samples are working!
Ogre.png
Although not the most complex one, the TagPoints sample does some advanced memory manipulation and I'm shocked it's working.
Ogre2.png
V1 Interface also working (shadows disabled)
Ogre3.png
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

Great work! Any reason why shadows aren't working? Is it the compositor or...

EDIT: The new laptop with 4800H has arrived. I knew the Ryzen platform was better than what Intel can do when it comes to compilation times but now that I have compiled Ogre with it... It is shocking to see how fast it is compared to my previous 7700HQ which wasn't exactly garbage either.

What I bought. The site is in romanian (I am in Romania) but I think you can make out the specs from the title. And the price is 1000 euros.. around what I would have paid to fix the old laptop :)

EDIT2: Interesting thing that it the CustomSampleRenderable dies in a few seconds with error: VK_ERROR_OUT_OF_DEVICE_MEMORY in VulkanVaoManager::allocateVbo()... For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

PbsMaterials is running! (sort of)
PbsMaterials.jpg
  • Shadows are still disabled
  • The "corruption" is actually expected: _autogenerateMipmaps is not yet implemented (Vulkan provides no such facility, we have to implement it ourselves by hand)
  • I had to comment some vkDestroyImageView & vkDestroyImage calls (letting them leak) because we destroy them too early (known issue)
We can workaround that by setting TextureGpuManager::mDefaultMipmapGen to SW though. If we do that there is still some corruption I didn't look into:
PbsMaterials2.jpg
Update: The corruption appears to be an incorrect stride when uploading the higher mips.

Update 2: Corruption when using SW mipmaps fixed
Pbs1.jpg
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

Hotshot5000 wrote: Tue Aug 04, 2020 7:46 am Great work! Any reason why shadows aren't working? Is it the compositor or...
Too many minor issues. I think I'd take more time to explain them than to fix them.
Hotshot5000 wrote: Tue Aug 04, 2020 7:46 am EDIT: The new laptop with 4800H has arrived. I knew the Ryzen platform was better than what Intel can do when it comes to compilation times but now that I have compiled Ogre with it... It is shocking to see how fast it is compared to my previous 7700HQ which wasn't exactly garbage either.
Hotshot5000 wrote: Tue Aug 04, 2020 7:46 am What I bought. The site is in romanian (I am in Romania) but I think you can make out the specs from the title. And the price is 1000 euros.. around what I would have paid to fix the old laptop :)
Nice!! Congrats on the new rig.
Hotshot5000 wrote: Tue Aug 04, 2020 7:46 am EDIT2: Interesting thing that it the CustomSampleRenderable dies in a few seconds with error: VK_ERROR_OUT_OF_DEVICE_MEMORY in VulkanVaoManager::allocateVbo()... For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
I cannot repro. Mine runs seemingly indefinitely (although there might be a race condition because I think 2 times I got a blank screen).

I get a several calls to allocateVbos caused by TextAreaOverlayElement::checkMemoryAllocation (looks like the FPS counter requiring more characters as framerate varies) but afterwards the calls to allocateVbos stop.
Hotshot5000 wrote: Tue Aug 04, 2020 7:46 am For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
A thing about measuring framerate (asides that this is waaaaay too early to compare):
  • A single bad barrier can destroy performance. We'll fix those once everything is running and we can run Vulkan profilers on it showing barrier usage
  • Currently our code is preferring VK_PRESENT_MODE_FIFO_KHR when available. This means VSync On. Maybe your monitor's refresh rate is 120hz? (VK_PRESENT_MODE_IMMEDIATE_KHR = VSync off)
  • SPRIV-Opt may help (or not)
  • Many engines are very CPU inefficient, but not Ogre. Ogre uses AZDO (Approaching Zero Driver Overhead). That means we have little to gain from Vulkan in some areas where others gain a lot; while D3D11/GL drivers have lots of optimizations (like deferring work to a worker thread). I don't expect our Vulkan implementation to beat D3D11/GL (though that'd be sweet) in fact it could be a bit slower, unless we're taking advantage of async compute. The main reason to get Vulkan is to move forward in API support and support Android and other OSes (Nintendo Switch... I'm looking at you)
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

Yes you are right. I have a 120hz screen. Forgot about that
dermont
Bugbear
Posts: 812
Joined: Thu Dec 09, 2004 2:51 am
x 42

Re: [2.3] Vulkan Progress

Post by dermont »

Thanks to both you guys (dark_sylinc and Hotshot5000) for all your effort on the Vulkan renderer. I tried the 3 samples mentioned by dark_sylinc and they all worked, the PBS sample crashed right away. Keep up the fantastic work, really appreciated.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

dermont wrote: Wed Aug 05, 2020 9:32 am Thanks to both you guys (dark_sylinc and Hotshot5000) for all your effort on the Vulkan renderer.
:)
dermont wrote: Wed Aug 05, 2020 9:32 am the PBS sample crashed right away
Yeah, for now you need to disable shadows for it to work (comment out setupCompositor() in PbsMaterials.cpp)
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

There is definitely a problem in VulkanQueue::releaseFence I get a "cannot dereference value-initialized map/set iterator" at line 864. Pushed a fix. Now it seems stable but eats 1 GB RAM. Probably because everything is preallocated.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

Oops! Nice catch!
I should turn the debug iterators in GCC.

Btw your git username in this new machine is "DESKTOP-MQ6E5NQ\sebas"
Hotshot5000 wrote: Wed Aug 05, 2020 7:18 pm Now it seems stable but eats 1 GB RAM. Probably because everything is preallocated.
That seems excessive, but I wouldn't rule out our preallocation spirals out of control for some reason
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

Yeah I saw my new username... I changed it to my name :)

EDIT: At first glance it looks like a lot of the heap is taken by the vulkan validation layer objects.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

Btw how are you measuring GPU consumption?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

Holy moly!
ShadowMaps.jpg
I did NOT expect this sample to start working so fast! Why? Because for this one to work layout transitions would have to work.
But for that we have the compositor which follows a special path if RSC_EXPLICIT_API is defined. However this path has never been tested because we never supported D3D12 nor Vulkan.

The best I had was Mesa's Radeon driver which would show blatant race conditions when we missed a barrier when launching compute shaders (GL's barriers are much more simple than Vulkan's), and that was the best thing we had for "debugging" this system

I basically maintained the RSC_EXPLICIT_API path blind. When I started writing it, Vulkan specs was a draft and I didn't understand barriers too well. But turns out it works quite closely to how Vulkan actually works.

And after fixing a couple design issues with this blind-coded barrier system, it worked!

To get this sample to work, you edit ShadowMapDebugging.compositor and comment out "shadow_map_target_type point", because Ogre/DPSM/CubeToDpsm hasn't been ported yet.
Also the sample will try to access some compute shaders at the beginning for ESM. That needs to be commented out too.
Hotshot5000
OGRE Contributor
OGRE Contributor
Posts: 226
Joined: Thu Oct 14, 2010 12:30 pm
x 56

Re: [2.3] Vulkan Progress

Post by Hotshot5000 »

dark_sylinc wrote: Wed Aug 05, 2020 8:34 pm Btw how are you measuring GPU consumption?
I don't. That 1GB was of system ram reported by Visual Studio's Diagnostic Tools. I just looked at the Process Memory graph.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.3] Vulkan Progress

Post by dark_sylinc »

PBS Materials now works and looks as it should!
PbsMaterials.jpg
Post Reply