[2.3] Vulkan Progress
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
[2.3] Vulkan Progress
This is a continuation from News Announcement post.
That post doesn't have much visibility so I'm continuing here.
It's beginning to work!
Stereo isn't working yet. From this angle one cannot notice that there is no depth buffer on Linux yet (Windows does have it). But it's working!!!
Many samples that could be working (I think) are broken due to (at least) a bug in V1's buffer management. I hope to fix that tomorrow.
Any sample which renders to RenderTextures (i.e. has shadow maps, doesn't directly render to screen) isn't working yet either.
That post doesn't have much visibility so I'm continuing here.
It's beginning to work!
Stereo isn't working yet. From this angle one cannot notice that there is no depth buffer on Linux yet (Windows does have it). But it's working!!!
Many samples that could be working (I think) are broken due to (at least) a bug in V1's buffer management. I hope to fix that tomorrow.
Any sample which renders to RenderTextures (i.e. has shadow maps, doesn't directly render to screen) isn't working yet either.
You do not have the required permissions to view the files attached to this post.
-
- OGRE Expert User
- Posts: 1148
- Joined: Sat Jul 06, 2013 10:59 pm
- Location: Chile
- x 169
Re: [2.3] Vulkan Progress
Awesomeeee!!
How's performance looking compared to dx11?
Will this "just works" on Android?
How's performance looking compared to dx11?
Will this "just works" on Android?
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Too early to tell. I barely got that sample to work. Without depth buffers it's not even apples to apples comparison. We haven't integrated a SPIRV shader optimizer (which we don't know if it affects performance on Desktop) and DescriptorSetTextures are being emulated to cut some corners.
Ask that question in a couple weeks until we got more stuff up, in a couple months until it fully catches up to the rest of the backends
That's the idea. Though from other engines' experience Android 7.x is really bad at driver bugs; and a blacklist of known bad driver versions needs to be maintained.
But other on a recent, Vulkan-capable Android version it should just work. And with some luck on not-so-recent too.
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
For some reason in commit 'Added ability to programmatically specify RootLayouts' I get an error in OgreRootLayout that I forgot to include OgreStableHeaders. After including it compiles. I think it's something related to precompiled headers?
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
I have fixed an issue with Win32Window where we weren't using the already provided window and created another one instead. You might want to cherry pick it.
-
- OGRE Team Member
- Posts: 2126
- Joined: Sun Mar 30, 2014 2:51 pm
- x 1140
Re: [2.3] Vulkan Progress
Note, that starting with Android 11, you can also use ANGLE for OpenGL > Vulkan translation:dark_sylinc wrote: ↑Mon Aug 03, 2020 5:25 am But other on a recent, Vulkan-capable Android version it should just work. And with some luck on not-so-recent too.
https://developer.android.com/preview/features/angle
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Wow I was not aware of it. Thanks! I wished they'd come with this sooner. Android 11 is a way too new.paroj wrote: ↑Mon Aug 03, 2020 11:57 am Note, that starting with Android 11, you can also use ANGLE for OpenGL > Vulkan translation:
https://developer.android.com/preview/features/angle
Though it looks like it's only ES? If so, then it's a shame. Full GL is where the good stuff is.
I haven't integrated any of your latest changes to that file, so I will most likely replace mine with yours.Hotshot5000 wrote: ↑Mon Aug 03, 2020 10:23 am I have fixed an issue with Win32Window where we weren't using the already provided window and created another one instead. You might want to cherry pick it.
Also my Linux version has the same problem as yours (there are 2 windows because it doesn't support external handle yet)
Btw I created a ticket with your tasks. I added two more.
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
Possible fix for autobatching for textures?
They seem to be uploading fine but they are not actually used in the custom renderable sample so I don't know if everything is OK. At least it doesn't crash anymore.
Now I have another round of issues. I see the cube but texture drawing seems problematic:
EDIT: I didn't fix anything since with autobatching the mNextLayout remains undefined. Must be properly initialized.
They seem to be uploading fine but they are not actually used in the custom renderable sample so I don't know if everything is OK. At least it doesn't crash anymore.
Now I have another round of issues. I see the cube but texture drawing seems problematic:
Code: Select all
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: VulkanQueue::commitAndNextCommandBuffer
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x1975bb8dd90 (Type = 6) | Submitted command buffer expects VkImage 0xb3ec550000000090[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL.
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-QueueForwardProgress ] Object: VK_NULL_HANDLE (Type = 6) | VkQueue 0x1973acb96e0[] is waiting on VkSemaphore 0x2d8f4b00000000ce[] that has no way to be signaled.
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkPresentInfoKHR-pImageIndices-01296 ] Object: 0x1973acb96e0 (Type = 4) | Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
Ogre: [VulkanWindow::swapBuffers] vkQueuePresentKHR: error presenting VkResult = VK_ERROR_VALIDATION_FAILED_EXT
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Pushed a fix for AuomaticBatching.
Aside from your change, like you said mNextLayout was not being set, mDisplayTextureName was not properly set when dummies are used(*), and _isDataReadyImpl was wrong.
Also vkCmdEndRenderPass was not being called when it should be.
From what I can see the only missing function to implement is VulkanTextureGpu::copyTo. Could you handle that one?
(*) A dummy is blank 4x4 texture we use to display on screen while we are waiting the real texture to finish streaming in the background (and that we do not own, and we cannot write to it. It's only used for display).
Aside from your change, like you said mNextLayout was not being set, mDisplayTextureName was not properly set when dummies are used(*), and _isDataReadyImpl was wrong.
Also vkCmdEndRenderPass was not being called when it should be.
From what I can see the only missing function to implement is VulkanTextureGpu::copyTo. Could you handle that one?
(*) A dummy is blank 4x4 texture we use to display on screen while we are waiting the real texture to finish streaming in the background (and that we do not own, and we cannot write to it. It's only used for display).
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
Ok I'll work on VulkanTextureGpu::copyTo().
EDIT: Any hints on how to test this? I see in HlmsPbs something in OgreParallaxCorrectedCubemapAuto that calls into copyTo and in OgreVctMaterial which I have no idea what it is. I assume that there are samples that test this functionality but they probably need other features before they work.
EDIT2: Basic VulkanTextureGpu::copyTo() implementation. Not sure if it handles all the cases correctly.
In particular this piece is not clear to me:
EDIT3: Now the sample cube is working fine also for me. It was the change in OgreCompositorManager2 that for some reason Visual Studio decided to ignore until now.
Now I will start exploring the HLSL shader path for Vulkan o see if it would be easier to use HLSL instead of GLSL. Unless you have something more important/urgent on your mind.
EDIT: Any hints on how to test this? I see in HlmsPbs something in OgreParallaxCorrectedCubemapAuto that calls into copyTo and in OgreVctMaterial which I have no idea what it is. I assume that there are samples that test this functionality but they probably need other features before they work.
EDIT2: Basic VulkanTextureGpu::copyTo() implementation. Not sure if it handles all the cases correctly.
In particular this piece is not clear to me:
Code: Select all
//Do not perform the sync if notifyDataIsReady hasn't been called yet (i.e. we're
//still building the HW mipmaps, and the texture will never be ready)
if( dst->_isDataReadyImpl() &&
dst->getGpuPageOutStrategy() == GpuPageOutStrategy::AlwaysKeepSystemRamCopy )
{
dst->_syncGpuResidentToSystemRam();
}
Now I will start exploring the HLSL shader path for Vulkan o see if it would be easier to use HLSL instead of GLSL. Unless you have something more important/urgent on your mind.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Create two AutomaticBatching textures chicken.png and cow.png, call waitForStreamingCompletion, then copy one over the other:Hotshot5000 wrote: ↑Mon Aug 03, 2020 5:43 pm EDIT: Any hints on how to test this? I see in HlmsPbs something in OgreParallaxCorrectedCubemapAuto that calls into copyTo and in OgreVctMaterial which I have no idea what it is. I assume that there are samples that test this functionality but they probably need other features before they work.
Code: Select all
cow = textureManager->createOrRetrieveTexture(
"cow.png", Ogre::GpuPageOutStrategy::Discard,
Ogre::CommonTextureTypes::Diffuse,
Ogre::ResourceGroupManager::AUTODETECT_RESOURCE_GROUP_NAME );
chicken = textureManager->createOrRetrieveTexture(
"chicken.png", Ogre::GpuPageOutStrategy::Discard,
Ogre::CommonTextureTypes::Diffuse,
Ogre::ResourceGroupManager::AUTODETECT_RESOURCE_GROUP_NAME );
cow->scheduleTransitionTo( Ogre::GpuResidency::Resident );
chicken->scheduleTransitionTo( Ogre::GpuResidency::Resident );
textureManager->waitForStreamingCompletion();
cow->copyTo( chicken );
Thanks!Hotshot5000 wrote: ↑Mon Aug 03, 2020 5:43 pm EDIT2: Basic VulkanTextureGpu::copyTo() implementation. Not sure if it handles all the cases correctly.
The comment makes it hard to understand because it was documenting an edge case I can't remember right now; that we should not call, which is why we're calling _isDataReadyImpl instead of calling isDataReady.Hotshot5000 wrote: ↑Mon Aug 03, 2020 5:43 pm In particular this piece is not clear to me:Code: Select all
//Do not perform the sync if notifyDataIsReady hasn't been called yet (i.e. we're //still building the HW mipmaps, and the texture will never be ready) if( dst->_isDataReadyImpl() && dst->getGpuPageOutStrategy() == GpuPageOutStrategy::AlwaysKeepSystemRamCopy ) { dst->_syncGpuResidentToSystemRam(); }
What that piece of code is simply doing is that "AlwaysKeepSystemRamCopy" means that there are two copies: 1 in GPU for use, and 1 in CPU for immediate read access (and for immediate re-upload in case device lost event)
Normally CPU -> GPU modifications are easy to keep in sync (we do memcpy CPU -> CPU then CPU -> GPU), but when a GPU -> GPU copy is performed, we need to download the data from GPU to keep the CPU version up to date.
That is what that block of code is doing.
Great! copyTo was the most urgent one, the rest is up to you.Hotshot5000 wrote: ↑Mon Aug 03, 2020 5:43 pm Now I will start exploring the HLSL shader path for Vulkan o see if it would be easier to use HLSL instead of GLSL. Unless you have something more important/urgent on your mind.
From ticket (added today) "Add options to RenderSystem config like all the other RS do (like Resolution, VSync, device to use, whether to force-use debug layers, etc)" is not urgent, but at some point if left unattended for too long I will have to do it if it gets to my way (but hopefully by that time almost everything would be working... yay!).
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
More samples are working!
Although not the most complex one, the TagPoints sample does some advanced memory manipulation and I'm shocked it's working.
V1 Interface also working (shadows disabled)
Although not the most complex one, the TagPoints sample does some advanced memory manipulation and I'm shocked it's working.
V1 Interface also working (shadows disabled)
You do not have the required permissions to view the files attached to this post.
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
Great work! Any reason why shadows aren't working? Is it the compositor or...
EDIT: The new laptop with 4800H has arrived. I knew the Ryzen platform was better than what Intel can do when it comes to compilation times but now that I have compiled Ogre with it... It is shocking to see how fast it is compared to my previous 7700HQ which wasn't exactly garbage either.
What I bought. The site is in romanian (I am in Romania) but I think you can make out the specs from the title. And the price is 1000 euros.. around what I would have paid to fix the old laptop
EDIT2: Interesting thing that it the CustomSampleRenderable dies in a few seconds with error: VK_ERROR_OUT_OF_DEVICE_MEMORY in VulkanVaoManager::allocateVbo()... For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
EDIT: The new laptop with 4800H has arrived. I knew the Ryzen platform was better than what Intel can do when it comes to compilation times but now that I have compiled Ogre with it... It is shocking to see how fast it is compared to my previous 7700HQ which wasn't exactly garbage either.
What I bought. The site is in romanian (I am in Romania) but I think you can make out the specs from the title. And the price is 1000 euros.. around what I would have paid to fix the old laptop
EDIT2: Interesting thing that it the CustomSampleRenderable dies in a few seconds with error: VK_ERROR_OUT_OF_DEVICE_MEMORY in VulkanVaoManager::allocateVbo()... For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
PbsMaterials is running! (sort of)
Update 2: Corruption when using SW mipmaps fixed
- Shadows are still disabled
- The "corruption" is actually expected: _autogenerateMipmaps is not yet implemented (Vulkan provides no such facility, we have to implement it ourselves by hand)
- I had to comment some vkDestroyImageView & vkDestroyImage calls (letting them leak) because we destroy them too early (known issue)
Update 2: Corruption when using SW mipmaps fixed
You do not have the required permissions to view the files attached to this post.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Too many minor issues. I think I'd take more time to explain them than to fix them.Hotshot5000 wrote: ↑Tue Aug 04, 2020 7:46 am Great work! Any reason why shadows aren't working? Is it the compositor or...
Hotshot5000 wrote: ↑Tue Aug 04, 2020 7:46 am EDIT: The new laptop with 4800H has arrived. I knew the Ryzen platform was better than what Intel can do when it comes to compilation times but now that I have compiled Ogre with it... It is shocking to see how fast it is compared to my previous 7700HQ which wasn't exactly garbage either.
Nice!! Congrats on the new rig.Hotshot5000 wrote: ↑Tue Aug 04, 2020 7:46 am What I bought. The site is in romanian (I am in Romania) but I think you can make out the specs from the title. And the price is 1000 euros.. around what I would have paid to fix the old laptop
I cannot repro. Mine runs seemingly indefinitely (although there might be a race condition because I think 2 times I got a blank screen).Hotshot5000 wrote: ↑Tue Aug 04, 2020 7:46 am EDIT2: Interesting thing that it the CustomSampleRenderable dies in a few seconds with error: VK_ERROR_OUT_OF_DEVICE_MEMORY in VulkanVaoManager::allocateVbo()... For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
I get a several calls to allocateVbos caused by TextAreaOverlayElement::checkMemoryAllocation (looks like the FPS counter requiring more characters as framerate varies) but afterwards the calls to allocateVbos stop.
A thing about measuring framerate (asides that this is waaaaay too early to compare):Hotshot5000 wrote: ↑Tue Aug 04, 2020 7:46 am For the few seconds that it runs the performance is around 120 fps while GL3Plus goes over 260 fps.
- A single bad barrier can destroy performance. We'll fix those once everything is running and we can run Vulkan profilers on it showing barrier usage
- Currently our code is preferring VK_PRESENT_MODE_FIFO_KHR when available. This means VSync On. Maybe your monitor's refresh rate is 120hz? (VK_PRESENT_MODE_IMMEDIATE_KHR = VSync off)
- SPRIV-Opt may help (or not)
- Many engines are very CPU inefficient, but not Ogre. Ogre uses AZDO (Approaching Zero Driver Overhead). That means we have little to gain from Vulkan in some areas where others gain a lot; while D3D11/GL drivers have lots of optimizations (like deferring work to a worker thread). I don't expect our Vulkan implementation to beat D3D11/GL (though that'd be sweet) in fact it could be a bit slower, unless we're taking advantage of async compute. The main reason to get Vulkan is to move forward in API support and support Android and other OSes (Nintendo Switch... I'm looking at you)
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
Yes you are right. I have a 120hz screen. Forgot about that
-
- Bugbear
- Posts: 812
- Joined: Thu Dec 09, 2004 2:51 am
- x 42
Re: [2.3] Vulkan Progress
Thanks to both you guys (dark_sylinc and Hotshot5000) for all your effort on the Vulkan renderer. I tried the 3 samples mentioned by dark_sylinc and they all worked, the PBS sample crashed right away. Keep up the fantastic work, really appreciated.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Yeah, for now you need to disable shadows for it to work (comment out setupCompositor() in PbsMaterials.cpp)
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
There is definitely a problem in VulkanQueue::releaseFence I get a "cannot dereference value-initialized map/set iterator" at line 864. Pushed a fix. Now it seems stable but eats 1 GB RAM. Probably because everything is preallocated.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Oops! Nice catch!
I should turn the debug iterators in GCC.
Btw your git username in this new machine is "DESKTOP-MQ6E5NQ\sebas"
I should turn the debug iterators in GCC.
Btw your git username in this new machine is "DESKTOP-MQ6E5NQ\sebas"
That seems excessive, but I wouldn't rule out our preallocation spirals out of control for some reasonHotshot5000 wrote: ↑Wed Aug 05, 2020 7:18 pm Now it seems stable but eats 1 GB RAM. Probably because everything is preallocated.
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
Yeah I saw my new username... I changed it to my name
EDIT: At first glance it looks like a lot of the heap is taken by the vulkan validation layer objects.
EDIT: At first glance it looks like a lot of the heap is taken by the vulkan validation layer objects.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Btw how are you measuring GPU consumption?
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
Holy moly!
I did NOT expect this sample to start working so fast! Why? Because for this one to work layout transitions would have to work.
But for that we have the compositor which follows a special path if RSC_EXPLICIT_API is defined. However this path has never been tested because we never supported D3D12 nor Vulkan.
The best I had was Mesa's Radeon driver which would show blatant race conditions when we missed a barrier when launching compute shaders (GL's barriers are much more simple than Vulkan's), and that was the best thing we had for "debugging" this system
I basically maintained the RSC_EXPLICIT_API path blind. When I started writing it, Vulkan specs was a draft and I didn't understand barriers too well. But turns out it works quite closely to how Vulkan actually works.
And after fixing a couple design issues with this blind-coded barrier system, it worked!
To get this sample to work, you edit ShadowMapDebugging.compositor and comment out "shadow_map_target_type point", because Ogre/DPSM/CubeToDpsm hasn't been ported yet.
Also the sample will try to access some compute shaders at the beginning for ESM. That needs to be commented out too.
I did NOT expect this sample to start working so fast! Why? Because for this one to work layout transitions would have to work.
But for that we have the compositor which follows a special path if RSC_EXPLICIT_API is defined. However this path has never been tested because we never supported D3D12 nor Vulkan.
The best I had was Mesa's Radeon driver which would show blatant race conditions when we missed a barrier when launching compute shaders (GL's barriers are much more simple than Vulkan's), and that was the best thing we had for "debugging" this system
I basically maintained the RSC_EXPLICIT_API path blind. When I started writing it, Vulkan specs was a draft and I didn't understand barriers too well. But turns out it works quite closely to how Vulkan actually works.
And after fixing a couple design issues with this blind-coded barrier system, it worked!
To get this sample to work, you edit ShadowMapDebugging.compositor and comment out "shadow_map_target_type point", because Ogre/DPSM/CubeToDpsm hasn't been ported yet.
Also the sample will try to access some compute shaders at the beginning for ESM. That needs to be commented out too.
You do not have the required permissions to view the files attached to this post.
-
- OGRE Contributor
- Posts: 237
- Joined: Thu Oct 14, 2010 12:30 pm
- x 58
Re: [2.3] Vulkan Progress
I don't. That 1GB was of system ram reported by Visual Studio's Diagnostic Tools. I just looked at the Process Memory graph.
-
- OGRE Team Member
- Posts: 5446
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1348
Re: [2.3] Vulkan Progress
PBS Materials now works and looks as it should!
You do not have the required permissions to view the files attached to this post.