[News] Vulkan Progress Report
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Vulkan resembles more D3D11 and Metal than it does to OpenGL. Don't let the usage of the GLSL language fool you.
Ogre, Vulkan, D3D11 and Metal use the same coordinate convention for viewports. TextureGpu::requiresTextureFlipping should always return false in Vulkan.
Ogre, Vulkan, D3D11 and Metal use the same coordinate convention for viewports. TextureGpu::requiresTextureFlipping should always return false in Vulkan.
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report

Finally it's rendering the box and the text correctly.
Next steps:
1. Fix all the memory leaks since vbos get allocated each frame for staging buffers instead of reusing. BufferViews also don't get destroyed. This means that it generates around 1-2GB of leaks per second

2. Add pipeline barriers as right now it's probably full of race conditions in the command buffers.
I remember 10+ years ago when I got interested in graphics I imagined the graphics API was something like Vulkan since multi core CPUs were getting popular and was rather unimpressed with OpenGL and its state machine. I found OpenGL awkard. But in the next years I got used to it and now coming to Vulkan, even though it's considerably better from a design point of view, I find Vulkan awkward

Regarding point 1. I need to find a way to get notified when a BufferView has been used by the GPU so that it's safe to destroy (return to a pool?).
Another question would be: are there any plans for making the Ogre renderer more parallel in the way it's handling Cb* internal command buffers? From my limited understanding in order to truly benefit from Vulkan you would need command buffers generated from multiple threads and per thread queues. With the current design it looks like I am submitting work from only one thread. But this is something from the future as right now getting everything working with a single thread looks like a challenge (at least to me) for Vulkan

- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Nice! Great job!
Remember that VaoManager already does this?Hotshot5000 wrote: ↑Thu Jul 23, 2020 6:34 pm Next steps:
1. Fix all the memory leaks since vbos get allocated each frame for staging buffers instead of reusing. BufferViews also don't get destroyed. This means that it generates around 1-2GB of leaks per second

What kind of BufferView? (used for what?). It is a red herring that BufferViews need to be destroyed since only dynamic buffers (and buffers being destroyed) would need that.Hotshot5000 wrote: ↑Thu Jul 23, 2020 6:34 pm Regarding point 1. I need to find a way to get notified when a BufferView has been used by the GPU so that it's safe to destroy (return to a pool?).
Not really, given that we're usually GPU bound. Also OpenGL compatibility really makes this hard.Hotshot5000 wrote: ↑Thu Jul 23, 2020 6:34 pm Another question would be: are there any plans for making the Ogre renderer more parallel in the way it's handling Cb* internal command buffers?
To make it more parallel, we'd need to preallocate memory and somehow fallback if preallocated memory is not enough (restart rendering with bigger buffers? fallback to a single thread?).
Then HlmsPbs::fillBuffersFor would also need a few tweaks to make it parallel.
And last but not least Hlms::getMaterial right now has only one cache where all PSOs are stored, but it would need to have two caches: a read-only cache with PSOs that all threads can access concurrently; and if the entry is not located in the read-only cache, then a second R/W cache protected by a mutex needs to be accessed to create the entry and share it with the other threads.
Once rendering is over, the R/W cache is moved to the Read-only cache
TL;DR: It's possible but a lot of work with little benefit right now.
Btw do you have your repo online? I'm interesting in taking a look at what you're doing.
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Sure! It's no state secret
https://github.com/Hotshot5000/ogre-nex ... otshot5000
I have disabled the VaoManager preallocate big buffer and use offsets into it because I have no synchronization yet and I wanted to make sure the buffers allocated are clean. BufferViews are used in OgreVulkanTexBufferPacked for making worldMatBuf and the likes visible to the GPU.
In the end I just hacked away some things in order to get something to the screen. Another example would be the static CommandPool used for the secondary command buffer created for OgreVulkanStagingTexture. There is still a lot of work to be done here.
EDIT: Looks like RenderDoc 1.9 released with shader debugging for Vulkan. Yaaayy!

I have disabled the VaoManager preallocate big buffer and use offsets into it because I have no synchronization yet and I wanted to make sure the buffers allocated are clean. BufferViews are used in OgreVulkanTexBufferPacked for making worldMatBuf and the likes visible to the GPU.
In the end I just hacked away some things in order to get something to the screen. Another example would be the static CommandPool used for the secondary command buffer created for OgreVulkanStagingTexture. There is still a lot of work to be done here.
EDIT: Looks like RenderDoc 1.9 released with shader debugging for Vulkan. Yaaayy!

- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Do I have write access to your repo? Are you ok with me pushing to your repo?
There are a few improvements and fixes I wish to push. Your work has renewed my interest
There are a few improvements and fixes I wish to push. Your work has renewed my interest

- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
It seems you forgot to push OgreVulkanHlmsPso.cpp because VulkanHlmsPso's constructor and destructors are not defined anywhere.
I tried giving it the obvious one:
But it just crashes in
pso->descriptorLayoutSets is empty, so pso->descriptorLayoutSets[0] leads to a bad path
I tried giving it the obvious one:
Code: Select all
VulkanHlmsPso( VkPipeline _pso, VulkanProgram *_vertexShader, VulkanProgram *_pixelShader,
const DescriptorSetLayoutBindingArray &_descriptorLayoutBindingSets,
const DescriptorSetLayoutArray &_descriptorSets, VkPipelineLayout _layout ) :
pso( _pso ),
vertexShader( _vertexShader ),
pixelShader( _pixelShader ),
descriptorLayoutBindingSets( _descriptorLayoutBindingSets ),
descriptorSets( _descriptorSets ),
pipelineLayout( _layout )
{
}
~VulkanHlmsPso() {}
Code: Select all
1 std::vector<VkDescriptorSetLayoutBinding>::begin stl_vector.h 573 0x7fffa9702bbf
2 Ogre::VulkanDescriptorPool::VulkanDescriptorPool OgreVulkanDescriptorPool.cpp 42 0x7fffa9701eaa
3 Ogre::VulkanRenderSystem::bindDescriptorSet OgreVulkanRenderSystem.cpp 1232 0x7fffa974482f
4 Ogre::VulkanRenderSystem::_renderEmulated OgreVulkanRenderSystem.cpp 1277 0x7fffa9744dc9
5 Ogre::CommandBuffer::execute_drawCallIndexedEmulated OgreCbDrawCall.cpp 80 0x7ffff6cf7db4
6 Ogre::CommandBuffer::execute OgreCommandBuffer.cpp 103 0x7ffff6cf9393
7 Ogre::RenderQueue::render OgreRenderQueue.cpp 464 0x7ffff6abcdf7
8 Ogre::SceneManager::_renderPhase02 OgreSceneManager.cpp 1454 0x7ffff6b35c59
9 Ogre::Camera::_renderScenePhase02 OgreCamera.cpp 404 0x7ffff68113f0
10 Ogre::Viewport::_updateRenderPhase02 OgreViewport.cpp 272 0x7ffff6cb166f
11 Ogre::CompositorPassScene::execute OgreCompositorPassScene.cpp 222 0x7ffff6d51186
12 Ogre::CompositorNode::_update OgreCompositorNode.cpp 890 0x7ffff6d0de0f
13 Ogre::CompositorWorkspace::_update OgreCompositorWorkspace.cpp 791 0x7ffff6d298f0
14 Ogre::CompositorManager2::_updateImplementation OgreCompositorManager2.cpp 724 0x7ffff6cfcba5
15 Ogre::RenderSystem::updateCompositorManager OgreRenderSystem.cpp 1149 0x7ffff6ac8a2a
16 Ogre::CompositorManager2::_update OgreCompositorManager2.cpp 652 0x7ffff6cfc81c
17 Ogre::Root::_updateAllRenderTargets OgreRoot.cpp 1528 0x7ffff6b20a50
18 Ogre::Root::renderOneFrame OgreRoot.cpp 1104 0x7ffff6b1e670
19 Demo::GraphicsSystem::update GraphicsSystem.cpp 362 0x55555558d701
20 Demo::MainEntryPoints::mainAppSingleThreaded MainLoopSingleThreaded.cpp 118 0x55555559ca48
... <More>
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Btw I see that often you're directly talking to Vulkan:
I believe that's temporary just to get things going? Because in Vulkan there's a limited number of buffers you can create, and there are restrictions on how data can be mapped.
The idea was to share buffers, and have VulkanBufferInterface map the whole buffer while automatically managing mapped subregions.
Code: Select all
// VulkanBufferInterface * mBufferInterface;
VkDeviceMemory mDeviceMemory;
VkBuffer mVboName;
The idea was to share buffers, and have VulkanBufferInterface map the whole buffer while automatically managing mapped subregions.
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Yes it's temporary. There will be few allocated buffers and all 'allocations' will be offsets. Also I will add VulkanHlmsPso to the repo.dark_sylinc wrote: ↑Fri Jul 24, 2020 6:28 am Btw I see that often you're directly talking to Vulkan:
I believe that's temporary just to get things going? Because in Vulkan there's a limited number of buffers you can create, and there are restrictions on how data can be mapped.Code: Select all
// VulkanBufferInterface * mBufferInterface; VkDeviceMemory mDeviceMemory; VkBuffer mVboName;
The idea was to share buffers, and have VulkanBufferInterface map the whole buffer while automatically managing mapped subregions.
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
I added VulkanHlmsPso.cpp to the online repo. You are more than welcome to push changes on this repo. I will check if I need to give you some sort of access rights.
EDIT: I've sent you an invitation as collaborator. I think you have direct access to the repo now.
EDIT: I've sent you an invitation as collaborator. I think you have direct access to the repo now.
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Thanks!!!
It works! I've pushed a couple fixes to get it running on my setup, and a barrier bug.
It works! I've pushed a couple fixes to get it running on my setup, and a barrier bug.
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
I'm trying to move your changes back into the official repo. I'm basically keeping what I like, and throwing what I dislike or I think can be improved (and some stuff is not yet transferred due to complexity).
There's also the thing about buffer barriers which I thought was worth mentioning:
In VulkanStagingBuffer::unmapImpl you issue a per buffer barrier. This is not wrong but it can be improved (I'm already writing the code):
There's also the thing about buffer barriers which I thought was worth mentioning:
In VulkanStagingBuffer::unmapImpl you issue a per buffer barrier. This is not wrong but it can be improved (I'm already writing the code):
- VkBufferMemoryBarrier::buffer doesn't actually need to be filled (i.e. we can issue one barrier for all buffer transfers, instead of one per buffer) unless we are doing a cross-queue or cross-device transactions (e.g. SLI/Crossfire/mixed-multi-GPU)
- To achieve the above one, I like Metal's approach: Metal divided their tasks in 3 encoders:
- Graphics
- Blitting (any copy operation, other misc stuff)
- Compute
But if we do it explictly like Metal does (i.e. mimic Metal's behavior) then when we close the Blitting encoder, it is time to perform a single buffer barrier. No need to perform one barrier every time VulkanStagingBuffer::unmapImpl gets called. We only perform one when the blitting encoder is closed (i.e. to open up graphics rendering encoder)
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
So I pushed several commits to the Vulkan branch incorporating your changes. I'm not done yet, but I wanted to explain them.
You'd probably want to take the main commit, copy paste over a clean checkout of your branch to see the differences (the next commits are just merges with master).
Your work is helping me A TON. Vulkan is very complex. My approach on Ogre is to think for weeks about what I'm going to do, layout a mental plan of how the components are going to interact, the repercusions on existing code, and then start writing the code.
But Vulkan is so complex I kept getting overwhelmed. Too many things to track down that branch into many potential use case/scenarios we have to consider.
I rewrote VulkanQueue::getCopyEncoder like 3 times because I kept missing use cases and leaving potential hazards. What if someone downloads data from GPU, uploads data to that region, and downloads it again? (without even using it). It would be a stupid usage pattern but it's not impossible. And if it happens, it has to work. Our code doesn't do that (that I'm aware of) but our users could.
And all that just to handle potential race conditions when copying memory! I didn't even touch code related to actual rendering yet.
So I rewrote the code to take that into account and then realize it was a good thing I did that because my old version would initiate uploads before the GPU is done using that data. Now I think I finally accounted for everything. And if I didn't then we'll fix those edge cases when we get them. The current code should be robust enough.
Your approach was pretty much the opposite of mine's: Try to get it running ASAP, even it that means leaking 1GB of RAM per second and iterating through all of our buffers every frame to rebuild the descriptors.
Yeah that sucks. But you're coding on Windows with an NVIDIA GPU (correct me if I'm wrong), and I got the same image as you, while working on Linux with AMD's RADV driver. When it comes to Vulkan, that kind of cross-OS, cross-vendor compatibility is a major achievement! Congratulations! (seriously. No sarcasm)
Thus while we can't obviously deploy that kind of brute force (leaks and iterate through all of our buffers every frame) in the final version, having a working version is of tremendous help: I can see what worked, what does not. I can also test other stuff to see if the validation layers will complain.
I no longer get mentally blocked by all the potential paths because I can try them out and probe which approaches look more promising.
Now onto the code itself:
I left out most of the descriptor stuff (shaders/bindings are not yet working in my branch) because the way you're doing it is too brute force. But it is giving me some ideas.
I think that it is clear that I should start with a similar approach of rebuilding everything at first, and later phase it out for a version that can reuse descriptors.
I didn't keep findDepthFormat, so probably the Windows version won't compile, but you may notice that I did check a different version of findSupportedFormat, which avoids VkFormat -> PixelFormatGpu conversion. Thus writing a new findDepthFormat should be a piece of cake
Like I explained in a Github comment, in my experience the original version you wrote is a bad pattern because eventually as new code gets added we fall into the trap of code doing PixelFormatGpu -> VkFormat -> PixelFormatGpu, and this eventually breaks. Either because the conversion isn't lossless or because we end up with two pointers having different PixelFormatGpu, but the same VkFormat or viceversa. Or the user caches getPixelFormat() only to see it later changed, because the Vulkan backend silently changed the Ogre format. This can lead to all sorts of bugs (I asked for a lemon pie, I was handled the fork and then I got an ice cream instead).
I fixed the VaoManager's memory family selection. If the GPU doesn't support non-coherent memory but the user requests BT_DYNAMIC_PERSISTENT, we use coherent memory and pretend it is non-coherent. Same viceversa. That's how it was supposed to work.
The VaoManager is preferring non-cached (i.e. write combined) memory for StagingBuffers even if we request for downloading, which is wrong. Reading from write-combined memory is slow. We'll have to fix this (by adding a new VulkanVaoManager::VboFlag for cached reads and using that one when StagingBuffers for downloads are requested).
I also moved a lot of map, flush, invalidate and unmapping to VulkanDynamicBuffer. It greatly simplified the other classes which were talking to Vulkan directly. Now VulkanDynamicBuffer takes care of rounding up the next nonCoherentAtomSize.
I also added VulkanQueue::getCopyEncoder, although we never yet call any of the end*Encoder() functions (see Metal, look for endBlitEncoder calls and beginRenderEncoder/ComputeEncoder ones).
The beauty of getCopyEncoder is that between 0 and a few memory barriers when getCopyEncoder gets called and then one final memory barrier at the end when endCopyEncoder() is called.
Your approach was to issue two memory barrier for every vkCmdCopy* call we do. This works but it was very suboptimal.
I wrapped some bits that I intend to be temporary under "VULKAN_HOTSHOT_WILL_REMOVE" ifdefs, while I've wrapped around "VULKAN_HOTSHOT_DISABLED" code that is not included but I wanted to keep to see how it works.
There are other classes such as VulkanTextureGpu that I didn't yet fully review.
Tomorrow I will keep working on this and see where that leads us.
Keep up doing the good stuff.
Cheers
You'd probably want to take the main commit, copy paste over a clean checkout of your branch to see the differences (the next commits are just merges with master).
Your work is helping me A TON. Vulkan is very complex. My approach on Ogre is to think for weeks about what I'm going to do, layout a mental plan of how the components are going to interact, the repercusions on existing code, and then start writing the code.
But Vulkan is so complex I kept getting overwhelmed. Too many things to track down that branch into many potential use case/scenarios we have to consider.
I rewrote VulkanQueue::getCopyEncoder like 3 times because I kept missing use cases and leaving potential hazards. What if someone downloads data from GPU, uploads data to that region, and downloads it again? (without even using it). It would be a stupid usage pattern but it's not impossible. And if it happens, it has to work. Our code doesn't do that (that I'm aware of) but our users could.
And all that just to handle potential race conditions when copying memory! I didn't even touch code related to actual rendering yet.
So I rewrote the code to take that into account and then realize it was a good thing I did that because my old version would initiate uploads before the GPU is done using that data. Now I think I finally accounted for everything. And if I didn't then we'll fix those edge cases when we get them. The current code should be robust enough.
Your approach was pretty much the opposite of mine's: Try to get it running ASAP, even it that means leaking 1GB of RAM per second and iterating through all of our buffers every frame to rebuild the descriptors.
Yeah that sucks. But you're coding on Windows with an NVIDIA GPU (correct me if I'm wrong), and I got the same image as you, while working on Linux with AMD's RADV driver. When it comes to Vulkan, that kind of cross-OS, cross-vendor compatibility is a major achievement! Congratulations! (seriously. No sarcasm)
Thus while we can't obviously deploy that kind of brute force (leaks and iterate through all of our buffers every frame) in the final version, having a working version is of tremendous help: I can see what worked, what does not. I can also test other stuff to see if the validation layers will complain.
I no longer get mentally blocked by all the potential paths because I can try them out and probe which approaches look more promising.
Now onto the code itself:
I left out most of the descriptor stuff (shaders/bindings are not yet working in my branch) because the way you're doing it is too brute force. But it is giving me some ideas.
I think that it is clear that I should start with a similar approach of rebuilding everything at first, and later phase it out for a version that can reuse descriptors.
I didn't keep findDepthFormat, so probably the Windows version won't compile, but you may notice that I did check a different version of findSupportedFormat, which avoids VkFormat -> PixelFormatGpu conversion. Thus writing a new findDepthFormat should be a piece of cake
Like I explained in a Github comment, in my experience the original version you wrote is a bad pattern because eventually as new code gets added we fall into the trap of code doing PixelFormatGpu -> VkFormat -> PixelFormatGpu, and this eventually breaks. Either because the conversion isn't lossless or because we end up with two pointers having different PixelFormatGpu, but the same VkFormat or viceversa. Or the user caches getPixelFormat() only to see it later changed, because the Vulkan backend silently changed the Ogre format. This can lead to all sorts of bugs (I asked for a lemon pie, I was handled the fork and then I got an ice cream instead).
I fixed the VaoManager's memory family selection. If the GPU doesn't support non-coherent memory but the user requests BT_DYNAMIC_PERSISTENT, we use coherent memory and pretend it is non-coherent. Same viceversa. That's how it was supposed to work.
The VaoManager is preferring non-cached (i.e. write combined) memory for StagingBuffers even if we request for downloading, which is wrong. Reading from write-combined memory is slow. We'll have to fix this (by adding a new VulkanVaoManager::VboFlag for cached reads and using that one when StagingBuffers for downloads are requested).
I also moved a lot of map, flush, invalidate and unmapping to VulkanDynamicBuffer. It greatly simplified the other classes which were talking to Vulkan directly. Now VulkanDynamicBuffer takes care of rounding up the next nonCoherentAtomSize.
I also added VulkanQueue::getCopyEncoder, although we never yet call any of the end*Encoder() functions (see Metal, look for endBlitEncoder calls and beginRenderEncoder/ComputeEncoder ones).
The beauty of getCopyEncoder is that between 0 and a few memory barriers when getCopyEncoder gets called and then one final memory barrier at the end when endCopyEncoder() is called.
Your approach was to issue two memory barrier for every vkCmdCopy* call we do. This works but it was very suboptimal.
I wrapped some bits that I intend to be temporary under "VULKAN_HOTSHOT_WILL_REMOVE" ifdefs, while I've wrapped around "VULKAN_HOTSHOT_DISABLED" code that is not included but I wanted to keep to see how it works.
There are other classes such as VulkanTextureGpu that I didn't yet fully review.
Tomorrow I will keep working on this and see where that leads us.
Keep up doing the good stuff.
Cheers
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Quick Update:
You implemented VulkanTextureGpu like this:
You confused views with the image. If we look at D3D11's implementation, where ID3D11ShaderResourceView = VkImageView and ID3D11Resource = VkImage it becomes obvious:
You implemented VulkanTextureGpu like this:
Code: Select all
VkImageView mDisplayTextureName;
VkImage mFinalTextureName;
VkImageView mFinalTextureNameView;
Code: Select all
ComPtr<ID3D11ShaderResourceView> mDefaultDisplaySrv;
ID3D11Resource *mDisplayTextureName; // It's not wrapped by ComPtr<> because it's a weak reference
ComPtr<ID3D11Resource> mFinalTextureName;
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Yeah I went through something similar at the start: I was reading and testing samples trying to understand how Vulkan works and how it can be added to Ogre. I spent around one month doing that and I realized at some point that I am not actually closer to getting anything done so I said: "To hell with it" and just started implementing the Vulkan backend in Ogre totally disregarding Vulkan needed practices like barriers and all of that. The only thing that I wanted was to get the simplest sample working (which I kind of did) and then go back to fix and plan and actually be thoughtful about what I was doing. I just wanted to get out of analysis paralysis.
That is why most of the code is rather garbage from both Vulkan standpoint and from general coding practices standpoint. So you will find duplicate code, copy pasted code from samples + stack overflow that don't follow the Ogre coding guidelines etc.
So I will port your changes from 'Incorporated many of Sebastian's improvements' to the vulkan branch and start looking into the VulkanTextureGpu issue with image and imageview.
EDIT: I am looking into the diff between the vulkan branch and the Improvements commit and I see that in OgreHlmsPbs.cpp you removed the from
I don't think it's gonna work without that. the shader constant names haven't been parsed yet from the shader so you will get an exception when setting the worldMatBuf.
EDIT2:
It looks like there might be some problem with the encoder in VulkanQueue.
EDIT3: I have pushed the changes containing the improvements to my forked repo.
That is why most of the code is rather garbage from both Vulkan standpoint and from general coding practices standpoint. So you will find duplicate code, copy pasted code from samples + stack overflow that don't follow the Ogre coding guidelines etc.
So I will port your changes from 'Incorporated many of Sebastian's improvements' to the vulkan branch and start looking into the VulkanTextureGpu issue with image and imageview.
EDIT: I am looking into the diff between the vulkan branch and the Improvements commit and I see that in OgreHlmsPbs.cpp you removed the
Code: Select all
if( mShaderProfile == "hlsl" || mShaderProfile == "metal" || mShaderProfile == "glsl-vulkan" )
Code: Select all
HlmsCache* HlmsPbs::createShaderCacheEntry
EDIT2:
Code: Select all
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkDescriptorImageInfo-imageLayout-00344 ] Object: 0x1a37fc5bba0 (Type = 6) | vkCmdDraw(): Cannot use VkImage 0x4bad500000000075[DebugFont/Texture] (layer=0 mip=0) with specific layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL that doesn't match the previously used layout VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL. The Vulkan spec states: imageLayout must match the actual VkImageLayout of each subresource accessible from imageView at the time this descriptor is accessed as defined by the image layout matching rules (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-VkDescriptorImageInfo-imageLayout-00344)
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-DescriptorSetNotUpdated ] Object: 0xd6062500000000a0 (Type = 23) | VkDescriptorSet 0xd6062500000000a0[] bound as set #0 encountered the following validation error at vkCmdDraw() time: Image layout specified at vkUpdateDescriptorSet* or vkCmdPushDescriptorSet* time doesn't match actual image layout at time descriptor is used. See previous error callback for specific details.
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkDescriptorImageInfo-imageLayout-00344 ] Object: 0x1a37fc5bba0 (Type = 6) | vkCmdDraw(): Cannot use VkImage 0x4bad500000000075[DebugFont/Texture] (layer=0 mip=0) with specific layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL that doesn't match the previously used layout VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL. The Vulkan spec states: imageLayout must match the actual VkImageLayout of each subresource accessible from imageView at the time this descriptor is accessed as defined by the image layout matching rules (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-VkDescriptorImageInfo-imageLayout-00344)
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-DescriptorSetNotUpdated ] Object: 0xd6062500000000a0 (Type = 23) | VkDescriptorSet 0xd6062500000000a0[] bound as set #0 encountered the following validation error at vkCmdDraw() time: Image layout specified at vkUpdateDescriptorSet* or vkCmdPushDescriptorSet* time doesn't match actual image layout at time descriptor is used. See previous error callback for specific details.
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkImageMemoryBarrier-newLayout-01198 ] Object: 0x1a37fc5bba0 (Type = 6) | vkCmdPipelineBarrier(): Image Layout cannot be transitioned to UNDEFINED or PREINITIALIZED. The Vulkan spec states: newLayout must not be VK_IMAGE_LAYOUT_UNDEFINED or VK_IMAGE_LAYOUT_PREINITIALIZED (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-VkImageMemoryBarrier-newLayout-01198)
Ogre: VulkanQueue::commitAndNextCommandBuffer
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x1a37fc5bba0 (Type = 6) | Submitted command buffer expects VkImage 0x4bad500000000075[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED.
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-QueueForwardProgress ] Object: VK_NULL_HANDLE (Type = 6) | VkQueue 0x1a37f9af980[] is waiting on VkSemaphore 0x610d0c00000000a9[] that has no way to be signaled.
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkPresentInfoKHR-pImageIndices-01296 ] Object: 0x1a37f9af980 (Type = 4) | Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
Ogre: [VulkanWindow::swapBuffers] vkQueuePresentKHR: error presenting VkResult = VK_ERROR_VALIDATION_FAILED_EXT
Ogre: flushUAVs
Ogre: * _setPipelineStateObject: pso
Ogre: * _setVertexArrayObject: vaoName 1
Ogre: * _setIndirectBuffer: 5
Ogre: * _renderEmulated: CbDrawCallIndexed 1
Ogre: * _setPipelineStateObject: pso
Ogre: _setTextures DescriptorSetTexture
Ogre: _setSamplers
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: VulkanQueue::commitAndNextCommandBuffer
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x1a3140a3a50 (Type = 6) | Submitted command buffer expects VkImage 0x4bad500000000075[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED.
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-QueueForwardProgress ] Object: VK_NULL_HANDLE (Type = 6) | VkQueue 0x1a37f9af980[] is waiting on VkSemaphore 0x811ca600000000c3[] that has no way to be signaled.
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkPresentInfoKHR-pImageIndices-01296 ] Object: 0x1a37f9af980 (Type = 4) | Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
Ogre: [VulkanWindow::swapBuffers] vkQueuePresentKHR: error presenting VkResult = VK_ERROR_VALIDATION_FAILED_EXT
Ogre: flushUAVs
Ogre: * _setPipelineStateObject: pso
Ogre: * _setVertexArrayObject: vaoName 1
Ogre: * _setIndirectBuffer: 5
Ogre: * _renderEmulated: CbDrawCallIndexed 1
Ogre: * _setPipelineStateObject: pso
Ogre: _setTextures DescriptorSetTexture
Ogre: _setSamplers
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: v1 * _render: CbRenderOp
Ogre: v1 * _render: CbDrawCallStrip
Ogre: VulkanQueue::commitAndNextCommandBuffer
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-QueueForwardProgress ] Object: 0x40fd720000000017 (Type = 5) | VkQueue 0x1a37f9af980[] is signaling VkSemaphore 0x40fd720000000017[] that was previously signaled by VkQueue 0x0[] but has not since been waited on by any queue.
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x1a3140a4730 (Type = 6) | Submitted command buffer expects VkImage 0x4bad500000000075[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED.
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkPresentInfoKHR-pImageIndices-01296 ] Object: 0x1a37f9af980 (Type = 4) | Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
Ogre: [VulkanWindow::swapBuffers] vkQueuePresentKHR: error presenting VkResult = VK_ERROR_VALIDATION_FAILED_EXT
Ogre: ERROR: [Validation] Code 0 : [ VUID-vkAcquireNextImageKHR-swapchain-01802 ] Object: 0xed70170000000012 (Type = 27) | vkAcquireNextImageKHR: Application has already previously acquired 3 images from swapchain. Only 3 are available to be acquired using a timeout of UINT64_MAX (given the swapchain has 4, and VkSurfaceCapabilitiesKHR::minImageCount is 2). The Vulkan spec states: If the number of currently acquired images is greater than the difference between the number of images in swapchain and the value of VkSurfaceCapabilitiesKHR::minImageCount as returned by a call to vkGetPhysicalDeviceSurfaceCapabilities2KHR with the surface used to create swapchain, timeout must not be UINT64_MAX (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-vkAcquireNextImageKHR-swapchain-01802)
Ogre: [VulkanWindow::acquireNextSwapchain] vkAcquireNextImageKHR failed VkResult = VK_ERROR_VALIDATION_FAILED_EXT
EDIT3: I have pushed the changes containing the improvements to my forked repo.
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Very well summarizedHotshot5000 wrote: ↑Sun Jul 26, 2020 8:11 pm "To hell with it" and just started implementing the Vulkan backend in Ogre totally disregarding Vulkan needed practices like barriers and all of that.

Note that today I pushed a fix for that. (I can't yet fully test it because I can't bind it to shaders yet)Hotshot5000 wrote: ↑Sun Jul 26, 2020 8:11 pm So I will port your changes from 'Incorporated many of Sebastian's improvements' to the vulkan branch and start looking into the VulkanTextureGpu issue with image and imageview.
It's not that I removed it, but rather I never added itHotshot5000 wrote: ↑Sun Jul 26, 2020 8:11 pm EDIT: I am looking into the diff between the vulkan branch and the Improvements commit and I see that in OgreHlmsPbs.cpp you removed thefromCode: Select all
if( mShaderProfile == "hlsl" || mShaderProfile == "metal" || mShaderProfile == "glsl-vulkan" )
Code: Select all
HlmsCache* HlmsPbs::createShaderCacheEntry

I only touched RenderSystem/Vulkan files, thus anything outside I did not review it or incorporate.
I am thinking what should we do about GLSL variants. I'm thinking that the final version should probably be the same for both OpenGL and Vulkan; and we provide a series of macros to have a GLSL shader be able to compile in both backends without modifications.
That depends on how much the cost/benefit of each approach is (having 2 duplicates one for OpenGL another for Vulkan vs sharing the same source with minor ifdefs)
There was a bug where imgBarrierCount was always 0, that's probably what you encountered. The fix has been pushed.Hotshot5000 wrote: ↑Sun Jul 26, 2020 8:11 pm EDIT2:Code: Select all
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkDescriptorImageInfo-imageLayout-00344 ] Object: 0x1a37fc5bba0 (Type = 6) | vkCmdDraw(): Cannot use VkImage 0x4bad500000000075[DebugFont/Texture] (layer=0 mip=0) with specific layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL that doesn't match the previously used layout VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL. The Vulkan spec states: imageLayout must match the actual VkImageLayout of each subresource accessible from imageView at the time this descriptor is accessed as defined by the image layout matching rules (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-VkDescriptorImageInfo-imageLayout-00344) ..
Btw I got render_quad passes working (v1 buffers submit the input vertices to the shader) and I am very happy about that!
The following vertex shader actually works:
Code: Select all
#version 450
layout(location = 0) in vec3 position;
layout(location = 7) in vec2 inUv0;
layout(location = 0) out vec2 uv0;
void main()
{
gl_Position = vec4( position.xy, 0.0f, 1.0f );
uv0 = inUv0.xy;
}
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Hmm. I tried your fix from here and I still get
Code: Select all
Ogre: VulkanQueue::commitAndNextCommandBuffer
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-QueueForwardProgress ] Object: 0x2eeb6f00000000aa (Type = 5) | VkQueue 0x2436999b370[] is signaling VkSemaphore 0x2eeb6f00000000aa[] that was previously signaled by VkQueue 0x0[] but has not since been waited on by any queue.
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x24369b944f0 (Type = 6) | Submitted command buffer expects VkImage 0x4bad500000000075[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL.
Ogre: ERROR: [Validation] Code 0 : [ VUID-VkPresentInfoKHR-pImageIndices-01296 ] Object: 0x2436999b370 (Type = 4) | Images passed to present must be in layout VK_IMAGE_LAYOUT_PRESENT_SRC_KHR or VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR but is in VK_IMAGE_LAYOUT_UNDEFINED. The Vulkan spec states: Each element of pImageIndices must be the index of a presentable image acquired from the swapchain specified by the corresponding element of the pSwapchains array, and the presented image subresource must be in the VK_IMAGE_LAYOUT_PRESENT_SRC_KHR layout at the time the operation is executed on a VkDevice (https://github.com/KhronosGroup/Vulkan-Docs/search?q=VUID-VkPresentInfoKHR-pImageIndices-01296)
Ogre: [VulkanWindow::swapBuffers] vkQueuePresentKHR: error presenting VkResult = VK_ERROR_VALIDATION_FAILED_EXT
Ogre: ERROR: [Validation] Code 0 : [ VUID-vkAcquireNextImageKHR-swapchain-01802 ] Object: 0xed70170000000012 (Type = 27) | vkAcquireNextImageKHR: Application has already previously acquired 3 images from swapchain. Only 3 are available to be acquired using a timeout of UINT64_MAX (given the swapchain has 4, and VkSurfaceCapabilitiesKHR::minImageCount is 2). The Vulkan spec states: If the number of currently acquired images is greater than the difference between the number of images in swapchain and the value of VkSurfaceCapabilitiesKHR::minImageCount as returned by a call to vkGetPhysicalDeviceSurfaceCapabilities2KHR with the surface used to create swapchain, timeout must not be UINT64_MAX (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-vkAcquireNextImageKHR-swapchain-01802)
Ogre: [VulkanWindow::acquireNextSwapchain] vkAcquireNextImageKHR failed VkResult = VK_ERROR_VALIDATION_FAILED_EXT
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
That error means endCopyEncoder was called too late. It does make sense because in Metal executeRenderPassDescriptorDelayedActions ends all active encoders to make way for a new render encoder (see wherever renderCommandEncoderWithDescriptor gets called). Vulkan should be doing the same but it is currently not.Hotshot5000 wrote: ↑Mon Jul 27, 2020 7:01 pmCode: Select all
Ogre: ERROR: [Validation] Code 0 : [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object: 0x24369b944f0 (Type = 6) | Submitted command buffer expects VkImage 0x4bad500000000075[DebugFont/Texture] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL.
Wherever Metal is doing renderCommandEncoderWithDescriptor, we should likely call getRenderEncoder
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
I merged in my Ogre fork with the 2-2-vulkan branch which I saw that you merged from master, tried to get as up to date as possible. There are quite a few things that changed so it looks like I may need to rerun CMake as I don't have the UnitTests project in the VS solution.
EDIT: Fixed the linker issues. Now I am back in business
Tell me on what to focus next so we don't end up working on the same things in the vulkan codebase please.
EDIT: Fixed the linker issues. Now I am back in business

Tell me on what to focus next so we don't end up working on the same things in the vulkan codebase please.
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
I had something for you but I forgot! Argh!!!Hotshot5000 wrote: ↑Mon Jul 27, 2020 10:38 pm Tell me on what to focus next so we don't end up working on the same things in the vulkan codebase please.
Well when it comes back I'll type out here.
Things you can focus on (please these changes try to isolate them in a commit so I can easily cherry pick

- createStagingBuffer & createStagingTexture call allocateVbo(), however we never call deallocateVbo when these staging buf/texs. get destroyed. We didn't have to, and all the other APIs just relied on "delete stagingBuffer" doing its thing.
However the destructor may not be the appropriate place because VaoManager::~VaoManager deletes staging buffers. And we can't have ~VulkanStagingBuffer call VulkanVaoManager::deallocateVbo while we're already inside the base class' destructor.
A solution could be to to move the destruction of staging buffers in ~VaoManager to its own function, and have the derived class call it instead (so that ~StagingBuffer can call VulkanVaoManager::deallocateVbo) - Look in Metal, and see calls for getBlitEncoder/endBlitEncoder (blitEncoder == copyEncoder in Vulkan), getComputeEncoder/endComputeEncoder and endAllEncoders calls and see that Vulkan has similar calls
Our approach to binding resources to shaders (textures, buffers, etc) is flawed.
VulkanDescriptors::generateAndMergeDescriptorSets is something that I probably shouldn't have written. But it was probably necessary for me to understand it.
Watching your code and reading Writing an efficient Vulkan renderer made me realize we got it all backwards.
Our current approach looks at what the shader is using, generates a descriptor layout out of it; and tries to merge all the shaders' descriptor layouts.
This is ok... but unless two PSOs have the exact same descriptor layout, we can't reuse descriptor pools. And thus we have to generate a lot of pools and waste a lot of memory and swap in/out lots descriptors every time something tiny changes.
My new approach is the reverse: rather than following what the shader is using, tell the shader what they should use.
Like zeux' post says, we could overestimate and create a layout for worst case: "one can allocate all pools with maxSets=1024, and pool sizes 16*1024 for texture descriptors and 8*1024 for buffer descriptors".
However like he says, that can cause a lot of waste.
But we can group shaders into classes and that's what we're going to do:
- Parse the shader (which is not valid GLSL yet) and see what slots are being used (how many texture slots, how many const buffer slots, how many tex buffer slots, etc)
- Round up the slots for each type to the nearest multiple of N (N is arbitrary, probably 2. Could be power of 2 instead of multiple?)
- Put all bindings in each set contiguosly. e.g. all const buffers come in first, then all texture buffers, then all samplers, then all textures. Basically we don't interleave (i.e. const buffer at binding 0 then tex buffer at binding 1 then const buffer again at binding 3).
- Compile the shaders telling the shader the actual binding slots via macros
Internally VulkanRenderSystem keeps a table of the bound const/tex/uav buffers and textures, samplers and uav textures; and thus we know how to generate the descriptor layout by following these simple rules.
Our GLSL derivative takes this concept from HLSL by grouping binding slots into letters. Thus before parsing it would look like this:
Code: Select all
layout( std140, ogre_set0, ogre_B0 ) uniform GlobalUniform {} globalUniform;
layout( ogre_set0, ogre_T0 ) uniform samplerBuffer texelBuffer; // T0 means texture buffer slot 0
layout( ogre_set0, ogre_t1 ) uniform sampler2D myTexture0; // t1 means texture slot 1
layout( ogre_set0, ogre_t2 ) uniform sampler2D myTexture1;
layout( ogre_set0, ogre_u0 ) uniform image2D myTexture1;
layout( std430, ogre_set0, ogre_U0 ) buffer MySsbo {};
layout( std140, ogre_set0, ogre_B1 ) uniform AnotherUnif {} anotherUnif;
// Set 1. Note that 't2' does not reset to 0
layout( ogre_set1, ogre_t2 ) uniform sampler2D anotherTex2;
Code: Select all
layout( std140, 0, 0 ) uniform GlobalUniform {} globalUniform;
layout( 0, 2 ) uniform samplerBuffer texelBuffer;
layout( 0, 3 ) uniform sampler2D myTexture0;
layout( 0, 4 ) uniform sampler2D myTexture1;
layout( 0, 5 ) uniform image2D myTexture1;
layout( std430, 0, 6 ) buffer MySsbo {};
layout( std140, 0, 1 ) uniform AnotherUnif {} anotherUnif; // Note that const buffers are contiguous so this one comes first
// Set 1. Note that 't2' got remapped to 0 (we internally track this)
layout( 1, 0 ) uniform sampler2D anotherTex2;
Code: Select all
// You can have gaps. However you can't go back, e.g. if you have:
layout( ogre_set0, ogre_t0 ) uniform sampler2D myTex0;
layout( ogre_set0, ogre_t4 ) uniform sampler2D myTex4;
layout( ogre_set1, ogre_t3 ) uniform sampler2D myTex3; // Invalid, t3 must be in set0
This applies to all buffers, and textures bound individually via setTexture.
For textures bound via setTextures it gets easier, because we can put these textures into its own set and thus we never have to worry about rounding up or keeping multiple descriptors (??? needs further research).
Well. I'm sure there will be changes because there's something I could be wrong, but I'm basically trying to get into a binding model that is actually useful for shipping.
TL;DR I will be reworking how we bind resources to shaders by forcing our shaders to follow certain rules, putting order into the flexibility of chaos that Vulkan allows. Which also helps us keep harmony with the other API backends in terms of binding slots since like zeux said, Vulkan's binding model doesn't really match the other APIs
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Sounds good. I've read the "Writing an efficient vulkan renderer" article (actually I bought the whole GPU Zen 2 book for weekend reading
).
As I see here t2 gets remapped, but does this mean that in other cases it might get mapped to say 2? Meaning we end up recompiling the same shader but with different layouts?
Asking so that instead of a lot of descriptor pools we don't end up with a lot of similar shaders.

Code: Select all
// Set 1. Note that 't2' got remapped to 0 (we internally track this)
layout( 1, 0 ) uniform sampler2D anotherTex2;
Asking so that instead of a lot of descriptor pools we don't end up with a lot of similar shaders.
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
This is solved. I slightly changed my approach today after I realized this approach gives trouble when linking vertex with pixel shaders. Because e.g. if vertex shader uses textures in range [0; 2) and pixel shader uses range [2; 4), then when compiling shaders we need to be aware of this and compile both shaders using range [0; 4)Hotshot5000 wrote: ↑Tue Jul 28, 2020 7:02 pmAs I see here t2 gets remapped, but does this mean that in other cases it might get mapped to say 2? Meaning we end up recompiling the same shader but with different layouts?Code: Select all
// Set 1. Note that 't2' got remapped to 0 (we internally track this) layout( 1, 0 ) uniform sampler2D anotherTex2;
Asking so that instead of a lot of descriptor pools we don't end up with a lot of similar shaders.
To solve this I... rediscovered D3D12's root layouts. We will basically force shaders to specify their root layouts. For simplicity for us I will use JSON. Thus if two shaders want to be linked together then they need to have matching root layouts:
Code: Select all
{
"0" :
{
"samplers" : [4, 16],
"textures" : [0, 16],
"const_buffers" : [0, 16],
"tex_buffers" : [0, 16],
"uav_buffers" : [0, 16]
}
}
I'm thinking root layouts can be specified inside the shader, or externally via code to speed up processing i.e. myShader->setRootLayout( alreadyExistingRootLayoutHandle );
It may be possible to relax compatibility rules for convenience (e.g. if vertex shader uses one const buffer and the pixel shader uses one const buffer and 4 textures; then they should be compatible; main use case is postprocessing effects which all share the same vertex shader)
So this should answer your question: Yes, if you need the same shader with two different (incompatible) root layouts, then they have to be compiled separately (or make the root layout bigger so it can be used in both places while being compiled just once).
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Regarding the staging buffers and textures deallocateVbo problem would this work?
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Yes Yes Yes!!! Thank you!!!
I already integrated it into the main branch. Looking at your changes something felt off and then I realized we already had GL3PlusVaoManager::destroyStagingTexture, thus I followed what we do there.
Thanks to that I also noticed StagingTexture had multiple bugs as a side effect from coming from NULL RenderSystem. Pushed those fixes.
Also renamed mVboIdx to mVboPoolIdx for consistency with the rest of the code
I already integrated it into the main branch. Looking at your changes something felt off and then I realized we already had GL3PlusVaoManager::destroyStagingTexture, thus I followed what we do there.
Thanks to that I also noticed StagingTexture had multiple bugs as a side effect from coming from NULL RenderSystem. Pushed those fixes.
Also renamed mVboIdx to mVboPoolIdx for consistency with the rest of the code
-
- OGRE Contributor
- Posts: 226
- Joined: Thu Oct 14, 2010 12:30 pm
- x 56
Re: [News] Vulkan Progress Report
Glad I could help! Now I will be moving to the encoder stuff
- dark_sylinc
- OGRE Team Member
- Posts: 5261
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1262
- Contact:
Re: [News] Vulkan Progress Report
Wooot!
The shader used:
VS
PS
I still need to fix the leak on vaoManager->getDescriptorPool (reuse those pools). But everything is finally starting to get pieced together!
I still have to code an alternate descriptor strategy for DescriptorSetTextures (since that can be made more efficient than our D3D11/Metal style emulation) which is ideal for material textures.
However I'm thinking that perhaps DescriptorSetUav may work better emulated due to its highly dynamic nature.
I hope progress skyrockets soon (i.e. samples start to work)
Btw on how it works:
Ogre will parse the root layout in the comments, and define the following macros:
The shader used:
VS
Code: Select all
#version 450
/*
## ROOT LAYOUT BEGIN
{
"0" :
{
"samplers" : [0, 1],
"textures" : [0, 1]
}
}
## ROOT LAYOUT END
*/
layout(location = 0) in vec3 position;
layout(location = 7) in vec2 inUv0;
layout(location = 0) out vec2 uv0;
void main()
{
gl_Position = vec4( position.xy, 0.0f, 1.0f );
uv0 = inUv0.xy;
}
PS
Code: Select all
#version 450
/*
## ROOT LAYOUT BEGIN
{
"0" :
{
"samplers" : [0, 1],
"textures" : [0, 1]
}
}
## ROOT LAYOUT END
*/
layout(ogre_t0) uniform texture2D tex;
layout(ogre_s0) uniform sampler my_sampler;
layout(location = 0) in vec2 uv0;
layout(location = 0) out vec4 outColour;
void main()
{
//outColour = vec4( uv0.xy, 0, 1 );
outColour = texture( sampler2D(tex, my_sampler), uv0.xy );
}
I still have to code an alternate descriptor strategy for DescriptorSetTextures (since that can be made more efficient than our D3D11/Metal style emulation) which is ideal for material textures.
However I'm thinking that perhaps DescriptorSetUav may work better emulated due to its highly dynamic nature.
I hope progress skyrockets soon (i.e. samples start to work)
Btw on how it works:
Ogre will parse the root layout in the comments, and define the following macros:
Code: Select all
#define ogre_t0 set = 0, binding = 0
#define ogre_s0 set = 0, binding = 1