Page 1 of 1

[2.1+] Bind vertex and index buffer to compute shader?

Posted: Mon Apr 08, 2019 2:38 pm
by al2950
As the title, it's not obvious how to bind a vertex or index buffer directly to a compute shader. At the moment they need to be of type TexBufferPacked (confusingly named as it's not a texture), this is not compatible with VertexBufferPacked or IndexBufferPacked.

I noticed some of the new vct code extracts the data and puts it into its own UAV, but seems a little unnecessary to me.

How can I make this work!?

Re: [2.1+] Bind vertex and index buffer to compute shader?

Posted: Mon Apr 08, 2019 5:27 pm
by dark_sylinc

As you have seen I added BufferPacked::copyTo in order to copy GPU -> GPU vertex/index buffers to an UAV buffer.
However the VCT code downloads the vertex buffers and creates a new copy because it converts the vertex data into a single structure our compute's code can understand (rather than having to deal with individual vertex formats specifications and emulate Vertex attributes in compute), thus we only copy "raw" the index buffers.

Now, you want to go further: read directly the buffers, no copies.

The VaoManager supports creating Uav buffers with multiple binding points:

Code: Select all

UavBufferPacked* createUavBuffer( size_t numElements, uint32 bytesPerElement, uint32 bindFlags,
                                          void *initialData, bool keepAsShadow );
If you pass "bindFlags = BB_FLAG_INDEX" you could create an UavBufferPacked that can be bound as an index buffer.
However we do not yet have a UavBufferPacked::getAsIndexBufferView simply because no one wrote one. We only have UavBufferPacked::getAsTexBufferView, but there's nothing preventing you to write one :)

However I'd expect some friction:
  1. D3D11 is extremely annoying, and the main reason you see all this shit. I'd expect having a buffer as UAV to have some unknown pointless limitations. The API will complain certain functions may not be possible to do. Out of the top of my head I remember that tex buffers did not allow NO_OVERWRITE when mapping, which we use for vertex and index buffers. But this is abstracted via D3D11DynamicBuffer vs D3D11CompatBufferInterface so it may just work. However frequently mapping a dynamic vertex/index buffer that is secretly an UAV buffer will have reduced performance.
  2. In D3D11 each UAV buffer is in its own pool, so do not expect the same degree of low API overhead when rendering many multiple meshes as we cannot batch them together (GL and Metal do share the pools).
  3. We currently offer no way of telling Ogre to load the meshes by creating the buffers secretly as an UAV buffer; so extra code would have to be added to do this, and ensure Ogre does not try to delete the vertex buffer directly, but rather its UAV representation. Unless you create your own meshes by hand, of course
Long story short, it is possible. The main offender is D3D11's stupid rules (seriously: they're unnecessary and no HW required it), and a lot of code would have to be written to support what you ask. If we'd only support D3D12/Vulkan/Metal this pointless nonsense would not exist (the keyword you're looking for is "resource aliasing").
Which is why I decided that copying the buffers via copyTo was just easier.
al2950 wrote:
Mon Apr 08, 2019 2:38 pm
TexBufferPacked (confusingly named as it's not a texture)
The name is historical and comes from how D3D11 and GL name it. Originally these buffers went through the same HW that fetches textures (which can perform format conversion), thus the name "texture buffers".
In fact, some not-so-old Adreno drivers in Android will incorrectly fetch from texture buffers if bilinear filtering is turned on (which is both depressing and laughable at the same time)

For modern GPUs, the term "texture buffers" is a bit inaccurate, however they still use instructions that are meant for textures due to the need to dynamically convert formats when loading (i.e. if the data contains "255" and the specified pixel format from C++ is R8_UNORM, then the GPU must read 1.0f)

Re: [2.1+] Bind vertex and index buffer to compute shader?

Posted: Mon Apr 08, 2019 6:15 pm
by al2950
Urgggh! I am actually wanting to use a standard Ogre Mesh and get the vao and pass the buffers directly to the compute. So I would need something like IndexBufferPacked::getAsTexBufferView. But given Ogre's auto batching of meshes and the points you made above I can see a naive implementation of this causing irritating issues for people.

I actually only need to worry about 1 mesh buffer at the moment to use as GPU particle emitter mesh. But given what you have said I think i will follow your approach, although I might keep the format the same so I can re-sure some HLMS pieces, especially the skeleton shader code.

Thanks for post, most informative as always, even if its not quite what I want to hear!