Page 1 of 1

Best way to implement hardwareBuffer vertices and indices?

Posted: Sun Jan 25, 2015 4:12 pm
by kuma
I am unsure what's the best approach to implement hardware buffer for large mesh (terrain) in practice..

I have built my own LOD terrain system. As the camera passes through terrain segments, I remove the hardwareBuffers that need to be changed and create new ones with the updated data. So essentially I I push indices,vertices,normals and UVs regularly as required depending on the position of the camera. This a bit slow building the segments and buffers but it works... However looking at this post ..

http://stackoverflow.com/questions/4424 ... with-depth

It suggests that I should not do it like this.
you should not try fiddling around with buffer up-/downloads. Just upload the vertex data one time and then just adjust the index arrays and the order in which those are submitted.
I tried pushing all the vertices at once but I have two issues with this for large terrain (8192x8192).

- I run out of memory and my application is killed by the OS :?
- The max number of vertices is 256^2 , so you need to create a new hardware buffer for each segment, this is essentially a new render object, which for 8192x8192 means 32^2 which is really bad performance :!:

In the original way I was doing it, I batch the segments together. Since most segments are low LOD I can fit many of them into a single hardwareBuffer.

Any thoughts on this? If you have some experience or suggestions I'd really appreciate your feedback!

Thanks! :)

edit: btw I'm using Ogre::HardwareBuffer::HBU_DYNAMIC and Ogre::HardwareBuffer::HBL_DISCARD for both vertices and indices.

Re: Best way to implement hardwareBuffer vertices and indice

Posted: Sun Jan 25, 2015 5:03 pm
by Kojack
kuma wrote:I tried pushing all the vertices at once but I have two issues with this for large terrain (8192x8192).

- I run out of memory and my application is killed by the OS :?
- The max number of vertices is 256^2 , so you need to create a new hardware buffer for each segment, this is essentially a new render object, which for 8192x8192 means 32^2 which is really bad performance :!:
8192x8192 with position, normal and uv takes up 2GB of ram. That's a rather huge amount for a graphics card to handle in one buffer, when there's other stuff in there taking up space (front buffer, back buffer, textures, other meshes, etc).

You can cut down on memory usage by creative shader use. If you are in a higher shader model (4 or above) you can use SV_PrimitiveID to mathematically generate the x, z and uv values. The vertices only need a height and normal.
kuma wrote:The max number of vertices is 256^2
16bit index buffers are limited to 65536 (256^2) vertices. 32bit index buffers are limited to (theoretical) 4 billion vertices.
When you create the index buffer, give it HardwareIndexBuffer::IT_32BIT (first parameter of createIndexBuffer).
kuma wrote:I remove the hardwareBuffers that need to be changed and create new ones with the updated data
Allocations and deallocations generally have a performance hit. It would be better to keep a pool of buffers and just write over the top of them as needed, rather than making new ones.

Re: Best way to implement hardwareBuffer vertices and indice

Posted: Sun Jan 25, 2015 6:53 pm
by kuma
That's a great answer thanks !!!
I'll probably keep the segment sizes at 256x256 but just batch them differently using 32bit index buffers.
Yes having static buffers would be nice, at the moment I have my own manager to handle creation and removal which is a bit tricky..

btw. Are 32bit index buffers common ? For example mobile say? Android .. :?:

Re: Best way to implement hardwareBuffer vertices and indice

Posted: Sun Jan 25, 2015 7:49 pm
by Kojack
kuma wrote:btw. Are 32bit index buffers common ? For example mobile say? Android .. :?:
Should be common on pc. Intel gpus from around 2007 seem to be limited to 16 bit, but those are pretty crap in general.
Apparently OpenGL ES doesn't support 32 bit indices. I don't know if ES 2.0 does. So mobile may not support it well, or at all.
(I don't do mobile dev, so not sure)

Re: Best way to implement hardwareBuffer vertices and indice

Posted: Sun Jan 25, 2015 9:15 pm
by dark_sylinc
Hi!

First, reality check. At 8192x8192 a lot of mobile devices will scream. Furthermore, if you're aiming for mobile, 32-bit indices weren't supported until GLES3, which also added instancing. Since a lot of devices are still GLES2, you're stuck with 65535 as max limit.

Second, if you're aiming at desktop (or GLES3?) you're looking at it the wrong way. Most or all of this computation should happen GPU side, not CPU.
For example, I've written terrain systems before that would create a 256x256 patch, create a VTF (Vertex Texture Fetch) to send the heightmaps to the GPU once at loading time.
Then render this patch multiple times (w/ an offset on each iteration), using vtf to read the heightmap at the given location. You can also use instancing to reduce the number of draw calls.

All lodding happens inside the shaders. If you're fortunate in your platform of choice, you can also use compute shaders.

The approach you want to do is how it used to be pre-2005 era; when GPUs were very limited in what they could do and a lot of data was being sent from the CPU every frame for the terrains.