Hi, I was thinking on implementing Skeleton LOD for Ogre3D, but I would like to discuss it first.
Problem: I have made some benchmarks, and found out that HW skinning is much faster than SW skinning. However I need to pass an array of bone states to the vertex shader, which requires a hard-coded size globally (We're not using RTShaderSystem). We would like to have animations like fingers, but that would require huge array globally in the shader for all meshes and it requires transforming all bones each frame. However we don't need animations for fingers in the far distance. If you combine it with mesh LOD, than the GPU will cull the unused vertices, which would be processed on CPU, so in far distance HW skinning is a double win.
Solution: We would like to have SW skinning in the near for detailed animations, while having HW skinning with reduced skeleton in the far distance for performance. It should be a fully automatic system without artist intervention. It could be implemented as a user created mesh LOD level generated by MeshUpgrader.
Algorithm: As my first suggestion is something like this: We calculate the bounding spheres for each bone (sphere containing all vertices with given boneID). We first reduce the bones with small bounding spheres. When reducing, we replace the old boneID with the parent boneID in the hierarchy and recalculate weights. This would work out for fingers/hand. The nice thing is that you already have a set of animations and the expected vertex positions for each frame, so you can calculate the error of the reduction accurately.
If you only reduce from the end of the hierarchy (like hands), than you don't need to recalculate animations, but if you reduce from middle of the skeleton, you need to.
Alternatives: Artist could create it in modeling tool, than load it as a manual mesh LOD, but this requires artist work, the artist would do it the same way as the above algorithm and it requires double maintenance. Other way would be to split the body (including bones/animations) into multiple meshes, this would be slower and requires artist work too.
What do you think of this idea?
Do you think it's worth working on?
What kind of problems/disadvantages can you think of?
Skeleton LOD idea
-
- Google Summer of Code Student
- Posts: 47
- Joined: Tue Sep 27, 2011 9:26 am
- x 50
Skeleton LOD idea
Google Summer of Code 2013 Student
Topic: "Progressive mesh improvements"
Project links: Project thread, WIKI page, Code fork for the project
Mentor: Murat Sari
Topic: "Progressive mesh improvements"
Project links: Project thread, WIKI page, Code fork for the project
Mentor: Murat Sari
-
- OGRE Team Member
- Posts: 5503
- Joined: Sat Jul 21, 2007 4:55 pm
- Location: Buenos Aires, Argentina
- x 1370
Re: Skeleton LOD idea
I don't want to be the killjoy but
Anyway, you can pass big data to the Shaders. In DX11 & GL3+ you can use constant/uniform buffers or texture buffers (depending on locality of data and size).
On DX9 you can use VTF (Vertex Texture Fetch). The HW VTF instancing technique in fact does this. In fact it works faster than passing a large array because DX9 & DX10 generation hardware was plagued by something called "constant waterfalling", but VTF is immune to that. GCN architecture doesn't suffer const. waterfalling at all.
And in Ogre 2.0 allowing to pass large data will be standard.
Also take in mind, in Ogre 2.0 the Skeleton system has been rewritten from scratch. And it's VERY different. Any work you may do for 1.x won't be possible to port to 2.0 unless there's a lot of effort involved.
You didn't say why you need to pass an array of bone states to the VS. Are you doing custom skeletal animation? or are you talking about the world_matrix_array_3x4 param?sajty wrote:However I need to pass an array of bone states to the vertex shader, which requires a hard-coded size globally (We're not using RTShaderSystem).
Anyway, you can pass big data to the Shaders. In DX11 & GL3+ you can use constant/uniform buffers or texture buffers (depending on locality of data and size).
On DX9 you can use VTF (Vertex Texture Fetch). The HW VTF instancing technique in fact does this. In fact it works faster than passing a large array because DX9 & DX10 generation hardware was plagued by something called "constant waterfalling", but VTF is immune to that. GCN architecture doesn't suffer const. waterfalling at all.
And in Ogre 2.0 allowing to pass large data will be standard.
Also take in mind, in Ogre 2.0 the Skeleton system has been rewritten from scratch. And it's VERY different. Any work you may do for 1.x won't be possible to port to 2.0 unless there's a lot of effort involved.
-
- Google Summer of Code Student
- Posts: 47
- Joined: Tue Sep 27, 2011 9:26 am
- x 50
Re: Skeleton LOD idea
Hehe, I have asked to get critics, which makes my decision easier whether to spend time on it or on something else. So I'm open for negative critics too.I don't want to be the killjoy but

We are supporting GL, DirectX9, GLES2(soon) rendersystem and using the skinning sample from ogre.
Yes, I'm talking about worldMatrix3x4Array as you said.
GLES2 doesn't seem to support VTF. It also limits GL compatibility with old HW.
Oh, then I think I will wait until we switch to 2.x, before deciding to implement it.Also take in mind, in Ogre 2.0 the Skeleton system has been rewritten from scratch. And it's VERY different. Any work you may do for 1.x won't be possible to port to 2.0 unless there's a lot of effort involved.

Thanks for the feedback.
Google Summer of Code 2013 Student
Topic: "Progressive mesh improvements"
Project links: Project thread, WIKI page, Code fork for the project
Mentor: Murat Sari
Topic: "Progressive mesh improvements"
Project links: Project thread, WIKI page, Code fork for the project
Mentor: Murat Sari