Rendering A LOT of things
-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
Rendering A LOT of things
I'm wanting to render scenes with lots of entities in them, like thousands of trees. Obviously I can't have thousands of individual entities, even if I use LOD to reduce the distant ones to just a couple of triangles.
Currently I'm thinking that I can use my map editor to place trees (rocks, etc) individually and then "compile" groups of trees into a much smaller number of bigger meshes. And also create a low LOD version in the same way.
However, some preliminary tests show that I'm hitting the performance limit at only a couple of hundred entities. Which is kind of OK, since I can just use less and less entities and make them bigger and bigger. Taken to the extreme, I could even just have a single massive "foliage" mesh!
But I assume Ogre culls stuff and applies LODs at an entity level, meaning that my massive foliage mesh would always get rendered in its entirety.
So, what that leads me to is that I'm going to have to compile my groups of trees into a very small number of big entities, "bake in" the world positions, and then switch entities on and off myself manually to perform culling and LOD switching.
Does that make sense, or is there an easier way that I've missed?
Currently I'm thinking that I can use my map editor to place trees (rocks, etc) individually and then "compile" groups of trees into a much smaller number of bigger meshes. And also create a low LOD version in the same way.
However, some preliminary tests show that I'm hitting the performance limit at only a couple of hundred entities. Which is kind of OK, since I can just use less and less entities and make them bigger and bigger. Taken to the extreme, I could even just have a single massive "foliage" mesh!
But I assume Ogre culls stuff and applies LODs at an entity level, meaning that my massive foliage mesh would always get rendered in its entirety.
So, what that leads me to is that I'm going to have to compile my groups of trees into a very small number of big entities, "bake in" the world positions, and then switch entities on and off myself manually to perform culling and LOD switching.
Does that make sense, or is there an easier way that I've missed?
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
I can just suggest you to group your trees in a way they are "packed" in octree nodes (I assume you're using octrees, for the description of your scene).
Another interesting performance is to make them static geometry (trees don't use to change...)
At last, replacing your meshes with billboards at a far distance use to work...
Just my ideas/suggestions. I don't know if they're good enough...
Another interesting performance is to make them static geometry (trees don't use to change...)
At last, replacing your meshes with billboards at a far distance use to work...
Just my ideas/suggestions. I don't know if they're good enough...

-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
Thanks for the suggestions;
I guess what I'm asking is; Is there a way to "freeze" the position of an entity, create it and transform it but then lock it so that it gets culled and LOD'd but doesn't incurr all the transformation overhead every frame?
I would have though the octree would do this; if forest chunks A through G are in octree node Z, and I can't see that octree node, then I don't need to bother doing anything with any of the chunks in that node. But I guess that every frame it's transforming the chunks to make sure they're in the right nodes?
Yep, that's what I'm currently doing. But I guess the overhead of having to maintain (i.e. transform, cull, apply LODs) a couple of hundred entities, even if they're not eventually visible, is what's killing it. In my test app there's only 2 small triangles per entity so I don't think it's running out of tri-power or fill-rate.group your trees in a way they are "packed" in octree nodes
That's kind of what I'm proposing; "compiling" a number of individual tree meshes into a single vertex and index array, so whole chunks of the forest become single entities. But as I say, if I make those chunks very big (as it's looking like I'll have to) then they won't get culled as vigorously as they could and I would probably have to apply the LODs manually too.Another interesting performance is to make them static geometry
Yep, again. That would be part of the "baking" process, at the same time as I'm compiling the chunks into single entities I'd build a low-poly version for in the distance.replacing your meshes with billboards at a far distance
I guess what I'm asking is; Is there a way to "freeze" the position of an entity, create it and transform it but then lock it so that it gets culled and LOD'd but doesn't incurr all the transformation overhead every frame?
I would have though the octree would do this; if forest chunks A through G are in octree node Z, and I can't see that octree node, then I don't need to bother doing anything with any of the chunks in that node. But I guess that every frame it's transforming the chunks to make sure they're in the right nodes?
-
- Gnoblar
- Posts: 21
- Joined: Mon Aug 18, 2003 6:06 pm
- Location: Houston, Tx
This may sound silly, but why don't you group a series of trees(like 50) together as one mesh and have them all share a material using UV mapping. This will increase performance since you won't have a TON of tiny objects since Ogre(and any modern engine) has problems with tons of nodes rather than a few nodes with lots of polygons. If you are using optimal modeling techniques to minimize polygons, this should be a good performance booster, especially for an octree. Obviously, this will be a bit tricky if you want to animate these objects, but it could be done.
Last edited by Nek on Tue Aug 03, 2004 12:48 am, edited 1 time in total.
-
- Greenskin
- Posts: 128
- Joined: Mon Jun 21, 2004 2:34 am
- Location: Victoria, Australia
I'm just wondering if its that you've only got 2 triangles per entity, so its 2 triangles per render-op, with all the associated over-head of a render-op?
Something similar to the forum thread regarding rendering 100 boxes on the screen dragging the framerate down, because each box was a seperate render-op.
Something similar to the forum thread regarding rendering 100 boxes on the screen dragging the framerate down, because each box was a seperate render-op.
-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
Thanks for the suggestions chaps.
With 256 entities (i.e. 512 tris, 2 each) I get around 95fps, if I bump that up to 512 entities then I get around 50fps. If I notch it up to 1024 entities then the whole thing grinds to a halt, like 0.04fps! Despite the fact that the vast majority of those aren't visibile from my camera position. Obviously there's some overhead involved in checking whether nodes are visible before they're rendered, and that overhead doesn't seem to scale linearly.
Yeah, that's currently just a test. At the moment It's just rendering 2 triangles where everntually I'm hoping to have a whole group of trees. I doubt it's going to get quicker when I start adding more polys to each group of trees!I'm just wondering if its that you've only got 2 triangles per entity, so its 2 triangles per render-op, with all the associated over-head of a render-op?
Not silly at all. That's exactly my plan!This may sound silly, but why don't you group a series of trees(like 50) together as one mesh and have them all share a material using UV mapping.
With 256 entities (i.e. 512 tris, 2 each) I get around 95fps, if I bump that up to 512 entities then I get around 50fps. If I notch it up to 1024 entities then the whole thing grinds to a halt, like 0.04fps! Despite the fact that the vast majority of those aren't visibile from my camera position. Obviously there's some overhead involved in checking whether nodes are visible before they're rendered, and that overhead doesn't seem to scale linearly.
-
- Platinum Sponsor
- Posts: 243
- Joined: Fri Oct 03, 2003 5:57 am
- Location: Sweden
- x 2
Intresting topic. Sounds like a lot of time is being spent in CPU? 1024 renderops might not be optimal performance wise for the GPU but it sure shouldnt bring your fps down to 0.04
Have you tried profiling the code to see exactly where all this time is spent? Assuming CPU is the bottleneck of course...

Have you tried profiling the code to see exactly where all this time is spent? Assuming CPU is the bottleneck of course...
-
- Gold Sponsor
- Posts: 446
- Joined: Fri May 02, 2003 10:05 am
- Location: UK
-
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
Because they don't render them as several separate objects. 
Just like Monster is suggesting, they pack the objects into as small a number of massive meshes as they can get away with, whilst still getting reasonable culling. Then they use vertex programs to make them sway - a couple of sum-of-sin waves based on the world positions and heights is enough to get a pretty convincing sway.
@Borundin: 1024 render ops is excessive. GPUs hate lots of render ops because way too much time is proportionately spent in set-up, both in CPU as part of the GL/D3D implementation and in the CPU element of the video driver. GPUs are heavily pipelined, and are best cruising at high-speed. Chopping and changing render ops is like changing them down into 1st gear at every corner

Just like Monster is suggesting, they pack the objects into as small a number of massive meshes as they can get away with, whilst still getting reasonable culling. Then they use vertex programs to make them sway - a couple of sum-of-sin waves based on the world positions and heights is enough to get a pretty convincing sway.
@Borundin: 1024 render ops is excessive. GPUs hate lots of render ops because way too much time is proportionately spent in set-up, both in CPU as part of the GL/D3D implementation and in the CPU element of the video driver. GPUs are heavily pipelined, and are best cruising at high-speed. Chopping and changing render ops is like changing them down into 1st gear at every corner

-
- Halfling
- Posts: 74
- Joined: Tue Dec 17, 2002 11:57 am
- Location: Somerset, England
-
- Greenskin
- Posts: 145
- Joined: Wed Apr 28, 2004 12:10 pm
- Location: Hungary
-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
I'd just like to get them displayed first!before you make them static world geom, remember you prolly gonna wanna make them sway and stuff.

Current (very) preliminary tests seem to indicate that I'll be able to maintain something like 25,000 trees at a time without paging or any of that sort of nonsense, but we'll see how we go.
Then I'll start worrying about getting them swaying!
I was also concerned about the amount of VRAM that I was chewing up, but I think I've found a way around that.
Yes. And the code that I can cut'n'paste to do that would be...?What if you just merged the trees and other stuff with the terrain?

Hopefully what I'm aiming for will do something similar, but in a bolt-on way, so I don't have to muck up the terrain scene managers and so it should be possible to use it elsewhere.
Thanks for all the suggestions chaps.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
I think that having large not-culled (easily) geometries is much better than culling each tree everytime. I was quite surprised when I saw by myself how Ogre managed perfectly (and smoothly) a large amount of vertices instead of a large amount of entities. My suggestion is to merge the trees in packs that fits perfectly in the octree nodes.
BTW: if you're using two tris per tree, you can use BillboardSets instead, to do fast tests
BTW: if you're using two tris per tree, you can use BillboardSets instead, to do fast tests

-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
I think you're right. You can easily throw one 100,000 polygon mesh at the card, but if you use 1000 entities of 100 polygons each then it all grinds to a halt!I think that having large not-culled (easily) geometries is much better than culling each tree everytime ... My suggestion is to merge the trees in packs that fits perfectly in the octree nodes.
In fact, I'm ignoring the octree culling entirely, since that just requires too many entities (one per node, over a thousand in my scene) and I'm using a much simpler (i.e. faster) specialised culling mechanism.
Actually, at the low LOD, it's 4 tris per tree, two intersecting quads. I think that using billboards might incur an overhead since each one needs to be oriented to face the screen every time the camera moves. With the intersecting quads I can just build them once in a static position and never touch them again.if you're using two tris per tree, you can use BillboardSets instead
Unless I wanted them to waft about in the breeze!
(which, actually, using a vertex program probably isn't that hard. Famous last words!)

-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
And, just to prove that I'm not talking complete cobblers, here's a shot of me driving (my buggy, simulated with OgreOde, natch!) around in a 20,000 tree forest;

That's actually far, far more trees than I'd really need, and with a much further far clip (for the forest, independent of any other clipping planes, Octrees, etc) than necessary. But I think it proves the concept.
OK, so the low poly tree model is a bit dodgy at the moment, but that's nothing that can't be fixed. And, although it doesn't show here, it works just as well with different meshes, all your trees don't have to be the same, in fact they don't even need to be trees!
I still need to swap in the high-detail meshes for close up stuff, but all the hooks are there to do that, and that's the code I'm going to write. Right about now.

That's actually far, far more trees than I'd really need, and with a much further far clip (for the forest, independent of any other clipping planes, Octrees, etc) than necessary. But I think it proves the concept.
OK, so the low poly tree model is a bit dodgy at the moment, but that's nothing that can't be fixed. And, although it doesn't show here, it works just as well with different meshes, all your trees don't have to be the same, in fact they don't even need to be trees!
I still need to swap in the high-detail meshes for close up stuff, but all the hooks are there to do that, and that's the code I'm going to write. Right about now.
-
- Gold Sponsor
- Posts: 446
- Joined: Fri May 02, 2003 10:05 am
- Location: UK
-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
Yep, I haven't been lounging around enjoying the sunshine (actually it's raining) this afternoon. Here's the LOD swapping stuff, with a slightly more sensible 10,000 tree forest. Most of the frame-rate drop here is due to the fact that texture shadows are switched on, although you can't actually see that very well in this shot;

As you can see, there's quite a difference between the high and low LOD trees so the popping's quite evident. Obviously that can be fixed, but my art skills aren't quite up to the job!
In this shot, we're up a mountain quite near to the coast, which is why you can't see very far. Trust me, the draw distance is just the same as in the previous shot!

As you can see, there's quite a difference between the high and low LOD trees so the popping's quite evident. Obviously that can be fixed, but my art skills aren't quite up to the job!
In this shot, we're up a mountain quite near to the coast, which is why you can't see very far. Trust me, the draw distance is just the same as in the previous shot!
Last edited by monster on Thu Aug 05, 2004 10:02 am, edited 2 times in total.
-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
Currently the forest is built up with;
The vector parameter to the constructor is the size of the world that you want to cover, the next 2 parameters are the number of ground cover "blocks" in the X and Z directions.
You call "add" a load of times to add meshes to the scene, the first one is the high-poly version, the second should be low-poly (but doesn't have to be!).
Then, when you call "compile" it creates a mesh to hold all the ground cover. Then it takes the low-poly versions of all the meshes you added and merges them into loads of submeshes (one for each block of ground cover) and attaches those submeshes to the big ground cover mesh, which in turn is attached to an entity and a scene node.
When culling the ground cover I use very simple visibility and distance tests and simply switch off sub-entities of the big mesh when that block can't be seen.
If a block is very close to the camera I switch it off and replace it with another scene node that's configured to have high-poly versions of the meshes in the same place as the low-poly versions. I keep an automatically expanding pool of these that get reused by different blocks as the camera moves.
These high-poly version of the blocks have a single scene node that's moved to be the center of the block that's using it, and the child scene nodes under that are moved around and switched on and off according to the meshes that are supposed be in the block. There's a high-water mark amount of nodes maintained according to the maximum number of each different mesh in each block, so it's not continually creating and deleting nodes. Each block just switches on and repositions the ones it needs and switches off the rest.
There's no reason why you couldn't have different GroundCover objects for different things, with different block parameters and clipping distances. Grass could never have a high-poly version and could get culled much closer to the camera, than stuff in a separate tree object, for example.
Not quite Far Cry just yet. But I'm reasonably happy with it for a couple of days work.
Now, if someone wants to give me some decent tree and other ground cover artwork...

Code: Select all
GroundCover* _ground_cover = new GroundCover(_scene_manager,Vector3(2052,100,2052),32,32);
for(int t = 0;t < 10000;t++)
{
Vector3 vec(rand() % 2052,100.0,rand() % 2052);
vec.y = heightAt(vec);
_ground_cover->add(vec,"Oak_Tree.mesh","Oak_Bill.mesh");
}
_ground_cover->compile();
You call "add" a load of times to add meshes to the scene, the first one is the high-poly version, the second should be low-poly (but doesn't have to be!).
Then, when you call "compile" it creates a mesh to hold all the ground cover. Then it takes the low-poly versions of all the meshes you added and merges them into loads of submeshes (one for each block of ground cover) and attaches those submeshes to the big ground cover mesh, which in turn is attached to an entity and a scene node.
When culling the ground cover I use very simple visibility and distance tests and simply switch off sub-entities of the big mesh when that block can't be seen.
If a block is very close to the camera I switch it off and replace it with another scene node that's configured to have high-poly versions of the meshes in the same place as the low-poly versions. I keep an automatically expanding pool of these that get reused by different blocks as the camera moves.
These high-poly version of the blocks have a single scene node that's moved to be the center of the block that's using it, and the child scene nodes under that are moved around and switched on and off according to the meshes that are supposed be in the block. There's a high-water mark amount of nodes maintained according to the maximum number of each different mesh in each block, so it's not continually creating and deleting nodes. Each block just switches on and repositions the ones it needs and switches off the rest.
There's no reason why you couldn't have different GroundCover objects for different things, with different block parameters and clipping distances. Grass could never have a high-poly version and could get culled much closer to the camera, than stuff in a separate tree object, for example.
Not quite Far Cry just yet. But I'm reasonably happy with it for a couple of days work.
Now, if someone wants to give me some decent tree and other ground cover artwork...

-
- OGRE Expert User
- Posts: 286
- Joined: Fri Nov 01, 2002 3:54 pm
- Location: Berlin & Nuremberg, Germany
- x 1
-
- OGRE Retired Team Member
- Posts: 1263
- Joined: Wed Sep 24, 2003 4:00 pm
- Location: Halifax, Nova Scotia, Canada
-
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
-
- Halfling
- Posts: 55
- Joined: Thu Mar 04, 2004 7:08 pm
- Location: Belgium
interesting. Couple questions regarding this, to get some insights:
Is it mainly the scenenode count that is limiting or the amount of mesh entitys? ie, each object having it's own transformation matrix, or is it more the renderstates switches that limit it?
is there much of a limiting overhead if you add the same entity mesh multiple times to one scenenode (this is what you do here right?) or would it improve even further when it's one big mesh(that gets culled somehow). the meshes would have the same renderstate switches after all.
as far as i understand the main switch is because of different textures, are there others? would it help if you seperate the trunks in one batch and the leaves in another, given that those have diferent materials? or do the renderques group these per submesh automatically?
as a sidenode, unreal engine 3 is said to handle 1000 - 5000 objects... [edit]... total count of objects in a map, in view they keep it around 300-1000[/edit]
Is it mainly the scenenode count that is limiting or the amount of mesh entitys? ie, each object having it's own transformation matrix, or is it more the renderstates switches that limit it?
is there much of a limiting overhead if you add the same entity mesh multiple times to one scenenode (this is what you do here right?) or would it improve even further when it's one big mesh(that gets culled somehow). the meshes would have the same renderstate switches after all.
as far as i understand the main switch is because of different textures, are there others? would it help if you seperate the trunks in one batch and the leaves in another, given that those have diferent materials? or do the renderques group these per submesh automatically?
as a sidenode, unreal engine 3 is said to handle 1000 - 5000 objects... [edit]... total count of objects in a map, in view they keep it around 300-1000[/edit]
. . .
-
- OGRE Community Helper
- Posts: 1098
- Joined: Mon Sep 22, 2003 2:40 am
- Location: Melbourne, Australia
Seems to be, yes.Is it mainly the scenenode count that is limiting or the amount of mesh entitys?
No, but you can only move it around if it's attached to a child scene node. If you just attach a number of entities, all using the same mesh, to a single scene node then they'll all appear in the same place won't they?is there much of a limiting overhead if you add the same entity mesh multiple times to one scenenode
Not quite. I have one scene node, with one attached entity, for all the low-poly trees. The mesh that that entity uses is built up out of lots of submeshes, each of which represents a smaller "block" of ground cover. The "compiler" takes the low-poly mesh (that would get reused if you just used it in different entities) and copies it multiple times into the submeshes for each block, but changing the actutal vertex positions to hardwiring in the item's position. The reason that there are lots of submeshes is so I can switch them on and off (via the associated subentity) and replace them with the high-poly versions when the camera's close.this is what you do here right?
The compiler creates one submesh for each material that's needed in each block, so if you've got 4 different types of tree you'll need a maximum of 4 submeshes per block. But it'll only create the ones it needs. So, if I merge together the textures I use for my low-poly trees I'll minimise the amount of submeshes I need. But first I'll worry about getting them looking decent!
Actually, the branches and the trunk are all one texture, so there's only a switch when rendering different types of tree. As with the low LOD geometry, I may well try and collapse all my ground cover textures into a single texture. But I'm not sure the performance improvement would be worth it. Ogre optimises texture (and other state) switches itself, so I believe.would it help if you seperate the trunks in one batch and the leaves in another, given that those have diferent materials
Pah! I laugh in the face of your 1000 objects!in view they keep it around 300-1000

-
- OGRE Expert User
- Posts: 286
- Joined: Fri Nov 01, 2002 3:54 pm
- Location: Berlin & Nuremberg, Germany
- x 1