Rendering A LOT of things

A place for users of OGRE to discuss ideas and experiences of utilitising OGRE in their games / demos / applications.
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Rendering A LOT of things

Post by monster »

I'm wanting to render scenes with lots of entities in them, like thousands of trees. Obviously I can't have thousands of individual entities, even if I use LOD to reduce the distant ones to just a couple of triangles.

Currently I'm thinking that I can use my map editor to place trees (rocks, etc) individually and then "compile" groups of trees into a much smaller number of bigger meshes. And also create a low LOD version in the same way.

However, some preliminary tests show that I'm hitting the performance limit at only a couple of hundred entities. Which is kind of OK, since I can just use less and less entities and make them bigger and bigger. Taken to the extreme, I could even just have a single massive "foliage" mesh!

But I assume Ogre culls stuff and applies LODs at an entity level, meaning that my massive foliage mesh would always get rendered in its entirety.

So, what that leads me to is that I'm going to have to compile my groups of trees into a very small number of big entities, "bake in" the world positions, and then switch entities on and off myself manually to perform culling and LOD switching.

Does that make sense, or is there an easier way that I've missed?
User avatar
Kencho
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 4011
Joined: Fri Sep 19, 2003 6:28 pm
Location: Burgos, Spain
x 2
Contact:

Post by Kencho »

I can just suggest you to group your trees in a way they are "packed" in octree nodes (I assume you're using octrees, for the description of your scene).
Another interesting performance is to make them static geometry (trees don't use to change...)
At last, replacing your meshes with billboards at a far distance use to work...

Just my ideas/suggestions. I don't know if they're good enough... :?
Image
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

Thanks for the suggestions;
group your trees in a way they are "packed" in octree nodes
Yep, that's what I'm currently doing. But I guess the overhead of having to maintain (i.e. transform, cull, apply LODs) a couple of hundred entities, even if they're not eventually visible, is what's killing it. In my test app there's only 2 small triangles per entity so I don't think it's running out of tri-power or fill-rate.
Another interesting performance is to make them static geometry
That's kind of what I'm proposing; "compiling" a number of individual tree meshes into a single vertex and index array, so whole chunks of the forest become single entities. But as I say, if I make those chunks very big (as it's looking like I'll have to) then they won't get culled as vigorously as they could and I would probably have to apply the LODs manually too.
replacing your meshes with billboards at a far distance
Yep, again. That would be part of the "baking" process, at the same time as I'm compiling the chunks into single entities I'd build a low-poly version for in the distance.

I guess what I'm asking is; Is there a way to "freeze" the position of an entity, create it and transform it but then lock it so that it gets culled and LOD'd but doesn't incurr all the transformation overhead every frame?

I would have though the octree would do this; if forest chunks A through G are in octree node Z, and I can't see that octree node, then I don't need to bother doing anything with any of the chunks in that node. But I guess that every frame it's transforming the chunks to make sure they're in the right nodes?
User avatar
Nek
Gnoblar
Posts: 21
Joined: Mon Aug 18, 2003 6:06 pm
Location: Houston, Tx
Contact:

Post by Nek »

This may sound silly, but why don't you group a series of trees(like 50) together as one mesh and have them all share a material using UV mapping. This will increase performance since you won't have a TON of tiny objects since Ogre(and any modern engine) has problems with tons of nodes rather than a few nodes with lots of polygons. If you are using optimal modeling techniques to minimize polygons, this should be a good performance booster, especially for an octree. Obviously, this will be a bit tricky if you want to animate these objects, but it could be done.
Last edited by Nek on Tue Aug 03, 2004 12:48 am, edited 1 time in total.
User avatar
PeterNewman
Greenskin
Posts: 128
Joined: Mon Jun 21, 2004 2:34 am
Location: Victoria, Australia
Contact:

Post by PeterNewman »

I'm just wondering if its that you've only got 2 triangles per entity, so its 2 triangles per render-op, with all the associated over-head of a render-op?

Something similar to the forum thread regarding rendering 100 boxes on the screen dragging the framerate down, because each box was a seperate render-op.
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

Thanks for the suggestions chaps.
I'm just wondering if its that you've only got 2 triangles per entity, so its 2 triangles per render-op, with all the associated over-head of a render-op?
Yeah, that's currently just a test. At the moment It's just rendering 2 triangles where everntually I'm hoping to have a whole group of trees. I doubt it's going to get quicker when I start adding more polys to each group of trees!
This may sound silly, but why don't you group a series of trees(like 50) together as one mesh and have them all share a material using UV mapping.
Not silly at all. That's exactly my plan!

With 256 entities (i.e. 512 tris, 2 each) I get around 95fps, if I bump that up to 512 entities then I get around 50fps. If I notch it up to 1024 entities then the whole thing grinds to a halt, like 0.04fps! Despite the fact that the vast majority of those aren't visibile from my camera position. Obviously there's some overhead involved in checking whether nodes are visible before they're rendered, and that overhead doesn't seem to scale linearly.
User avatar
Borundin
Platinum Sponsor
Platinum Sponsor
Posts: 243
Joined: Fri Oct 03, 2003 5:57 am
Location: Sweden
x 2
Contact:

Post by Borundin »

Intresting topic. Sounds like a lot of time is being spent in CPU? 1024 renderops might not be optimal performance wise for the GPU but it sure shouldnt bring your fps down to 0.04 :shock:
Have you tried profiling the code to see exactly where all this time is spent? Assuming CPU is the bottleneck of course...
Image : Image
User avatar
SpannerMan
Gold Sponsor
Gold Sponsor
Posts: 446
Joined: Fri May 02, 2003 10:05 am
Location: UK
Contact:

Post by SpannerMan »

Ive always wondered how the hell other apps manage to do so many graphical objects at the same time. Like Far Cry. Or Speed Tree, which has loads of animated trees. Crazy.
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Because they don't render them as several separate objects. :)

Just like Monster is suggesting, they pack the objects into as small a number of massive meshes as they can get away with, whilst still getting reasonable culling. Then they use vertex programs to make them sway - a couple of sum-of-sin waves based on the world positions and heights is enough to get a pretty convincing sway.

@Borundin: 1024 render ops is excessive. GPUs hate lots of render ops because way too much time is proportionately spent in set-up, both in CPU as part of the GL/D3D implementation and in the CPU element of the video driver. GPUs are heavily pipelined, and are best cruising at high-speed. Chopping and changing render ops is like changing them down into 1st gear at every corner ;)
User avatar
bad_camel
Halfling
Posts: 74
Joined: Tue Dec 17, 2002 11:57 am
Location: Somerset, England
Contact:

Post by bad_camel »

I was also gonna say, before you make them static world geom, remember you prolly gonna wanna make them sway and stuff. Looks to me like you're gonna have to use combinations of these techniques.
AssiDragon
Greenskin
Posts: 145
Joined: Wed Apr 28, 2004 12:10 pm
Location: Hungary
Contact:

Post by AssiDragon »

What if you just merged the trees and other stuff with the terrain? Then they'd get culled too (it would be fast, wouldn it) :?
Hope is the first step on the road to disappointment.
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

before you make them static world geom, remember you prolly gonna wanna make them sway and stuff.
I'd just like to get them displayed first!
:D

Current (very) preliminary tests seem to indicate that I'll be able to maintain something like 25,000 trees at a time without paging or any of that sort of nonsense, but we'll see how we go.

Then I'll start worrying about getting them swaying!

I was also concerned about the amount of VRAM that I was chewing up, but I think I've found a way around that.
What if you just merged the trees and other stuff with the terrain?
Yes. And the code that I can cut'n'paste to do that would be...?
;)

Hopefully what I'm aiming for will do something similar, but in a bolt-on way, so I don't have to muck up the terrain scene managers and so it should be possible to use it elsewhere.

Thanks for all the suggestions chaps.
User avatar
Kencho
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 4011
Joined: Fri Sep 19, 2003 6:28 pm
Location: Burgos, Spain
x 2
Contact:

Post by Kencho »

I think that having large not-culled (easily) geometries is much better than culling each tree everytime. I was quite surprised when I saw by myself how Ogre managed perfectly (and smoothly) a large amount of vertices instead of a large amount of entities. My suggestion is to merge the trees in packs that fits perfectly in the octree nodes.

BTW: if you're using two tris per tree, you can use BillboardSets instead, to do fast tests ;)
Image
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

I think that having large not-culled (easily) geometries is much better than culling each tree everytime ... My suggestion is to merge the trees in packs that fits perfectly in the octree nodes.
I think you're right. You can easily throw one 100,000 polygon mesh at the card, but if you use 1000 entities of 100 polygons each then it all grinds to a halt!

In fact, I'm ignoring the octree culling entirely, since that just requires too many entities (one per node, over a thousand in my scene) and I'm using a much simpler (i.e. faster) specialised culling mechanism.
if you're using two tris per tree, you can use BillboardSets instead
Actually, at the low LOD, it's 4 tris per tree, two intersecting quads. I think that using billboards might incur an overhead since each one needs to be oriented to face the screen every time the camera moves. With the intersecting quads I can just build them once in a static position and never touch them again.

Unless I wanted them to waft about in the breeze!
(which, actually, using a vertex program probably isn't that hard. Famous last words!)
;)
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

And, just to prove that I'm not talking complete cobblers, here's a shot of me driving (my buggy, simulated with OgreOde, natch!) around in a 20,000 tree forest;
Image
That's actually far, far more trees than I'd really need, and with a much further far clip (for the forest, independent of any other clipping planes, Octrees, etc) than necessary. But I think it proves the concept.

OK, so the low poly tree model is a bit dodgy at the moment, but that's nothing that can't be fixed. And, although it doesn't show here, it works just as well with different meshes, all your trees don't have to be the same, in fact they don't even need to be trees!

I still need to swap in the high-detail meshes for close up stuff, but all the hooks are there to do that, and that's the code I'm going to write. Right about now.
User avatar
SpannerMan
Gold Sponsor
Gold Sponsor
Posts: 446
Joined: Fri May 02, 2003 10:05 am
Location: UK
Contact:

Post by SpannerMan »

Whoaa, thats a forest and a half.

Monster, can you elaborate more on the nitty gritty of how you achieved this please?
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

Yep, I haven't been lounging around enjoying the sunshine (actually it's raining) this afternoon. Here's the LOD swapping stuff, with a slightly more sensible 10,000 tree forest. Most of the frame-rate drop here is due to the fact that texture shadows are switched on, although you can't actually see that very well in this shot;
Image
As you can see, there's quite a difference between the high and low LOD trees so the popping's quite evident. Obviously that can be fixed, but my art skills aren't quite up to the job!

In this shot, we're up a mountain quite near to the coast, which is why you can't see very far. Trust me, the draw distance is just the same as in the previous shot!
Last edited by monster on Thu Aug 05, 2004 10:02 am, edited 2 times in total.
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

Currently the forest is built up with;

Code: Select all

GroundCover* _ground_cover = new GroundCover(_scene_manager,Vector3(2052,100,2052),32,32);
for(int t = 0;t < 10000;t++)
{
	Vector3 vec(rand() % 2052,100.0,rand() % 2052);
	vec.y = heightAt(vec);
	_ground_cover->add(vec,"Oak_Tree.mesh","Oak_Bill.mesh");
}
_ground_cover->compile();
The vector parameter to the constructor is the size of the world that you want to cover, the next 2 parameters are the number of ground cover "blocks" in the X and Z directions.

You call "add" a load of times to add meshes to the scene, the first one is the high-poly version, the second should be low-poly (but doesn't have to be!).

Then, when you call "compile" it creates a mesh to hold all the ground cover. Then it takes the low-poly versions of all the meshes you added and merges them into loads of submeshes (one for each block of ground cover) and attaches those submeshes to the big ground cover mesh, which in turn is attached to an entity and a scene node.

When culling the ground cover I use very simple visibility and distance tests and simply switch off sub-entities of the big mesh when that block can't be seen.

If a block is very close to the camera I switch it off and replace it with another scene node that's configured to have high-poly versions of the meshes in the same place as the low-poly versions. I keep an automatically expanding pool of these that get reused by different blocks as the camera moves.

These high-poly version of the blocks have a single scene node that's moved to be the center of the block that's using it, and the child scene nodes under that are moved around and switched on and off according to the meshes that are supposed be in the block. There's a high-water mark amount of nodes maintained according to the maximum number of each different mesh in each block, so it's not continually creating and deleting nodes. Each block just switches on and repositions the ones it needs and switches off the rest.

There's no reason why you couldn't have different GroundCover objects for different things, with different block parameters and clipping distances. Grass could never have a high-poly version and could get culled much closer to the camera, than stuff in a separate tree object, for example.

Not quite Far Cry just yet. But I'm reasonably happy with it for a couple of days work.

Now, if someone wants to give me some decent tree and other ground cover artwork...
;)
User avatar
psyclonist
OGRE Expert User
OGRE Expert User
Posts: 286
Joined: Fri Nov 01, 2002 3:54 pm
Location: Berlin & Nuremberg, Germany
x 1
Contact:

Post by psyclonist »

Great work! I'd love to see this in action.

If you need any help just shout.

I have a tree modelling package. The trees look quite nice. I'll see if I can export a few for you to use ;)

-psy
nfz
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 1263
Joined: Wed Sep 24, 2003 4:00 pm
Location: Halifax, Nova Scotia, Canada

Post by nfz »

WOW :!: That method is definately the way to go for lots of static objects. Very impressive screenshots, Monster.
User avatar
sinbad
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 19269
Joined: Sun Oct 06, 2002 11:19 pm
Location: Guernsey, Channel Islands
x 66
Contact:

Post by sinbad »

Nice! :)
User avatar
Cyberdigitus
Halfling
Posts: 55
Joined: Thu Mar 04, 2004 7:08 pm
Location: Belgium
Contact:

Post by Cyberdigitus »

interesting. Couple questions regarding this, to get some insights:

Is it mainly the scenenode count that is limiting or the amount of mesh entitys? ie, each object having it's own transformation matrix, or is it more the renderstates switches that limit it?

is there much of a limiting overhead if you add the same entity mesh multiple times to one scenenode (this is what you do here right?) or would it improve even further when it's one big mesh(that gets culled somehow). the meshes would have the same renderstate switches after all.

as far as i understand the main switch is because of different textures, are there others? would it help if you seperate the trunks in one batch and the leaves in another, given that those have diferent materials? or do the renderques group these per submesh automatically?

as a sidenode, unreal engine 3 is said to handle 1000 - 5000 objects... [edit]... total count of objects in a map, in view they keep it around 300-1000[/edit]
. . .
User avatar
monster
OGRE Community Helper
OGRE Community Helper
Posts: 1098
Joined: Mon Sep 22, 2003 2:40 am
Location: Melbourne, Australia
Contact:

Post by monster »

Is it mainly the scenenode count that is limiting or the amount of mesh entitys?
Seems to be, yes.
is there much of a limiting overhead if you add the same entity mesh multiple times to one scenenode
No, but you can only move it around if it's attached to a child scene node. If you just attach a number of entities, all using the same mesh, to a single scene node then they'll all appear in the same place won't they?
this is what you do here right?
Not quite. I have one scene node, with one attached entity, for all the low-poly trees. The mesh that that entity uses is built up out of lots of submeshes, each of which represents a smaller "block" of ground cover. The "compiler" takes the low-poly mesh (that would get reused if you just used it in different entities) and copies it multiple times into the submeshes for each block, but changing the actutal vertex positions to hardwiring in the item's position. The reason that there are lots of submeshes is so I can switch them on and off (via the associated subentity) and replace them with the high-poly versions when the camera's close.

The compiler creates one submesh for each material that's needed in each block, so if you've got 4 different types of tree you'll need a maximum of 4 submeshes per block. But it'll only create the ones it needs. So, if I merge together the textures I use for my low-poly trees I'll minimise the amount of submeshes I need. But first I'll worry about getting them looking decent!
would it help if you seperate the trunks in one batch and the leaves in another, given that those have diferent materials
Actually, the branches and the trunk are all one texture, so there's only a switch when rendering different types of tree. As with the low LOD geometry, I may well try and collapse all my ground cover textures into a single texture. But I'm not sure the performance improvement would be worth it. Ogre optimises texture (and other state) switches itself, so I believe.
in view they keep it around 300-1000
Pah! I laugh in the face of your 1000 objects!
:D
User avatar
psyclonist
OGRE Expert User
OGRE Expert User
Posts: 286
Joined: Fri Nov 01, 2002 3:54 pm
Location: Berlin & Nuremberg, Germany
x 1
Contact:

Post by psyclonist »

Now if you could make the trees sway... 8)
( Maybe that's possible with a vertex program and a bit of additional math on the GPU ... *thinks* )

-psy
User avatar
Robomaniac
Hobgoblin
Posts: 508
Joined: Tue Feb 03, 2004 6:39 am

Post by Robomaniac »

:!: So Awesome, Want It :P

Is this a scene manager thing, or one of your own created classes.
phear hingo

My Webpage
Post Reply