Project idea:Instancing & crowds.

Threads related to Google Summer of Code
Post Reply
Crashy
Google Summer of Code Student
Google Summer of Code Student
Posts: 997
Joined: Wed Jan 08, 2003 9:15 pm
Location: Lyon, France
x 48
Contact:

Post by Crashy »

InstancedGeometry would allow you to add entites/scene nodes, which will cache the info needed to instance it out. Maybe even make a high-level interface for both that has addEntity and addSceneNode defined so the two can be paired in some conceptual sense. It's also nice because the two have similar goals (not identical, but close) and it would be a familiar system/interface to use.
In fact an instancedGeometry would contain only one type of mesh, not multiple as in staticGeometry. Notice that it is possible to do so, but instancing is really meant to represent multiple times the same and only the same mesh.

I wouldn't worry about submeshes right now.
Later on, you could use BatchInstance to implement a more generic InstancedGeometry class (which would handle multiple submesh by assigning one BatchInstance per submesh).
I 'd prefer to do that before beginning with animated meshes. :wink

@syedhs : yes there is a limit. The fact is that with the vertex shader 2.0 you can only have 256 float4. I use an array of 200 float4 to have 56 float4 to do the "eye-candy" part of the shader.
Notice that for the moment I only displace vertices, I don't rotate them.
If I need to rotate vertices, I must use a float3x3 matrix, and the maximum will be 65 matrices(because 65*3 = 195) .


@pix: Thanks for the bug report, I didn't saw it :)
Follow la Moustache on Twitter or on Facebook
Image

User avatar
Praetor
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 3335
Joined: Tue Jun 21, 2005 8:26 pm
Location: Rochester, New York, US
x 3
Contact:

Post by Praetor »

So would you think something completely divorced from StaticGeometry? It might be similar conceptually, but not share an interface at all? Perhaps it just contains a setEntity call to set the entity to use, then addInstance to tell it to do another one of the already-set entity over with the new specific info (position, rotation, etc.).

Down the road how many objects do you think we'll be able to see in this system, at the end? 50 tops? 60? I suppose to have the fifty closest soldiers as instanced, the rest farther ones as imposters might be a decent tradeoff.

User avatar
Project5
Goblin
Posts: 245
Joined: Mon Nov 22, 2004 11:56 pm
Location: New York, NY, USA
Contact:

Post by Project5 »

I may be showing my lack of shader knowledge here, but why not use quaternions instead of rotation matrices to pass rotation information in?

Looking at your calculations, it would up the number of instanced objects a good deal.

--Ben

klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

Not really.
Matrices pack all transformation information (a 3x4 matrix, at least). Quaternions, though, only pack rotation. You'd still need scaling and translation to equal a matrix's power, which would wind up using 3 float4s. A 3x4 matrix also uses 3 float4s, so... no gain there.
Unless you don't use scaling, in which case you'd use only 2 float4s, and that would be somewhat better than a matrix. Still... quaternions aren't used on the hardware side because of lack of support (they introduce a lot of complexity since application has to be done using many more shader instructions - a 3x4 matrix is easy to apply: 3 dot()s, but applying a quaternion isn't that easy).
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe

User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4
Contact:

Post by tuan kuranes »

As a Note : Nvidia proposed a NVX_instanced_arrays opengl extension which might go into next ARB extension, which does allow Hardware Instancing on OpenGL. It's available in 91.28 beta drivers. They also made a paper/presentation on that particular topic (along with some SLI performance and usage considerations.). They also state their "pseudo instancing" is faster, which uses a texcoord buffers instead of shader constant for transformation matrix per instance... might worth investigating that point.

klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

Hey... that one seems nice:

Code: Select all

for(inti=0; i<nobjects; i++) {
    // send transformation as texture coordinates
    glMultiTexCoord4fv(GL_TEXTURE0, &transform_data[0][i*4]);
    glMultiTexCoord4fv(GL_TEXTURE1, &transform_data[1][i*4]);
    glMultiTexCoord4fv(GL_TEXTURE2, &transform_data[2][i*4]);

    // draw instance
    glDrawElements(GL_TRIANGLES, nindices, GL_UNSIGNED_SHORT, indices);
}
(though it would need RenderSystem support in Ogre)

What I like is that it doesn't have an instance limit.
And it doesn't need a fattened vertex buffer.
Very cool, I'd say.

Crashy... are you doing that? If not... shouldn't you?

EDIT: More on pseudo instancing for your reading pleasure.

Be careful, if you try/are using this, as I'm not sure it will perform that well on other drivers (I'm thinking ATI), as it depends on a driver optimization (they have fast implementations of glMultiTexCoord and glDrawElements).
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe

User avatar
Project5
Goblin
Posts: 245
Joined: Mon Nov 22, 2004 11:56 pm
Location: New York, NY, USA
Contact:

Post by Project5 »

klauss wrote:Not really.
Matrices pack all transformation information (a 3x4 matrix, at least). Quaternions, though, only pack rotation. You'd still need scaling and translation to equal a matrix's power, which would wind up using 3 float4s. A 3x4 matrix also uses 3 float4s, so... no gain there.
Unless you don't use scaling, in which case you'd use only 2 float4s, and that would be somewhat better than a matrix. Still... quaternions aren't used on the hardware side because of lack of support (they introduce a lot of complexity since application has to be done using many more shader instructions - a 3x4 matrix is easy to apply: 3 dot()s, but applying a quaternion isn't that easy).
Unit quats don't contain scaling information, but I believe non unit quats do:
http://www.ogre3d.org/phpBB2/viewtopic. ... highlight=
I'd need to brush up on my math to tell you how though :-)

Then you'd be down from 3 float4s (a 3x4 matrix) to 2 (a quat and x,y,z).

This is more food for thought than anything else, because I don't know enough about shaders to say if the added number of instanced objects would outweigh the complexity of using a quat.

--Ben

Vectrex
Ogre Magi
Posts: 1266
Joined: Tue Aug 12, 2003 1:53 am
Location: Melbourne, Australia
x 1
Contact:

Post by Vectrex »

well I'd guess most projects probably wouldn't use scaling anyway to be honest. Quality modellers would cringe at the thought... unless you're talking asteroids ;)

klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

Why do you think scaling is so bad?
We use scaling all the time, since not always modellers work with the same units as the game does. We encourage that, but it's simply not worth the burden, since nowadays normalization is pretty much free.

(nVidia does say it's actually free on their hardware - they can, without performance impact, perform a half-precision normalization each GPU cycle, since they have a dedicated unit for that).

Besides, we use scaling while rendering far stellar bodies to reduce the dynamic range of objects rendered to screen. Otherwise, we'd have to cope with dynamic ranges that barely fit in double-precision floats - we divide the scene in three parts: close, medium, far, and use different scaling for them (each it its own zclear, which some would say it's messy, but it actually turned out pretty well - mostly on modern hardware with hierarchical z buffers).
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe

User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Post by jacmoe »

I think scaling makes sense for some things, especially instanced *things*. :)
Imagine an asteroid field..
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.

Vectrex
Ogre Magi
Posts: 1266
Joined: Tue Aug 12, 2003 1:53 am
Location: Melbourne, Australia
x 1
Contact:

Post by Vectrex »

yeah an asteroid field of course.. but not alot else ;) Sure scaling is important sometimes, but I'm just throwing out there the idea that a choice of technique might be nice. If you aren't using scaling then you get a bonus of extra instances, but if you need scaling then cool, but you can't have as many *things* :)

ps I make my modeller model to game scale because I tend to have alot of problems with coded scale and physics etc. Better to have the model right from the start. But of course I'm not dealing with interstellar scales :)

User avatar
Kencho
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 4011
Joined: Fri Sep 19, 2003 6:28 pm
Location: Burgos, Spain
x 2
Contact:

Post by Kencho »

Well, I can think some more examples of mandatory scaling:
- Forests
- Planets/Stars
- Population...
Scaling is a must in my opinion. Not always needed, but assuming people won't need it is too much assumption in my honest opinion :)
Image

Vectrex
Ogre Magi
Posts: 1266
Joined: Tue Aug 12, 2003 1:53 am
Location: Melbourne, Australia
x 1
Contact:

Post by Vectrex »

hehe ok I confess.. *I* don't need it ;)

pix
Kobold
Posts: 28
Joined: Thu Sep 02, 2004 7:13 pm
x 1

Post by pix »

hey, the bigger bounding box problem is that it's only calculated based on the centres of the objects, and batchInstance doesn't know the size of the instanced object. it would be easier to get this starting bounding box if the createBatchRenderOp was a method of the batchInstance class somehow. although perhaps just a bounding radius would be more useful since it would work even if you were rotating or scaling the objects.

btw, i'm still trying to get this working in GLSL but encountering numerous problems, mostly to do with the different way that parameters work in GLSL (no indexed parameters it seems). briefly i got fed-up with that aspect of GLSL (or at least the way it's implemented in OGRE) and went to CG but before i got it completely working i realised that CG has a really tight limit on the number of parameters you can pass in, so i didn't persist with it.

anyhow, if i haven't posted something by the time you need to start working on a GLSL version, let me know and i'll tell you how far i am.

[edit]
here is the code i have so far, incase someone is interested in telling me what i am doing wrong. it's made to be unpacked in the Samples dir, like crashys. look out, it writes over the Makefile.am file there to add instancing to the list of subdirs.

http://pix.test.at/ogre/instancing_gl.tgz

this is the output i get at the moment:

Image

each instance seems to be just getting the last value (the camera is looking at 0,0,0 here). the polycount seems right, and if i set their positions mathematically based on their index (as mentioned in an earlier post) i can see that they are all there, just that they are all being drawn on top of each other.

[/edit]

pix.

pix
Kobold
Posts: 28
Joined: Thu Sep 02, 2004 7:13 pm
x 1

Post by pix »

wow, i'd just decided to abandon this if i couldn't get it working tonight, and whaddaya know? bang, 800 ninjas.

http://pix.test.at/ogre/instancing_gl.w ... 060612.tgz

Image

(to put the fps in context, it's a running on a 2ghz pentium-m with a radeon mobility 9700)

either i'm going about setting up the parameters in a very strange way, or the way glsl parameters are handled in ogre doesn't make sense. i only ended up working out what to do by putting tracewrites in the ogre code and then coming up with a situation which would fool the code into doing the right thing at the opengl-level. i might make a separate post about that side of it to avoid hijacking this thread.

[edit] http://www.ogre3d.org/phpBB2/viewtopic. ... 208#155208 [/edit]

strange behaviour: if you turn on setDisplaySceneNodes, instead of the axes mesh, you get the batch-mesh (ie, 200 ninjas at 0,0,0) rendered with the emissive axes material.

pix.

Crashy
Google Summer of Code Student
Google Summer of Code Student
Posts: 997
Joined: Wed Jan 08, 2003 9:15 pm
Location: Lyon, France
x 48
Contact:

Post by Crashy »

As suggested by Kuranes, I'm trying to make instancing from the static geometry code, the modification would only be the addition of the index in the vertex buffer.

But I've some difficulties to see what are exactly the difference between material buckets and geometry buckets.

Some others questions for the design.
-The static geometry can have multiple regions. I think that for instancing, only 1 region will be needed for one batch.

-use the GeometryBucket::getRenderOperation() to use multiple times the same render operaton, for each batch instance.


I think the only modification I've to do in the static geometry code to make an instancing code is in the geometryBucket::build() method, and in the constructor of the geometry bucket, to add index to the vertex declaration.

But now, where to add it :lol: Texcoord 1 is a good idea, sure. But I know that tangent vectors are not in the Tangent register, but in the texcoord1.
If the mesh has tangent vector, I must offset the register used.
(I can also suggest to use the Tangent register to store the index, but it is really strange, isn't it :lol: ).

Notice that offset the register is do-able, not so hard I assume.
Follow la Moustache on Twitter or on Facebook
Image

Vectrex
Ogre Magi
Posts: 1266
Joined: Tue Aug 12, 2003 1:53 am
Location: Melbourne, Australia
x 1
Contact:

Post by Vectrex »

what do you think positioning instanced geo will look like in code? Since static needs to be unpacked to reposition things and then repacked, but I assume instanced would have a more direct approach similar to a normal entity?

User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4
Contact:

Post by tuan kuranes »

-The static geometry can have multiple regions. I think that for instancing, only 1 region will be needed for one batch.
I would say It would still can be of some use, specially considering user dropping blindly all its meshes in its without taking care of anything, but it would need a call to the splatial partitionning each time a mesh insid the instance batch is moved. Can be done afterward as an option, no ?
use the GeometryBucket::getRenderOperation() to use multiple times the same render operaton, for each batch instance.
the Geometry bucket is indeed the "batch" instance. Buckets are just the containers. Each bucket has to have the same material, the same geometry format, the same LOD levels...
Material bucket would be a group of geometry bucket using the same material. (if having different geomtry configuratin (vertex buffers) or different LODs, it can be more than only 1 geometry bucket.)
where to add it
Would do both...
if no tangent in that geometry bucket => texcoord 1
if tangents in that geometry bucket => texcoord 2
no ?

Crashy
Google Summer of Code Student
Google Summer of Code Student
Posts: 997
Joined: Wed Jan 08, 2003 9:15 pm
Location: Lyon, France
x 48
Contact:

Post by Crashy »

Can be done afterward as an option, no ?
yes why not.
the Geometry bucket is indeed the "batch" instance. Buckets are just the containers. Each bucket has to have the same material, the same geometry format, the same LOD levels...
mmhhhhh, but in the code, 1 bucket = 1 vertex buffer. So that's not really our goal, because all the instances of the same batch use the same vertex buffer.
But I can do so, that's not an heavy modification I think.

Just to be sure, if I fill in 10 times the same mesh(using only one material) in the static geometry, this is only 1 bucket, and not 10 buckets.

Would do both...
if no tangent in that geometry bucket => texcoord 1
if tangents in that geometry bucket => texcoord 2
no ?
Mmh, yes, but the final user must know if its mesh uses or not tangents, to change the register used in the shader. That was on of the problems I had before.
Follow la Moustache on Twitter or on Facebook
Image

User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4
Contact:

Post by tuan kuranes »

- Indeed one geometry bucket.(if no submesh with different geometry format, think of pose animated mesh.)

- if user doesn't want to touch shaders, it will use one of the 2 premade shader (texcoord 1 or 2). If he handles shaders, he could/should be able to set material per materiabucket, and therefore will know what shader to apply : He calls MaterialBucket::getIndexTexcoordIndex(), and after that is able to call MaterialBucket::getMaterial() in order to apply its own shader...

Crashy
Google Summer of Code Student
Google Summer of Code Student
Posts: 997
Joined: Wed Jan 08, 2003 9:15 pm
Location: Lyon, France
x 48
Contact:

Post by Crashy »

Okay, interesting design. And it will be easier like that, because the index will be added at the end of the vertex declaration.
Thanks.
Follow la Moustache on Twitter or on Facebook
Image

klauss
Hobgoblin
Posts: 559
Joined: Wed Oct 19, 2005 4:57 pm
Location: LS87, Buenos Aires, República Argentina.

Post by klauss »

Can't all the shaders and stuff from skeletal animation be used?
"Instance index" sounds a lot like "bone" index to me. It's like 1 weight shaders.

In fact, when I was thinking about it, I was actually thinking about implementing it using skeletal animation stuff - just make each instance be assigned to its own bone, and then manipulate the bone.

Just passing ideas.

Crashy: that would save you the work of having to mess with index buffers, no? Only multiply the buffer, and assign bones. Much easier than handling it all yourself. And it would centralize that too, so improvements in one automatically carry to the other. And, with some consideration, you could also support animated meshes like that - you only have to work carefully to synchronize multiple bones per instance. Any skinning-capable material would be instance-capable, and that would be cool too.
Oíd mortales, el grito sagrado...
Hey! What is it with that that?
Wing Commander Universe

Crashy
Google Summer of Code Student
Google Summer of Code Student
Posts: 997
Joined: Wed Jan 08, 2003 9:15 pm
Location: Lyon, France
x 48
Contact:

Post by Crashy »

First batch created with the new code (modified from static geometry)
Image

For the moment it works for the rock only, but it will work with all the meshes soon.

Some design considerations:
-I suggest that with the batch creation, it automatically create one batch instance, with all of the geometry buckets.
-After that, the user can call a "add batch instance" method, that take the transformation matrix array as a parameter, and that create new buckets using the same vertex buffer as the firsts buckets.

The region has a list of LOD buckets->ok!

The LOD Bucket has a map of material bucket->ok! I understand.

I've seen that the material bucket has a list of geometry buckets->ok! I understand at this point.


Here is how I see the creation of a new batch instance:

-First create a new LOD Bucket
-Read all the material buckets of the original batch instance lod bucket
-create new material buckets.
-Read all the geometry buckets from the original batch instance material buckets
-Use a clone() method for each geometry bucket, that return a new bucket using the same buffers.

Some other things: I've seen that the static geometry code remove bone informations from the vertex declaration. In the perspective of implementing crowd rendering, it would be usefull not to delete them, no?
Follow la Moustache on Twitter or on Facebook
Image

User avatar
tuan kuranes
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 2653
Joined: Wed Sep 24, 2003 8:07 am
Location: Haute Garonne, France
x 4
Contact:

Post by tuan kuranes »

@klauss: actual skeletal animation code in Ogre is indeed quite complex, not due to all particular cases to handle. Not sure it would be a gain to mess with that code. But using weight shaders as instance index could be interesting (a lot more complex for crowd animation, and perhaps even would need too much matrix computation on CPU feeding the "bones instance" matrix mul by "mesh instance" matrix)
In the perspective of implementing crowd rendering, it would be usefull not to delete them, no?
Could be left as an option ?
In some case, user could need very fast instanced "army" that just stand still. and just used another set of instance batches with bones when moving (slower) ?

Crashy
Google Summer of Code Student
Google Summer of Code Student
Posts: 997
Joined: Wed Jan 08, 2003 9:15 pm
Location: Lyon, France
x 48
Contact:

Post by Crashy »

Now full support of meshes with multiple submeshes

http://crashy.cartman.free.fr/SOC/headArmy.jpg


Some little problems to resolve:
-meshes with multiple buffers have problems with the vertex declaration, should be easy to correct I hope.
-with complex meshes like the athena, I'm limited by the original index size, that is often 16bits. So I must make the same trick that I've made with my original code, I think.

But nothing really problematic for the moment, I plan to release a code compatible with every meshes, and with the ability to add batch instances, this weekend(but more saturday evening that tomorow ;) )
Follow la Moustache on Twitter or on Facebook
Image

Post Reply