[GSoC 2013 - accepted] Ogre 2.0

Threads related to Google Summer of Code
Post Reply
User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by Klaim » Wed Jul 17, 2013 10:23 am

I prefer Ogre cmake generated project to use the default of the Visual Studio version. That way, I can easily choose if I want this optimization or not. Ogre should not impose it.
Also recent vc versions makes GS faster than previously. And not everybody wants to remove such kind of checks.

In my project using Ogre, I prefer those checks to be active until the end of the first big release. At this point I'll be optimizing anyway and plan to add some compilation flags to accelerate the whole project, including all dependencies like Ogre. I have doubts that I will remove GS actually, because it prevent some potential hacking issues that I don't want to occur in my case. I'll remove it if it really makes the overall experience better, but I can't verify that yet.
0 x

User avatar
FrameFever
Platinum Sponsor
Platinum Sponsor
Posts: 414
Joined: Fri Apr 27, 2007 10:05 am

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by FrameFever » Wed Jul 17, 2013 6:53 pm

The guy who make this project here should just test it and give us the results,
then the team can decide if it's worth or not.

The prebuild SDK should offer the best optimized and fastest version thats my opinion.

PS: This buffer overflow check is mostly use for server application.
It's nonsense for graphics application. Btw. has gcc such a check?
0 x

User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by Klaim » Wed Jul 17, 2013 7:54 pm

FrameFever wrote:The guy who make this project here should just test it and give us the results,
then the team can decide if it's worth or not.
I would prefer that guy to focus on improving performance on the architecture level, as he is supposed to, instead of loosing time on this.
The prebuild SDK should offer the best optimized and fastest version thats my opinion.
That I don't disagree with. I was talking about the source distribution which should not change the defualt GS setting provided by different Visual Studio versions.
PS: This buffer overflow check is mostly use for server application.
It's nonsense for graphics application.
Totally wrong, tons of games got problems through the issues checked by this option. It's not vital in the case of games to have these checks but it does have an impact in experience when cheating in multiplayer by exploiting issues with bufffers. It's a general problem anyway, so having the check by default and removing it if you want is the safest way I think.
0 x

User avatar
spacegaier
OGRE Team Member
OGRE Team Member
Posts: 4293
Joined: Mon Feb 04, 2008 2:02 pm
Location: Germany
x 2
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by spacegaier » Fri Jul 19, 2013 3:46 pm

Some new insights from Mathias on his blog. This time, the new bounding box handling:

http://yosoygames.com.ar/wp/2013/07/goo ... hello-aabb
0 x
Ogre Admin [Admin, Dev, PR, Finance, Wiki, etc.] | BasicOgreFramework | AdvancedOgreFramework
Don't know what to do in your spare time? Help the Ogre wiki grow! Or squash a bug...

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4144
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 268
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by dark_sylinc » Mon Jul 22, 2013 12:32 am

The guy on this blog made a thorough research regarding "secure/debug" functions that are still enabled in Release mode:
I don't agree to all of his conclusions, but he raises a few good points. My view on the take is:

The problem is that we usually work with "Debug" and "Release" builds, when this is very limiting actually.
During development I use what I call "Hybrid". Hybrid builds uses the release CRT, enables _SECURE_SCL, enables /GS, includes debugging symbols (pdbs), does not define neither NDEBUG (asserts now work!) nor _DEBUG (build doesn't get painfully slow).
While still optimizes inlines, uses /O2 (may be except one or two cpp files when I'm debugging a very hard to track bug so I can more acurately see what variables contain)
This is more tuned version of what CMake wants to do with "RelWithDebInfo"; which is basically just Release mode with pdbs (what I miss the most is lack of asserts)

This way Hybrid achieves decent performance while still debuggable.
Now, as when deploying your final project, it depends. I usually prefer leaving _SECURE_SCL & /GS enabled because they will cause the build to crash at the exact moment the bug. Which makes it a lot easier to track down when a client/customer/gamer finds your bug.
Without them on, a buffer overflow can show it's symptoms way too late, and no easy way to discover them.

Nonetheless, "it depends". If your game is running at steady 60fps all the time even on old hardware, leave them on. If your game is struggling (CPU bottleneck) to get 15-20fps, I consider turning them off.
0 x

User avatar
syedhs
Silver Sponsor
Silver Sponsor
Posts: 2702
Joined: Mon Aug 29, 2005 3:24 pm
Location: Kuala Lumpur, Malaysia
x 3

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by syedhs » Mon Jul 22, 2013 12:50 am

Yes Hybrid is the way especially in 3d programming environments because we want sane speed while still being able to debug the program. For my case which has rare need to examine Ogremain and the plugins source code, all Ogremain is in release mode, but my application is in hybrid mode. Hybrid also enables one to link much faster - about 3-4x faster so one can make changes, compile, test faster too. Release mode is only executed when you want to give your app to your client, or better once a week just to make sure there are not bugs introduced in 'production exe'.
0 x
A willow deeply scarred, somebody's broken heart
And a washed-out dream
They follow the pattern of the wind, ya' see
Cause they got no place to be
That's why I'm starting with me

drwbns
Orc Shaman
Posts: 777
Joined: Mon Jan 18, 2010 6:06 pm
Location: Costa Mesa, California

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by drwbns » Mon Jul 22, 2013 1:38 pm

Great to see your initial profiling work. Looks like you've bettered a lot of the core scene work, and ahead of schedule! Awesome :) Can't wait to see what you got next :)
0 x

User avatar
spacegaier
OGRE Team Member
OGRE Team Member
Posts: 4293
Joined: Mon Feb 04, 2008 2:02 pm
Location: Germany
x 2
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by spacegaier » Thu Aug 15, 2013 10:24 am

New non-scientific benchmark on the HW instancing improvements: http://yosoygames.com.ar/wp/2013/08/hw- ... rk-on-2-0/ Awesome results!
0 x
Ogre Admin [Admin, Dev, PR, Finance, Wiki, etc.] | BasicOgreFramework | AdvancedOgreFramework
Don't know what to do in your spare time? Help the Ogre wiki grow! Or squash a bug...

User avatar
Zonder
Ogre Magi
Posts: 1133
Joined: Mon Aug 04, 2008 7:51 pm
Location: Manchester - England
x 22

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by Zonder » Thu Aug 15, 2013 11:39 am

Impressive results so far!!
0 x
There are 10 types of people in the world: Those who understand binary, and those who don't...

drwbns
Orc Shaman
Posts: 777
Joined: Mon Jan 18, 2010 6:06 pm
Location: Costa Mesa, California

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by drwbns » Thu Aug 15, 2013 12:57 pm

Wow, just wow. Are you currently working on stage 6 from your wiki? -

Code: Select all

Stage 6 (2 Aug – 8 Aug)

Restore broken stuff. RibbonTrails, Billboards, InstanceManager, ManualObject & SimpleRenderable.
Just wondering where you're at as far as the schedule.
0 x

PhilipLB
Google Summer of Code Student
Google Summer of Code Student
Posts: 550
Joined: Thu Jun 04, 2009 5:07 pm
Location: Berlin

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by PhilipLB » Thu Aug 15, 2013 2:17 pm

Does this mean that you can set a "Static"-flag on any entity and it automatically gets treated as static geometry and the batch count goes down when there are many static entities sharing the same material?
If so, then the Volume Terrain will get a massive performance gain on Ogre 2.0. :D
0 x
Google Summer of Code 2012 Student
Topic: "Volume Rendering with LOD aimed at terrain"
Project links: Project thread, WIKI page, Code fork for the project
Mentor: Mattan Furst


Volume GFX, accepting donations.

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4144
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 268
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by dark_sylinc » Thu Aug 15, 2013 9:01 pm

PhilipLB wrote:Does this mean that you can set a "Static"-flag on any entity and it automatically gets treated as static geometry and the batch count goes down when there are many static entities sharing the same material?
If so, then the Volume Terrain will get a massive performance gain on Ogre 2.0. :D
Umm no and yes. Read the doxygen doc.

On normal entities, static allows Ogre to avoid updating the SceneNode transformation every frame (because it doesn't change) and the AABB bounds from the Entity (because it doesn't change either).
This yields massive performance bump.

When using Instancing however, we're already batching everything together that has the same material, so it is indeed like Static Geometry, except that we cull per instance basis (which puts a bit more strain on CPU, but allows for very fine grained frustum culling on the GPU, giving it less work), and 2.0's culling code is several times faster than 1.9's

When using normal entities, batch count won't go down for using static flag. However it will improve performance when compared to 1.9 (because we're skipping the scene node transform & aabb update phases; and that takes a lot of cpu time)
0 x

User avatar
spacegaier
OGRE Team Member
OGRE Team Member
Posts: 4293
Joined: Mon Feb 04, 2008 2:02 pm
Location: Germany
x 2
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by spacegaier » Thu Aug 15, 2013 9:04 pm

dark_sylinc wrote:
PhilipLB wrote:Does this mean that you can set a "Static"-flag on any entity and it automatically gets treated as static geometry and the batch count goes down when there are many static entities sharing the same material?
If so, then the Volume Terrain will get a massive performance gain on Ogre 2.0. :D
Umm no and yes. Read the doxygen doc.

On normal entities, static allows Ogre to avoid updating the SceneNode transformation every frame (because it doesn't change) and the AABB bounds from the Entity (because it doesn't change either).
This yields massive performance bump.

When using Instancing however, we're already batching everything together that has the same material, so it is indeed like Static Geometry, except that we cull per instance basis (which puts a bit more strain on CPU, but allows for very fine grained frustum culling on the GPU, giving it less work), and 2.0's culling code is several times faster than 1.9's

When using normal entities, batch count won't go down for using static flag. However it will improve performance when compared to 1.9 (because we're skipping the scene node transform & aabb update phases; and that takes a lot of cpu time)
Mathias, can you please add those information (perhaps in a Q&A format on your wiki page as well) since those are important bits that should be collected in one central place to make it later easier for people to benefit from your great changes.
0 x
Ogre Admin [Admin, Dev, PR, Finance, Wiki, etc.] | BasicOgreFramework | AdvancedOgreFramework
Don't know what to do in your spare time? Help the Ogre wiki grow! Or squash a bug...

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4144
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 268
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by dark_sylinc » Thu Aug 15, 2013 9:09 pm

drwbns wrote:Wow, just wow. Are you currently working on stage 6 from your wiki? -

Code: Select all

Stage 6 (2 Aug – 8 Aug)

Restore broken stuff. RibbonTrails, Billboards, InstanceManager, ManualObject & SimpleRenderable.
Just wondering where you're at as far as the schedule.
Yeah, restoring stuff is taking longer than planned. InstanceManager is fully (or almost) restored (I haven't tried ShaderBased & VTF techniques, but HW Basic, HW VTF & HW VTF Dual Quaternion techniques are tested & working).

RibbonTrails, ManualObject & SimpleRenderable should be easier to port, but I want to put them on hold because shadow maps are broken and they're more important (who uses an engine without shadow mapping today!?) so we're reevaluating a bit the schedule.
0 x

PhilipLB
Google Summer of Code Student
Google Summer of Code Student
Posts: 550
Joined: Thu Jun 04, 2009 5:07 pm
Location: Berlin

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by PhilipLB » Thu Aug 15, 2013 9:11 pm

Interesting, thanks for the answer. :) So I still get more performance for free, yay. :)
0 x
Google Summer of Code 2012 Student
Topic: "Volume Rendering with LOD aimed at terrain"
Project links: Project thread, WIKI page, Code fork for the project
Mentor: Mattan Furst


Volume GFX, accepting donations.

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4144
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 268
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by dark_sylinc » Thu Aug 15, 2013 9:16 pm

spacegaier wrote:Mathias, can you please add those information (perhaps in a Q&A format on your wiki page as well) since those are important bits that should be collected in one central place to make it later easier for people to benefit from your great changes.
Sure. I'll add them to the GSoC wiki. Wiki maintainers can give them probably better format (I suck at wiki formatting)
0 x

User avatar
spacegaier
OGRE Team Member
OGRE Team Member
Posts: 4293
Joined: Mon Feb 04, 2008 2:02 pm
Location: Germany
x 2
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by spacegaier » Fri Aug 16, 2013 12:34 pm

dark_sylinc wrote:Sure. I'll add them to the GSoC wiki. Wiki maintainers can give them probably better format (I suck at wiki formatting)
Done, complete with links to OgreLexicon :) .
0 x
Ogre Admin [Admin, Dev, PR, Finance, Wiki, etc.] | BasicOgreFramework | AdvancedOgreFramework
Don't know what to do in your spare time? Help the Ogre wiki grow! Or squash a bug...

drwbns
Orc Shaman
Posts: 777
Joined: Mon Jan 18, 2010 6:06 pm
Location: Costa Mesa, California

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by drwbns » Tue Sep 03, 2013 1:36 pm

I know you've been pushing a lot of commits, but can you re-evaluate where you're at again?
0 x

AgentC
Kobold
Posts: 33
Joined: Tue Apr 24, 2012 11:24 am

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by AgentC » Wed Sep 04, 2013 11:53 pm

I haven't been following this thread for a while, and to my surprise when I come back I find that Ogre 2.0's instanced 250x250 object test finally beats my engine hands down, both in static & animated modes, forcing me to optimize heavily if I want to keep up :) As well as beating others, like Unity, by even wider margin. Congrats dark_sylinc!
0 x

al2950
OGRE Expert User
OGRE Expert User
Posts: 1212
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 80

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by al2950 » Fri Sep 06, 2013 11:48 am

@dark_sylinc
I dont want to downplay any of the work people have done for Ogre in previous GSOC or any 'free time' contributions but I would just like to say I think this is some of the most important work done in Ogre for a long time :). I have been following your commits and blogs quite closely, your performance achievements are already very impressive. I am really looking forward to you finishing the new compositor and scene rendering framework. I have just one question which is confusing me though.

You have split up culling and rendering to deal with some shadow issues you were having. Firstly I think that is a good idea to split those up and allow sharing of culling data between passes, although I think that was in your original design...? My naive question is why do you have to share culling data between main scene and direction shadow maps, admittedly I have never got round to working out how to setup direction shadow camera (it just works in Ogre :oops:!). As far as I am concerned you have to do your own culling with any shadow cam renders anyway. Would your approach be flexible enough for another forms of rendering, eg Deferred?

Anyway keep up the excellent work :D
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4144
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 268
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by dark_sylinc » Fri Sep 06, 2013 11:12 pm

drwbns wrote:I know you've been pushing a lot of commits, but can you re-evaluate where you're at again?
I'm restoring shadow mapping. It's taken way longer than anticipated. However the necessary features needed for shadow mapping to work caused this "minimalistic" compositor to turn into the full blown one (so much for "not part of this gsoc"), so by the end of the gsoc we should have an almost fully featured compositor to replace the old one.
I think threading will make it in time since that's a no brainer because everything else is ready for it.
Particle FXs and billboards won't make it in time.
The mesh partitioner/splitter won't make it either, however I'm evaluating an alternative that may cause us not to need it so much.
AgentC wrote:I haven't been following this thread for a while, and to my surprise when I come back I find that Ogre 2.0's instanced 250x250 object test finally beats my engine hands down, both in static & animated modes, forcing me to optimize heavily if I want to keep up :) As well as beating others, like Unity, by even wider margin. Congrats dark_sylinc!
We... beat Unity? :shock: :shock: :shock: :) :o :D
At last! Something we can compete! Lately all other engines have been kicking our butts. We're back on the game! (one the big reasons I went into 2.0)
al2950 wrote:My naive question is why do you have to share culling data between main scene and direction shadow maps, admittedly I have never got round to working out how to setup direction shadow camera (it just works in Ogre :oops:!). As far as I am concerned you have to do your own culling with any shadow cam renders anyway.
TBH I forgot about sharing the low level cull lists between identical passes :lol: But it would be trivial to implement.
As for shadow cameras, we can't share the low level cull lists because... it's a different list per camera.
However Shadow mapping needs the data from the normal pass (basically Focused camera setups and its derived classes: LispSM & PSSM) to know the bounds of the receiver objects & the bounds of the caster objects and then build an optimal setting for the shadow camera (position, near plane, far plane and skewing of matrices in the case of LispSM & PSSM).
al2950 wrote: Would your approach be flexible enough for another forms of rendering, eg Deferred?
Yes. I'm writing it with that in mind.
However for the time being I had to disable support for something that was added in a previous GSoC, which was "paused rendering" which was used to use to apply the shadow map pass right before rendering the light object. With this method, it was possible to draw an infinite amount of lights with just one texture; whereas for now, an infinite amount of lights would require an infinite amount of textures (one per caster light, that's a lot of VRAM).

We'll solve that somehow on the go as the compositor is very flexible. Besides, I hardly think there is a need for this feature since it's uncommon to see more than 8 shadow casting lights (at 1024x1024 R32 that's 32 MB of VRAM, at 2048x2048 that's still 128 MB) but it hinders the possibility of instancing those light volumes (render all of them in one pass).

On the other side, I added support for shadow map atlas! I'm not sure if it will be working by the end of the GSoC, but the infrastructure is there!
Just in case someone is lost what I mean with "shadow map atlas", it means (i.e.) having one texture of 2048x2048 and rendering 4 shadow maps, each on a 1024x1024 region (or whatever resolution/number of shadow maps you want) instead of having 4 textures of 1024x1024 each.
0 x

User avatar
Mako_energy
Greenskin
Posts: 125
Joined: Mon Feb 22, 2010 7:48 pm
x 4

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by Mako_energy » Sat Sep 07, 2013 2:28 am

dark_sylinc wrote:I think threading will make it in time since that's a no brainer because everything else is ready for it.
Particle FXs and billboards won't make it in time.
What threading solution are you thinking of going with, since it is under the scope of this GSOC?
If particles and billboards won't make it within the scope of this GSOC, do you intend to keep on working on this refactor after the GSOC has ended?
0 x

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4144
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 268
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by dark_sylinc » Sat Sep 07, 2013 4:25 am

Mako_energy wrote:What threading solution are you thinking of going with, since it is under the scope of this GSOC?
The one that is one the slides: sync and wait.
The code is not only highly parallel, it also takes the same time of execution for all threads. If someone thinks a thread pool could give a boost, he can try; but I don't think it's going to make much difference; so my efforts will be on the simplest solution.

As for what sync mechanisms (eg. boost, tbb) just custom made: A sync barrier (which is native on PThreads, and emulated in Windows, I've already wrote that code but not committed) and native thread launching. It's not really a big deal.
TBH it's easier than using a lib.
Mako_energy wrote:If particles and billboards won't make it within the scope of this GSOC, do you intend to keep on working on this refactor after the GSOC has ended?
Yes. I'll start with particles because of personal interest. Furthermore I have something in mind (new) for the particles. If that pans out, it may work with billboards too.
The idea is that functionally it remains the same (position, velocity, emitters, etc; behave the same or very similar to how they did in 1.9)
0 x

User avatar
Mako_energy
Greenskin
Posts: 125
Joined: Mon Feb 22, 2010 7:48 pm
x 4

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by Mako_energy » Sat Sep 07, 2013 8:02 am

dark_sylinc wrote:As for what sync mechanisms (eg. boost, tbb) just custom made: A sync barrier (which is native on PThreads, and emulated in Windows, I've already wrote that code but not committed) and native thread launching. It's not really a big deal.
TBH it's easier than using a lib.
Sounds reasonable. But I have to ask, is a solution as simple as that adequately future proof?
Although the main reason I ask is the off chance my studios custom threading solution is appropriate.
For the curious: https://github.com/BlackToppStudios/DAGFrameScheduler
0 x

User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
Contact:

Re: [GSoC 2013 - accepted] Ogre 2.0

Post by Klaim » Sat Sep 07, 2013 10:46 am

Disclaimer: My following comments shouldn't impact current development, I'm just thinking about how the short-term and long-term future is apparently going, both for C++ landscape and my own (long-term) project.
dark_sylinc wrote: The one that is one the slides: sync and wait.
The code is not only highly parallel, it also takes the same time of execution for all threads. If someone thinks a thread pool could give a boost, he can try; but I don't think it's going to make much difference; so my efforts will be on the simplest solution.
Just to clarify, here you are talking exclusively about rendering/updating animations/particles, not resource loading, right?
If you use custom threads, then I see a potential oversubscription problems in the coming future (or with my current code).
I think it should be easily fixed, but I guess we'll see later after you stabilize your version and make it public.
As for what sync mechanisms (eg. boost, tbb) just custom made: A sync barrier (which is native on PThreads, and emulated in Windows, I've already wrote that code but not committed) and native thread launching. It's not really a big deal.
TBH it's easier than using a lib.
I don't agree because I have no confidence (by experience) in custom code doing low level synchronization. It's not in you I have no confidence, it's in low level synchronization code.
I guess it's ok for now if you don't find any problem, but I also suspect library code to be of higher quality and certainly both easier to maintain and debug
than anything custom we can put Ogre code. Just for the sake of keeping Ogre only about graphics, that's not a good idea.
I understand that adding libraries right now is not really ideal, but I suspect there is enough constructs in C++11 standard library to easily
write the same semantic. I suggest using the standard library instead (for long-term), NOT NOW, but as soon as the OGRE version is stable (somewhere next year?)
These things frankly scares me a lot (all my current code is concurrent).
0 x

Post Reply