Separating Ogre Logic & Ogre Rendering to different threads

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


Post Reply
al2950
OGRE Expert User
OGRE Expert User
Posts: 1227
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 157

Separating Ogre Logic & Ogre Rendering to different threads

Post by al2950 »

Recently I have been looking at trying to completely separate Ogre rendering loop from my logic loop with no locking. In short it is very difficult to impossible (see here for one particular problem)

Anyway, I believe this should be built in functionality of Ogre so I am starting this thread to discuss at what, if any, level the separation should be.

As far as I am concerned from a high level prospective this should not be too difficult and all you really need on the rendering side is the following

Renderables
  • World transform matrix
  • Material
  • Skeleton Bone matrices and weights
  • Misc stuff like cast shadows, etc...
Lights
  • World transform matrix
  • Type
  • Colour
  • Light properties...
SceneSetup
  • Shadow settings
  • Forward 3D settings
  • Compositor setup
  • etc
The idea of above is to allow everything in CompositorManager2::_update to be called in a separate thread. I appreciate this is very high level and its guaranteed I have missed stuff (please fill in the missing bits!), however it would be good to get some feedback and work out if this is or will ever be possible with Ogre. :D
crancran
Greenskin
Posts: 138
Joined: Wed May 05, 2010 3:36 pm
x 6

Re: Separating Ogre Logic & Ogre Rendering to different threads

Post by crancran »

You mentioned the following in the other thread:
al2950 wrote: Mon Nov 13, 2017 5:26 pm I honestly believe its a rendering engine's job and should be done at a lower level for both ease and memory use. Anyway I would appreciate any thoughts/opinions you have
We cannot loose sight of the fact that OGRE is a rendering engine and not a game engine.

When I see you mention "my logic loop", I interpret that to include a plethora of logical step actions such as gathering input, updating the game object state, stepping physics, and a variety of other deterministic steps that eventually lead to the frame state that is to be pushed to OGRE. I just want to be careful with the wording here because I would not want OGRE to mandate infrastructure for these things. That should be something left up to the individual game or the game engine to do.

While I understand your concern with memory use, I do want to stress that we cannot forget about performance.

There are steps in the logic-loop that may influence data storage layouts or access patterns that are not conducive to an efficient render pipeline and vice versa. In order to maximize performance here, it often makes sense to replicate the necessary bits of information in two different data structures that are efficient for the thread/task at hand and use a sync-point. So rather than looking at splitting, it may make more sense to look at a way to more easily influence/push data to OGRE for its needed steps when/where it makes sense.
al2950
OGRE Expert User
OGRE Expert User
Posts: 1227
Joined: Thu Dec 11, 2008 7:56 pm
Location: Bristol, UK
x 157

Re: Separating Ogre Logic & Ogre Rendering to different threads

Post by al2950 »

Thanks for you reply :)

Before going into some more detail of my current way of thinking, I am happy to be convinced the way Ogre is currently structured, and what I have done to date is correct, as it will mean less work! :D
crancran wrote: Tue Nov 14, 2017 6:50 am We cannot loose sight of the fact that OGRE is a rendering engine and not a game engine.

When I see you mention "my logic loop", I interpret that to include a plethora of logical step actions such as gathering input, updating the game object state, stepping physics, and a variety of other deterministic steps that eventually lead to the frame state that is to be pushed to OGRE. I just want to be careful with the wording here because I would not want OGRE to mandate infrastructure for these things. That should be something left up to the individual game or the game engine to do.
Good point, when I say 'my logic loop' I am thinking of my end goal, but I 120% agree with your 'Ogre is a rendering engine' statement!

I shall try and clarify;
Currently I am wasting quite a few CPU cycles waiting for the GPU to finish rendering. This is compounded by Ogre's rendering pipeline still being single threaded. So many cores are laying dormant. I should think this is a problem faced by many Ogre devs.

So we could as you say have ogre in its own thread, and our engines logic in another and sync the 2, and I assume this is what most people do. But there are issues.
  • You have to duplicate all data, which is not easy, especially with Ogre Singletons everywhere! I have ended up rolling my own scene nodes implementation and shortly my own skeleton, animation, particle....etc. In fact if I am being honest, I am getting to the point of questioning what I am using in Ogre.... Answer = Compositor & HLMS :)
  • Ideally you want to create a 'data fram'e for the graphics engine to render. That is you give it a collection of objects that are not persistent. However Ogre's memory mangers are designed to be fast at iterating through collection, but relatively slow at adding. So if you were for example creating the entire scene graph every frame it would run like a dog
  • SO you need to create a persistent copy and update it at a specific sync point. This has a number of issues, firstly, for example with scene nodes, you need to store and sync create, delete, attach, detach, etc events, not just state, eg position. Secondly the sync point implies some form of locking which will have a performance impact.
So when I talk about logic and rendering, I see Ogre as having a, albeit limited, logic side eg scene graph updates, skeleton, animation, particle updates. And it would be much easier, and more importantly, much more performant to allow devs to update scene graph, animations etc in thier own logic thread, and have Ogre deal submitting 'frame data' to the rendering thread.
crancran wrote: Tue Nov 14, 2017 6:50 am While I understand your concern with memory use, I do want to stress that we cannot forget about performance.
This thread is all about performance :). It would be fairly easy to configure Ogre to have the current behavior or my suggested behavior depending on devs needs.

This is my current view, and with CPU finally becoming many core its more important than ever, but it is a very difficult problem to solve in software. I have done a lot of reading, but I have been mostly inspired by Naughty Dogs fiber presentation
Presentation
PDF
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: Separating Ogre Logic & Ogre Rendering to different threads

Post by dark_sylinc »

al2950 wrote: Tue Nov 14, 2017 11:38 am Currently I am wasting quite a few CPU cycles waiting for the GPU to finish rendering. This is compounded by Ogre's rendering pipeline still being single threaded. So many cores are laying dormant. I should think this is a problem faced by many Ogre devs.
Analyze your bottleneck first. If you're idle because the CPU is waiting for the GPU, adding more cores won't fix any problem.
If you're idle because one CPU core is busy with Ogre processing, then adding more cores can alleviate it.
However the render split model in this case works just fine.
al2950 wrote: Tue Nov 14, 2017 11:38 am
  • You have to duplicate all data, which is not easy, especially with Ogre Singletons everywhere! I have ended up rolling my own scene nodes implementation and shortly my own skeleton, animation, particle....etc. In fact if I am being honest, I am getting to the point of questioning what I am using in Ogre.... Answer = Compositor & HLMS :)
No matter what threading paradigm, you'll never be saved from data duplication. It's a threading trade off:
  1. If data is read-only, no need to duplicate.
  2. If data is write access, you can lock access. Doesn't scale well if contention is high.
  3. If data is write access, you can duplicate. Each thread gets its own copy it can work with.
I don't get why you're duplicating everything and seemingly using Ogre directly or Ogre-like concepts in your logic.
Back when I was writing Distant Souls, Ogre was fully contained to the rendering thread, except for skeletons which also had a copy simulated on the logic thread for important characters that needed deterministic attachments (e.g. weapons).

There were no nodes, just 2 flat array of objects in scene (one for objects that require constant sync like characters and attacks, and another for rare sync like trees, rocks, rendering-only entities). Logic thread copied from Havok its position & orientation to a container; and the rendering engine every frame copied that data to the Ogre node.

Animations that were on both logic and rendering, logic would periodically send messages basically saying "I am here" (running animations A, B, C, at time X and weights Y) and rendering just ensuring it wouldn't run too much ahead (basically rendering would act on its own; but on a few constraints to prevent deviating too much).

Particles FXs were fired from logic thread, which passed messages for the render thread to spawn, play and stop.
  1. SO you need to create a persistent copy and update it at a specific sync point. This has a number of issues, firstly, for example with scene nodes, you need to store and sync create, delete, attach, detach, etc events, not just state, eg position. Secondly the sync point implies some form of locking which will have a performance impact.
Those are all rather trivial to maintain (create, attach, detach) as the logic has no knowledge of that. That's something the render thread takes care of.
Sync points aren't a problem either because you don't need to lock to sync except in some rare cases (usually if the logic thread needs to know something from render thread, and it can't be delayed for the next frame). If you couldn't pass something this frame to/from render thread, then you'll do that on the next frame.

Logic doesn't require Graphics to be initialized at all, as it normally shouldn't be reading data back from Graphics. This makes the possibility that an Entity could spawn and be kept invisible for a few frames because of sync intervals being missed (though rarely more than a frame of latency).

In other words, maintaining a render split between Logic and Graphics is very much similar to CPU and GPU; usually we issue commands CPU -> GPU; and when we need something back then we either buffer the result so we query if the result is ready later; or we just stall (block the main thread). Though GPU -> CPU usually involves 2 or 3 frames of latency, whereas when you're threading, times tend to much faster (it really depends on how long it takes for the rendering thread to get to the sync point).
Post Reply