Re: [GSoC 2013 - accepted] Ogre 2.0
Posted: Sat Sep 07, 2013 12:51 pm
For clarification, are you planning on completely omitting boost and other threading options from your version or something similar to what klaim said?
Support and community hang-out spot for Ogre3D
https://forums.ogre3d.org/
YesKlaim wrote:Just to clarify, here you are talking exclusively about rendering/updating animations/particles, not resource loading, right?
You must tell how many threads you want Ogre to create when creating the scene manager. That is in your control.Klaim wrote: If you use custom threads, then I see a potential oversubscription problems in the coming future (or with my current code).
Why everyone puts so much of a fuss on threading libraries?Klaim wrote: I don't agree because I have no confidence (by experience) in custom code doing low level synchronization. It's not in you I have no confidence, it's in low level synchronization code.
I guess it's ok for now if you don't find any problem, but I also suspect library code to be of higher quality and certainly both easier to maintain and debug
than anything custom we can put Ogre code. Just for the sake of keeping Ogre only about graphics, that's not a good idea.
I understand that adding libraries right now is not really ideal, but I suspect there is enough constructs in C++11 standard library to easily
write the same semantic. I suggest using the standard library instead (for long-term), NOT NOW, but as soon as the OGRE version is stable (somewhere next year?)
These things frankly scares me a lot (all my current code is concurrent).
Omitting boost and other threading options. In PThreads terms, all I'll be using is a pthread_barrier_t and pthread_create (and pthread_join on shutdown).drwbns wrote:For clarification, are you planning on completely omitting boost and other threading options from your version or something similar to what klaim said?
Perfect.dark_sylinc wrote:You must tell how many threads you want Ogre to create when creating the scene manager. That is in your control.Klaim wrote: If you use custom threads, then I see a potential oversubscription problems in the coming future (or with my current code).
Because both on the correctness, performance and maintenance side, their implementation is a moving target at this time. Their interface is not.Why everyone puts so much of a fuss on threading libraries?
Again I know you know what you're doing on the implementation level. What I'm saying is that this is not a good long-term strategy, in the same way choosing standard library container was a good long-term strategy for Ogre, in general.I'm going to sound arrogant on this one: It feels that I'm trying to do 2+2 and everyone is suggesting I should use a bignum library.
A thread barrier is one of the most basic synchronization primitives. We could even use OpenMP if it weren't for poor compiler support.
I've done a lot of very hard multithreading programming, and this job is the most simple threading example I can think of; usually done best using native synchronization primitives.
Yes exactly. That's the point actually. Even std::future::get() currently is considered synchronous multithreading (with simple and readable interface).It's synchronous multithreading, not asynchronous. Threading libraries are meant to deal with async multithreading problems. For synchronous multithreading, all libraries just wrap up the basic synchronization primitives for a given platform and call it a day.
I agree, as long as it's abstracted it's ok for me too.PhilipLB wrote:And if it's nicely abstracted, there is no obstacle to overwrite the Ogre-solution with Boost. And it would be possible to later exchange the implementation with Boost without braking the API in a later Ogre version if needed. -> Sounds good to me.
+1PhilipLB wrote:And if it's nicely abstracted, there is no obstacle to overwrite the Ogre-solution with Boost. And it would be possible to later exchange the implementation with Boost without braking the API in a later Ogre version if needed. -> Sounds good to me.
+2!PhilipLB wrote:And if it's nicely abstracted, there is no obstacle to overwrite the Ogre-solution with Boost. And it would be possible to later exchange the implementation with Boost without braking the API in a later Ogre version if needed. -> Sounds good to me.
Yes. When SSE2 is used, ARRAY_PACKED_REALS = 4; and mOwner is declared as:Klaim wrote: 1. ... "2.2 How to debug MovableObjects' (and Nodes) data": I'm lost starting from "Nevertheless, debugging SSE builds directly isn't that difficult.". I'm not sure if it's because I'm not an expert on the subject or it's the way you explain it. I get a glimpse of understanding after a lot of re-reading. Am I correct that you're saying SSE builds imply different arrays for different sizes of vector values, then mIndex tells you which array is to be used? Or something like that...
Code: Select all
Node **mOwner[ARRAY_PACKED_REALS]
No. ArrayVector3 packs 4 Vector3 together (in an SSE2 build, because ARRAY_PACKED_REALS = 4)Klaim wrote: 2. "2.2.1 Interpreting ArrayVector3": am I correct that m_chunkBase (in the image) is a transform matrix? Because it was not immediately clear to me (but maybe because I'm not used to look at them directly in the eye). If I'm correct, saying it directly (maybe in the variable name?) would help interpreting the data when debugging.
mmmm.... ok, I'll re-read to understand what's not clear.Klaim wrote: 4. Could you add a quick clarification (just after "names are optional" maybe) about the SceneMemoryMgrTypes argument in the entity creation function? So that the change on the way to managing static/dynamic is clear. As you say later "don't: Create objects that never move, rotate or scale as SCENE_DYNAMIC" without saying what's it's about, I think it's worth having a short section explaining it. Maybe just copy/pasting what you already wrote in this discussion.
Yes.Klaim wrote: 5. About threading, did I understand correctly that threads are associated to scene managers, so if you have several scenes to be rendered in the same time you need to keep in mind that each scene will have it's own set of threads (or not)?
Yes, though I should clarify that there should be one worker thread (numThreads must be > 0). One CPU = One worker thread.Klaim wrote: 6. Am I correct that if numThreads (in the scene/thread initialization example) is 0, then there is just no additional worker thread created?
Ooops, sorry. Implementing it in HW VTF is possible (though I have a couple of internal disagreements with myself on how to implement it: put it in the VTF or pass it as extra texcoords).Klaim wrote: 8. "At the time of writing only HW Basic supports passing the custom parameters. All other techniques will ignore
it." Did you mean it's not possible or it's not implemented? (I'm not a specialist in Instancing nor in shaders-related subjects in general, so I have no idea)
**Sigh** There was a name clashing with the old CompositorManager. Now it is removed, but at the time the name clashing was there. In other words CompositorManager2 completely replaces CompositorManager.Klaim wrote: 9. Is CompositorManager2 a final name? Is it supposed to replace CompositorManager later? Or did you mean it's an alternative way of doing it and you think the older way should be kept?
Ooops, I you see STATIC_SCENE, I mean SCENE_STATIC (it's a typo)spacegaier wrote:Haven't read the whole guide yet, but on the very last page in the "Don't" section: Are the first two points correct?
In the first one, you mention and STATIC_SCENE and SCENE_STATIC? Is that a typo?
It's in the "Don't" section. You're supposed to read it as "Don't create objects that never move, rotate or scale as SCENE_DYNAMIC"spacegaier wrote: In the second one, you say objects that never move should be in SCENE_DYNAMIC? Shouldn't it be the other way around? Or maybe one has to combine those statements with the "Don't" header to get the right meaning. In that case, please change the don't points to already include that "don't". As an example: The last bullet point would become: "Do not specify more threads than the number of available cores."
I almost assumed so, but it is highly misleading, so as I wrote earlier, I would urge you to rephrase that so that all statements in the "Don't" section also include the "do not" part within them. Otherwise people will get confused!dark_sylinc wrote:It's in the "Don't" section. You're supposed to read it as "Don't create objects that never move, rotate or scale as SCENE_DYNAMIC"spacegaier wrote: In the second one, you say objects that never move should be in SCENE_DYNAMIC? Shouldn't it be the other way around? Or maybe one has to combine those statements with the "Don't" header to get the right meaning. In that case, please change the don't points to already include that "don't". As an example: The last bullet point would become: "Do not specify more threads than the number of available cores."
Ok then what I don't understand is eitherdark_sylinc wrote:No. ArrayVector3 packs 4 Vector3 together (in an SSE2 build, because ARRAY_PACKED_REALS = 4)Klaim wrote: 2. "2.2.1 Interpreting ArrayVector3": am I correct that m_chunkBase (in the image) is a transform matrix? Because it was not immediately clear to me (but maybe because I'm not used to look at them directly in the eye). If I'm correct, saying it directly (maybe in the variable name?) would help interpreting the data when debugging.
So you're looking at a class that contains 4 vectors in one. If the vectors were named A, B, C, D; the picture says:
m_chunkBase[0] = { D.x, C.x, B.x, A.x }
m_chunkBase[1] = { D.y, C.y, B.y, A.y }
m_chunkBase[2] = { D.z, C.z, B.z, A.z }
What I meant exactly is that in this specific document you never explain the meaning of SCENE_DYNAMIC and SCENE_STATIC but you mention these values, while it's a new thing replacing an previous thing. So I think a short explanation of this in the beginning would be helpful. I know what these are because I followed this thread, it's just the document missing the info.mmmm.... ok, I'll re-read to understand what's not clear.Klaim wrote: 4. Could you add a quick clarification (just after "names are optional" maybe) about the SceneMemoryMgrTypes argument in the entity creation function? So that the change on the way to managing static/dynamic is clear. As you say later "don't: Create objects that never move, rotate or scale as SCENE_DYNAMIC" without saying what's it's about, I think it's worth having a short section explaining it. Maybe just copy/pasting what you already wrote in this discussion.
Let me rephrase the question: if I want Ogre to never spawn any other thread, just use the thread calling renderOneFrame() to do everything, should I set 1? Or do you mean there is always at least one thread that will be associated to each scene?Yes, though I should clarify that there should be one worker thread (numThreads must be > 0). One CPU = One worker thread.Klaim wrote: 6. Am I correct that if numThreads (in the scene/thread initialization example) is 0, then there is just no additional worker thread created?
Don't worry, Ogre will throw you telling that numThreads must be greater than 0.
OMG I totally forgot that I didn't write about that and thought I did!Klaim wrote: What I meant exactly is that in this specific document you never explain the meaning of SCENE_DYNAMIC and SCENE_STATIC but you mention these values, while it's a new thing replacing an previous thing. So I think a short explanation of this in the beginning would be helpful. I know what these are because I followed this thread, it's just the document missing the info.
Ooookaaaayyy now I got it! From reading I thought that the mIndex was the index of the parent only, not of all the related data.dark_sylinc wrote: 1. No Node is used for "management". C and D are other node's vectors like you said. A is also another node's position.
2. By looking at the value of mIndex. If mIndex = 0; then I should look at the 4th coloumn. If mIndex = 1 I should look at the 3rth column, if mIndex = 3 I should be looking at the 1st column.
might be improved by mentionning why we are taking this value and not another (yes it's repeatition but it's necessary in human readable doc ;_; ):Here the debugger is telling us that the center of the Aabb in local space is at (100; 100; 100)
Or something similar.Here the debugger is telling us that the center of the Aabb in local space is at (100; 100; 100), which is the vector made of the values at index mIndex in reverse order.
The image shows (0,0,0) for both the 4th and the 0th element so I mixed both. It's a bit confusing, but I'm not sure how to improve this part.Note that if, for example, the 4th element (in this case it reads (0, 0, 0)) is an empty slot (i.e. there are only 3
entities in the entire scene); it could contain complete garbage; even NaNs. This is ok. We fill the memory
with valid values after releasing memory slots to prevent NaNs & INFs; as some architectures are slowed
down when such floating point special is present; but it is possible that some of them slip through (or it is also
possible a neighbour Entity actually contains an infinite extent, for example).
I didn't question the threading impact yet because I want to focus on helping you with clarifying what is intended (and happening) in the current setup.Ogre always has to spawn another another thread.Klaim wrote: Let me rephrase the question: if I want Ogre to never spawn any other thread, just use the thread calling renderOneFrame() to do everything, should I set 1? Or do you mean there is always at least one thread that will be associated to each scene?
But you don't have to worry that these two threads (Ogre main thread and its worker thread) run concurrently because they won't. When Ogre needs the worker thread to run, the main thread will pause.
And yes, there is some unnecessary OS overhead (thread context switching) for doing that, which I was worried about but when I benchmarked the old vs new code, performance drop was less than 1% for single core (which could easily be attributed to noise); and the amount of effort needed to maintain a single threaded path and a multithreaded path was too big for such minimal margin.
Though it may be worth saying that we're aiming towards updating the SceneManager through the CompositorManager, and SceneManagers would be updated sequentially. I don't even know if that can be done in parallel.Klaim wrote: If I know exactly what's the cost of each scene, at least I can do some prevention (even if that setup will be a bit expensive for me I think, but I'll check and report after trying Ogre 2.0 ).
In my specific case, I will need to sometime prepare a scene before it's visible, so I'll need to create a scene, once it's filled with objects I'll begin rendering in a texture, and at some pointdark_sylinc wrote:Though it may be worth saying that we're aiming towards updating the SceneManager through the CompositorManager, and SceneManagers would be updated sequentially. I don't even know if that can be done in parallel.Klaim wrote: If I know exactly what's the cost of each scene, at least I can do some prevention (even if that setup will be a bit expensive for me I think, but I'll check and report after trying Ogre 2.0 ).
Indeed.Since they're updated sequentially, you should provide numThreads with the total number of cores, regardless of how many SceneManagers you create.
Yes, I know, that can spawn a lot of threads (3 SceneManagers in a 6-core machine = 18 threads) unnecessarily. Perhaps we can later revise this (not now, there's a lot to do more important) so that they can share worker threads as long as you promise not to update these managers at the same time.
Actually that was one of the few little-big details that came up organically during development. I didn't plan that up, nor foresee it.Mako_energy wrote:I have what may be an unusual (or ignorant) gripe regarding the attachment and visibility changes made to movable objects. I think forcing objects to always be in the scene may get in the way of some operations in some setups. Such as when you want to prepare a scene in advance so you don't need to create objects during gameplay. I understand that there isn't much difference between toggling visibility and removing from the scene, but one critical difference is what shows up in ray/spacial scene queries that devs may want to perform.
It is also counter-intuitive to have an accessible setting (set/get Visible) be clobbered based on it's attached state, especially when coming from a system that made a distinction between in-scene and visible. Personally I value that distinction, but if there is a technical reason that cannot be, I will trust your judgement. I hope I articulated this well enough, I feel it's not as clear as it could be.
I cannot stress enough the importance of Visibility Flags. If you want to have objects that are invisible during render but are part of the scene (aka ghosts/phantoms) and as such should show up in ray/spatial scene queries, make the objects visible but put them in a special layer that no viewport will use. Hence ray casts will see it but the objects won't be rendered.Mako_energy wrote: I understand that there isn't much difference between toggling visibility and removing from the scene, but one critical difference is what shows up in ray/spacial scene queries that devs may want to perform.
Hurray! I'm useful!dark_sylinc wrote:By the way, you made me remember that the raycasting system is completely broken and needs to be added to the TODO list. I didn't have time to look into that (I'm sorry!)
Sounds good, but what about if I want objects to be removed from all interactions with the scene? Do you plan to have the updated ray scene query skip something if it isn't visible?dark_sylinc wrote:I cannot stress enough the importance of Visibility Flags. If you want to have objects that are invisible during render but are part of the scene (aka ghosts/phantoms) and as such should show up in ray/spatial scene queries, make the objects visible but put them in a special layer that no viewport will use. Hence ray casts will see it but the objects won't be rendered.
I cannot and will not defend all the different "features" that have appeared over time because of that, but sometimes it does say something about the ease of use for those features if someone is willing to work around the existing way the system is presented. It may be a petty gripe but between tracking a bunch of bitmasks in different places and calling "detach" and knowing I've covered my bases...I'd prefer the latter (which is how it works in 1.x as far as I can tell).dark_sylinc wrote:There's a lot of problems that can be easily solved by playing with visibility flags, which people forget about them and start doing complex modifications to the code to get their specific job done and then this code they wrote implementing a "new feature" gets integrated into the main branch.
No contest on the first two there, but the third...I hope so.dark_sylinc wrote:I know it's a mentality to change, but with a bit of time you'll see the new system is not just faster, but more flexible and easier to use.