First of all, regarding the Threading/tasks issue. I quickly wrote
an addedum to make a few things clearer. I'm updating the original post so it can be viewed there too.
I didn't include references & stuff because they're more like notes, and I didn't have the time.
masterfalcon wrote:My understanding of some of the plans is a little sketchy. I haven't been keeping up with it as much as I should have. But I believe one of the main goals is speed up scene graph updates via threads.
Discussion in the forum topic went straight to threading. But as for the slides, the topics were:
- Eliminate cache misses & cache polution through pointers to relevant data
- Smarter behavior about culling & traversing the scene
- Data-level parallelization (SIMD)
- Execution level parallelization (Threading)
- Other engine design changes.
Sqeaky wrote:Sqeaky's whole posts
Sqeaky seems to have a good understanding of my intentions. Also he seems to have a good graps of the parallel world hazards. I agree to everything he said.
Sqeaky wrote:It seems every body agrees Transforms -> Cull -> Complex Visuals -> Render, but which stages depend on each other for data and which connections are essential? How many of these become into 1 Task/WorkUnit and how many become some data driven amount of Tasks/WorkUnits?
I'm going to assume "Complex Visuals = Sorting & Material preparation".
There's no ultimate answer. "It depends". For example if the game consists on a simple scene of one pass, no shadows; then cull -> transform is preferred, because you only update what's visible. This was your typical 1999 game.
But if you have shadows, environment maps, and/or multiple passes; then cull -> transform may be counterproductive, because you may need to transform the same object on each pass where it appears repeated (i.e. no reuse). Transforming before culling may iterate over a few objects that were culled on all passes, but tends to be much scalable because the number of non-culled objects is statistically much higher.
tuan kuranes wrote:
For Each scene: const shared Nodes Buffer -> Transformed Nodes Buffer
For Each Viewport: const shared Transformed Nodes Buffer -> Culled Transformed Node Buffer (threadable)
For Each renderTarget: const shared Culled Transformed Node Buffer -> RenderQueue (threadable)
For Each RenderQueue: const shared Render Queue -> Command Buffer
Transforming the nodes is threadable. The baking the command buffer is threadable in D3D11 (apparently not in GL?)
I'd like to start thinking of "compositor output" rather than Viewport, since it's easier to visualize and work with. Viewports are still useful for setting up render area & some old school tricks; and ultimately each compositor outputs to a viewport.
sparkprime wrote: sparkprime's whole post
I agree with Sparkprime. And his rant is hilarious.
Although I don't agree on his view on cells. Next generation of phones is going to be a beast (Samsung Galaxy SIII computing power is quite impressive) and GL ES 3.0 is a real step forward (may be the Khronos group is doing something good with GL for once?).
As for the PC vs Phone vs Tablet "who will win?" debate?, is a fairly subjective point. The media tends to hype "the PC is dying" because it's investor's lingo for "sales increasing slower than before".
If we look at history, live theaters took quite a hit when movies appeared. And radio took quite a hit too when TV appeared. And TV is "dying" when broadband internet became mainstream. People now spends more time on the PC than on the TV.
And the movie industry was predicted to be "dead by now" because of online piracy. But
there are a few movies that sold quite well for a dead medium.
Radio wasn't replaced by TV, it had to share a similar market.
Most likely PC vs Phone/Tablet will follow the same pattern. Phones will grow as big to not be ignored hurting the PC; but the PC won't disappear. I could be wrong though. Exceptions happen.
Therefore having two different APIs is a tremendous amount of work we're not able to cope with. Of course, the Ogre user at a higher level can't aim at plug 'n play ports between PC & phones unless computing power required for his game isn't big. But he still needs to work out the shader quirks (as we don't provide uber shaders like UDK).
As for the static render systems, that's an interesting approach. I don't know how bad the virtuals will be after we separate states from SceneManager & RenderSystem; but certainly phones & consoles are the most affected.
Fallback support for DX11 & DX9 isn't hard really. Just watch all the other games: Include 2 executables + 1 launcher. The launcher detects DX support (or GL option) and launches the right program.
sparkprime wrote:What I would propose is doing the work from back to front. First fixing the rendersystem API (...)
Well, you propose doing the reverse order I do. I'm writing the series of tasks regarding scene management, but it lacks a bit on the RenderSystem side. You could help up a little on that adding those tasks.
They can be done at the same time. I prefer focusing on the tasks I propose, then Render APIs; but there is no reason they can't be done concurrently if enough volunteers are found. I know what I'll be working on. But I can't speak for the rest of the team and contributors.
sparkprime wrote:(...) It needs cleaning up too, with the removal of fixed function functionality. It seems we're coming close to this with the GL4 and D3D11 work nearing completion.
Well, one of the reasons I came with the idea of States (which isn't new btw.) is not only performance and usability (since RTSS should be able to build shaders for newbies basing on their states) but also as fear the FF makes a come back. You never really know. A new architecture. A new technique. Who knows.
A year ago Forward Rendering was dying. Today Deferred Shading using GBuffers has it's days counted. In 2006 FF was dying, but Nintento chose to launch the Wii with FF. And in 2010 FF appeared to be a dead end, but it was a good idea for the tessellator (since Geometry Shaders sucked at that).