crancran wrote:It is my understanding that GL3+ with GLX treats the first render window as a main/primary context of sorts. Any subsequent windows that are created are therefore treated as child contexts, so they would essentially inherit the shared state their parent maintains to minimize the memory footprint needed. This design is exactly how the DX11 render system is designed too.
That is correct. This is some D3D9 remnant; and when it comes to GL it's tricky. D3D11 doesn't have this issue at all, but it just behaves in the same way for consistency (and probably due to how the code works too).
crancran wrote:
It's my understanding there are a number of data items that can be shared among contexts, notably VBOs and GPU programs, but one thing I've read isn't supported is the notion of sharing VAOs. This seems to contradict the current inherent design in OGRE where at the time of loading a renderable, a VAO will be constructed and therefore bound to whatever the "current" GL context happens to be. No other VAO will be constructed for the additional contexts, and so when the scene pass is executed on the drawable surfaces, one will be right while the others will appear black (or will have whatever background viewport color you've specified for your clear pass).
That is half spot on.
For v1 objects, we have one global VAO created at init time and works the way you described.
For v2 objects though, we keep one VAO per vertex format (actually it's per buffer pool per vertex format; but the majority of use cases end up using one pool, thus one VAO per vertex format. But it's not impossible to have more than one VAO per vertex format). The VaoManager is responsible for handling this.
crancran wrote:Is the notion of using a VAO like this an intentional design choice?
Perhaps there were some performance implications that lead to this decision?
The global vao was just for simplicity. The old v1 code is from long before VAOs and GL3 requires a VAO to be bound, so using a mutable VAO was a quick and simple solution (one that many use).
But for v2 we have multiple VAOs, and it would be extremely complicated and unreasonable to multiply the VAOs based on the number of RenderWindows currently live.
Even if we were to adjust our VAOs to fit context sharing, the real issue here is that trying to get multiple GL contexts is a no-go. The OpenGL specs define it loosely and allow a very wide range of behaviors; which makes it impossible to make it run reliably on multiple drivers and platforms. Crashes, glitches, nothing rendering, deadlocks, inexplicable errors, BSODs... context sharing is a pandora's box.
GL context sharing is only useful in practice for running background stuff like data streaming and shader compilation (and even then, dealing with driver glitches and driver crashes is a PITA; but if you target one platform you can enable it for one major vendor or two where you've tested it).
Using one GL context with multiple drawables is a far more stable approach in general.
crancran wrote:In order to use the "currentGLContext=true" option, it really appears that it boils down to a compositor manager change where the final render target swap needs to happen before moving onto the next surface to be drawn. I've tried to locate documentation that explains why that is necessary, but maybe it is just that Mesa must do it this way when sharing a single context with multiple drawables.
Yeah, it's not necessary for Ogre, but it was necessary for certain RenderSystems. I remember D3D9 was the most picky about it (which we no longer support); but in general the order in which RenderWindows are swapped is mostly just to get drivers to work and doing at the end used to work with all the combinations we tried so far (obviously not the case here). Since it's not hard requirement, this can be adapted.
If Mesa requires to swap buffers before switching the GL context to use a different drawable, it might just be a Mesa bug; nonetheless it's possible that we should just detect if Mesa is being used and do a swap before switching (or do it for every GL implementation). It may also be possible that calling glFlush() could work (likely glFinish too, but that would be slow).