Hotshot5000 wrote: ↑Wed Nov 15, 2017 11:55 am
What are the (planned) additions/changes for 2.2? Is it (semi-)stable? I'm asking because in December I will have some free time to experiment and I'm thinking maybe I should make GLES3 work on the 2.2 branch if you say there could be fewer bugs with Android. I'm already behind in that GLES3 works with a 2.1 version from March 2017 so I'll need to become up to date with the 2.1 branch first, but while I'm there, maybe I can also do something for 2.2.
Semi-stable is probablythe right word for it.
I wonder how hard would be to make GL3+ be used for also GLES3 but with a couple ifdefs (so we can share common code more easily)
I'm compiling a cheat sheet on the changes, this is what I so far:
The following will be soon removed from our codebase:
- Texture
- HardwarePixelBuffer
- RenderTarget / RenderTexture / RenderWindow
The following are the equivalences:
- Texture -> TextureGpu
- Rendering: RenderTarget / RenderTexture / RenderWindow -> TextureGpu
- Writing to a texture:
Old:
- Map a HardwarePixelBuffer (synchronous)
New:
- Map a StagingTexture
- Upload from StagingTexture to TextureGpu (asynchronous)
- Reading from a texture:
Old:
- Map a HardwarePixelBuffer (synchronous)
New:
- Download to an AsyncTextureGpu. (asynchronous)
- Map the AsyncTextureGpu
Things to watch out for when porting to 2.2:
- Previously numMipmaps = 0, meant it only had the main mip. Now numMipmaps = 0 is impossible, since the main mip is also counted. This means you need to watch out for getNumMipmaps() vs getNumMipmaps() + 1; or similarly for( i=0; i<=numMipmaps; ++i ) needs to be changed to for( i=0; i<numMipmaps; ++i )
- TexturePtr is default initialized to 0 because it's a SharedPtr. TextureGpu is not, it's a raw pointer. This is also a common problem with arrays of textures i.e. TexturePtr myTextures[5];
- Compositor textures are non-msaa by default. Use msaa <number of samples> or msaa_auto. In Ogre 2.1 all compositor textures defaulted to being msaa
So that's basically the change. We collapsed Texture, RenderTarget & HardwarePixelBuffer into one (TextureGpu), and added StagingTexture and AsyncTextureGpu to upload / download data from/to these textures. TextureGpu cannot be mapped directly, there is no TextureGpu::map / unmap.
In GL lingo, StagingTexture and AsyncTextureGpu are just PBOs.
TextureGpu also incorporates internally the notion of being part of a pool (i.e. what HlmsTextureManager does), which means a TextureGpu created with TextureFlags::AuomaticBatching may be internally be created as a Texture2DArray, and be part of a slice (see TextureGpu::getTexturePool and getInternalSliceStart). Externally, Ogre acts as if TextureGpu is just a single texture, while internally it is handled as a slice to a Texture2DArray.
Another issue that was fixed is that the old texture system tend to give automipmapping to everything (which meant calling glGenerateMipmaps and waste memory and CPU cycles on the driver allocating temporary storage for mipmap generation) whereas the new system rarely gives automipmapping, and automipmapping requires setting the flag TextureFlags::RenderTexture (which makes explicit the performance penalty).
Ogre 2.2 does have HW mipmap generation, but the default implementation creates a temporary texture with RenderTexture and Automipmap flags, and then copies the mipmaps to the actual texture, then destroys the temporary texture.
Something internally for porting:
Normal users don't see this, but to render you need to setup a RenderPassDescriptor. RenderPassDescriptor were designed
drawing inspiration from Metal, which fits mobile design very well.
Basically rendering gets encapsulated into <load> -> <render> -> <store> semantics. Load semantic can be set to dont_care, clear, and load.
Clear resets the TBDR's cache to a single colour (i.e. it's just a glClear). Dont_care loads nothing. Whatever is in RAM. Load loads the data from RAM.
Similarly store means we store to RAM, dont_care means we discard the results (i.e. depth and stencil buffers contents usually can be discarded), and then there's Resolve and StoreAndMultisampleResolve.
This means we support "dont_care" load and store semantics, which means we can call glInvalidateFramebuffer so tilers can avoid flushing their caches to RAM.
When store semantic is set to "Resolve" (instead of StoreAndMultisampleResolve) we call glBlitFramebuffer to resolve the MSAA buffer, and then can call glInvalidateFramebuffer on the MSAA texture so the tiler doesn't flush the cache of the MSAA surface to RAM, and only writes the resolved texture. This needs testing though. Considering Android driver quality, I wouldn't be surprised if the driver gets confused and writes nothing instead.