Mmm.... your theory about frames getting skipped sounds interesting. But it may be caused by triple buffer instead.
What happens in Metal is not that it's because of VSync, but related:
Metal requires us to acquire a Drawable. This is done in MetalWindow::nextDrawable. A synonym for "Drawable" in other API terms would be backbuffer. We must acquire only one Drawable per frame. Doing so more than once would be a waste, doing less means skipping frames (and also lots of API issues because you can't issue commands to a nil RenderTarget, though you can use a discardable RenderTarget to use as a dummy).
Metal driver has a limit amount of Drawables per process (usually between 2 or 4). If you request them too fast, you may not get one. This can happen if for example you're requesting them at 60fps but the GPU is taking 50ms per frame (20 fps) so by the time you request your third frame from the CPU, the GPU has just finished the rendering commands issued to the first frame.
To prevent from running out of Drawables and also to prevent race conditions (i.e. we cannot write to GPU buffer areas from the CPU that are currently in use by the GPU) we use a global semaphore in MetalRenderSystem::_endFrameOnce:
Code: Select all
void MetalRenderSystem::_endFrameOnce(void)
{
RenderSystem::_endFrameOnce();
cleanAutoParamsBuffers();
__block dispatch_semaphore_t blockSemaphore = mMainGpuSyncSemaphore;
[mActiveDevice->mCurrentCommandBuffer addCompletedHandler:^(id<MTLCommandBuffer> buffer)
{
// GPU has completed rendering the frame and is done using the contents of any buffers
// previously encoded on the CPU for that frame. Signal the semaphore and allow the CPU
// to proceed and construct the next frame.
dispatch_semaphore_signal( blockSemaphore );
}];
mActiveDevice->commitAndNextCommandBuffer();[
//...
This semaphore is triple buffered because of the hard coded value "c_inFlightCommandBuffers" in MetalRenderSystem::_createRenderWindow.
My THEORY, is that this is what's happening:
- Grab drawable
- Ogre culls and does everything, Issues render commands
- CPU side it takes 2ms
- GPU starts rendering
- Grab drawable
- Ogre culls and does everything, Issues render commands
- CPU side it takes 2ms
- GPU starts rendering
- Wait for semaphore so Drawable from steps 1-4 becomes available again (it won't be available until the next V Blank). This takes between 0-16ms
- Grab drawable
- Ogre culls and does everything, Issues render commands
- CPU side it takes 2ms
- GPU starts rendering
- Wait for semaphore so Drawable becomes available again (it won't be available until the next V Blank). This takes between 0-16ms
- Grab drawable
- Ogre culls and does everything, Issues render commands
- CPU side it takes 2ms
- GPU starts rendering
...
In other words, my theory is that CPU side only takes 2ms, but eventually the problem is that CPU must wait for the GPU, either because the GPU has to catch up (due to the frame taking too long to render) or because of VSync.
To test this theory, you should modify c_inFlightCommandBuffers in MetalRenderSystem::_createRenderWindow. Try the values 4, 2 and 1 and see how that affects the timing variance. Note that very high values ( either > 4 or >=4 ) will result in spamming the log with errors about how the drawable couldn't be acquired (because we ran out of them).