[2.1][Metal] Delta time between frames are not consistent (v-sync?)

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


Post Reply
xissburg
Halfling
Posts: 83
Joined: Sun Feb 21, 2010 4:58 pm
x 28

[2.1][Metal] Delta time between frames are not consistent (v-sync?)

Post by xissburg »

When I print the value of timeSinceLast at MainLoopSingleThreaded.cpp:131 (e.g.

Code: Select all

std::cout << "dt " << std::fixed << timeSinceLast << std::endl;
) I don't get a consistent value in every frame. It alternates between ~14ms and ~2ms and the application shows 120fps while it looks like it's actually rendering only 60fps because I think that when it takes only ~2ms it's not drawing anything and Metal is just skipping a frame. By inserting that line of code this can be reproduced in any of the samples (in my machine, at least). I don't know if there's any extra synchronization that needs to be done with Metal for v-sync to work properly but from what I've seen there isn't really. Perhaps there's an issue with CPU-GPU sync (https://developer.apple.com/documentati ... ronization) ?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.1][Metal] Delta time between frames are not consistent (v-sync?)

Post by dark_sylinc »

Mmm.... your theory about frames getting skipped sounds interesting. But it may be caused by triple buffer instead.

What happens in Metal is not that it's because of VSync, but related:

Metal requires us to acquire a Drawable. This is done in MetalWindow::nextDrawable. A synonym for "Drawable" in other API terms would be backbuffer. We must acquire only one Drawable per frame. Doing so more than once would be a waste, doing less means skipping frames (and also lots of API issues because you can't issue commands to a nil RenderTarget, though you can use a discardable RenderTarget to use as a dummy).

Metal driver has a limit amount of Drawables per process (usually between 2 or 4). If you request them too fast, you may not get one. This can happen if for example you're requesting them at 60fps but the GPU is taking 50ms per frame (20 fps) so by the time you request your third frame from the CPU, the GPU has just finished the rendering commands issued to the first frame.

To prevent from running out of Drawables and also to prevent race conditions (i.e. we cannot write to GPU buffer areas from the CPU that are currently in use by the GPU) we use a global semaphore in MetalRenderSystem::_endFrameOnce:

Code: Select all

void MetalRenderSystem::_endFrameOnce(void)
{
	RenderSystem::_endFrameOnce();

	cleanAutoParamsBuffers();

	__block dispatch_semaphore_t blockSemaphore = mMainGpuSyncSemaphore;
	[mActiveDevice->mCurrentCommandBuffer addCompletedHandler:^(id<MTLCommandBuffer> buffer)
	{
		// GPU has completed rendering the frame and is done using the contents of any buffers
		// previously encoded on the CPU for that frame. Signal the semaphore and allow the CPU
		// to proceed and construct the next frame.
		dispatch_semaphore_signal( blockSemaphore );
	}];

	mActiveDevice->commitAndNextCommandBuffer();[
	//...
This semaphore is triple buffered because of the hard coded value "c_inFlightCommandBuffers" in MetalRenderSystem::_createRenderWindow.

My THEORY, is that this is what's happening:
  1. Grab drawable
  2. Ogre culls and does everything, Issues render commands
  3. CPU side it takes 2ms
  4. GPU starts rendering
  5. Grab drawable
  6. Ogre culls and does everything, Issues render commands
  7. CPU side it takes 2ms
  8. GPU starts rendering
  9. Wait for semaphore so Drawable from steps 1-4 becomes available again (it won't be available until the next V Blank). This takes between 0-16ms
  10. Grab drawable
  11. Ogre culls and does everything, Issues render commands
  12. CPU side it takes 2ms
  13. GPU starts rendering
  14. Wait for semaphore so Drawable becomes available again (it won't be available until the next V Blank). This takes between 0-16ms
  15. Grab drawable
  16. Ogre culls and does everything, Issues render commands
  17. CPU side it takes 2ms
  18. GPU starts rendering
    ...
In other words, my theory is that CPU side only takes 2ms, but eventually the problem is that CPU must wait for the GPU, either because the GPU has to catch up (due to the frame taking too long to render) or because of VSync.

To test this theory, you should modify c_inFlightCommandBuffers in MetalRenderSystem::_createRenderWindow. Try the values 4, 2 and 1 and see how that affects the timing variance. Note that very high values ( either > 4 or >=4 ) will result in spamming the log with errors about how the drawable couldn't be acquired (because we ran out of them).
xissburg
Halfling
Posts: 83
Joined: Sun Feb 21, 2010 4:58 pm
x 28

Re: [2.1][Metal] Delta time between frames are not consistent (v-sync?)

Post by xissburg »

I have tried all sorts of values for c_inFlightCommandBuffers and it makes no difference, which is really strange I think (I have not read the code throughly btw to precisely understand what it does). I was making sure the RenderSystem_Metal.framework was being updated inside the app bundle. Also, this does not happen when running on the iPhone. I get 60 fps reported and the frame time is always around ~16ms.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.1][Metal] Delta time between frames are not consistent (v-sync?)

Post by dark_sylinc »

Mmm... I was 99% certain modifying c_inFlightCommandBuffers would fix it.

This is fishy indeed.

The iOS sample relies on CADisplayLink timer to have the OS call us 60 times per second, which is why it works correctly.
The macOS sample uses the same code as Linux/Windows where the window polling events are managed by SDL and we try to "render as fast possible" (as fast possible = as fast as the RenderSystem / API / GPU will allow it). Obviously something is wrong here.

I am currently clueless :roll:
Post Reply