FPS hiccups profiling

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
User avatar
spacegaier
OGRE Team Member
OGRE Team Member
Posts: 4293
Joined: Mon Feb 04, 2008 2:02 pm
Location: Germany
x 2
Contact:

Re: FPS hiccups profiling

Post by spacegaier » Mon Feb 11, 2013 8:23 pm

Firefox was barely able to display that, but was practically unusable (I stopped once FF passed 1.7 GB RAM ;) ).

I think we need a better output format. Perhaps a CSV and then people could use something like "R" to make graphs with it, or Excel files.

PS: Amazing how a 2GB HTML file shrinks down to 30MB once all the whitespaces and stuff is compressed :shock:

But this whole topic is indeed very interesting! Keep it up Assaf!
0 x
Ogre Admin [Admin, Dev, PR, Finance, Wiki, etc.] | BasicOgreFramework | AdvancedOgreFramework
Don't know what to do in your spare time? Help the Ogre wiki grow! Or squash a bug...

User avatar
Assaf Raman
OGRE Team Member
OGRE Team Member
Posts: 3092
Joined: Tue Apr 11, 2006 3:58 pm
Location: TLV, Israel

Re: FPS hiccups profiling

Post by Assaf Raman » Mon Feb 11, 2013 9:09 pm

Yes, I know the file it too big. My next step is to solve that... I think I can change the code to only output the bottlenecks with their call stack.
0 x
Watch out for my OGRE related tweets here.

User avatar
Zonder
Ogre Magi
Posts: 1133
Joined: Mon Aug 04, 2008 7:51 pm
Location: Manchester - England
x 22

Re: FPS hiccups profiling

Post by Zonder » Tue Feb 12, 2013 10:29 am

CSV might not be a bad idea can import into a database to query then. Pre-filtering ofc will be nice :)
0 x
There are 10 types of people in the world: Those who understand binary, and those who don't...

User avatar
Zonder
Ogre Magi
Posts: 1133
Joined: Mon Aug 04, 2008 7:51 pm
Location: Manchester - England
x 22

Re: FPS hiccups profiling

Post by Zonder » Tue Feb 12, 2013 10:32 am

Opera opened that file in seconds it obviously obeys then render soon and wait for rest rules of html :)
0 x
There are 10 types of people in the world: Those who understand binary, and those who don't...

User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
Contact:

Re: FPS hiccups profiling

Post by Klaim » Tue Feb 12, 2013 3:17 pm

Why not JSON, then make the page load a part of the data and the rest progressively on request? I mean, you have javascript available, just put the data inside and process+display them only when necessary?
0 x

User avatar
spacegaier
OGRE Team Member
OGRE Team Member
Posts: 4293
Joined: Mon Feb 04, 2008 2:02 pm
Location: Germany
x 2
Contact:

Re: FPS hiccups profiling

Post by spacegaier » Tue Feb 12, 2013 4:33 pm

Klaim wrote:Why not JSON, then make the page load a part of the data and the rest progressively on request? I mean, you have javascript available, just put the data inside and process+display them only when necessary?
Would solve the issue for browsers, but with a CSV we would have the option to do calculations and graphs more easily.
0 x
Ogre Admin [Admin, Dev, PR, Finance, Wiki, etc.] | BasicOgreFramework | AdvancedOgreFramework
Don't know what to do in your spare time? Help the Ogre wiki grow! Or squash a bug...

User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
Contact:

Re: FPS hiccups profiling

Post by Klaim » Tue Feb 12, 2013 7:59 pm

spacegaier wrote:
Klaim wrote:Why not JSON, then make the page load a part of the data and the rest progressively on request? I mean, you have javascript available, just put the data inside and process+display them only when necessary?
Would solve the issue for browsers, but with a CSV we would have the option to do calculations and graphs more easily.
Yes, I meant whatever easy to parse from javascript format that can be embedded in the page.
0 x

User avatar
Assaf Raman
OGRE Team Member
OGRE Team Member
Posts: 3092
Joined: Tue Apr 11, 2006 3:58 pm
Location: TLV, Israel

Re: FPS hiccups profiling

Post by Assaf Raman » Tue Feb 12, 2013 8:23 pm

I think all we need is a summary of call stacks of the bottlenecks - the rest isn't that interesting, so - I think that dropping most of the report is the best way to go - regardless of format.
Regarding the format - I think html will be best to display the call stack tree - that is why I am going to stick with it.
0 x
Watch out for my OGRE related tweets here.

User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
Contact:

Re: FPS hiccups profiling

Post by Klaim » Wed Feb 13, 2013 12:00 am

Assaf Raman wrote: Regarding the format - I think html will be best to display the call stack tree - that is why I am going to stick with it.
HTML imply Javascript that imply JSON which is why I was suggesting generating the data in JSON, putting it into the HTML but in a way that makes the javascript code display seomthing before the browser have finished reading the JSON and only display the important info first. It's not a lot of work but it's more than just displaying less infos indeed.


Maybe I was not clear enough.
0 x

bstone
OGRE Expert User
OGRE Expert User
Posts: 1920
Joined: Sun Feb 19, 2012 9:24 pm
Location: Russia

Re: FPS hiccups profiling

Post by bstone » Wed Feb 13, 2013 8:16 am

Anything JSON won't work with a local HTML file. You need a web server to read the report using multiple requests.
0 x

User avatar
Zonder
Ogre Magi
Posts: 1133
Joined: Mon Aug 04, 2008 7:51 pm
Location: Manchester - England
x 22

Re: FPS hiccups profiling

Post by Zonder » Wed Feb 13, 2013 8:52 am

If it's filtered then it won't matter it can be pur html. as you shouldn't have pages and pages of bottlenecks! :)
0 x
There are 10 types of people in the world: Those who understand binary, and those who don't...

User avatar
syedhs
Silver Sponsor
Silver Sponsor
Posts: 2702
Joined: Mon Aug 29, 2005 3:24 pm
Location: Kuala Lumpur, Malaysia
x 3

Re: FPS hiccups profiling

Post by syedhs » Wed Feb 13, 2013 10:13 am

CSV is better IMO as you can use Excel to do aggregations, filter, highlight and not to mention formula.
0 x
A willow deeply scarred, somebody's broken heart
And a washed-out dream
They follow the pattern of the wind, ya' see
Cause they got no place to be
That's why I'm starting with me

User avatar
Assaf Raman
OGRE Team Member
OGRE Team Member
Posts: 3092
Joined: Tue Apr 11, 2006 3:58 pm
Location: TLV, Israel

Re: FPS hiccups profiling

Post by Assaf Raman » Wed Feb 13, 2013 10:47 am

Guys, lets move on from the csv or html discussion and assume you are going to get both.
0 x
Watch out for my OGRE related tweets here.

User avatar
Assaf Raman
OGRE Team Member
OGRE Team Member
Posts: 3092
Joined: Tue Apr 11, 2006 3:58 pm
Location: TLV, Israel

Re: FPS hiccups profiling

Post by Assaf Raman » Thu Feb 14, 2013 10:09 am

Improved the performance by using the addresses of functions instead the names (saves time) - and translating the addresses to function names only while generating the final report.
Attachments
profiler.zip
(20.78 KiB) Downloaded 100 times
0 x
Watch out for my OGRE related tweets here.

User avatar
TheOnlyJoey
Halfling
Posts: 53
Joined: Sun Apr 10, 2011 12:05 pm
Location: The Netherlands
Contact:

Re: FPS hiccups profiling

Post by TheOnlyJoey » Sat Oct 19, 2013 2:08 pm

Great project!

Is this tested on linux yet?
Also, could you maybe create a small tutorial/readme on how to add this to a existing project?
I am currently profiling our engine, and this would be quite useful.

Thanks
0 x

User avatar
0xC0DEFACE
OGRE Expert User
OGRE Expert User
Posts: 84
Joined: Thu May 21, 2009 4:55 am

Re: FPS hiccups profiling

Post by 0xC0DEFACE » Mon Mar 24, 2014 2:23 am

Hi Assaf!

Sorry that I'm late to the party, but I don't check these forums as frequently as I used to.

Anyway I have recently spent time debugging stalls and dropped frames and would like to tell you and anyone else reading about some of my experiences.

I have a copy of VTune XA and frequently use it for profiling, however VTune sometimes isn't the best tool for the job. Hot spots that only occur in 1/300 frames will get lost in the noise of the hotspots of the other 299 frames.

Adding manual profiling code throughout my code (as you seem to be doing now) has been mildly successful. the problem with it is that it produces lots of data that is difficult to analyse.

However they both pale in comparison to GPUView. GPUView is provided as part of the windows performance toolkit. You run it while your application is running and it captures HUGE volumes of information until you stop it running. It manages to do this without impacting system performance which is great. It then can display the information in a useful way and relate it all to the graphics pipeline. I've found that its great for detecting multithreading related rendering stalls, like from main thread context switches or locks waiting too long, however it can also help with things like hardware buffer allocations causing stalls and things like that. It is really amazing and anyone profiling stalls should certainly look at it.

Two problems I found while using it which I hope will illustrate its usefulness.

1. The main thread context switches. I have N threads on this machine. Ogre therefore creates N threads for background use. All N of theses threads have the same thread priority as the main thread and the windows scheduler will roughly give them equal CPU time. When all 8 threads were busy doing some background loading the main thread would be starved of CPU time and therefore not render for a brief time. This resulted in hard to catch dropped frames. the solution was to make ogre only create N-1 threads and also to increase the thread priority of the main thread. Huge difference.

2. I have a GTX 670 here. I began noticing dropped frames in windowed mode (windows 7) again and wasn't sure what the cause was. VTune wasn't showing any issues and there wasn't much happening in terms of threading or heavy processing. So I opened GPUView. I found that i was rendering 200 fps. Cleanly rendering multiple frames in one vsync interval. The command queue wasn't overloaded, and everything looked fine, however windows direct window manager was not flipping the buffer when it should. Sometimes half of my frames would not get flipped resulting in my 200FPS feeling like 30... Eventually i tried running on my second monitor. BAM! Totally smooth. Seems as though my particular card has a DVI port that runs optimally in windowed mode and one that does't. I didn't find a solution for that problem, but I was able to rule it out as being an issue with my code. (screen shot below)

I would strongly encourage you to check it out. Its a very handy tool to add to your performance measuring toolkit.

Before you know what you are looking at its hard to tell whats going on in the profile, but once you read about it its very handy.

http://graphics.stanford.edu/~mdfisher/GPUView.html

blue vertical line are a vsync interval. Clearly you can see that the flip queue is not flipping every frame. HTH

c0deface.

GPUVIEW.jpg
GPUVIEW.jpg (252.57 KiB) Viewed 3076 times
0 x

User avatar
Assaf Raman
OGRE Team Member
OGRE Team Member
Posts: 3092
Joined: Tue Apr 11, 2006 3:58 pm
Location: TLV, Israel

Re: FPS hiccups profiling

Post by Assaf Raman » Mon Mar 24, 2014 8:12 am

Regarding the windowed mode half FPS - I can confirm to you this is an NVIDIA vsync driver issue.
Did you try running D3D9Ex? It runs better with vsync and D3DSWAPEFFECT_FLIPEX
http://msdn.microsoft.com/en-us/library ... 85%29.aspx
D3DSWAPEFFECT_FLIPEX

Designates when an application is adopting flip mode, during which time an application's frame is passed instead of copied to the Desktop Window Manager(DWM) for composition when the application is presenting in windowed mode. Flip mode allows an application to more efficiently use memory bandwidth as well as enabling an application to take advantage of full-screen-present statistics. Flip mode does not affect full-screen behavior. A sample application that uses D3DPRESENT_FORCEIMMEDIATE and D3DSWAPEFFECT_FLIPEX is the D3D9ExFlipEx sample on the MSDN Code Gallery.

Note If you create a swap chain with D3DSWAPEFFECT_FLIPEX, you can't override the hDeviceWindow member of the D3DPRESENT_PARAMETERS structure when you present a new frame for display. That is, you must pass NULL to the hDestWindowOverride parameter of IDirect3DDevice9Ex::PresentEx to instruct the runtime to use the hDeviceWindow member of D3DPRESENT_PARAMETERS for the presentation.

Differences between Direct3D 9 and Direct3D 9Ex:

D3DSWAPEFFECT_FLIPEX is only available in Direct3D9Ex running on Windows 7 (or more current operating system).
0 x
Watch out for my OGRE related tweets here.

User avatar
0xC0DEFACE
OGRE Expert User
OGRE Expert User
Posts: 84
Joined: Thu May 21, 2009 4:55 am

Re: FPS hiccups profiling

Post by 0xC0DEFACE » Tue Mar 25, 2014 3:37 am

Thanks for that, I am still running an older version of ogre and have on my list of jobs an upgrade to the latest for this exact reason, as I suspected DX9EX would help with the issue. Although I probably wont get around to that until next week. ill let you know how it goes and if it helps with the issue.
0 x

Post Reply