[2.1] OpenGL warnings : AMD vs Nvidia

Discussion area about developing with Ogre2 branches (2.1, 2.2 and beyond)
Post Reply
User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

[2.1] OpenGL warnings : AMD vs Nvidia

Post by Kinslore »

Hello !

For my project, I create mesh manually in a function. I want to do it smoothly.
For this, in a secondary thread, I fill some arrays (positions, normal, indices...) to be able to copy that into a buffer I create and map in the main thread. Here is the code :

Code: Select all

Ogre::MeshPtr TerrainGenerator::generateTile ()
{
	// Name generation
	std::string tileName ("terrainTile") ;
	std::stringstream numberConverter ;
	numberConverter << _threadGenerationParameters._gridPosition.x << "x" << _threadGenerationParameters._gridPosition.y ;

	tileName += numberConverter.str() ;

	// Load the mesh
	Ogre::MeshPtr resultMesh = Ogre::MeshManager::getSingleton().createManual(tileName, Ogre::ResourceGroupManager::DEFAULT_RESOURCE_GROUP_NAME) ;

	// If already in memory...
	if (resultMesh->getNumSubMeshes())
		return resultMesh ;

	// Else we create it
	Ogre::SubMesh* sub = resultMesh->createSubMesh() ;

	// Vertex declaration
	Ogre::VertexElement2Vec vertexElements ;
	vertexElements.push_back(Ogre::VertexElement2(Ogre::VET_FLOAT3, Ogre::VES_POSITION)) ;
	vertexElements.push_back(Ogre::VertexElement2(Ogre::VET_FLOAT3, Ogre::VES_NORMAL)) ;

	if (_threadGenerationParameters._generateUv)
		vertexElements.push_back(Ogre::VertexElement2(Ogre::VET_FLOAT2, Ogre::VES_TEXTURE_COORDINATES)) ;

	// Vertex number for generation
	unsigned int vertexNumber = _threadGenerationResult._vertexNumber ;
	unsigned int vertexSize = _threadGenerationResult._vertexTotalSize ;

	// Get the arrays generated in a SIMD fashion
	float* vertices = reinterpret_cast<float*>(OGRE_MALLOC_SIMD(sizeof(float) * (vertexNumber * vertexSize), Ogre::MEMCATEGORY_GEOMETRY)) ;
	// Copy the results
	memcpy(vertices, _threadGenerationResult._finalPointBuffer, sizeof(float) * (vertexNumber * vertexSize)) ;

	// Buffer creation
	Ogre::VertexBufferPacked* vBuffer = NULL ;

	try
	{
		vBuffer = _vaoManager->createVertexBuffer(vertexElements, vertexNumber, Ogre::BT_IMMUTABLE, vertices, false) ;
	}
	catch (Ogre::Exception& e)
	{
		OGRE_FREE_SIMD(vertices, Ogre::MEMCATEGORY_GEOMETRY) ;
		vBuffer = NULL ;

		throw e ;
	}

	// Index buffer 
	unsigned int indexNumber = _threadGenerationResult._indexNumber ;
	unsigned int indexByteSize = _threadGenerationResult._longIndex ? sizeof(Ogre::uint32) : sizeof(Ogre::uint16) ;
	Ogre::IndexBufferPacked::IndexType indexType = _threadGenerationResult._longIndex ? Ogre::IndexBufferPacked::IT_32BIT : Ogre::IndexBufferPacked::IT_16BIT ;

	Ogre::IndexBufferPacked* iBuffer = NULL ;

	// Switch concerning the index type (32 or 16bits values)
	if (_threadGenerationResult._longIndex)
	{
		Ogre::uint32* indices = reinterpret_cast<Ogre::uint32*>(OGRE_MALLOC_SIMD(indexByteSize * indexNumber, Ogre::MEMCATEGORY_GEOMETRY)) ;

		memcpy(indices, _threadGenerationResult._finalIndexBufferLong, indexByteSize * indexNumber) ;

		try
		{
			iBuffer = _vaoManager->createIndexBuffer(indexType, indexNumber, Ogre::BT_IMMUTABLE, indices, false) ;
		}
		catch (Ogre::Exception& e)
		{
			OGRE_FREE_SIMD(indices, Ogre::MEMCATEGORY_GEOMETRY) ;
			iBuffer = NULL ;

			throw e ;
		}
	}
	else
	{
		Ogre::uint16* indices = reinterpret_cast<Ogre::uint16*>(OGRE_MALLOC_SIMD(indexByteSize * indexNumber, Ogre::MEMCATEGORY_GEOMETRY)) ;

		memcpy(indices, _threadGenerationResult._finalIndexBufferShort, indexByteSize * indexNumber) ;

		try
		{
			iBuffer = _vaoManager->createIndexBuffer(indexType, indexNumber, Ogre::BT_IMMUTABLE, indices, false) ;
		}
		catch (Ogre::Exception& e)
		{
			OGRE_FREE_SIMD(indices, Ogre::MEMCATEGORY_GEOMETRY) ;
			iBuffer = NULL ;

			throw e ;
		}
	}

	// Dealing with VAO
	Ogre::VertexBufferPackedVec vertexBuffers ;
	vertexBuffers.push_back(vBuffer) ;

	Ogre::VertexArrayObject* vao = _vaoManager->createVertexArrayObject(vertexBuffers, iBuffer, Ogre::v1::RenderOperation::OT_TRIANGLE_LIST) ;

	// LOD
	// First level
	sub->mVao[0].push_back(vao) ;
	sub->mVao[1].push_back(vao) ;

	// Bounds
	Ogre::Vector3 centerBox (((int)_tileSize - 1) * _threadGenerationParameters._spaceBetweenVertices * 0.5f, 0, ((int)_tileSize - 1) * _threadGenerationParameters._spaceBetweenVertices * 0.5f) ;
	Ogre::Vector3 halfBoxBound = Ogre::Vector3 () ;
	halfBoxBound.x = ((int)_tileSize - 1) * 0.5f * _threadGenerationParameters._spaceBetweenVertices ;
	halfBoxBound.y = std::max(std::fabs(_threadGenerationResult._extremumHeights.x), std::fabs(_threadGenerationResult._extremumHeights.y)) ;
	halfBoxBound.z = ((int)_tileSize - 1) * 0.5f * _threadGenerationParameters._spaceBetweenVertices ;

	resultMesh->_setBounds(Ogre::Aabb(centerBox, halfBoxBound), true) ;
	resultMesh->_setBoundingSphereRadius(centerBox.length()) ;

	// Free 
	OGRE_FREE_SIMD(vertices, Ogre::MEMCATEGORY_GEOMETRY) ;
	OGRE_FREE_SIMD(indices, Ogre::MEMCATEGORY_GEOMETRY) ;

	return resultMesh ;
}
It runs fine, and I can load meshes dynamically and without any stall. But recently, I switched from an AMD GPU to a Nvidia GPU. On the AMD chipset, everything is running as intended (AMD Radeon HD 7400M). However, I tried on 2 Nvidia GPU (Quadro K2100 and Geforce GTX 560 Ti) and it stalls every time I want to load a mesh.

First, during launch, some warnings pop up :

Code: Select all

OpenGL:performance(medium) 131186: Buffer performance warning : Buffer object 5 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (0), usage hint is GL_STATIC_DRAW) is being copied/moved from VIDEO memory to HOST memory.
OpenGL:performance(medium) 131186: Buffer performance warning : Buffer object 3 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (0), and GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (7), usage hint is GL_STATIC_DRAW) is being copied/moved from VIDEO memory to HOST memory.
Then, everytime I render a frame after a mesh filling using the upper function :

Code: Select all

OpenGL:performance(medium) 131154: Pixel-path performance warning: Pixel transfer is synchronized with 3D rendering.
And during this one it will halt the rendering during a short time, like something is waiting before rendering.

From what I found, this is due to some upload from the CPU part to the GPU part.
I am wondering why is it only on Nvidia GPU ? Is it because I do not fulfill some requirements toward its architecture or is it Ogre ? I will take any advice !

Thanks !

User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 4211
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 802
Contact:

Re: [2.1] OpenGL warnings : AMD vs Nvidia

Post by dark_sylinc »

I admit I haven't tested on NV in a very long time. My NV card died months ago and I haven't replaced it since.

It seems your code is correct. As for the warnings: Do they happen right after loading that code?

There are a few things I can make a few remarks:
1. The texture loading code is old and while refactoring I found it used a PSO (Pixel Buffer Object) to load a texture asynchronously, but then immediately destroyed it. Unless the driver goes to great lengths, this would cause the loading to become synchronous. Unfortunately at that time it was very hard to fix; and the issue still remains in Ogre's codebase.
Eventually textures will be refactored, but I'm afraid it needs more look into it (e.g. it's not a one liner).

I suspect this is what you're running into because:

Code: Select all

Pixel-path performance warning: Pixel transfer is synchronized with 3D rendering.
sounds like a texture issue and not a vertex issue. One good way to prove that theory is to use a default material that uses no texture.

If that's not the problem, then likely NV's driver is going nuts because there is no other pixel transfer going on when loading a mesh!
If that's the problem then you could avoid the stall by creating the materials upfront, which would cause all textures to be loaded earlier.

Note that it could be possible the stall is caused not by this, but rather by the GLSL shaders being compiled (if a new shader needs to be compiled). AMD is quite fast at compiling GLSL shaders, but I don't know about NV.
One way to be sure about that is to use the microcode cache which saves the results to disk (so the stall would happen only on the first runs).

2. As for the warnings at launch; I would definitely would have to take a look at some point (when I get an NV card in my hands...); though it's likely that these warnings come from some v1 legacy code during initialization. Certain parts of the initialization process of the GL3+ render system are very complicated so it wouldn't surprise me.
Specially because it says 'GL_STATIC_DRAW' where V2 objects use glBufferStorage to create buffers (that flag doesn't even exist there!)

User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [2.1] OpenGL warnings : AMD vs Nvidia

Post by Kinslore »

Hello, thanks for the quick answer !

It occurs during :

Code: Select all

Ogre::Root::renderOneFrame() ;
Each mesh has the same Hlms assigned. I don't think there are loadings, appart for the first tile and the geometry.

I also tried to not set the material and keep the basic one. Still these warnings. That's why I think it is not linked to the textures, nor the GLSL shaders (which is the same for every mesh, I didn't even use properties in the HLMS). But then, what is left ?

I am going to check the drivers, but they are not so old. I will post an update when those are ready !

EDIT : Drivers are now updated, and nothing has changed :/

User avatar
Kinslore
Gnoblar
Posts: 18
Joined: Sun Jul 26, 2015 8:55 pm
x 5

Re: [2.1] OpenGL warnings : AMD vs Nvidia

Post by Kinslore »

Hey there !

I got some time to get some debugging, and got some results. Even if I can't say if those are useful or not.

I did those tests with a simple mesh : it is from the plane created by Ogre itself. Here is the creation process :

Code: Select all

Ogre::v1::MeshPtr planeV1 = Ogre::v1::MeshManager::getSingleton().createPlane(...) ;

Ogre::MeshPtr planeV2 = Ogre::MeshManager::getSingleton().createManual(...) ;
planeV2->importV1(planeV1.get(), false, false, true) ;

// ...
// Some code to get the mesh and everything
// ...

Ogre::Item* planeItem = _sceneManager->createItem(planeV2, Ogre::SCENE_STATIC) ;

Ogre::SceneNode* planeNode = _someNode->createChildSceneNode(Ogre::SCENE_STATIC) ;
planeNode->attachObject(planeItem) ;

// This one is related to the third point of this post, and won't be called in both cases
planeItem->setDatablock("planeTest") ;

// Then we set the position and such if wanted, and let the rendering go
The HLMS linked to the "planeTest" datablock has textures in it. In fact I used the one I did for the terrain generator I am making (Water). I have as input some vec4 for the PassBuffers, and textures coming from the Compositor itself (results from past passes).

So, I launched the program until I could find the lines giving the errors or the warnings. Here are the results :

1. For every texture I load for the first time, I get the Pixel-path warning, as your predicted :

Code: Select all

OpenGL:performance(medium) 131154: Pixel-path performance warning: Pixel transfer is synchronized with 3D rendering.
Maybe it is the lines you mentioned. Anyway, this is not a problem for me as it is. The first loading is not at random ! Because I won't reload those textures, it stays in memory afterwards.


2. During a quad pass, when we go through it for the first time, we get this kind of warning :

Code: Select all

OpenGL:performance(medium) 131186: Buffer performance warning : Buffer object 5 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (0), usage hint is GL_STATIC_DRAW) is being copied/moved from VIDEO memory to HOST memory.
OpenGL:error(high) 1281: GL_INVALID_VALUE error generated. <index> exceeds the maximum number of vertex attributes.
Not sure why though. It occurs during the line 218 :

Code: Select all

_sceneManager->renderSingleObject(...) ;
I didn't go further, but I didn't get those with the AMD GPU.


3. Then, when I load a new mesh, I have this warning, locking the render during a short amount of time :

Code: Select all

OpenGL:performance(medium) 131154: Pixel-path performance warning: Pixel transfer is synchronized with 3D rendering.
It only occurs the first time the mesh is used, and later during the render process. Then I won't get it. For the plane mesh, when creating it for the first time I will get the warning. Then using the tile system, everything will be running smoothly even if I use it again. Also, it will be here only during the ScenePass related to the renderGroup associated to the Item.

This is the problem really annoying me for now. I tested two cases :

- I asked a custom HLMS datablock to be bound to the Item
- I didn't ask that and let the default behaviour

If I asked for the custom datablock "planeTest"
I will get the error during the RenderQueue's render. Here :

Code: Select all

Ogre::Root::RenderOneFrame() L.1032
Ogre::Root::_updateAllRenderTargets() L.1509
Ogre::CompositorManager2::_update(...) L.562
Ogre::CompositorWorkspace::_update() L.541
Ogre::CompositorNode::_update(...) L.501
Ogre::CompositorPassScene::execute(...) L.173
Ogre::RenderTarget::_updateViewportRenderPhase02(...) L.227
Ogre::Viewport::_updateRenderPhase02(...) L.210
Ogre::Camera::_renderScenePhase02(...) L.401
Ogre::SceneManager::_renderPhase02 (...) L.1304
Ogre::RenderQueue::render(...) L.443
So, this line :

Code: Select all

mCommandBuffer->execute() ;
I couldn't debug further, because when I debug line per line and reach this line from the CommandBuffer::execute L.99 :

Code: Select all

(*CbExecutionTable[cmd->commandType])(this, cmd) ;
I get a read error for the address 0xFFFFFFFF... Letting it run without debugging here won't cause the error. Maybe some polymorphic things losing the debugger...

But I guess a command within the buffer will cause the Pixel Path warning to appear, only on the first use of the mesh.
I checked the commands pushed by the HLMS. Those are concerning the vec4 values, and the textures.

The textures are pushed during fillBuffersFor (every frame if I'm right) and loaded beforehand. That's why I don't think this is the problem. There should be another command pushed from another thing causing this halt, I think.

Else I didn't ask
I get the error later. Here is the stack when called :

Code: Select all

Ogre::Root::RenderOneFrame() L.1032
Ogre::Root::_updateAllRenderTargets() L.1509
Ogre::CompositorManager2::_update(...) L.593
Ogre::RenderSystem::_update() L.1056
Ogre::GL3PlusVaoManager::_update L.1029
So, after this line is executed, I get the warning :

Code: Select all

OCGE(mFrameSyncVec[mDynamicBufferCurrentFrame] = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0)) ;
I don't know the meaning of this, nor I know what material or HLMS it is using. But it is outside the CommandBuffer's execution scope...

Anyway, I hope these informations will be useful. As for me, I am a little lost, lacking some knowledge about Ogre's way of working. Let me know if you want me to debug some more on specific points !

Thanks !

Post Reply