[2.1] DX11 Locking a texture sub region [SOLVED]

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


Post Reply
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

[2.1] DX11 Locking a texture sub region [SOLVED]

Post by SolarPortal »

Hi, i seem to have stumbled on another issue with ogre3d, but this time with DX11.

I am trying to lock a region of a texture for editing the data and unlocking the data back. This all seems to work great in GL however, in DX11; it only seems to work if the entire texture is edited and unlocked. i have looked into the source for lockimpl, unlockImpl, _mapstagingbuffer etc... and played around with a few of the properties.. i did manage to get it to copy to a region partially by editing the:
OgreD3D11HardwarePixelBuffer : function (_mapstagingbuffer) line: 226:

Code: Select all

mLockBox.left, mLockBox.top, 0,
and changed both the lockbox to zero since its wanting to copy to the staging buffer which should in theory start from 0 if mapping a region and not use the previous full textures starting coords.

I also edited function (_unmapstagingbuffer) as when the data is passed back in, it again is using the main texture lock box rather than the staging buffers width and height to be passed to the main texture..

Code: Select all

....
        if(flags == D3D11_MAP_READ_WRITE || flags == D3D11_MAP_READ || flags == D3D11_MAP_WRITE)  
        {
            D3D11_BOX srcBoxDx11 = OgreImageBoxToDx11Box(mLockBox);
            srcBoxDx11.front = 0;
            srcBoxDx11.back = mLockBox.getDepth();
			srcBoxDx11.left = 0;
			srcBoxDx11.right = mLockBox.getWidth();
			srcBoxDx11.top = 0;
			srcBoxDx11.bottom = mLockBox.getHeight();
....
Alas, it did not work brilliantly, but it did give some result in the region of locking, but some of the data was out of scope or not enough data shown... perhaps someone could shed some light as having to edit the entire image is far too sluggish.. again, this all works well with the GL renderer... :)

oh, and here's the code we are using for creating the texture, editing and reuploading. (its only a single channel image to start simple with):

.cpp

Code: Select all


	void PbsMaterialsGameState::createTexture()
	{
		
		imageSize = 128;

		// Create the data array that will store the image CPU side and get saved with terrain and loaded back into.
		data = static_cast<Ogre::uint8*>(OGRE_MALLOC(imageSize * imageSize * sizeof(Ogre::uint8), Ogre::MEMCATEGORY_RESOURCE));

		parentTex = Ogre::TextureManager::getSingletonPtr()->createManual("STPhysicsVisibility",
			Ogre::ResourceGroupManager::DEFAULT_RESOURCE_GROUP_NAME,
			Ogre::TEX_TYPE_2D_ARRAY, imageSize, imageSize, 1, 0, Ogre::PF_L8, Ogre::TU_STATIC);

		// default and initialise to white
		Ogre::Box bbox(0, 0, imageSize, imageSize);
		Ogre::v1::HardwarePixelBufferSharedPtr buf = parentTex->getBuffer();
		Ogre::uint8* pInit = static_cast<Ogre::uint8*>(buf->lock(bbox, Ogre::v1::HardwarePixelBuffer::HBL_DISCARD).data);
		memset(pInit, 255, Ogre::PixelUtil::getNumElemBytes(Ogre::PF_L8) * imageSize * imageSize);
		buf->unlock();

		// Get pointer to the buffer for downloading/uploading data
		buffer = parentTex->getBuffer().getPointer();

		// Download the data.
		download();
	}

	void PbsMaterialsGameState::destroyTexture()
	{
		Ogre::TextureManager::getSingletonPtr()->remove(parentTex);
		parentTex.setNull();
	}

	void PbsMaterialsGameState::download()
	{
		Ogre::uint8* pDst = data;
		// Download data
		Ogre::Image::Box box(0, 0, imageSize, imageSize);
		Ogre::uint8* pSrc = static_cast<Ogre::uint8*>(buffer->lock(box, Ogre::v1::HardwareBuffer::HBL_READ_ONLY).data);
		size_t srcInc = Ogre::PixelUtil::getNumElemBytes(buffer->getFormat());
		for (size_t y = box.top; y < box.bottom; ++y)
		{
			for (size_t x = box.left; x < box.right; ++x)
			{
				*pDst++ = static_cast<Ogre::uint8>(*pSrc);
				pSrc += srcInc;
			}
		}
		buffer->unlock();
	}

	void PbsMaterialsGameState::dirty()
	{
		Ogre::Rect rect;
		rect.top = 0; rect.bottom = imageSize;
		rect.left = 0; rect.right = imageSize;
		dirtyRect(rect);
	}

	void PbsMaterialsGameState::dirtyRect(const Ogre::Rect& rect)
	{
		//if (mDirty)
		//{
		//	mDirtyBox.left = std::min(mDirtyBox.left, static_cast<Ogre::uint32>(rect.left));
		//	mDirtyBox.top = std::min(mDirtyBox.top, static_cast<Ogre::uint32>(rect.top));
		//	mDirtyBox.right = std::max(mDirtyBox.right, static_cast<Ogre::uint32>(rect.right));
		//	mDirtyBox.bottom = std::max(mDirtyBox.bottom, static_cast<Ogre::uint32>(rect.bottom));
		//}
		//else{
			mDirtyBox.left = static_cast<Ogre::uint32>(rect.left);
			mDirtyBox.right = static_cast<Ogre::uint32>(rect.right);
			mDirtyBox.top = static_cast<Ogre::uint32>(rect.top);
			mDirtyBox.bottom = static_cast<Ogre::uint32>(rect.bottom);
			mDirty = true;
		//}
	}

	void PbsMaterialsGameState::upload()
	{
		if (mDirtyBox.right > imageSize){ mDirtyBox.right = imageSize; }
		if (mDirtyBox.bottom > imageSize){ mDirtyBox.bottom = imageSize; }
		if (mDirtyBox.top < 0){ mDirtyBox.top = 0; }
		if (mDirtyBox.left < 0){ mDirtyBox.left = 0; }

		// Only reupload the data if this is dirty
		if (data && mDirty){
			Ogre::uint8* pDstBase = static_cast<Ogre::uint8*>(buffer->lock(mDirtyBox, Ogre::v1::HardwarePixelBuffer::HBL_NORMAL).data);
			size_t dstInc = Ogre::PixelUtil::getNumElemBytes(buffer->getFormat());
			for (size_t y = 0; y < mDirtyBox.getHeight(); ++y)
			{
				Ogre::uint8* pDst = pDstBase + y * mDirtyBox.getWidth() * dstInc;
				for (size_t x = 0; x < mDirtyBox.getWidth(); ++x)
				{
					*pDst = 0;
					pDst += dstInc;
				}
			}
			buffer->unlock();
			mDirty = false;
		}
	}
.h

Code: Select all

		size_t terrainIdx;
		size_t imageSize;

		void createTexture();
		void destroyTexture();

		Ogre::TexturePtr parentTex;
		Ogre::uint8* data;
		Ogre::v1::HardwarePixelBuffer* buffer; // References the main texture buffer

		void dirty();
		void dirtyRect(const Ogre::Rect& rect);
		Ogre::Box mDirtyBox;

		Ogre::uint8* getData(){ return data; };
		void freeResources();

		size_t getSize(){ return imageSize; }
		Ogre::TexturePtr getTexture(){ return parentTex; }

		void download();
		void upload();

		bool mDirty;
This was all placed in the PbsMaterialsGameState demo for testing in debug and just created the texture at the top before generating the plane:
and then passed this as the diffuse texture into the plane's diffuse block. Also changed the UV of the plane to 1:1

Code: Select all

		createTexture();
		Ogre::TexturePtr temptex = parentTex;

        Ogre::v1::MeshPtr planeMeshV1 = Ogre::v1::MeshManager::getSingleton().createPlane( "Plane v1",
                                            Ogre::ResourceGroupManager::DEFAULT_RESOURCE_GROUP_NAME,
                                            Ogre::Plane( Ogre::Vector3::UNIT_Y, 1.0f ), 50.0f, 50.0f,
                                            1, 1, true, 1, 1.0f, 1.0f, Ogre::Vector3::UNIT_Z,
                                            Ogre::v1::HardwareBuffer::HBU_STATIC,
                                            Ogre::v1::HardwareBuffer::HBU_STATIC );

        Ogre::MeshPtr planeMesh = Ogre::MeshManager::getSingleton().createManual(
                    "Plane", Ogre::ResourceGroupManager::DEFAULT_RESOURCE_GROUP_NAME );

        planeMesh->importV1( planeMeshV1.get(), true, true, true );

        {
            Ogre::Item *item = sceneManager->createItem( planeMesh, Ogre::SCENE_DYNAMIC );
            item->setDatablock( "Marble" );
            Ogre::SceneNode *sceneNode = sceneManager->getRootSceneNode( Ogre::SCENE_DYNAMIC )->
                                                    createChildSceneNode( Ogre::SCENE_DYNAMIC );
            sceneNode->setPosition( 0, -1, 0 );
            sceneNode->attachObject( item );

            //Change the addressing mode of the roughness map to wrap via code.
            //Detail maps default to wrap, but the rest to clamp.
            assert( dynamic_cast<Ogre::HlmsPbsDatablock*>( item->getSubItem(0)->getDatablock() ) );
            Ogre::HlmsPbsDatablock *datablock = static_cast<Ogre::HlmsPbsDatablock*>( item->getSubItem(0)->getDatablock() );
            //Make a hard copy of the sampler block
			Ogre::HlmsSamplerblock samplerblock;
            samplerblock.mU = Ogre::TAM_WRAP;
            samplerblock.mV = Ogre::TAM_WRAP;
            samplerblock.mW = Ogre::TAM_WRAP;

			datablock->setTexture(Ogre::PBSM_DIFFUSE, 0, temptex, &samplerblock);
        }
In the key release we then added a few commands for 3,4 and 5

Code: Select all

        else if( arg.keysym.sym == SDLK_3 )
        {
              // Make all black,
              dirty();
              upload();
       }
       else if (arg.keysym.sym == SDLK_4)
       {
              // Dirty a region of the texture
              Ogre::Rect rect(10, 10, 30, 30);
              dirtyRect(rect);
              upload();
       }else if (arg.keysym.sym == SDLK_5)
       {
              // Reset back to white
              Ogre::Box bbox(0, 0, imageSize, imageSize);
              Ogre::v1::HardwarePixelBufferSharedPtr buf = parentTex->getBuffer();
              Ogre::uint8* pInit = static_cast<Ogre::uint8*>(buf->lock(bbox, Ogre::v1::HardwarePixelBuffer::HBL_DISCARD).data);
              memset(pInit, 255, Ogre::PixelUtil::getNumElemBytes(Ogre::PF_L8) * imageSize * imageSize);
              buf->unlock();
       }
Key 3 - This edits the region to black for the entire image (works fine)
Key 4 - This edits a region within the texture to black and uploads
Key 5 - This clears the texture back to white again for testing Key 3 & 4

In openGl, it produces this when key 4 is pressed which is the region 10,10 -> 30,30 of a 128 image:
Image

but DX11 will not show any changes unless the code above it used, but again it doesnt work properly..
is this perhaps a bug...
i personally don't want to break these functions as they seem intrinsicly linked with creating texture arrays....

Thanks for you time and your help and look forward to one of this communities great responses :D

p.s. i also commented out all the spheres and boxes just to see the plane render :)
Also cleared the Marble material of all textures just to show the diffuse.
Last edited by SolarPortal on Fri Nov 10, 2017 12:02 pm, edited 1 time in total.
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.1] DX11 Locking a texture sub region

Post by dark_sylinc »

I am too tempted to say bugs like these will and are fixed in 2.2.

To be honest, the D3D11 Texture code in 2.1 is full of hacks. First, it does a lousy job, and second Ogre's texture methods don't map well to D3D11.
So it's no surprise for me that you're having trouble when editing subregions. Last time I tried to improve that section I had too many WTFs and gave up.

Perhaps you find a more suitable alternative to fully edit a temporary texture, and perform a blit Texture -> Texture? (be warned this is slow in GL, so it would be a path for D3D11)

Alternatively try editing via blitFromMemory.

I'm sorry I couldn't give more help
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

Re: [2.1] DX11 Locking a texture sub region

Post by SolarPortal »

Thanks for the response and explains a lot of the issues i do have with D3D11, its just the performance difference to GL is huge :shock: as on my system, it can run up to just under twice the speed... :shock:

i have tried using blitting with D3D11 on cubemaps and trying to make 6 images blit to 1 cube texture, but ended up crashing in the SwapTarget area, but perhaps it will work better on standard 2D textures..

Also, i have experienced a few crashes with loading textures that are NPOT... but again these crash in the lockimpl function...
its a shame as that was the couple of issues that were holding us back from using D3D11...

i might try looking into creating some direct d3d11 routines to fullfill the jobs but from what i see, it is only the OgreD3D11HardwarePixelBuffer.cpp that needs some work done to it to make the DX11 system run smoothly :P

Thanks for responding anyway :)
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

Re: [2.1] DX11 Locking a texture sub region

Post by SolarPortal »

Well, i managed to get it to work really well and the performance is just under double the speed of GL texture manipulation for me!
I will post my results tomorrow as its late here now :)

but there were fixes required in D3D11 pixel buffer and the way of working with the mapped data is completely different to GL.. but i shall post about this more tomorrow :)

Sorry, I am Buzzing!!! :lol:
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

Re: [2.1] DX11 Locking a texture sub region

Post by SolarPortal »

ok, Time for a write up as simple as possible :P After countless hours trying to figure out what was causing the locked region to not function, we managed to get a single row from left to right with any offset working as we expected. However, when we were stepping down the rows, things just didnt add up and was extremely confusing... and frustrating haha :P It took both of our team members on this one task for double brain power :?

DX11 handles its texture sub region manipulation very different from GL as DX11 will always use a rowpitch of image size or greater than image size (...e.g. 16x16 might give you a row pitch of 128 which really makes things harder to get working...) across as it adds padding between rows to reach the next section for editing whereas GL uses the Area box that is dirty for iterating across.. basically GL has consecutive image data for the rows that are locked whereas DX11 has padding between the end of the box and the the beginning of the next row on the next line. Here's a quick image to show what i mean....

Image

However, even having this knowledge does not help with this error as there is a slight error, well i say error but its more missing lol :P
In the file OgreD3D11HardwarePixelBuffer.cpp in function: "_unmapstagingbuffer(bool copyback)", these lines need to be added:

Code: Select all

				srcBoxDx11.left = 0;
				srcBoxDx11.right = mLockBox.getWidth();
				srcBoxDx11.top = 0;
				srcBoxDx11.bottom = mLockBox.getHeight();
This is so when the source box is copied back into the destination buffer, it contains the correct size of the data that has been manipulated otherwise you will just upload to the wrong region of the data and end up not seeing any changes or see part changes as its uploaded to the GPU dest texture again.

Here's where things get interesting.....

When you are getting the offset for editing the row of data in the nested for loop, in GL it will use the width of the dirty box whereas DX11 will use the rowPitch to access the data: for example this is as simple a change as necessary to edit a single channel image.

Code: Select all

// DX11 -- Rowpitch could be the size of the image or greater than the image if DirectX added padding to the data.
Ogre::uint8* pDst = pDstBase + (y * rowPitch) * dstInc;

// GL -- mDirtyBox.getWidth() is the width of the region to be edited... e.g. width could be 10 if region is (40, 40), (50, 50) on an image
//Ogre::uint8* pDst = pDstBase + y * mDirtyBox.getWidth() * dstInc;
for example on the full code block to see how these lines are used, then please see below:

Code: Select all

	void PbsMaterialsGameState::upload()
	{
		if (mDirtyBox.right > imageSize){ mDirtyBox.right = imageSize; }
		if (mDirtyBox.bottom > imageSize){ mDirtyBox.bottom = imageSize; }
		if (mDirtyBox.top < 0){ mDirtyBox.top = 0; }
		if (mDirtyBox.left < 0){ mDirtyBox.left = 0; }

		// Only reupload the data if this is dirty
		if (data && mDirty){
			Ogre::PixelBox box = buffer->lock(mDirtyBox, Ogre::v1::HardwarePixelBuffer::HBL_NORMAL);
			rowPitch = box.rowPitch;
			Ogre::uint8* pDstBase = static_cast<Ogre::uint8*>(box.data);
			//Ogre::uint8* pDstBase = static_cast<Ogre::uint8*>(buffer->lock(mDirtyBox, Ogre::v1::HardwarePixelBuffer::HBL_NORMAL).data);
			size_t dstInc = Ogre::PixelUtil::getNumElemBytes(buffer->getFormat());
			for (size_t y = 0; y < mDirtyBox.getHeight(); ++y)
			{
				// DX11
				Ogre::uint8* pDst = pDstBase + (y * rowPitch) * dstInc;

				// GL
				//Ogre::uint8* pDst = pDstBase + y * mDirtyBox.getWidth() * dstInc;
				for (size_t x = 0; x < mDirtyBox.getWidth(); ++x)
				{
					//float random = Ogre::Math::RangeRandom(0, 100);
					//*pDst = (Ogre::uint8)random;
					*pDst = 0;
					pDst += dstInc;
				}
			}
			buffer->unlock();
			mDirty = false;
		}
	}
To ease the use of row pitch or box width for GL or DX editing, we created a function that would do this for us:

Code: Select all

size_t RenderManager::getGlOrDXRowPitch(size_t rowPitch, size_t width)
{
	if (getRenderSystemType() == RenderManager::RS_DX11){
		return rowPitch;
	}else{
		return width;
	}
}
This is then used in conjunction with the locking of the region for a texture and is used as such:

Code: Select all

		// Get the pixel box rather than returning the data straight away to gain access to the rowPitch from the PixelBox
		Ogre::PixelBox box = buffer->lock(mDirtyBox, Ogre::v1::HardwarePixelBuffer::HBL_NORMAL);
		Ogre::uint8* pDstBase = static_cast<Ogre::uint8*>(box.data);
		size_t dstInc = Ogre::PixelUtil::getNumElemBytes(buffer->getFormat());

		// Get the row putch for the upload if using DX11 or get the rect box width if GL
		// DX11 uses padding to reach the next row.
		size_t rowPitch = scene.renderMgr->getGlOrDXRowPitch(box.rowPitch, mDirtyBox.getWidth());
		
		....
		
		// Then when editing the data inside your for loop you can use rowPitch for either GL or DX to save some lines of code
		// So your offset for the row would end up looking like this for both Renderers:
		Ogre::uint8* pDst = pDstBase + y * rowPitch * dstInc;	
		
		....
Note, this is only for a single channel texture, e.g. PF_L8 format, things get a little more tricky when you come to multi channel textures and editing a single channel as in GL this works fine and is happy to edit just a single channel of data with an offset to the channel.
But DX11 has to have the entire data filled for each channel otherwise you can end up with the GPU clearing the texel data or just putting something random down, so we did the following:

Code: Select all

void SkyTerrainBlendLayer::upload()
{
	// Only reupload the data if this is dirty
	if (data && mDirty){
		Ogre::PixelBox box = buffer->lock(mDirtyBox, Ogre::v1::HardwarePixelBuffer::HBL_NORMAL);
		Ogre::uint8* pDstBase = static_cast<Ogre::uint8*>(box.data);
		size_t dstInc = Ogre::PixelUtil::getNumElemBytes(buffer->getFormat());

		// Get the row putch for the upload if using DX11 or get the rect box width if GL
		// DX11 uses padding to reach the next row.
		size_t rowPitch = scene.renderMgr->getGlOrDXRowPitch(box.rowPitch, mDirtyBox.getWidth());

		SkyTerrain* terrain = skyTerrainMgr.getTerrainTile(0);
		RenderManager::RenderSystemType renderSystem = scene.renderMgr->getRenderSystemType();
		switch (renderSystem)
		{
			case RenderManager::RenderSystemType::RS_DX11:
				{
					float* mBlendData1 = terrain->getBlendLayer(2)->getBlendData(); // B
					float* mBlendData2 = terrain->getBlendLayer(1)->getBlendData(); // G
					float* mBlendData3 = terrain->getBlendLayer(0)->getBlendData(); // R
					float* mBlendData4 = terrain->getBlendLayer(3)->getBlendData(); // A

					Ogre::uint32 pos = mDirtyBox.top * channelWidth + mDirtyBox.left;
					mBlendData1 = mBlendData1 + pos;
					mBlendData2 = mBlendData2 + pos;
					mBlendData3 = mBlendData3 + pos;
					mBlendData4 = mBlendData4 + pos;

					for (size_t y = 0; y < mDirtyBox.getHeight(); ++y)
					{
						size_t yPos = y * channelWidth;
						float* pSrc1 = mBlendData1 + yPos;
						float* pSrc2 = mBlendData2 + yPos;
						float* pSrc3 = mBlendData3 + yPos;
						float* pSrc4 = mBlendData4 + yPos;

						Ogre::uint8* pDst = pDstBase + y * rowPitch * dstInc;
						for (size_t x = 0; x < mDirtyBox.getWidth(); ++x){
							*pDst++ = static_cast<Ogre::uint8>(*pSrc1++ * 255);
							*pDst++ = static_cast<Ogre::uint8>(*pSrc2++ * 255);
							*pDst++ = static_cast<Ogre::uint8>(*pSrc3++ * 255);
							*pDst++ = static_cast<Ogre::uint8>(*pSrc4++ * 255);
						}
					}
				}
				break;

			case RenderManager::RenderSystemType::RS_GL:
				{
					float* pSrcBase = data + mDirtyBox.top * channelWidth + mDirtyBox.left;
					pDstBase += mChannelOffset;
					for (size_t y = 0; y < mDirtyBox.getHeight(); ++y)
					{
						float* pSrc = pSrcBase + y * channelWidth;
						Ogre::uint8* pDst = pDstBase + y * rowPitch * dstInc;
						for (size_t x = 0; x < mDirtyBox.getWidth(); ++x)
						{
							*pDst = static_cast<Ogre::uint8>(*pSrc++ * 255);
							pDst += dstInc;
						}
					}
				}
				break;
		}

		buffer->unlock();
		mDirty = false;
	}
}
Well, we hope this information helps someone else who encounters this problem and if the source change could be merged back into the ogre3d SDK, that would be great as it would make merging much easier.
Note: This has only been used with uncompressed formats (PF_L8, PF_A8R8G8B8) and have not tested with formats such as DXT1, but its unusual to use DXT1 and want to edit a sub region...
However, we have tested all demos and everything seems to run as expected :)

Also the textures were created with TU_STATIC so it used staging buffers for editing rather than static buffers.

Using all this knowledge, we have managed to get all our textures to edit data on both DX11 and OpenGL perfectly and the speed difference on DX11 to GL is far superior.. :)

Thanks for your time reading and hope this helps :D

Edit: Forgot to post an image of it working in DX11:

Image

This is the region 10,10,30,30 of a 128 texture... The colours are random per texel to show where each texel is :)
Also, we are only outputting the diffuse texture on the shader which is why its white.
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.1] DX11 Locking a texture sub region

Post by dark_sylinc »

You should always be using pitch. In GL pitch should be the same as width.
SolarPortal wrote: Fri Nov 10, 2017 12:00 pmWell, we hope this information helps someone else who encounters this problem and if the source change could be merged back into the ogre3d SDK, that would be great as it would make merging much easier.
I'm having trouble following what needs to be merged to source. Could you make it more obvious? Thanks!

Btw in Ogre 2.2 these issues definitely are solved. You do have to deal with bytesPerRow, but we provide TextureBox::at( x, y, z ) to get it always right. Also it works with compressed formats.
For performance reasons you'll call at( 0, y, z ) and then increment the pointers, rather than calling it for every pixel (each row is contiguous)
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

Re: [2.1] DX11 Locking a texture sub region [SOLVED]

Post by SolarPortal »

sure no problem, the only tweak for this topic is the:

Code: Select all

				srcBoxDx11.left = 0;
				srcBoxDx11.right = mLockBox.getWidth();
				srcBoxDx11.top = 0;
				srcBoxDx11.bottom = mLockBox.getHeight();
which is in _unmapstagingbuffer()

Code: Select all

		void D3D11HardwarePixelBuffer::_unmapstagingbuffer(bool copyback)
		{
			_unmap(mStagingBuffer);

			if (copyback)
			{
				D3D11_BOX srcBoxDx11 = OgreImageBoxToDx11Box(mLockBox);
				srcBoxDx11.front = 0;
				srcBoxDx11.back = mLockBox.getDepth();
				
				// << CODE TWEAK
				srcBoxDx11.left = 0;
				srcBoxDx11.right = mLockBox.getWidth();
				srcBoxDx11.top = 0;
				srcBoxDx11.bottom = mLockBox.getHeight();
				// << END CODE TWEAK
				
				if (PixelUtil::isCompressed(mFormat))
				{
					const uint32 blockWidth = PixelUtil::getCompressedBlockWidth(mFormat, true);
					const uint32 blockHeight = PixelUtil::getCompressedBlockHeight(mFormat, true);

					srcBoxDx11.right = std::max(srcBoxDx11.left + blockWidth, srcBoxDx11.right);
					srcBoxDx11.bottom = std::max(srcBoxDx11.top + blockHeight, srcBoxDx11.bottom);
				}

				unsigned int dstSubresource = D3D11CalcSubresource(mSubresourceIndex, mLockBox.front + mFace,
					mParentTexture->getNumMipmaps() + 1);
				mDevice.GetImmediateContext()->CopySubresourceRegion(
					mParentTexture->getTextureResource(),
					dstSubresource,
					mLockBox.left, mLockBox.top, 0, //TODO: Support 3D array textures
					mStagingBuffer, 0, &srcBoxDx11);

				SAFE_RELEASE(mStagingBuffer);
			}
		}
Edit: Also good to know the extra details, but we wont be moving to 2.2 for a while.. just got our engine working with 2.1 fully :lol:
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5296
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1278
Contact:

Re: [2.1] DX11 Locking a texture sub region [SOLVED]

Post by dark_sylinc »

SolarPortal wrote: Fri Nov 10, 2017 5:37 pm sure no problem, the only tweak for this topic is the:

Code: Select all

				srcBoxDx11.left = 0;
				srcBoxDx11.right = mLockBox.getWidth();
				srcBoxDx11.top = 0;
				srcBoxDx11.bottom = mLockBox.getHeight();
Oh no. That would mean that either:
  • D3D11 is creating a Staging Texture as large as the whole texture, even when a subregion is involved. The StagingTexture should've been much smaller + apply your fix. Or just apply your fix and waste RAM.
  • For some reason, the Staging Texture needs to be that large. Mearning that the returned pointer should've been offsetted to the start of the box and not to the beginning of the Staging Texture. Some more obscure bug could be hidden
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

Re: [2.1] DX11 Locking a texture sub region [SOLVED]

Post by SolarPortal »

D3D11 is creating a Staging Texture as large as the whole texture, even when a subregion is involved. The StagingTexture should've been much smaller + apply your fix. Or just apply your fix and waste RAM.
You are certainly right about this one. In one of our tests, we passed the pixel box directly through to the function:
void D3D11HardwarePixelBuffer::createStagingBuffer()
which in the description for creating a new texture, creates the staging buffer at the same size which one would think is incorrect.
Its these lines in that function:

function: (createStagingBuffer)

Code: Select all

desc.Width = std::max<uint32>(minWidth, mWidth);
desc.Height = std::max<uint32>(minHeight, mHeight);
we changed to represent the size of the pixel box being editing:

Code: Select all

case TEX_TYPE_2D_ARRAY:
	{
		D3D11_TEXTURE2D_DESC desc;
		tex->GetTex2D()->GetDesc(&desc);

		// <<< ---- BEGIN EDIT
		//desc.Width = std::max<uint32>(minWidth, mWidth);
		//desc.Height = std::max<uint32>(minHeight, mHeight);
		desc.Width = std::max<uint32>(minWidth, box.getWidth());
		desc.Height = std::max<uint32>(minHeight, box.getHeight());
		// <<< ---- END EDIT
		
		desc.MipLevels = 0;
		desc.BindFlags = 0;
		desc.MiscFlags = 0;
		desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE | D3D11_CPU_ACCESS_READ;
		desc.Usage = D3D11_USAGE_STAGING;
	}
The pixel box was passed through as an argument from _mapStagingBuffer() as:

Code: Select all

void D3D11HardwarePixelBuffer::_mapstagingbuffer(D3D11_MAP flags, PixelBox &box)
{
	if (!mStagingBuffer)
		createStagingBuffer(box); // <<< ---- Pass the edited box region through to the staging buffer.
.....			
and was then used as shown above in the "createStagingBuffer(Ogre::PixelBox &box)" function(Note: the addition of the pixel box argument).

Note: I have just given this another go and with the pixel box width and height passed to the staging buffer, my demos still seem to work which means better ram usage and sub region editing... however, it was only a quick test :P But point 1 of your bullet points seems to be the correct answer...

However without editing those first 4 initial lines on the unmap, i never got anything working.

Edit: Did a performance test with pixel box size or texture size. Heres the stats:
Original Texture: 4096
Edited Region: (10,10), (30,30)

Update the region every 10 frames and run a random colour gen on an A8RG8B8 format texture:

Code: Select all

float random = Ogre::Math::RangeRandom(0, 100);
*pDst++ = (Ogre::uint8)random;
random = Ogre::Math::RangeRandom(0, 100); *pDst++ = (Ogre::uint8)random;
random = Ogre::Math::RangeRandom(0, 100); *pDst++ = (Ogre::uint8)random;
random = Ogre::Math::RangeRandom(0, 100); *pDst++ = (Ogre::uint8)random;
(Release)
Without Pixelbox size for staging buffer: 170-200fps
With Pixelbox size for staging buffer: 300fps

(Debug)
Both modes ran at 50fps...

So there is a notable performance increase by using the pixel box size for the staging buffer.

PC Specs:
Cpu - Intel core i7 3770
Ram: 16gb DDR3
Gpu -GTX 750Ti

Edit2: The only time i think it might crash generating a Stagingbuffer is if the format is compressed and the width and height are not in a multiple of the block compression size (DXT1 is 4x4).. as this crashes when loading say a 900px x 900px texture with 2 or more mip maps as the second mip 450x450 does not have a multiple of 4 and as such crashes the Staging buffer generation....

Edit 3: Just trying it on our engine at the mo and its not gone as well, but doing a full clean and rebuild to check if that was the issue..
Will edit this post again once i have tested with our engine again :)
Yeah, it was crashing in the _mapstagingbuffer() because the offset for the staging buffer should be 0,0 rather than left, top

In 30mins or so, i will post to this topic again and give the code that needs changed to make it easier for you @dark_sylinc to merge back in :) as this post is a lot of testing :P
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
User avatar
SolarPortal
OGRE Contributor
OGRE Contributor
Posts: 203
Joined: Sat Jul 16, 2011 8:29 pm
Location: UK
x 51
Contact:

Re: [2.1] DX11 Locking a texture sub region [SOLVED]

Post by SolarPortal »

first, apologies for the double posting, but this makes it easier for you to merge:

so the changes are as follows to fix this problem once and for all :)

File: OgreD3D11HardwarePixelBuffer.h
Around line 69: change:

Code: Select all

//void createStagingBuffer();
void createStagingBuffer(Ogre::PixelBox &box);
File: OgreD3D11HardwarePixelBuffer.cpp
Function: D3D11HardwarePixelBuffer::_mapstagingbuffer(D3D11_MAP flags, PixelBox &box)
Around line: 204: change:

Code: Select all

//	createStagingBuffer();
	createStagingBuffer(box); //<< --- Send the pixel box through to the function.
Around line: 226: change:

Code: Select all

mDevice.GetImmediateContext()->CopySubresourceRegion(
	mStagingBuffer, 0,
	
	// ---  BEGIN EDIT
	//mLockBox.left, mLockBox.top, 0,
	0, 0, 0, // << -- We need to start the staging buffer at zero.
	// --- END EDIT
	
	mParentTexture->getTextureResource(), subresource, &srcBoxDx11);
Function:D3D11HardwarePixelBuffer::_unmapstagingbuffer(bool copyback)
Around line: 439: add these 4 lines:

Code: Select all

			
	_unmap(mStagingBuffer);

	if (copyback)
	{
		D3D11_BOX srcBoxDx11 = OgreImageBoxToDx11Box(mLockBox);
		srcBoxDx11.front = 0;
		srcBoxDx11.back = mLockBox.getDepth();
		
		// ---  BEGIN EDIT
		srcBoxDx11.left = 0;
		srcBoxDx11.right = mLockBox.getWidth();
		srcBoxDx11.top = 0;
		srcBoxDx11.bottom = mLockBox.getHeight();
		// ---  END EDIT
	...
Function: D3D11HardwarePixelBuffer::createStagingBuffer()
Add the argument to the function:

Code: Select all

void D3D11HardwarePixelBuffer::createStagingBuffer(Ogre::PixelBox &box) // Pass in pixel box as reference.
Then in the same function, change each of the:

Code: Select all

desc.Width = std::max<uint32>(minWidth, mWidth); // and
desc.Height = std::max<uint32>(minHeight, mHeight);
to:

Code: Select all

desc.Width = std::max<uint32>(minWidth, box.getWidth());        // Note: notice the box.getWidth() instead of mWidth
desc.Height = std::max<uint32>(minHeight, box.getHeight());     // Note: notice the box.getHeight() instead of mHeight
this is needed for cases:
  • TEX_TYPE_1D
  • TEX_TYPE_2D, TEX_TYPE_CUBE_MAP and TEX_TYPE_2D_ARRAY
  • TEX_TYPE_3D
On case TEX_TYPE_3D:
the depth may need the box.depth added as such:

Code: Select all

desc.Depth = box.getDepth();
Hope this helps and is the fix you were looking for :)

I have tested in the samples and in our engine and it works a treat :)
Lead developer of the Skyline Game Engine: https://aurasoft-skyline.co.uk
Post Reply