[2.2] Question about the metal render system

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


Nucleartree
Kobold
Posts: 35
Joined: Tue Apr 04, 2017 9:10 pm
Location: Cardiff, UK
x 18

[2.2] Question about the metal render system

Post by Nucleartree »

Hi everyone.

I've been trying to get a gui library named Colibrigui working with metal on MacOS. I'm getting stuck with some problem relating to shader genaration.

Firstly, this library already has support for ogre 2.2 with glsl and hlsl. I actually got really far, and by writing a work around for this problem I got it working with metal rendering text and widgets. The problem itself is relating to clipdistance.

Hlms/Unlit/Metal/VertexShader_vs.metal contains the following:

Code: Select all

	@foreach( hlms_pso_clip_distances, n )
		float gl_ClipDistance@n [[clip_distance]];
	@end
This produces a final shader which contains this:

Code: Select all

		float gl_ClipDistance0 [[clip_distance]];
	
		float gl_ClipDistance1 [[clip_distance]];
	
		float gl_ClipDistance2 [[clip_distance]];
	
		float gl_ClipDistance3 [[clip_distance]];
The gui library has a custom hlms built on top of unlit. It does this in its calculateHashForPreCreate:

Code: Select all

setProperty( HlmsBaseProp::PsoClipDistances, 4 );
That explains why I get four of those floats appearing in my shader. The problem is that when it tries to compile the shader it does this:

Code: Select all

Metal SL Compiler Error in 300000001VertexShader_vs:
Compilation failed: 

program_source:149:9: error: declaration with attribute 'clip_distance' already specified
                float gl_ClipDistance1 [[clip_distance]];
  ~~~~~~^~~~~~~~~~~~~~~~
program_source:147:9: note: previous declaration with attribute 'clip_distance' here
                float gl_ClipDistance0 [[clip_distance]];
        ^
program_source:151:9: error: declaration with attribute 'clip_distance' already specified
                float gl_ClipDistance2 [[clip_distance]];
  ~~~~~~^~~~~~~~~~~~~~~~
program_source:147:9: note: previous declaration with attribute 'clip_distance' here
                float gl_ClipDistance0 [[clip_distance]];
        ^
program_source:153:9: error: declaration with attribute 'clip_distance' already specified
                float gl_ClipDistance3 [[clip_distance]];
  ~~~~~~^~~~~~~~~~~~~~~~
program_source:147:9: note: previous declaration with attribute 'clip_distance' here
                float gl_ClipDistance0 [[clip_distance]];
My analysis of this problem is that Metal doesn't allow multiple variables to have the [[clip_distance]] attribute, and I've got four.

So if anyone knows about Ogre's metal implementation, I'd like to know what the purpose of the for loop in the base shader is? It seems to me like if hlms_pso_clip_distances was ever anything greater than 1 it would crash.

The complete generated shader is below. It does have some of the gui specific stuff in it as well as the unlit stuff.

Code: Select all


#include <metal_stdlib>
using namespace metal;

struct float1
{
	float x;
	float1() {}
	float1( float _x ) : x( _x ) {}
};

inline float3x3 toMat3x3( float4x4 m )
{
    return float3x3( m[0].xyz, m[1].xyz, m[2].xyz );
}
inline float3x3 toMat3x3( float3x4 m )
{
	return float3x3( m[0].xyz, m[1].xyz, m[2].xyz );
}

#define ogre_float4x3 float3x4

//Short used for read operations. It's an int in GLSL & HLSL. An ushort in Metal
#define rshort2 ushort2
#define rint uint
//Short used for write operations. It's an int in GLSL. An ushort in HLSL & Metal
#define wshort2 ushort2
#define wshort3 ushort3

#define toFloat3x3( x ) toMat3x3( x )
#define buildFloat3x3( row0, row1, row2 ) float3x3( row0, row1, row2 )

#define min3( a, b, c ) min( a, min( b, c ) )
#define max3( a, b, c ) max( a, max( b, c ) )

#define mul( x, y ) ((x) * (y))
#define lerp mix
#define INLINE inline
#define NO_INTERPOLATION_PREFIX
#define NO_INTERPOLATION_SUFFIX [[flat]]

#define finalDrawId drawId

#define floatBitsToUint(x) as_type<uint>(x)
#define uintBitsToFloat(x) as_type<float>(x)
#define floatBitsToInt(x) as_type<int>(x)
#define lessThan( a, b ) (a < b)
#define discard discard_fragment()

#define inVs_vertex input.position
#define inVs_blendWeights input.blendWeights
#define inVs_blendIndices input.blendIndices
#define inVs_qtangent input.qtangent
	
		#define inVs_drawId input.drawId
	
    #define inVs_uv0 input.uv0
#define outVs_Position outVs.gl_Position
#define outVs_viewportIndex outVs.gl_ViewportIndex
#define outVs_clipDistance0 outVs.gl_ClipDistance0

#define gl_SampleMaskIn0 gl_SampleMask
//#define interpolateAtSample( interp, subsample ) interpolateAtSample( interp, subsample )
#define findLSB ctz

#define outPs_colour0 outPs.colour0
#define OGRE_Sample( tex, sampler, uv ) tex.sample( sampler, uv )
#define OGRE_SampleLevel( tex, sampler, uv, lod ) tex.sample( sampler, uv, level( lod ) )
#define OGRE_SampleArray2D( tex, sampler, uv, arrayIdx ) tex.sample( sampler, float2( uv ), arrayIdx )
#define OGRE_SampleArray2DLevel( tex, sampler, uv, arrayIdx, lod ) tex.sample( sampler, float2( uv ), ushort( arrayIdx ), level( lod ) )
#define OGRE_SampleArrayCubeLevel( tex, sampler, uv, arrayIdx, lod ) tex.sample( sampler, float3( uv ), ushort( arrayIdx ), level( lod ) )
#define OGRE_SampleGrad( tex, sampler, uv, ddx, ddy ) tex.sample( sampler, uv, gradient2d( ddx, ddy ) )
#define OGRE_SampleArray2DGrad( tex, sampler, uv, arrayIdx, ddx, ddy ) tex.sample( sampler, uv, ushort( arrayIdx ), gradient2d( ddx, ddy ) )
#define OGRE_ddx( val ) dfdx( val )
#define OGRE_ddy( val ) dfdy( val )
#define OGRE_Load2D( tex, iuv, lod ) tex.read( iuv, lod )
#define OGRE_Load2DMS( tex, iuv, subsample ) tex.read( iuv, subsample )

#define OGRE_Load3D( tex, iuv, lod ) tex.read( ushort3( iuv ), lod )

#define bufferFetch( buffer, idx ) buffer[idx]
#define bufferFetch1( buffer, idx ) buffer[idx]

#define structuredBufferFetch( buffer, idx ) buffer[idx]

#define OGRE_Texture3D_float4 texture3d<float>

#define OGRE_SAMPLER_ARG_DECL( samplerName ) , sampler samplerName
#define OGRE_SAMPLER_ARG( samplerName ) , samplerName

#define CONST_BUFFER_STRUCT_BEGIN( structName, bindingPoint ) struct structName
#define CONST_BUFFER_STRUCT_END( variableName )

#define FLAT_INTERPOLANT( decl, bindingPoint ) decl [[flat]]
#define INTERPOLANT( decl, bindingPoint ) decl

#define OGRE_OUT_REF( declType, variableName ) thread declType &variableName
#define OGRE_INOUT_REF( declType, variableName ) thread declType &variableName

#define OGRE_ARRAY_START( type ) {
#define OGRE_ARRAY_END }


// START UNIFORM STRUCT DECLARATION

//Uniforms that change per pass
struct PassData
{
	
	//Vertex shader
	float4x4 viewProj[2];
				//Pixel Shader
	float4 invWindowSize;
	

};

// END UNIFORM STRUCT DECLARATION

struct VS_INPUT
{
	float4 position [[attribute(VES_POSITION)]];
	float4 colour [[attribute(VES_DIFFUSE)]];

	float2 uv0 [[attribute(VES_TEXTURE_COORDINATES0)]];

	ushort drawId [[attribute(15)]];

	
	float4 normal [[attribute(VES_NORMAL)]];

	
};

struct PS_INPUT
{

	
		ushort materialId [[flat]];
		float4 colour;		
			float2 uv0;			

	

	float4 gl_Position [[position]];
	
		float gl_ClipDistance0 [[clip_distance]];
	
		float gl_ClipDistance1 [[clip_distance]];
	
		float gl_ClipDistance2 [[clip_distance]];
	
		float gl_ClipDistance3 [[clip_distance]];
	
};


	
		
	


vertex PS_INPUT main_metal
(
	VS_INPUT input [[stage_in]]
	
	// START UNIFORM DECLARATION
	
, constant PassData &passBuf [[buffer(CONST_SLOT_START+0)]]

	
//Uniforms that change per Item/Entity
//.x =
//Contains the material's start index.
//
//.y =
//shadowConstantBias. Send the bias directly to avoid an
//unnecessary indirection during the shadow mapping pass.
//Must be loaded with uintBitsToFloat
//
//.z =
//Contains 0 or 1 to index into passBuf.viewProj[]. Only used
//if hlms_identity_viewproj_dynamic is set.
, constant uint4 *worldMaterialIdx [[buffer(CONST_SLOT_START+2)]]

	, device const float4x4 *worldMatBuf [[buffer(TEX_SLOT_START+0)]]
	
	
	, uint gl_VertexID	[[vertex_id]]

	// END UNIFORM DECLARATION
)
{
	
		ushort drawId = input.drawId;
	

	PS_INPUT outVs;
	
	
		uint colibriDrawId = drawId + ((uint(gl_VertexID) - worldMaterialIdx[drawId].w) / 54u);
		#undef finalDrawId
		#define finalDrawId colibriDrawId
	
	#define worldViewProj 1.0f

	outVs.gl_ClipDistance0 = input.normal.x;
	outVs.gl_ClipDistance1 = input.normal.y;
	outVs.gl_ClipDistance2 = input.normal.z;
	outVs.gl_ClipDistance3 = input.normal.w;

	

	


	outVs.gl_Position = input.position * passBuf.viewProj[1];





	outVs.colour = input.colour;




	
		outVs.uv0.xy = input.uv0.xy;
	

	outVs.materialId = (ushort)worldMaterialIdx[finalDrawId].x;



	
	



	

	return outVs;
}

Thanks!
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5505
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1372

Re: [2.2] Question about the metal render system

Post by dark_sylinc »

I've pushed a fix for it but I don't know how well it will work.

Let me know if that works out for you. Basically you will have to use outVs.gl_ClipDistance0.x outVs.gl_ClipDistance0.y and so on until outVs.gl_ClipDistance0.w (fortunately Colibri needs only 4)
Nucleartree
Kobold
Posts: 35
Joined: Tue Apr 04, 2017 9:10 pm
Location: Cardiff, UK
x 18

Re: [2.2] Question about the metal render system

Post by Nucleartree »

Hi Dark_sylinc,

Thanks for your response. Unfortunately that didn't work. As well as this I also noticed some shader compilation errors in the samples that wasn't there previously.

The shader compilation errors in the samples were looking like this:

Code: Select all

program_source:255:4: error: unknown type name 'float0'
                        float0 gl_ClipDistance0 : [[clip_distance]];
   ^
program_source:255:30: error: lambda expressions are not supported in Metal
                        float0 gl_ClipDistance0 : [[clip_distance]];
The loop was inserting a value of float0, which doesn't exist. It was also writing the clip distances stuff regardless of whether it was being used. Also, I'm not sure the : is meant to be there either.

I fixed that particular crash by wrapping it in a check that the clip distances is greater than 1:

Code: Select all

@property( partial_pso_clip_distances > 1 )
	float@value( partial_pso_clip_distances ) gl_ClipDistance@value( full_pso_clip_distances ) : [[clip_distance]];
@end
So it only writes it if that value is something valid. That got the samples working for me.

Secondly, it seems float4 cannot be assigned to a clip_distance.

Code: Select all

program_source:150:29: error: type 'float4' (aka 'vector_float4') is not valid for attribute 'clip_distance'
                float4 gl_ClipDistance0 [[clip_distance]];
                            ^~~~~~~~~~~~~
Some investigation shows it has to look like this:

Code: Select all

float gl_ClipDistance [[clip_distance]] [4];
Bit weird. So my complete changes looked like:

Code: Select all

	@pdiv( full_pso_clip_distances, hlms_pso_clip_distances, 4 )
	@pmod( partial_pso_clip_distances, hlms_pso_clip_distances, 4 )
	@foreach( full_pso_clip_distances, n )
		float gl_ClipDistance@n [[clip_distance]] [4];
	@end
	@property( partial_pso_clip_distances == 1 )
		float gl_ClipDistance@value( full_pso_clip_distances ) [[clip_distance]];
	@else
		@property( partial_pso_clip_distances > 1 )
			float gl_ClipDistance@value( full_pso_clip_distances ) [[clip_distance]] [@value( partial_pso_clip_distances )];
		@end
	@end
That applies to both unlit and pbs. Please let me know what you think.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5505
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1372

Re: [2.2] Question about the metal render system

Post by dark_sylinc »

Pushed the fix.

Thanks for looking into this!
Btw. your final solution was overly complex because it was using the workaround... which is no longer needed.

Turns out Metal had a sane implementation of clip_distance, it just had a weird syntax.