Support for Cg's glslX and hlslX profiles added

Discussion area about developing or extending OGRE, adding plugins for it or building applications on it. No newbie questions please, use the Help forum for that.
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

Looks like the GpuProgramParameters objects gets destroyed twice. This is odd, since it's in a SharedPtr. Even if they are shared between Gpu Programs (the GLSL and the Cg in this case) that should be fine. Is there any explicit destruction going on anywhere?
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

No, but the GpuProgramParameters creation is forwarded to the GLSL shader. Maybe I need to forward something else, too. It's possible that at some point there are two different parameter sets that get confused.
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

Here is a valgrind problem while rendering:

Code: Select all

==8806== Invalid write of size 4
==8806==    at 0x402EA0D: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==8806==    by 0x81D8905: Ogre::GpuProgramParameters::_updateAutoParams(Ogre::AutoParamDataSource const*, unsigned short) (OgreGpuProgramParams.cpp:1729)
==8806==    by 0x8291197: Ogre::Pass::_updateAutoParams(Ogre::AutoParamDataSource const*, unsigned short) const (OgrePass.cpp:1590)
==8806==    by 0x82FC633: Ogre::SceneManager::updateGpuProgramParameters(Ogre::Pass const*) (OgreSceneManager.cpp:7208)
==8806==    by 0x8307E05: Ogre::SceneManager::renderSingleObject(Ogre::Renderable*, Ogre::Pass const*, bool, bool, Ogre::HashedVector<Ogre::Light*> const*) (OgreSceneManager.cpp:3449)
==8806==    by 0xC70085F: ???
==8806==  Address 0xa3ed584 is 0 bytes after a block of size 156 alloc'd
==8806==    at 0x402CE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==8806==    by 0x81E47D3: std::vector<float, Ogre::STLAllocator<float, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::_M_fill_insert(__gnu_cxx::__normal_iterator<float*, std::vector<float, Ogre::STLAllocator<float, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > > >, unsigned int, float const&) (OgreMemoryStdAlloc.h:66)
==8806==    by 0x83E7825: Ogre::GLSLProgram::populateParameterNames(Ogre::SharedPtr<Ogre::GpuProgramParameters>) (OgreGLSLProgram.cpp:219)
==8806==    by 0x81F098F: Ogre::HighLevelGpuProgram::createParameters() (OgreHighLevelGpuProgram.cpp:95)
==8806==    by 0x83F8242: Ogre::CgProgram::createParameters() (OgreCgProgram.cpp:614)
==8806==    by 0xC93521F: ???
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

A simple reproduction case would help me more at this point :)
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

I suspect it's going to be easier for me to debug this on my own than whittle down my 1500lines of cg into a simple repro case :) I haven't worked out which of my shader instances is the problem yet. My bet is that there is some disagreement as to whether a uniform actually exists, and this is firstly causing all my uniforms to get weird values, and secondly causing a 4 byte buffer overrun that can touch the reference counter of a shared ptr. There are a lot of irritations with cg optimising away unused uniforms when compiling to asms. It might be something to do with that.
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

This looks very suspicious:

Before:
uniform vec4 _view1[4];
uniform vec4 _proj1[4];
uniform vec4 _shadow_view_proj11[4];
uniform vec4 _shadow_view_proj21[4];
uniform vec4 _shadow_view_proj31[4];
uniform vec4 _sun_pos_ws1;
uniform vec3 _camera_pos_ws1;

After:
uniform mat4 view;
uniform mat4 proj;
uniform vec4 _shadow_viewproj1[4];
uniform mat4 shadow_view_proj2;
uniform mat4 shadow_view_proj3;
uniform vec4 sun_pos_ws;
uniform vec3 camera_pos_ws;

What happened with _shadow_view_proj11?
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

You remember how I wrote that my cleanup of Cg's output is a hack? Well, apparently you've found the first hole in the hack :) It's probably to do with the underscores in the variable name, I'll look into it.

Can you post the original Cg shader, please?
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

Sure, it'll take me a minute as it's actually partially generated code... are you able to come on IRC?
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

I'll be heading to bed, soon. So take your time :)
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

Actually I forgot I don't generate Cg, I instead use a macro in the cg and generate a very long list of command line parameters for the cg compiler:

-O3 -DUSE_VERTEX_COLOURS=0 -DUSE_DIFFUSE_MAP=0 -DPREMULTIPLIED_ALPHA=0 -DUSE_NORMAL_MAP=0 -DUSE_SPECULAR_MAP=0 -DUSE_SPECULAR_FROM_DIFFUSE=0 -DUSE_SPECULAR_FROM_DIFFUSE_ALPHA=0 -DUSE_GLOSS_FROM_SPECULAR_ALPHA=0 -DUSE_TRANSLUCENCY_MAP=0 -DUSE_PAINT_MAP=0 -DUSE_PAINT_COLOUR=0 -DUSE_PAINT_MASK=0 -DUSE_PAINT_ALPHA=0 -DUSE_MICROFLAKES=0 -DUSE_STIPPLE_TEXTURE=0 -DUSE_OVERLAY_OFFSET=0 -DBLEND=1 -DBLENDED_BONES=0 -DFLIP_BACKFACE_NORMALS=1 -DWORLD_GEOMETRY=1 -DSHADING_MODEL=0 -DRECEIVE_SHADOWS=1 -DUSE_FOG=1 -DNO_TEXTURE_LOOKUPS=0 -DUSE_HEIGHTMAP_BLENDING=1 -DGAMMA_CORRECTION_IN=2.2 -DGAMMA_CORRECTION_OUT=2.2 -DUSE_REVERSE_SPECULAR=1 -DMAX_LIGHT_RANGE=1 -DSHADOW_PADDING=4 -DSHADOW_DIST1=14 -DSHADOW_DIST2=30 -DSHADOW_DIST3=500 -DSHADOW_FADE_START=500 -DSHADOW_FADE_END=500 -DSHADOW_RES=2048 -DEMULATE_PCF=0 -DSHADOW_FILTER_TAPS=16 -DSHADOW_FILTER_NOISE=0 -DSHADOW_FILTER_DITHER=1 -DSPREAD1=8 -DSPREAD2=8 -DSPREAD3=2.24 -DEXTRA_MAPS="" -DEXTRA_MAPS_AGAIN="" -DEXTRA_MAP_ARGS="" -DBLEND_UNIFORMS="uniform float spec_diff_brightness0,uniform float spec_diff_contrast0,uniform float2 uv_animation0,uniform float2 uv_scale0,uniform float2 uv0," -DDIFFUSE_MAPS="" -DNORMAL_MAPS="" -DSPEC_MAPS="" -DTRAN_MAPS="" -DBLEND_UNIFORM_ARGS="spec_diff_brightness0, spec_diff_contrast0, uv_animation0, uv_scale0, uv0_, " -DUVS="uv0, " -DUV_SCALES="uv_scale0, " -DUV_ANIMATIONS="uv_animation0, " -DSPEC_DIFF_CONTRASTS="spec_diff_contrast0, " -DSPEC_DIFF_BRIGHTNESSES="spec_diff_brightness0, " -DSHADOW_MAPS="uniform sampler2D shadow_map1 : register(s0),uniform sampler2D shadow_map2 : register(s1),uniform sampler2D shadow_map3 : register(s2)," -DFORWARD_PART=1 -DDEFERRED_AMBIENT_SUN_PART=1 -DDEFERRED_LIGHTS_PART=0


Code: Select all

// (c) David Cunningham 2011, Licensed under the MIT license: http://www.opensource.org/licenses/mit-license.php

#include <system/uber.cgh>
#include <system/uber_recv.cgh>

#if EMISSIVE_PART==1

// {{{ EMISSIVE VERTEX PROGRAM
void vp_main (
        in float3 pos_os : POSITION,
        in float3 normal_os : POSITION,
        #if USE_EMISSIVE_MAP==1
        in float2 uv0 : TEXCOORD0,
        #endif
        #if WORLD_GEOMETRY==1
        in float visibility : TEXCOORD1,
        #endif

        #if BLENDED_BONES>0
        in float4 bone : BLENDINDICES,
        in float4 bone_weights : BLENDWEIGHT,
        #endif
        uniform float3x4 bone_matrixes[70],
        uniform float4x4 world,
        uniform float4x4 view,
        uniform float4x4 proj,
        uniform float3 camera_pos_ws,

        out float4 out00 : TEXCOORD0,
        out float4 for_rasteriser : POSITION
) {
        float2 uv0_ = float2(1,1);
        #if USE_EMISSIVE_MAP==1
                uv0_ = uv0;
        #endif


        float3 pos_ws;
        float3 normal_ws;
        float3 tangent_ws;
        float3 tangent_os;

        #if BLENDED_BONES>0
                transform_vertex_bones(bone_weights, bone_matrixes, bone, pos_os, normal_os, tangent_os,
                                       pos_ws, normal_ws, tangent_ws);
        #else
                transform_vertex(world, pos_os, normal_os, tangent_os,
                                 pos_ws, normal_ws, tangent_ws);
        #endif
        pos_ws += ground_overlay_offset(pos_ws, normal_ws, camera_pos_ws);

        float3 pos_vs = mul(view, float4(pos_ws,1)).xyz;
        for_rasteriser = mul(proj, float4(pos_vs,1));

        out00.xy = uv0_.xy;

        #if WORLD_GEOMETRY==1
                out00.z = visibility;
        #endif

        out00.w = -pos_vs.z;
}
// }}}

// {{{ EMISSIVE FRAGMENT PROGRAM
void fp_main (
        in float4 out00 : TEXCOORD0,

        in float2 wpos : WPOS,

        sampler2D emissive_map0 : register(s0),

        uniform float2 uv_animation0,
        uniform float2 uv_scale0,

        uniform float3 surf_emissive,

        uniform float4 the_fog_params,
        uniform float4 custom_param,
        uniform float4 time,

        out float4 pixel0 : COLOR0
) {

        float2 uv0_ = out00.xy;
        #if WORLD_GEOMETRY==1
        float visibility = out00.z;
        #else
        float visibility = custom_param.x;
        #endif
        float cam_dist = out00.w;

        float3 emissive_colour = surf_emissive;
        float emissive_alpha = 1;
        #if USE_EMISSIVE_MAP==1
                float2 uv = uv_scale0 * uv0_;
                uv += time.x*uv_animation0;
                float4 emi_texel = tex2D(emissive_map0, uv);
                emissive_colour *= gamma_correct(emi_texel.rgb);
                emissive_alpha *= emi_texel.a;
        #endif

        emissive_colour *= visibility;
        emissive_colour *= fog_weakness(the_fog_params.x, cam_dist);

        pixel0.rgb = emissive_colour * emissive_alpha;
}
// }}}

#endif // EMISSIVE_PART==1


#if FORWARD_PART==1 && DEFERRED_AMBIENT_SUN_PART==1

// {{{ RECEIVER VERTEX PROGRAM
void vp_main (
        in float3 pos_os : POSITION,
        in float3 normal_os : NORMAL,
        #if USE_NORMAL_MAP==1
        in float4 tangent_with_parity : TANGENT,
        #endif
        #if USE_DIFFUSE_MAP==1 || USE_NORMAL_MAP==1 || USE_SPECULAR_MAP==1 || USE_TRANSLUCENCY_MAP==1 || USE_PAINT_MAP==1
        in float2 uv0 : TEXCOORD0,
        #endif
        #if USE_VERTEX_COLOURS==3
        in float3 colour : COLOR,
        #endif
        #if USE_VERTEX_COLOURS==4
        in float4 colour : COLOR,
        #endif
        #if BLEND>1
        in float sharpness : TEXCOORD1,
        in float3 blend0 : TEXCOORD2, // tex 1, 2, 3  (0 we get for free)
        #endif
        #if BLEND>4
        in float3 blend1 : TEXCOORD3, // tex 4, 5, 6
        #endif
        #if BLEND>7
        in float3 blend2 : TEXCOORD4, // tex 7, 8, 9
        #endif
        #if WORLD_GEOMETRY==1
        in float visibility : TEXCOORD1,
        #endif

        #if BLENDED_BONES>0
        in float4 bone : BLENDINDICES,
        in float4 bone_weights : BLENDWEIGHT,
        #endif
        uniform float3x4 bone_matrixes[70],
        uniform float4x4 world,
        uniform float4x4 view,
        uniform float4x4 proj,
        uniform float4x4 shadow_view_proj1,
        uniform float4x4 shadow_view_proj2,
        uniform float4x4 shadow_view_proj3,
        uniform float4 sun_pos_ws,
        uniform float3 camera_pos_ws,

        out float4 out00 : TEXCOORD0,
        out float4 out01 : TEXCOORD1,
        out float4 out02 : TEXCOORD2,
        out float4 out03 : TEXCOORD3,
        out float4 out04 : TEXCOORD4,
        out float4 out05 : TEXCOORD5,
        out float4 out06 : TEXCOORD6,
        out float4 out07 : TEXCOORD7,
        out float4 for_rasteriser : POSITION,
        out float4 vert_colour : COLOR0
) {
        #if USE_NORMAL_MAP==1
        float3 tangent_os = tangent_with_parity.xyz;
        float tangent_parity = tangent_with_parity.w;
        #endif

        vert_colour = float4(1, 1, 1, 1);

        #if USE_VERTEX_COLOURS>0
                vert_colour.rgb *= colour.rgb;
                #if USE_VERTEX_COLOURS==4
                        vert_colour.a *= colour.a;
                #endif
        #endif

        // we need the worldspace one in order to calculate the distance from the sun plane, which is the plane
        // that is perpendicular to the sun direction and intersects (0,0,0).
        // viewspace is not good enough because (0,0,0) is not the same as in the shadow casting phase.
        float3 sun_dir_ws = sun_pos_ws.xyz;  // assume directional light, vector points towards sun

        float2 uv0_ = float2(1,1);
        #if USE_DIFFUSE_MAP==1 || USE_NORMAL_MAP==1 || USE_SPECULAR_MAP==1 || USE_TRANSLUCENCY_MAP==1 || USE_PAINT_MAP==1
                uv0_ = uv0;
        #endif


        float3 pos_ws;
        float3 normal_ws;
        float3 tangent_ws;
        #if USE_NORMAL_MAP==0
        float3 tangent_os;
        #endif

        #if BLENDED_BONES>0
                transform_vertex_bones(bone_weights, bone_matrixes, bone, pos_os, normal_os, tangent_os,
                                       pos_ws, normal_ws, tangent_ws);
        #else
                transform_vertex(world, pos_os, normal_os, tangent_os,
                                 pos_ws, normal_ws, tangent_ws);
        #endif
        pos_ws += ground_overlay_offset(pos_ws, normal_ws, camera_pos_ws);

        float3 camera_dir_ws = camera_pos_ws - pos_ws;

        // pos position in light (sun) space for shadow mapping
        #if RECEIVE_SHADOWS == 1
                float4 pos_ls1 = mul(shadow_view_proj1, float4(pos_ws,1));
                float3 pos_ls1_ = pos_ls1.xyw;
                float4 pos_ls2 = mul(shadow_view_proj2, float4(pos_ws,1));
                float3 pos_ls2_ = pos_ls2.xyw;
                float4 pos_ls3 = mul(shadow_view_proj3, float4(pos_ws,1));
                float3 pos_ls3_ = pos_ls3.xyw;
        #endif

        float3 pos_vs = mul(view, float4(pos_ws,1)).xyz;
        for_rasteriser = mul(proj, float4(pos_vs,1));

        float sun_dist_ = -dot(pos_ws,sun_dir_ws);

        #if RENDER_OBJECT_NORMAL==1
                vert_colour.rgb = zero(vert_colour.rgb) + direction_to_colour(normal_os);
        #endif
        #if RENDER_NORMAL==1
                vert_colour.rgb = zero(vert_colour.rgb) + direction_to_colour(normal_ws);
        #endif
        #if RENDER_TANGENT==1
                #if USE_NORMAL_MAP==1
                        vert_colour.rgb = zero(vert_colour.rgb) + direction_to_colour(tangent_ws);
                #endif
        #endif

        out00.xyz = (normal_ws);
        out01.xyz = (sun_dir_ws);
        out02.xyz = (camera_dir_ws);
        #if USE_NORMAL_MAP==1
                out03.xyz = (tangent_ws);
                out04.x = tangent_parity;
        #endif

        out00.w = sun_dist_;
        out01.w = -pos_vs.z;

        out02.w = uv0_.x;

        out03.w = uv0_.y;

        out04.y = 0;
        out04.z = 0;

        #if RECEIVE_SHADOWS == 1
                out05.xyz = pos_ls1_;
                out06.xyz = pos_ls2_;
                out07.xyz = pos_ls3_;
        #endif

        #if BLEND>1
        // ranges from 1 (very smooth transition) to lots (sharp transition)
        // This is used to exagerate (or soften) the blend
                float contrast = 1/max((1-sharpness)*(1-sharpness), 0.0000001);
        #endif

        #if BLEND>1
                out04.w = contrast;
                
                out05.w = blend0.r;
                out06.w = blend0.g;
                out07.w = blend0.b;
        #endif

        #if USE_MICROFLAKES == 1
                out05.w = pos_os.x;
                out06.w = pos_os.y;
                out07.w = pos_os.z;
        #endif

        #if WORLD_GEOMETRY==1
                out04.w = visibility;
        #endif

}
// }}}

// {{{ RECEIVE FRAGMENT PROGRAM
void fp_main (
        in float4 vert_colour : COLOR0,
        in float face : FACE,
        in float4 out00 : TEXCOORD0,
        in float4 out01 : TEXCOORD1,
        in float4 out02 : TEXCOORD2,
        in float4 out03 : TEXCOORD3,
        in float4 out04 : TEXCOORD4,
        in float4 out05 : TEXCOORD5,
        in float4 out06 : TEXCOORD6,
        in float4 out07 : TEXCOORD7,

        in float2 wpos : WPOS,

        SHADOW_MAPS

        EXTRA_MAPS

        BLEND_UNIFORMS

        uniform float4 surf_diffuse,
        uniform float3 surf_specular,
        uniform float surf_gloss,

        uniform float3 sun_diffuse,
        uniform float3 sun_specular,
        uniform float4 the_fog_params,
        uniform float3 the_fog_colour,
        uniform float4 custom_param,
        uniform float alpha_rej,
        uniform float3 misc,
        uniform float4 time,
        uniform float shadow_oblique_cutoff,
        uniform float3 scene_ambient_colour,
        uniform float3 texture_size,
        uniform float4 col1,
        uniform float4 col2,
        uniform float4 col3,
        uniform float4 col4,
        uniform float3 col_spec1,
        uniform float3 col_spec2,
        uniform float3 col_spec3,
        uniform float3 col_spec4,
        uniform float microflakes_mask,
        uniform float render_target_flipping,

        out float4 pixel0 : COLOR0
) {


        // {{{ decoding interpolators

        float3 normal_ws = normalize(out00.xyz);
        float3 sun_dir_ws = normalize(out01.xyz);
        float3 camera_dir_ws = normalize(out02.xyz);
        #if USE_NORMAL_MAP==1
                float3 tangent_ws = normalize(out03.xyz);
                float tangent_parity = out04.x;
        #endif

        float sun_dist_ = out00.w;
        float cam_dist = out01.w;

        float surf_shadow_strength = misc.x;
        float sky_light_strength = misc.y;

        float2 uv0_ = float2(out02.w, out03.w);

        float3 pos_ls1_;
        float3 pos_ls2_;
        float3 pos_ls3_;
        #if RECEIVE_SHADOWS == 1
                pos_ls1_ = out05.xyz;
                pos_ls2_ = out06.xyz;
                pos_ls3_ = out07.xyz;
        #endif

        float2 screen_pos = wpos.xy;

        #if BLEND <= 1
                float blend[BLENDSZ] = {1};
        #elif BLEND <= 4
                float blend[BLENDSZ] = { 0.5, out05.w, out06.w, out07.w };
                /* too many blends is bad for performance
                #elif BLEND <= 7
                float blend[7] = { 0.5, out05.w, out06.w, out07.w, out04.r, out04.g, out04.b };
                #elif BLEND <= 10
                float blend[10] = { 0.5, out05.w, out06.w, out07.w, out04.r, out04.g, out04.b, out03.r, out03.g, out03.b };
                */
        #endif
        #if USE_MICROFLAKES==1
                float3 pos_os = float3(out05.w, out06.w, out07.w);
        #endif
        #if WORLD_GEOMETRY==1
        float visibility = out04.w;
        #else
        float visibility = custom_param.x;
        #endif

        #if BLEND <= 1
                float contrast = 1;
        #else
                float contrast = out04.w;
        #endif


        // }}}

        float3 diff_colour;
        float3 spec_colour;
        float gloss;
        float translucency;
        float pixel_alpha;

        #if ABUSING_AMBIENT==1
        surf_specular = vert_colour.rgb;
        #endif

        forward_pass(vert_colour.rgb, vert_colour.a,
                     face,
                     surf_diffuse.rgb, surf_diffuse.a, surf_specular, surf_gloss,
                     alpha_rej,
                     time,
                     col1, col2, col3, col4,
                     col_spec1, col_spec2, col_spec3, col_spec4,
                     microflakes_mask,
                     #if USE_MICROFLAKES==1
                     pos_os,
                     #endif
                     render_target_flipping,
                     contrast,
                     #if USE_NORMAL_MAP==1
                     tangent_ws,
                     tangent_parity,
                     #endif
                     blend,
                     visibility,
                     screen_pos,
                     texture_size,
                     EXTRA_MAP_ARGS
                     BLEND_UNIFORM_ARGS

                     diff_colour, normal_ws, spec_colour, gloss, translucency, pixel_alpha);

        float3 sky_ws = float3(0,0,1);

        float3 pixel_colour = deferred_shading(diff_colour,
                                               normal_ws,
                                               spec_colour,
                                               gloss,
                                               translucency,
                                               cam_dist,
                                               sky_ws,
  
                                               sun_dir_ws,
                                               camera_dir_ws,
                                               shadow_oblique_cutoff,
                                               sun_dist_,
                                               screen_pos,
                                               surf_shadow_strength,
                                               #if RECEIVE_SHADOWS == 1
                                               shadow_map1,
                                               shadow_map2,
                                               shadow_map3,
                                               #endif
                                               #if SHADOW_FILTER_NOISE == 1
                                               shadow_filter_noise,
                                               #endif
                                               pos_ls1_,
                                               pos_ls2_,
                                               pos_ls3_,
                                               scene_ambient_colour,
                                               sun_diffuse,
                                               sun_specular,
                                               sky_light_strength,
                                               the_fog_params.x,
                                               the_fog_colour);

        pixel0.rgb = tone_map(pixel_colour)/MAX_LIGHT_RANGE;
        pixel0.a = pixel_alpha;
}
// }}}

#endif



#if FORWARD_PART==1 && DEFERRED_AMBIENT_SUN_PART==0

// {{{ FORWARD RECEIVER VERTEX PROGRAM
void vp_main (
        in float3 pos_os : POSITION,
        in float3 normal_os : NORMAL,
        #if USE_NORMAL_MAP==1
        in float4 tangent_with_parity : TANGENT,
        #endif
        #if USE_DIFFUSE_MAP==1 || USE_NORMAL_MAP==1 || USE_SPECULAR_MAP==1 || USE_TRANSLUCENCY_MAP==1 || USE_PAINT_MAP==1
        in float2 uv0 : TEXCOORD0,
        #endif
        #if USE_VERTEX_COLOURS==3
        in float3 colour : COLOR,
        #endif
        #if USE_VERTEX_COLOURS==4
        in float4 colour : COLOR,
        #endif
        #if BLEND>1
        in float sharpness : TEXCOORD1,
        in float3 blend0 : TEXCOORD2, // tex 1, 2, 3  (0 we get for free)
        #endif
        #if BLEND>4
        in float3 blend1 : TEXCOORD3, // tex 4, 5, 6
        #endif
        #if BLEND>7
        in float3 blend2 : TEXCOORD4, // tex 7, 8, 9
        #endif
        #if WORLD_GEOMETRY==1
        in float visibility : TEXCOORD1,
        #endif

        #if BLENDED_BONES>0
        in float4 bone : BLENDINDICES,
        in float4 bone_weights : BLENDWEIGHT,
        #endif
        uniform float3x4 bone_matrixes[70],
        uniform float4x4 world,
        uniform float4x4 view,
        uniform float4x4 proj,

        uniform float3 camera_pos_ws,

        out float4 out00 : TEXCOORD0,
        out float4 out01 : TEXCOORD1,
        out float4 out02 : TEXCOORD2,
        out float4 out03 : TEXCOORD3,
        out float4 out04 : TEXCOORD4,
        out float4 out05 : TEXCOORD5,
        out float4 out06 : TEXCOORD6,
        out float4 out07 : TEXCOORD7,
        out float4 for_rasteriser : POSITION,
        out float4 vert_colour : COLOR0
) {

        #if USE_NORMAL_MAP==1
        float3 tangent_os = tangent_with_parity.xyz;
        float tangent_parity = tangent_with_parity.w;
        #endif

        vert_colour = float4(1,1,1, 1);

        #if USE_VERTEX_COLOURS>0
                vert_colour.rgb *= colour.rgb;
                #if USE_VERTEX_COLOURS==4
                        vert_colour.a *= colour.a;
                #endif
        #endif

        float2 uv0_ = float2(1,1);
        #if USE_DIFFUSE_MAP==1 || USE_NORMAL_MAP==1 || USE_SPECULAR_MAP==1 || USE_TRANSLUCENCY_MAP==1 || USE_PAINT_MAP==1
                uv0_ = uv0;
        #endif


        float3 pos_ws;
        float3 normal_ws;
        float3 tangent_ws;
        #if USE_NORMAL_MAP==0
        float3 tangent_os;
        #endif

        #if BLENDED_BONES>0
                transform_vertex_bones(bone_weights, bone_matrixes, bone, pos_os, normal_os, tangent_os,
                                       pos_ws, normal_ws, tangent_ws);
        #else
                transform_vertex(world, pos_os, normal_os, tangent_os,
                                 pos_ws, normal_ws, tangent_ws);
        #endif
        pos_ws += ground_overlay_offset(pos_ws, normal_ws, camera_pos_ws);

        float3 pos_vs = mul(view, float4(pos_ws,1)).xyz;
        for_rasteriser = mul(proj, float4(pos_vs,1));

        #if RENDER_OBJECT_NORMAL==1
                vert_colour.rgb = zero(vert_colour.rgb) + direction_to_colour(normal_os);
        #endif
        #if RENDER_NORMAL==1
                vert_colour.rgb = zero(vert_colour.rgb) + direction_to_colour(normal_ws);
        #endif
        #if RENDER_TANGENT==1
                #if USE_NORMAL_MAP==1
                        vert_colour.rgb = zero(vert_colour.rgb) + direction_to_colour(tangent_ws);
                #endif
        #endif

        out00.xyz = (normal_ws);
        #if USE_NORMAL_MAP==1
                out03.xyz = (tangent_ws);
                out04.x = tangent_parity;
        #endif

        out01.w = -pos_vs.z;

        out02.w = uv0_.x;

        out03.w = uv0_.y;

        out04.y = 0;
        out04.z = 0;

        #if BLEND>1
        // ranges from 1 (very smooth transition) to lots (sharp transition)
        // This is used to exagerate (or soften) the blend
                float contrast = 1/max((1-sharpness)*(1-sharpness), 0.0000001);
        #endif

        #if BLEND>1
                out04.w = contrast;
                
                out05.w = blend0.r;
                out06.w = blend0.g;
                out07.w = blend0.b;
        #endif

        #if USE_MICROFLAKES == 1
                out05.w = pos_os.x;
                out06.w = pos_os.y;
                out07.w = pos_os.z;
        #endif

        #if WORLD_GEOMETRY==1
                out04.w = visibility;
        #endif

}
// }}}

// {{{ FORWARD RECEIVE FRAGMENT PROGRAM
void fp_main (
        in float4 vert_colour : COLOR0,
        in float face : FACE,
        in float4 out00 : TEXCOORD0,
        in float4 out01 : TEXCOORD1,
        in float4 out02 : TEXCOORD2,
        in float4 out03 : TEXCOORD3,
        in float4 out04 : TEXCOORD4,
        in float4 out05 : TEXCOORD5,
        in float4 out06 : TEXCOORD6,
        in float4 out07 : TEXCOORD7,

        in float2 wpos : WPOS,

        EXTRA_MAPS

        BLEND_UNIFORMS

        uniform float4 surf_diffuse,
        uniform float3 surf_specular,
        uniform float surf_gloss,
        uniform float shadow_oblique_cutoff,

        uniform float4 custom_param,
        uniform float alpha_rej,
        uniform float4 time,
        uniform float3 texture_size,
        uniform float4 col1,
        uniform float4 col2,
        uniform float4 col3,
        uniform float4 col4,
        uniform float3 col_spec1,
        uniform float3 col_spec2,
        uniform float3 col_spec3,
        uniform float3 col_spec4,
        uniform float microflakes_mask,
        uniform float render_target_flipping,
        uniform float far_clip_distance,

        out float4 pixel0 : COLOR0,
        out float4 pixel1 : COLOR1,
        out float4 pixel2 : COLOR2,
        out float4 pixel3 : COLOR3
) {

        // {{{ decoding interpolators

        float3 normal_ws = normalize(out00.xyz);
        #if USE_NORMAL_MAP==1
                float3 tangent_ws = normalize(out03.xyz);
                float tangent_parity = out04.x;
        #endif

        float2 uv0_ = float2(out02.w, out03.w);
        float cam_dist = out01.w;

        float2 screen_pos = wpos.xy;

        #if BLEND <= 1
                float blend[BLENDSZ] = {1};
        #elif BLEND <= 4
                float blend[BLENDSZ] = { 0.5, out05.w, out06.w, out07.w };
                /* too many blends is bad for performance
                #elif BLEND <= 7
                float blend[7] = { 0.5, out05.w, out06.w, out07.w, out04.r, out04.g, out04.b };
                #elif BLEND <= 10
                float blend[10] = { 0.5, out05.w, out06.w, out07.w, out04.r, out04.g, out04.b, out03.r, out03.g, out03.b };
                */
        #endif
        #if USE_MICROFLAKES==1
                float3 pos_os = float3(out05.w, out06.w, out07.w);
        #endif
        #if WORLD_GEOMETRY==1
        float visibility = out04.w;
        #else
        float visibility = custom_param.x;
        #endif

        #if BLEND <= 1
                float contrast = 1;
        #else
                float contrast = out04.w;
        #endif

        // }}}

        float3 diff_colour;
        float3 spec_colour;
        float gloss;
        float translucency;
        float pixel_alpha;

        #if ABUSING_AMBIENT==1
        surf_specular = vert_colour.rgb;
        #endif

        forward_pass(vert_colour.rgb, vert_colour.a,
                     face,
                     surf_diffuse.rgb, surf_diffuse.a, surf_specular, surf_gloss,
                     alpha_rej,
                     time,
                     col1, col2, col3, col4,
                     col_spec1, col_spec2, col_spec3, col_spec4,
                     microflakes_mask,
                     #if USE_MICROFLAKES==1
                     pos_os,
                     #endif
                     render_target_flipping,
                     contrast,
                     #if USE_NORMAL_MAP==1
                     tangent_ws,
                     tangent_parity,
                     #endif
                     blend,
                     visibility,
                     screen_pos,
                     texture_size,
                     EXTRA_MAP_ARGS
                     BLEND_UNIFORM_ARGS

                     diff_colour, normal_ws, spec_colour, gloss, translucency, pixel_alpha);

        cam_dist = cam_dist / far_clip_distance;

        pack_deferred(pixel0, pixel1, pixel2, pixel3,
                      shadow_oblique_cutoff, diff_colour, normal_ws, spec_colour, cam_dist, gloss);


}
// }}}

#endif



#if FORWARD_PART==0 && DEFERRED_AMBIENT_SUN_PART==1

// {{{ DEFERRED RECEIVER VERTEX PROGRAM
void vp_main (
        in float4 pos_ss : POSITION,

        uniform float4x4 quad_proj,
        uniform float3 top_left_ray,
        uniform float3 top_right_ray,
        uniform float3 bottom_left_ray,
        uniform float3 bottom_right_ray,

        out float2 uv0_ : TEXCOORD0,
        out float3 ray_ : TEXCOORD1,
        out float4 for_rasteriser : POSITION
) {
        for_rasteriser = mul(quad_proj, pos_ss);
        //for_rasteriser = float4(pos_ss.xy, 0, 1);

        // D3D9 notes:
        // this is pre-rasterisation so use the offset values
        // this will give slightly offset rays but the rasteriser
        // will correct them
        uv0_ = (pos_ss.xy) * float2(0.5,-0.5) + float2(0.5,0.5);
        ray_ = lerp(
                lerp(top_left_ray,top_right_ray, uv0_.x),
                lerp(bottom_left_ray,bottom_right_ray, uv0_.x),
                uv0_.y);

        uv0_ = sign(pos_ss.xy) * float2(0.5,-0.5) + float2(0.5,0.5);
}
// }}}

// {{{ DEFERRED RECEIVER FRAGMENT PROGRAM
void fp_main (

        in float2 uv : TEXCOORD0,
        in float3 ray : TEXCOORD1,
        //in float3 ray2 : TEXCOORD2,
        in float2 wpos : WPOS,

        SHADOW_MAPS

        uniform float3 scene_ambient_colour,
        uniform float3 sun_diffuse,
        uniform float3 sun_specular,
        uniform float4 the_fog_params,
        uniform float3 the_fog_colour,
        uniform float3 misc,
        uniform float4 sun_pos_ws,
        uniform float4x4 shadow_view_proj1,
        uniform float4x4 shadow_view_proj2,
        uniform float4x4 shadow_view_proj3,
        uniform float3 camera_pos_ws,
        uniform float far_clip_distance,
        uniform float near_clip_distance,

        uniform float4x4 view_proj, // for depth

        sampler2D tex0 : register(s0),
        sampler2D tex1 : register(s1),
        sampler2D tex2 : register(s2),
        sampler2D tex3 : register(s3),

        out float4 pixel : COLOR0,
        out float depth : DEPTH
) {

        float shadow_oblique_cutoff;
        float3 diff_colour;
        float3 normal_ws;
        float3 spec_colour;
        float normalised_cam_dist;
        float gloss;
        float4 texel0 = tex2D(tex0, uv);
        unpack_deferred(texel0, tex2D(tex1, uv), tex2D(tex2, uv), tex2D(tex3, uv), 
                        shadow_oblique_cutoff, diff_colour, normal_ws, spec_colour, normalised_cam_dist, gloss);
        float translucency = 0;

        if (normalised_cam_dist>=1) discard;

        float3 pos_ws = camera_pos_ws + normalised_cam_dist*ray;
        //float3 pos_vs = mul(view, float4(pos_ws,1)).rgb;

        float cam_dist = normalised_cam_dist * far_clip_distance;
        ray = normalize(ray);

        float3 sun_dir_ws = sun_pos_ws.xyz;  // assume directional light, vector points towards sun
        //float3 sun_dir_vs = mul(view,float4(sun_dir_ws,0)).xyz;  // assume directional light, vector points towards sun

        float3 camera_dir_ws = -ray;
        //float3 camera_dir_vs = mul(view,float4(camera_dir_ws,0)).rgb;

        float sun_dist = -dot(pos_ws,sun_dir_ws);

        float surf_shadow_strength = misc.x;
        float sky_light_strength = misc.y;

        float3 pos_ls1_;
        float3 pos_ls2_;
        float3 pos_ls3_;
        #if RECEIVE_SHADOWS == 1
                float4 pos_ls1 = mul(shadow_view_proj1, float4(pos_ws,1));
                pos_ls1_ = pos_ls1.xyw;
                float4 pos_ls2 = mul(shadow_view_proj2, float4(pos_ws,1));
                pos_ls2_ = pos_ls2.xyw;
                float4 pos_ls3 = mul(shadow_view_proj3, float4(pos_ws,1));
                pos_ls3_ = pos_ls3.xyw;
        #endif

        float2 screen_pos = wpos.xy;

        float3 sky_ws = float3(0,0,1);

        float3 pixel_colour = deferred_shading(diff_colour,
                                               normal_ws,
                                               spec_colour,
                                               gloss,
                                               translucency,
                                               cam_dist,
                                               sky_ws,
  
                                               sun_dir_ws,
                                               camera_dir_ws,
                                               shadow_oblique_cutoff,
                                               sun_dist,
                                               screen_pos,
                                               surf_shadow_strength,
                                               #if RECEIVE_SHADOWS == 1
                                               shadow_map1,
                                               shadow_map2,
                                               shadow_map3,
                                               #endif
                                               #if SHADOW_FILTER_NOISE == 1
                                               shadow_filter_noise,
                                               #endif
                                               pos_ls1_, pos_ls2_, pos_ls3_,
                                               scene_ambient_colour,
                                               sun_diffuse,
                                               sun_specular,
                                               sky_light_strength,
                                               the_fog_params.x,
                                               the_fog_colour);

        pixel.rgb = tone_map(pixel_colour) / MAX_LIGHT_RANGE;
        pixel_colour /= max(1, max(pixel_colour.r, max(pixel_colour.g, pixel_colour.b)));
        // view space is right-handed -- negative Z is depth
        //pos_vs.z = min(-near_clip_distance, pos_vs.z);
        //pixel.rgb = zero(pixel.rgb) + float3(1,1,1) * mod(cam_dist,1);
        //pixel.rgb = zero(pixel.rgb) + mod(float3(64,0,0) * texel0.rgb, 1);
        //pixel.rgb = zero(pixel.rgb) + mod(dot(float3(256*256*255,256*255,255), texel0.rgb)/(256*256*256-1)*8, 1);
        //if (texel0.g > 1.0001) pixel.rgb = float3(1,0,0);
        //pixel.rgb = zero(pixel.rgb) + float3(-normalised_cam_dist,0,normalised_cam_dist);
        //pixel.rgb = zero(pixel.rgb) + mod(pos_ws, 1.0);
        //pixel.rgb = zero(pixel.rgb) + tex2D(tex0, uv).rgb;
        //pixel.rgb = zero(pixel.rgb) + direction_to_colour(ray);
        //pixel.rgb = zero(pixel.rgb) + tone_map(diff_colour);
        pixel.a = 1;


        float4 projected = mul(view_proj, float4(pos_ws,1));
        // Whether we are using d3d9 or gl rendersystems,
        // ogre gives us the view_proj in a 'standard' form, which is
        // right-handed with a depth range of [-1,+1].
        // Since we are outputing depth in the fragment shader, the range is [0,1]
        depth = 0.5 + (projected.z / projected.w) / 2.0;
        //depth = 1;
}
// }}}

#endif


#if DEFERRED_LIGHTS_PART==1

// {{{ DEFERRED LIGHTS VERTEX PROGRAM
void vp_main (
        in float3 pos_ws : POSITION,
        in float3 light_aim_ws : NORMAL,
        in float3 diff_colour : COLOR0,
        in float3 spec_colour : COLOR1,
        in float3 light_pos_ws : TEXCOORD0,
        in float3 light_param : TEXCOORD1,

        uniform float4x4 view_proj,
        uniform float render_target_flipping,

        out float3 uv0_ : TEXCOORD0,
        out float3 light_aim_ws_ : TEXCOORD2,
        out float3 diff_colour_ : TEXCOORD3,
        out float3 spec_colour_ : TEXCOORD4,
        out float3 light_pos_ws_ : TEXCOORD5,
        out float3 light_param_ : TEXCOORD6,
        out float4 for_rasteriser : POSITION
) {
        for_rasteriser = mul(view_proj, float4(pos_ws,1));
        uv0_ = for_rasteriser.xyw;
        for_rasteriser.y *= render_target_flipping;

        light_aim_ws_ = light_aim_ws;
        diff_colour_ = diff_colour;
        spec_colour_ = spec_colour;
        light_pos_ws_ = light_pos_ws;
        light_param_ = light_param;
}
// }}}

// {{{ DEFERRED LIGHTS FRAGMENT PROGRAM
void fp_main (

        in float3 uv_ : TEXCOORD0,
        //in float3 ray : TEXCOORD1,
        in float3 light_aim_ws_ : TEXCOORD2,
        in float3 diff_colour_ : TEXCOORD3,
        in float3 spec_colour_ : TEXCOORD4,
        in float3 light_pos_ws_ : TEXCOORD5,
        in float3 light_param_ : TEXCOORD6,

        uniform float3 top_left_ray,
        uniform float3 top_right_ray,
        uniform float3 bottom_left_ray,
        uniform float3 bottom_right_ray,

        uniform float4 the_fog_params, // attenuate light to hide it behind fog
        uniform float3 camera_pos_ws,
        uniform float far_clip_distance,
        uniform float4 viewport_size,

        sampler2D tex0 : register(s0),
        sampler2D tex1 : register(s1),
        sampler2D tex2 : register(s2),
        sampler2D tex3 : register(s3),

        out float3 pixel : COLOR0
) {

        float2 uv = uv_.xy/uv_.z;
        uv = uv * float2(0.5,-0.5) + float2(0.5,0.5);
        float3 ray = lerp(lerp(top_left_ray,top_right_ray, uv.x),
                          lerp(bottom_left_ray,bottom_right_ray, uv.x),
                          uv.y);
        // hack to stop it 'shimmering' in d3d9
        if (d3d9() > 0) uv -= 0.5 * viewport_size.zw; // zw is 1/w 1/h

        float shadow_oblique_cutoff;
        float3 surf_diff_colour;
        float3 normal_ws;
        float3 surf_spec_colour;
        float normalised_cam_dist;
        float gloss;
        unpack_deferred(tex2D(tex0, uv), tex2D(tex1, uv), tex2D(tex2, uv), tex2D(tex3, uv), 
                        shadow_oblique_cutoff, surf_diff_colour, normal_ws, surf_spec_colour, normalised_cam_dist, gloss);

        if (normalised_cam_dist>=1) discard;

        float3 pos_ws = camera_pos_ws + normalised_cam_dist*ray;

        float3 cam_dist = normalised_cam_dist * far_clip_distance;
        ray = normalize(ray);

        float3 camera_dir_ws = -ray;

        float3 light_ray_ws = light_pos_ws_ - pos_ws;
        float light_dist = length(light_ray_ws);
        float3 light_dir_ws = light_ray_ws / light_dist;

        float inner = light_param_.x;
        float outer = light_param_.y;
        float range = light_param_.z;

        float light_intensity = light_attenuation(light_param_.z, light_dist);

        float angle = -dot(light_aim_ws_, light_dir_ws);
        if (outer != inner) {
                float occlusion = clamp((angle-inner)/(outer-inner), 0.0, 1.0);
                light_intensity *= (1-occlusion);
        }

        float diff_exposure = dot(normal_ws, light_dir_ws);
        float diff_illumination = max(diff_exposure, 0.0);
        float3 diff_component = surf_diff_colour * diff_colour_ * diff_illumination;

        float spec_exposure = -dot(reflect(light_dir_ws, normal_ws), camera_dir_ws);
        float spec_illumination = pow(max(0.0000001,spec_exposure),gloss);
        float3 spec_component = spec_illumination * surf_spec_colour * spec_colour_;

        pixel = light_intensity * (diff_component + spec_component) * fog_weakness(the_fog_params.x, cam_dist);
        //pixel = zero(pixel) + 10.0/255*float3(1,1,1);
        //pixel = zero(pixel) + mod(pos_ws,1);
        //pixel = zero(pixel) + float3(uv,0);
        //pixel = zero(pixel);
}
// }}}

#endif

// vim: ts=8:sw=8:et
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

Actually there are 2 more files. It's probably better to give you the preprocessed output somehow... But there is no material script I can give you as all that is done with procedural code.
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

Maybe just give me the full debug output from the logs (the before and after GLSL code), that might be enough. I have a vague idea, I'll look into it tomorrow.
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

The find/replace in CgProgram::fixHighLevelOutput isn't going to work I'm afraid. It's going to find substrings of identifiers, and not whole identifiers, so it may change uniforms, and indeed variable names and other things more than what you want it to change. E.g. I have a variable called view_proj and a variable called shadow_view_proj, so the view_proj renaming is affecting the wrong variables. There is a possibility that a uniform could be renamed to a variable name, which would cause the variable definition to shadow the uniform definition, and cause the uniform to essentially disappear. I'm not sure if this is actually happening, however, as it seems to require some quite weird variable names. Probably there is some other reason why there is a buffer overrun.

Of course one could find/replaced using something like the \<\> regular expression syntax. But there are deeper issues here. Cg must mangle for a reason. I presume this is to avoid hitting a gl keyword, or it could be to separate things that are in different namespaces in the source language but in the same namespace in the target language (e.g. fields and methods when compiling from java to c++). So if you rename them back again, you face the issues the mangling was trying to avoid. I assume the variables in the generated code are all uniquely named, that would cause a problem as well, and solving that would require a proper parsing / symbol table translation (almost a whole new compiler). But hopefully, there are two reasonable choices I think.

1) Reject CG programs that include variables whose names might cause a problem, i.e. ogre would not support some shaders that are valid cg shaders.
2) Do some translation at uniform binding time in the GpuProgramParams, to translate the ogre names into mangled names

1 seems preferable to me. It means reversing the mangling Cg programs might need some porting in extreme cases but it's a lot simpler to get working
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

You're right, of course. Actually, most of the hack was meant precisely to avoid 2) in the first place, but the more I think about it, the more likely it seems that 2) is the only way to make sure that you don't run into weird find/replace problems as you described. Because even if you discard programs that have keyword issues, that one would still be problematic to fix.

The type declaration change to mat4 etc. has to remain, but that's a lot less problematic because it searches for a complete declaration, so different variables can't interfere with each other. I'll have to think about the easiest way to do the translation in the GpuProgramParameters.
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

Ok, I committed a first try at implementing 2), it's pushed in the repository. From my local tests it appears to be working in principle, but there is at least one strange bug: If I enable the glslX profiles for the sample CelShading shaders, the Ogre head is rendered in red. I have no idea what's causing this, other shaders I tested work fine.

Anyway, give it a try. I don't expect it to work for your case, but it would be interesting to know how much further it gets you :)
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

I narrowed down the issue with the CelShading sample to the custom parameters assigned to the shader. No idea why that is causing an issue; it works for the hlslX profiles, and it worked for glslX with my previous hack. Given that non-custom parameters look good I have no idea what is the issue, but I'll find out sooner or later :)
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

Will do. BTW this looks like the bug I reported with colour_op_ex http://www.ogre3d.org/mantis/view.php?id=541
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

Hm, I don't think so; if I replace the custom parameter with a constant parameter, it works fine. There is something fishy going on when custom parameters are assigned, I'll have to dig into the code.
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

I use a single float4 custom parameter per entity, that may have been the cause of the overrun if it wasn't counted for some reason.
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

Well, so far I haven't gotten any further with the problem, but I thought I'd explain briefly the approach I've taken. In the GLSL/HLSL output produced by Cg I use the comments at the beginning to determine the parameter name changes and record them in a map. I also clean up any matrix declaration in GLSL as I explained earlier. However, I no longer reverse the variable name changes, because as you pointed out, this would need a more sophisticated approach than a simple string replace. (But might still be doable with reasonable effort if the current approach fails.)

Instead I intercept every GpuProgramParameters object that is created for the shader in either createParameters or getDefaultParameters. That parameter object is supplied by the actual GLSL/HLSL shader with a GpuNamedConstants object mapping variable names to physical indices. I take that map and create a new GpuNamedConstants object that contains the original parameter names, then supply it to the GpuProgramParameters object. All user code sees the new mapping with the original parameter names, so all user code should work fine. And in theory, the internal mapping should also be consistent. Indeed, the approach appears to work fine with HLSL shaders, however with GLSL, something seems to get lost along the way. It's not a complete loss, because some shaders do work fine and others (like the CelShading shaders) work partially, but obviously there's still some work to be done. Hopefully I can figure out the missing piece soon, and hopefully it's not going to turn out to be a stopper, but you never know.

In the case of the CelShading sample, it's the custom parameters that do not get through to the shader properly. Their value gets corrupted somewhere. I do not know, however, if those are the only parameter types with issues.
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

I was able to build and test after merging in v1-9

It actually seems to look identical, and I still get the double free crash. I see some of my positions are wrong, as if the world matrix is getting in there corrupted, but only for certain objects. It is very weird. I don't see any issues with uniform names though.

What's with all the [0] variants of the names?
User avatar
sparkprime
Ogre Magi
Posts: 1137
Joined: Mon May 07, 2007 3:43 am
Location: Ossining, New York
x 13

Re: Support for Cg's glslX and hlslX profiles added

Post by sparkprime »

here's a log of it running in valgrind with an empty scene:

http://spark.woaf.net/grit/valgrind_cg_glsl.txt

In valgrind I don't get a crash on exit that i usually get, which i suspect is because valgrind doesn't let buffer overruns corrupt other things, it traps them and ignores the write or something like that.

Fixing that buffer overrun has to be a priority, but I'm not sure exactly which shader it's happening with and what's special about that shader. It could be that it actually happens with all shaders but this is an empty scene so only the skybox is rendered. On theory is that I an rendering an empty pass for my deferred light boxes and coronas, and this is dragging in a particle shader. Take it all with a pinch of salt :)
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

What's with all the [0] variants of the names?
Don't know, they are discovered like that by the GLSL shader and appear to be necessary for some shaders. I don't know enough about GLSL binding procedures to fully understand it.

I will probably have to create a simpler test bed for this; so far I've experimented with the SampleBrowser, but it's limited for actual debugging work. I'll look into the double free, but it may take some time. I have some other things I need to get back to (non-Ogre related).
User avatar
areay
Bugbear
Posts: 819
Joined: Wed May 05, 2010 4:59 am
Location: Auckland, NZ
x 69

Re: Support for Cg's glslX and hlslX profiles added

Post by areay »

First up, great work CABAListic, this is going to make life a helluva lot easier for us.
CABAListic wrote:Hm, I don't think the Cg work broke that as it doesn't touch anything outside of shaders. But if that problem is present also in our vanilla repository, I'd welcome a bug report nonetheless :)
sparkprime wrote:It could be a bug introduced by changes in my fork, but I suspect a regression in Ogre itself. Unless any of the samples use colour_op_ex? Should be easy enough to replace one of the texture units in a sample material with a solid red colour and see if it works.
This problem sounds very much like this bug that was only just fixed a day ago http://www.ogre3d.org/mantis/view.php?id=541
CABAListic
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 2903
Joined: Thu Jan 18, 2007 2:48 pm
x 58

Re: Support for Cg's glslX and hlslX profiles added

Post by CABAListic »

Alright, I found out why custom parameters (and likely other special parameters) aren't working properly with the glsl profiles in my new approach. Basically, every shader parameter set GpuProgramParameters gets a GpuNamedConstants object which maps parameter names to a GpuConstantDefinition. Whenever a named parameter is set, the map is used to find the appropriate GpuConstantDefinition which contains the physicalIndex to tell the param set where to store the param data. The GpuNamedConstants object is usually shared between each param set and the shader program.

In theory one can (with a single const_cast) replace the mapping in the original GpuNamedConstants object to set other parameter names, and that works fine with parameter assignment. Indeed, the Cg hlsl profiles appear to work well this way, so the HLSL programs don't care about this replacement. But GLSL shaders do, which has to do with the fact that they need to assemble a combined link program consisting of fragment and vertex shader and that parameter binding happens in this combined program. GLSLPrograms use a hack and scan their program source to find out the available parameters early and create their GpuNamedConstants object. But only the LinkProgram can assemble the name of actually existing parameter names from the GL API, and then it uses the GLSLProgram's GpuNamedConstants to find out which parameter belongs to which program. Given that this can happen long after parameters have been assigned, if you change the name mapping in the GpuNamedConstants, this lookup will fail.

So instead I created a separate GpuNamedConstants in the CgProgram that was a copy of the original, but with the parameter names replaced, then assigned that one to any created GpuProgramParameters. This still works fine if delegating to an HLSL shader, but with GLSL there is now another problem. And that is that the GpuNamedConstants are unfortunately not constant. The GpuProgramParameters can and will modify certain attributes of a GpuConstantDefinition, mainly the variability parameter that tells Ogre in what context this constant can change. So if you replace the GpuNamedConstants in the parameter set, they will change the parameter in the new map. However, the GLSLLinkProgram uses the original maps to assemble its combined parameter list and therefore uses the original GpuConstantDefinitions when considering variability. So the modifications from the GpuProgramParameters are never seen by the LinkProgram, and therefore the variability is not correct, and shader parameters may not be updated when they need to.

There is no easy way to fix this, imho it's a design flaw in Ogre. A relatively clean way would be to separate the modifiable GpuConstantDefinitions from the named param mapping, i.e. store them separately in the shader program and have the map work on pointers. Then you could replace the GpuNamedConstants object for param sets, and they'd still operate on the original GpuConstantDefinitions if they need to modify them. But this change (even though a normal Ogre user would probably never see it as it's mostly hidden behind Ogre's API) would require modifications to Ogre as well as all of the RenderSystems, and I don't feel comfortable doing it. It just asks for some subtle new bugs in some obscure shader setup that you'd never see when testing the changes until they come bite you in the ass.

So unless you disagree, I'll go back to replacing Cg's new parameter names with the old ones in the source output so that I can leave the parameter mapping untouched. To avoid the previous bug, I'll modify the find/replace so that it looks at the two character before and after each find position to determine if I actually found an instance of the variable I'm searching for, or if the find is part of a larger variable name. As you said, the drawback of this method is that if a user chooses a variable name that is reserved in GLSL, but not in Cg, then this will fail, and probably in a rather unrecognizable way. But I'll see if I can find a comprehensive list of GLSL reserved keywords to compare variable names to so that maybe I can warn the user if this happens.

Also, if you could find a simple repro case for the crashes you encounter, I'd be grateful. Because in my tests so far the only crash I've seen is if I combine vertex and fragment programs of which only one uses the glslX profiles, but the other doesn't. I haven't investigated that one, yet. Reloading a shader appears to work fine on Windows.