Page 1 of 2

Parsing of .material script

Posted: Thu May 21, 2009 12:55 pm
by jblecanard
Hello there

please check out this forum entry : http://www.ogre3d.org/forums/viewtopic.php?f=2&t=50060. It sounds strange to parse correctly commas for floating points but not dots. I failed into submit a ticket in the bug tracker so I posted here. It may be a linux bug and not a windows one (I'm a linux programmer). And if it's not a bug and the normal behavior, I think it should be reported in the corresponding manual section (http://www.ogre3d.org/docs/manual/manual_14.html#SEC23).

Thanks to all developpers for their job !

Regards

Re: Parsing of .material script

Posted: Thu May 21, 2009 1:00 pm
by spacegaier
As I've already posted in your other thread, dots are the right token for floating numbers. I'm quite sure there is no problem with the parser but a problem with your script.

Re: Parsing of .material script

Posted: Thu May 21, 2009 1:06 pm
by jblecanard
I want to believe you, but the facts are here... let's see the evolution of the other post. I created this one before seeing your reply. Sorry !

Regards

Edit : spacegaier and I did not find the origin of the problem. It seems that we need an expert ;)

Re: Parsing of .material script

Posted: Thu May 21, 2009 5:40 pm
by Kentamanos
Is this some sort of localization thing somehow? It seems to be flipping his digit separator and decimal point character?

Re: Parsing of .material script

Posted: Thu May 21, 2009 5:57 pm
by sparkprime

Code: Select all

    Real StringConverter::parseReal(const String& val)
    {
        // Use istringstream for direct correspondence with toString
        std::istringstream str(val);
        Real ret = 0;
        str >> ret;

        return ret;
    }
Seems very likely to me.

Re: Parsing of .material script

Posted: Thu May 21, 2009 5:58 pm
by sparkprime
Run your program with the LANG environment variable set to C

LANG=C ./myapp

Re: Parsing of .material script

Posted: Thu May 21, 2009 6:00 pm
by Klaim
Can't you force the local instead? In the function sparkprime posted?

Re: Parsing of .material script

Posted: Thu May 21, 2009 6:16 pm
by jblecanard
Hu...

I did not use any function for parsing the file, I let ogre do it himself by running "Ogre::ResourceGroupManager::getSingleton().initialiseAllResourceGroups();"

Running the program with LANG=C does not change anything. :(

I agree that it looks like a locale problem but I would be very surprised of it. Do you imagine the parsing of any source code (in C or C++ for example) which depends on the locale ? It would be useless and painful. Writing some .material script is a developper task so it would be surprising that the syntax depends on the language of the developper :s. What if you want share your program with foreigners ?

Re: Parsing of .material script

Posted: Thu May 21, 2009 6:33 pm
by Kentamanos
jblecanard wrote:I agree that it looks like a locale problem but I would be very surprised of it. Do you imagine the parsing of any source code (in C or C++ for example) which depends on the locale ? It would be useless and painful. Writing some .material script is a developper task so it would be surprising that the syntax depends on the language of the developper :s. What if you want share your program with foreigners ?
If it's a localization thing, then I definitely agree it's a bug. Not trying to excuse it, just trying to find the reason. :)

Re: Parsing of .material script

Posted: Thu May 21, 2009 6:59 pm
by jblecanard
Ok...

I should try the sparkprime idea but I do not understand how to use his function. Can you be more accurate sparkprime ?

Re: Parsing of .material script

Posted: Thu May 21, 2009 9:21 pm
by sinbad
He basically means setting the 'LANG' environment variable to 'C' before running your program. However, in practice it would be better to just make the parser able to choose an appropriate locale. I've never really dealt with this before, since I've only worked in regions where decimals are delimited with a point.

It's inappropriate to change parseReal to override the locale universally, because people might use it to parse localised values too. Therefore I guess we need a locale option to each of the conversion methods (don't want to make a static option here, it's unfriendly to threading). Something like this:

Code: Select all

    //-----------------------------------------------------------------------
    Real StringConverter::parseReal(const String& val, const std::locale& loc)
    {
		// Use istringstream for direct correspondence with toString
		std::istringstream str(val);
		str.imbue(loc);
		Real ret = 0;
		str >> ret;

        return ret;
    }
Where 'loc' is defaulted to a static const locale which is initialised to the default. Then, in the parsers, we pass in a locale of std::locale("en_GB") or similar to fix the format.

For people with more localisation experience than me - does that make sense? Is there a better way to do this? Frankly I'm surprised no-one's raised this before.

Re: Parsing of .material script

Posted: Thu May 21, 2009 9:47 pm
by Kentamanos
sinbad wrote:Frankly I'm surprised no-one's raised this before.
That is weird...

Countries this could potentially affect:http://en.wikipedia.org/wiki/Decimal_se ... imal_comma

Re: Parsing of .material script

Posted: Thu May 21, 2009 9:53 pm
by jblecanard
Hello sinbad !

What do you think about this :
I agree that it looks like a locale problem but I would be very surprised of it. Do you imagine the parsing of any source code (in C or C++ for example) which depends on the locale ? It would be useless and painful. Writing some .material script is a developper task so it would be surprising that the syntax depends on the language of the developper :s. What if you want share your program with foreigners ?
Don't you agree ? In my opinion, a parser should not be linked to any locale thing. I'm french and Iv'e never seen a program which parses real values with a comma, even a french one. French programmers also uses the dot to delimite decimals on a computer. I don't think that people might use it to parse localised values too, cause as you said before, no-one's raised this before. I think (as a non-english speaker) that parsing the values with localized format does not make sense for a script file. It makes sense for a log or a report file for instance.

Regards

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:12 pm
by sinbad
The thing is, we have lots of French programmers here and this really hasn't hit the radar before. That suggests to me that something in your environment is triggering the default locale inside VC++ to use commas for decimal separators when other people's aren't. I'm intrigued about what that might be.

I could hard-code it to "en_GB" but there are 2 questions which I can't answer, I need non-English people to debate / confirm:
1. Would that cause other problems?
2. Instead of changing code, is there a VC setting or something which you could change to do this, since other French programmers haven't had this problem?

I don't want to just hard-code something here without asking, because I have no experience of localisation.

And you've missed a point of mine - StringConverter is not just used for parsing, it's used for lots of other things like logging (and whatever else users might use it for). Therefore hard-coding it would not just affect the parsers, that's why I asked about making the converter behave differently per usage mode. Therefore your argument for hard-coding is an oversimplification, and I come back to my code example above which allows switching depending on context, which I wanted some feedback on.

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:16 pm
by Kentamanos
sinbad:The other thread says he's in Linux, so not a VC++ issue. It might also might partially explain why it hasn't been raised before?

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:30 pm
by Klaim
Should the default locale of compiled applications be the application OS locale? I'm not sure about that : all my applications default locales "seems" to be "english"... Until know I always assumed in C++ (not in .NET) the default locale was always "english" until you redefine it in the code... now I'm doubting... (I'm french too)




Or maybe it's because we use english versions of Visual Studio? I do this to be sure to not have to translate errors and other compiler/IDE specific messages when asking for help (most of the time on english forums).

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:41 pm
by spacegaier
Or maybe it's because we use english versions of Visual Studio? I do this to be sure to not have to translate errors and other compiler/IDE specific messages when asking for help (most of the time on english forums).
I'm on Windows and use the German version of VisualStudio and (although we Germans are in the wiki list posted above) the dots for floating point numbers work here (and only the dots).

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:42 pm
by jblecanard
sinbad wrote:And you've missed a point of mine - StringConverter is not just used for parsing, it's used for lots of other things like logging (and whatever else users might use it for). Therefore hard-coding it would not just affect the parsers, that's why I asked about making the converter behave differently per usage mode. Therefore your argument for hard-coding is an oversimplification, and I come back to my code example above which allows switching depending on context, which I wanted some feedback on.
You're absolutely right, and my opinion is that parsers should use a dedicated function, because parsing a file is a lot more complex (in theory) than just translate the data between String and real values. This way, you keep a single syntax for all the developers (which is an key point, I think), and the other tools are easier to develop. I think about the blender plugin which creates .material script with dots (so I have to translate them ^^). And what if I want to share my software with you ? I have to generate correct .material script for each configuration...

And you're also right when speaking about my particular computer. I'm using Linux (archlinux) and I'm coding with Eclipse CDT. My compiler is GCC and my toolchain for compilation is CMake (which also runs on Windows and MacOS, maybe you have heard about it). I don't think that the way I compile my source code changes something, because I do not compile ogre each time, I'm just making some dynamic linking to the libray which is already compiled. But I'm not an expert about C++ compilation, I may be wrong.

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:53 pm
by Klaim
After a quick check there : http://www.linuxtopia.org/online_books/ ... g_101.html

I made a little test (French OS, VS 2009 English, empty console application ) :

Code: Select all

// Illustrates effects of locales.
#include <iostream>
#include <locale>
using namespace std;

int main() {
	locale def;
	cout << def.name() << endl;
	locale current = cout.getloc();
	cout << current.name() << endl;
	float val = 1234.56;
	cout << val << endl;
	// Change to French/France
	cout.imbue(locale("french"));
	current = cout.getloc();
	cout << current.name() << endl;
	cout << val << endl;

	cout << "Enter the literal 7890,12: ";
	cin.imbue(cout.getloc());
	cin >> val;
	cout << val << endl;
	cout.imbue(def);
	cout << val << endl;

	std::system("pause");
} ///:~

It displays only points...
I just checked : my computer (at work) is set to . in the french locale. I remember that it's often set like that to avoid tool problems (some tools uses the locale, others don't so to be sure we set all the system to . ).

(I'll have to test at home...)

Re: Parsing of .material script

Posted: Thu May 21, 2009 10:59 pm
by jblecanard
Okay, I found which env variable can affect the behavior. It is LC_ALL and not LANG. Setting LC_ALL to C solves the problem and makes the parsing be made with dots. I can easily handle that. But I'm still thinking that parsing should not be linked to locales ;). This is also the evidence that it is a locale problem (we did not have any in the previous posts).

Edit : there's a real problem. Because while setting LC_ALL to C, my software is launched in english. I'm using internationalization tools and they rely on the LC_ALL env variable.

Posted: Fri May 22, 2009 7:05 am
by wacom
Why not accepting dots and commas in a material script?

Re: Parsing of .material script

Posted: Fri May 22, 2009 7:28 am
by banal
I think the solution suggested by sinbad is the best so far. I'm not a localization expert either and have never encountered these problems (although I'm also from a country that usually uses the comma as separator).
Material scripts (and other scripts for that matter) should be parsed using a fixed locale setting, so that they are interchangeable. Where locale settings are indeed needed, the function could use a custom locale.

Re: Parsing of .material script

Posted: Fri May 22, 2009 7:32 am
by jacmoe
Yes, something is definitely totally off.
I haven't experienced anything like it - and I am also using a locale (Danish) where commas and punctuation marks are swapped. Windows/Linux.
Have no idea what you did, but you did something bad. :)

I don't see any reason at all to change anything.

Re: Parsing of .material script

Posted: Fri May 22, 2009 9:53 am
by jblecanard
jacmoe wrote:Yes, something is definitely totally off.
I haven't experienced anything like it - and I am also using a locale (Danish) where commas and punctuation marks are swapped. Windows/Linux.
Have no idea what you did, but you did something bad. :)

I don't see any reason at all to change anything.
Do you use internationalization in your programs ? Can you show me your env variables when running the program (under Linux) ? Maybe I did something wrong, but I compiled a lot of things and Ogre is the first which does that. I still believe that the parser should not depends on a locale, in any case, and I also think that the sinbad idea is the best one.

Re: Parsing of .material script

Posted: Fri May 22, 2009 10:08 am
by jacmoe
I am not using internationalization in my own programs (yet), but a lot of the programs I use do.
I can't post my env vars, yet, because I'm on my Windows 'box' atm.
Maybe a permanent Linux resident can post theirs?