Advanced compilers for Ogre's scripts
-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
Once the AST is generated the further processing is up to me. The import thing can be handled fairly easily. The requested script is loaded (I'll provide a listener for "hooking" this, if you want to customize loading), and fed into the parser to generate another AST. This new one is then substituted in for the original script's import tree node. Further processing continues oblivious to the fact that the new elements never actually existed in the original script.
For the variables: handling it at parse time seems rather difficult. My plan was to handle it in between parsing and compiling. It would be one quick pass through the tree to process the variables. And yes, this pass would use a scoped symbol table to do it (I'm thinking something like a Stack structure of symbol tables). The one thing that tricky and nice is that these variables aren't typed. They should be able to set to any arbitrary ogre script construct. So, upon replacement the variable values will need to be expanded and transformed into ASTs themselves. Not worrying about type information though will make it simpler.
For the variables: handling it at parse time seems rather difficult. My plan was to handle it in between parsing and compiling. It would be one quick pass through the tree to process the variables. And yes, this pass would use a scoped symbol table to do it (I'm thinking something like a Stack structure of symbol tables). The one thing that tricky and nice is that these variables aren't typed. They should be able to set to any arbitrary ogre script construct. So, upon replacement the variable values will need to be expanded and transformed into ASTs themselves. Not worrying about type information though will make it simpler.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
It's just the opposite: It's quite easy to handle it at parse time indeed, or at least it's been for me. Instead of retrieving scene nodes created before during the parse (my case), you would retrieve values from a table/map. The scope is quite easy as well; you can create scope strings, and store the variables along their scope string. Then when retrieving its value, you would take the one with the larger scope variable that actually fits in the scope at the call point. For instance:Praetor wrote:For the variables: handling it at parse time seems rather difficult. My plan was to handle it in between parsing and compiling. It would be one quick pass through the tree to process the variables. And yes, this pass would use a scoped symbol table to do it (I'm thinking something like a Stack structure of symbol tables). The one thing that tricky and nice is that these variables aren't typed. They should be able to set to any arbitrary ogre script construct. So, upon replacement the variable values will need to be expanded and transformed into ASTs themselves. Not worrying about type information though will make it simpler.
Code: Select all
material mat1 {
technique {
define red 1 0 0 // Stored as red@technique0@mat1 = 1 0 0
pass {
define red 0.8 0 0 // red@pass0@technique0@mat1 = 0.8 0 0
diffuse $red // uses red@pass0@technique0@mat1 instead of red@technique0@mat1
}
pass {
diffuse $red // uses red@technique0@mat1 as red@pass0@technique0@mat1 is out of scope (the current scope is red@pass1@technique0@mat1)
}
}
}

-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
It gets the idea across, but there are other issues. I'll have to mull it over a bit.
Anyway, I got somewhere on the ANTLR front. I generated a basic lexer. However, I quickly realized it is not stand-alone. It depends on ANTLR's C++ runtime. I was hoping not to bring in any extra dependencies with this project. If we are going to accept extra dependencies, I'd just assume use boost's spirit instead of ANTLR. It generates the the parser in-line (define the grammar in C++, and template meta-programming creates the parser as it compiles), I'm familiar with it, like most of boost it is a header-only library, and sinbad already plans on folding in some boost dependency in the future.
That's where I am now. I feel like I should probably work on my hand-crafted lexer parser some more. I will need definitive answers about dependencies soon though. Using spirit could really accelerate this development.
Anyway, I got somewhere on the ANTLR front. I generated a basic lexer. However, I quickly realized it is not stand-alone. It depends on ANTLR's C++ runtime. I was hoping not to bring in any extra dependencies with this project. If we are going to accept extra dependencies, I'd just assume use boost's spirit instead of ANTLR. It generates the the parser in-line (define the grammar in C++, and template meta-programming creates the parser as it compiles), I'm familiar with it, like most of boost it is a header-only library, and sinbad already plans on folding in some boost dependency in the future.
That's where I am now. I feel like I should probably work on my hand-crafted lexer parser some more. I will need definitive answers about dependencies soon though. Using spirit could really accelerate this development.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
Well, as a mentor I can't help but tell you to think carefully about this (I completely forgot about that dependency!).
Though you're allowed to add dependencies if that will accelerate the development and make things cleaner, you know almost everything possible will be ported to boost to remove dependencies as much as possible. That would make this project a candidate for a re-work just to remove one dependency. And that means re-writing code and wasting efforts.
Though I like ANTLR probably above all the other parser generators, this is a delicate issue and you must balance the pros and cons in depth to find a good, satisfying answer.
Though you're allowed to add dependencies if that will accelerate the development and make things cleaner, you know almost everything possible will be ported to boost to remove dependencies as much as possible. That would make this project a candidate for a re-work just to remove one dependency. And that means re-writing code and wasting efforts.
Though I like ANTLR probably above all the other parser generators, this is a delicate issue and you must balance the pros and cons in depth to find a good, satisfying answer.
-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
Precisely my point. I don't think ANTLR is a good way to go if only for that reason. It is a shame, but that is just the way things happen sometimes.
I've converted that prototype to see how quickly I can get a spirit parser generating an AST for any arbitrary ogre script (any old .material or .particle, etc.). Depending on the scope of boost that will be integrated into Ogre, using spirit might be a negligible dependency to add.
I'll also think more about handling variable during parsing. The main issue is base elements will define variables, which are overwritten by child elements. It is this inheritance behavior coupled with the variables that makes it a little more complex. Anyway I'll do some more prototyping and find the best approach.
I've converted that prototype to see how quickly I can get a spirit parser generating an AST for any arbitrary ogre script (any old .material or .particle, etc.). Depending on the scope of boost that will be integrated into Ogre, using spirit might be a negligible dependency to add.
I'll also think more about handling variable during parsing. The main issue is base elements will define variables, which are overwritten by child elements. It is this inheritance behavior coupled with the variables that makes it a little more complex. Anyway I'll do some more prototyping and find the best approach.
-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
@everyone I noticed an inconsistency. One of the goals of these new compilers is to allow for more flexible input. That means newlines and tabs wherever the author wants. With one exception: properties. Properties still need to be in the form
with a newline at the end. There can be no newlines in the middle of this sequence. So, while formatting is free-form everywhere else, in this instance it isn't. Now, I can make my parser handle this of course, and I will, but I think since we are using a C-like syntax it isn't out-of-the-question to ask that these constructs end in a ';'.
Now, it is unacceptable right now for me to *require* the semi-colon, since I want all current scripts to compile under the new systems. However, I propose that I allow the semi-colons to be there. The old non-semi-coloned format becomes deprecated but acceptable for one release cycle, and then the whole system is transfered to semi-colon only. What do you think?
@kencho To continue our banter about symbols...
The issue that muddies symbol handling is inheritance. One material definition can inherit from another. The child material should then be able to override variables place in the parent. I'm mulling over the mechanisms i'll use to compile the inheritance, but one thing is certain: it will have to happen after parsing is completed. To allow for the overwriting of parent variables I'll have to do variable processing after that, which means I can't do it during parsing. Does that make sense?
Code: Select all
name value1 ... valuen
Code: Select all
diffuse 0 1 1 1;
specular 1 1 1 1;
@kencho To continue our banter about symbols...
The issue that muddies symbol handling is inheritance. One material definition can inherit from another. The child material should then be able to override variables place in the parent. I'm mulling over the mechanisms i'll use to compile the inheritance, but one thing is certain: it will have to happen after parsing is completed. To allow for the overwriting of parent variables I'll have to do variable processing after that, which means I can't do it during parsing. Does that make sense?
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
First, my opinion regarding the newlines... I might be wrong, but I think that's only a problem where the values are strings themselves (for instance, cube maps or multiframe animations). In all the other cases you have sentences in the form "keyword value(s)", so newlines should be a problem as values are either numbers or variables starting with $ (and the previously told "string values").
Think this is a very problematic issue as you're breaking a lot of the scripts API by requiring extra chars at the end of lines. None of the existing resource tools would work, and you know how appreciated are the few tools existing for Ogre
I think that we should put a lot more thought on this issue given its impact and importance.
Onto the symbols, you've got a good point with inheritance, though that would be possible to do it in parse time. As long as you import the ASTs in the moment you find the import statement, you could attach the values to a structure different than a map, say, a non-cyclic (inheritance) graph.
That's my guess though, but I still have faith on parse-time
Theoretically, if you require variables to be declared before being used, you can do it in parse-time, am I wrong?
PS: This is a very nice and instructive discussion
Thanks!
Think this is a very problematic issue as you're breaking a lot of the scripts API by requiring extra chars at the end of lines. None of the existing resource tools would work, and you know how appreciated are the few tools existing for Ogre

Onto the symbols, you've got a good point with inheritance, though that would be possible to do it in parse time. As long as you import the ASTs in the moment you find the import statement, you could attach the values to a structure different than a map, say, a non-cyclic (inheritance) graph.
That's my guess though, but I still have faith on parse-time

PS: This is a very nice and instructive discussion

-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
That is exactly my apprehension. I don't want to break any current scripts. I can certainly make it work now. I just wanted comment on the apparent inconsistency. Formatting of the script doesn't matter, until you start specifying properties then it does... I was saying we could continue to support it for now, but what I build in is the option to add the trailing ';'. If we start changing examples and manuals slowly over time, it won't be too much of a burden when the optional ';' becomes mandatory. I have no idea what a timeline for the change could be. It could be never. But until then allowing an optional ';' at the end of properties won't hurt anything. I'll make sure those scripts without it still work.Kencho wrote:Think this is a very problematic issue as you're breaking a lot of the scripts API by requiring extra chars at the end of lines. None of the existing resource tools would work, and you know how appreciated are the few tools existing for OgreI think that we should put a lot more thought on this issue given its impact and importance.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
I was guessing that keywords are detected by the lexer and not returned as mere string tokens, so "string values" shouldn't be a problem either. Have you give a thought on this? I'm working on a heavy GUI assignment for class right now and my mind is quite shuffled (due partly to the lack of sleep
). Maybe it's possible to use the initial formula after all 


-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
I've been working on parsers based on boost's spirit parsers. It is going really well, with some really very amazing error handling (this is the best part so far). I'm doing some custom AST generation (spirit has built-in ast builders, but I'm not using them so I have more control).
The main bit of news is an idea I had for handling the afore-mentioned problems with newlines and such in properties. Instead of trying to split properties up into groups during the parsing phase, which I've found to be hard due to the lack of an end-of-property character (like C and C++'s ';') I'll simply parse out a "block" of property settings. The tree would like this:
Notice that the properties are placed into the parent node as a flat block instead of sorted like
The job of sorting out the properties can be left to the compiler, which will be better able to handle it, since it will know the keywords for properties. A desirable side-effect means that properties should now be tolerant of newlines and such splitting them up in the script.
The parser is currently exposed as a single free-function: parse, which returns an AST structure. I will be finishing up this parsing function, and then start on the base class for all compilers. This base class will do some pre-processing on the AST, then hand off further processing to subclasses.
The main bit of news is an idea I had for handling the afore-mentioned problems with newlines and such in properties. Instead of trying to split properties up into groups during the parsing phase, which I've found to be hard due to the lack of an end-of-property character (like C and C++'s ';') I'll simply parse out a "block" of property settings. The tree would like this:
Code: Select all
material
Test
:
Parent
lighting
off
diffuse
1
1
1
1
technique
Code: Select all
lighting
off
diffuse
1
1
1
1
The parser is currently exposed as a single free-function: parse, which returns an AST structure. I will be finishing up this parsing function, and then start on the base class for all compilers. This base class will do some pre-processing on the AST, then hand off further processing to subclasses.
-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
More testing reveals that I will probably have to create a mechanism for passing in the header and property keywords to the parser. Spirit is nice in that it contains dynamic parsing capabilities. The compilers will need to tell the parsing what keywords denote headers ("material", "overlay", "technique") and what are properties ("lighting", "diffuse"). I can then dynamically generate the right parser for the job on-the-fly. The technique I think I'll use will even show performance improvements over brute-force matching a list of keywords.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
Token attributes? Detected and initialised by the lexer, seems the perfect candidate for thatPraetor wrote:More testing reveals that I will probably have to create a mechanism for passing in the header and property keywords to the parser. Spirit is nice in that it contains dynamic parsing capabilities. The compilers will need to tell the parsing what keywords denote headers ("material", "overlay", "technique") and what are properties ("lighting", "diffuse"). I can then dynamically generate the right parser for the job on-the-fly. The technique I think I'll use will even show performance improvements over brute-force matching a list of keywords.


-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
Possibly solved. No need for the compiler to pass down into the parser what is expected to be property tokens either. The parser can handle it just fine. I haven't completed my unit testing yet, but I'll commit what I have for parser so far and upload the new version of the tests.
Basically, I got around the ambiguities in the grammar by using a sneaky trick in spirit to force a look-ahead. If the parser looks ahead and sees this kind of construction:
It uses the the rule governing what I'm calling "script objects" (materials, particle systems, the items in a script which have properties and possibly children). For parsing the properties I had to switch into character-level parsing (from phrase-level) and explicitly specify where and what whitespace is after each element. Here's an grammar for the property names:
The property values can be like the names (there are a few more characters allowed like '/' and '.'), a variable ('$' followed by the same grammar as property names), or a number (either integer or floating point).
Here's a snippet of one of the tests that mixes these "objects" and properties:
Notice the ';' on property2. It is NOT required. The '}' is successfully recognized as an end-of-property character, but the ';' is acceptable too. You may get used to adding ';' to the end of properties in new scripts you write, since OgreScript is allegedly inspired by C syntax.
So, that's where I currently am. The parsing so far is passing all tests and creating very clean abstract syntax trees. I've been thinking on and off about the kinds of transformations I want to do with those trees to handle variables and imports and I think I have a handle on it.
The thing I am now considering is a kind of special syntax for imports. Imports are totally new to this version of the compiler so I feel I have a little more slack. What about concepts of packages? In Java if you want to import something specific you do this:
Can't we do the same thing? Say my material wants to inherit from a material called Base, in a script called base_materials.material.
Or, if you want everything from that script:
So, yeah that's hard-coding that material scripts will now actually have be *.material. And what about .program files that some exporters create? I don't know yet. Anyone have any thought on this sort of pseudo package system. It seems to me like a good way of being explicit about dependencies, while keeping thing clean and organized. Also, remember that there will be a listener interface to hook into the importing mechanism of the compiler. It will give you the "path" of the import requested, and you return a DataStreamPtr. So, if there is some custom import processing to do, that interface should give you the flexibility to achieve it.
The final thing I want to to mention is an addition of a new keyword that works hand-in-hand with the new import mechanics and the new style of inheritance. Let's consider this production:
Currently, Base is actually created in the MaterialManager. Child then gets the actual material "Base" and does a material copy on it. What if you only wanted Base to be a Base, and not an actual, usable material? And what about all those objects which can't do what materials can? Like techniques and passes? I propose a keyword for top-level objects only (those that have no parent and sit at the top scope of a script) which tells the underlying compilers not to actually compile them to real objects. The material copy will not happen any more. All inheritance is done through transformations of the AST. I understand "template" is already a keyword for particle systems, so how about "abstract"?
Basically, I got around the ambiguities in the grammar by using a sneaky trick in spirit to force a look-ahead. If the parser looks ahead and sees this kind of construction:
Code: Select all
type{...
type Name{...
type : Name{...
Code: Select all
property_name ::= A-Za-z (A-Za-z0-9_)*
Here's a snippet of one of the tests that mixes these "objects" and properties:
Code: Select all
type Name1{
property1 value1 value2
type2{property2 1.0 0 0;}
type3{}
}
So, that's where I currently am. The parsing so far is passing all tests and creating very clean abstract syntax trees. I've been thinking on and off about the kinds of transformations I want to do with those trees to handle variables and imports and I think I have a handle on it.
The thing I am now considering is a kind of special syntax for imports. Imports are totally new to this version of the compiler so I feel I have a little more slack. What about concepts of packages? In Java if you want to import something specific you do this:
Code: Select all
import somewhere.something.Class
Code: Select all
import base_materials.Base
Code: Select all
import base_materials.*
The final thing I want to to mention is an addition of a new keyword that works hand-in-hand with the new import mechanics and the new style of inheritance. Let's consider this production:
Code: Select all
material Base
{
...
}
material Child : Base
{
...
}
Code: Select all
abstract technique BaseTechnique{...}
material Test
{
technique : BaseTechnique{...}
}
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
The problem is not to use to add ';' at the end. The problem is that existing tools and resources doesn't feel like accepting that... Isn't there a different way to do that? I'm sure there's one.Praetor wrote:Notice the ';' on property2. It is NOT required. The '}' is successfully recognized as an end-of-property character, but the ';' is acceptable too. You may get used to adding ';' to the end of properties in new scripts you write, since OgreScript is allegedly inspired by C syntax.Code: Select all
type Name1{ property1 value1 value2 type2{property2 1.0 0 0;} type3{} }
Hmm... I find it an interesting topic, though that will probably kill the flexibility about resources Ogre has. Some people prefer to keep all the resources in a single place, other prefer to group them by resource type... Not to mention that some should have to tweak an existing resource to make them work with their own resources.Praetor wrote:The thing I am now considering is a kind of special syntax for imports. Imports are totally new to this version of the compiler so I feel I have a little more slack. What about concepts of packages? In Java if you want to import something specific you do this:
Can't we do the same thing? Say my material wants to inherit from a material called Base, in a script called base_materials.material.Code: Select all
import somewhere.something.Class
Or, if you want everything from that script:Code: Select all
import base_materials.Base
So, yeah that's hard-coding that material scripts will now actually have be *.material. And what about .program files that some exporters create? I don't know yet. Anyone have any thought on this sort of pseudo package system. It seems to me like a good way of being explicit about dependencies, while keeping thing clean and organized. Also, remember that there will be a listener interface to hook into the importing mechanism of the compiler. It will give you the "path" of the import requested, and you return a DataStreamPtr. So, if there is some custom import processing to do, that interface should give you the flexibility to achieve it.Code: Select all
import base_materials.*
I would suggest to explicitely define a "namespace" declaration for this, though I then wouldn't see the final point.
Yeah, abstract rules (literallyPraetor wrote:The final thing I want to to mention is an addition of a new keyword that works hand-in-hand with the new import mechanics and the new style of inheritance. Let's consider this production:
Currently, Base is actually created in the MaterialManager. Child then gets the actual material "Base" and does a material copy on it. What if you only wanted Base to be a Base, and not an actual, usable material? And what about all those objects which can't do what materials can? Like techniques and passes? I propose a keyword for top-level objects only (those that have no parent and sit at the top scope of a script) which tells the underlying compilers not to actually compile them to real objects. The material copy will not happen any more. All inheritance is done through transformations of the AST. I understand "template" is already a keyword for particle systems, so how about "abstract"?Code: Select all
material Base { ... } material Child : Base { ... }
Code: Select all
abstract technique BaseTechnique{...} material Test { technique : BaseTechnique{...} }

-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
In that context (or in any valid context currently) the ';' is not required. I can at any time easily revoke it. I wasn't understanding your point earlier, but I think I get it now. I never wanted to make the ';' required in the current iteration, but any sort of transitional period could break current tools. Something to think about.
I'm struggling with the import thing. Right now, it seems to me using it to make dependencies between scripts explicit is a good idea. Remember, explicit over implicit. This will solve the problem of trying to ensure one script gets compiled before another. With explicit dependencies it will be handled for you. The only question is how. The rigidity of the Java-style packages and imports is cited sometimes as a weakness. Perhaps we can utilize the ':' character, which already has some special significance for inheritance here.
This will cause the compiler to search through the resource groups for base_materials.material. When it is found, that script will be turned into AST in the normal manner. The top-level object called "Base" will be extracted and then inserted into the original script (the one doing the importing). Once normal compilation resumes, it will be as if Base were declared within the same file.
For now, I'll see how easily I can extend the grammar to except "abstract" as a modifier for top-level objects. It will still be a far simpler grammar than some I've seen.
I'm struggling with the import thing. Right now, it seems to me using it to make dependencies between scripts explicit is a good idea. Remember, explicit over implicit. This will solve the problem of trying to ensure one script gets compiled before another. With explicit dependencies it will be handled for you. The only question is how. The rigidity of the Java-style packages and imports is cited sometimes as a weakness. Perhaps we can utilize the ':' character, which already has some special significance for inheritance here.
Code: Select all
import base_materials.material:Base
For now, I'll see how easily I can extend the grammar to except "abstract" as a modifier for top-level objects. It will still be a far simpler grammar than some I've seen.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
Ah, I get your point here. Seems interesting, though you will be parsing the same script many times when looking for base objects... The most clear case would be to import/use as a base an object defined later. For instance:Praetor wrote:I'm struggling with the import thing. Right now, it seems to me using it to make dependencies between scripts explicit is a good idea. Remember, explicit over implicit. This will solve the problem of trying to ensure one script gets compiled before another. With explicit dependencies it will be handled for you. The only question is how. The rigidity of the Java-style packages and imports is cited sometimes as a weakness. Perhaps we can utilize the ':' character, which already has some special significance for inheritance here.
This will cause the compiler to search through the resource groups for base_materials.material. When it is found, that script will be turned into AST in the normal manner. The top-level object called "Base" will be extracted and then inserted into the original script (the one doing the importing). Once normal compilation resumes, it will be as if Base were declared within the same file.Code: Select all
import base_materials.material:Base
Code: Select all
import thisFile.material:BaseMaterial
material ExtendedMaterial {
...
}
material BaseMaterial {
...
}

-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
Anything in the same file will be available for use, no matter if it's declared before or after. Remember it is an AST here, so a traversal to find a top-level parent object is a relatively cheap operation. Order of declaration is no problem.
On the front of loading and parsing multiple times, I thought of that. For now it could a naive algorithm, loading a parsing each time. Of course, we can then play with the memory budget and think about caching ASTs for imported scripts. If the same is imported again, we'll get a significant speed boost. I'm so far confident (with no data to back this up) that the AST for an average script has trivial memory size. Each node has a handful of integers and pointers, and one list which points to that node's children. Not exactly a behemoth. But, we'll only truly know after the unit tests are done and some real-world examples are created.
On the front of loading and parsing multiple times, I thought of that. For now it could a naive algorithm, loading a parsing each time. Of course, we can then play with the memory budget and think about caching ASTs for imported scripts. If the same is imported again, we'll get a significant speed boost. I'm so far confident (with no data to back this up) that the AST for an average script has trivial memory size. Each node has a handful of integers and pointers, and one list which points to that node's children. Not exactly a behemoth. But, we'll only truly know after the unit tests are done and some real-world examples are created.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
I had a relaxing few days completely away from work, but now I'm back.
I've tightened up the test cases, which I will upload. I'm also prepared to commit the parser, which I'm doing now. The final test case I want to write tests error handling. I'm very very pleased with the error handling and reporting I've managed. I can pinpoint line and column numbers in the input, and output specific error codes to inform of the error type (such as PE_OPENBRACEEXPECTED, or PE_NEWLINEEXPECTED).
I've also added the "abstract" keyword to the grammar. It went in without any problems. This new keyword only works on top-level objects.
That does not work. It doesn't make any sense.
That, however, at the top level, works great. Like in other language abstract means the object is meant to be overridden. Expect to see it pop up a lot in resource libraries (like a collection of reusable particle system components, or materials components).
So far, the AST generation is going great. I'm calling it finished for now. It appears to handle all valid productions, reject all invalid, and the ASTs it makes are logical and well-organized. I'm going to move on to the ScriptCompiler base class. The first step is to devise a number of pass which transform and pre-process the AST before compiler subclasses do the translation into the final form. For now, I'll take the naive approach, and each transformation will receive its own pass over the AST. No optimizations. They can always be added later when the need and opportunity arises.
1. Process top-level import statements (expand imports into the actual AST representing the imported script)
2. Expand the AST of base objects into the subtree of the object which are deriving from them.
3. Do variable replacement in the properties
[EDIT] Tests are uploaded to http://www.rit.edu/~bjj1478/files/SoCTests.zip
I've tightened up the test cases, which I will upload. I'm also prepared to commit the parser, which I'm doing now. The final test case I want to write tests error handling. I'm very very pleased with the error handling and reporting I've managed. I can pinpoint line and column numbers in the input, and output specific error codes to inform of the error type (such as PE_OPENBRACEEXPECTED, or PE_NEWLINEEXPECTED).
I've also added the "abstract" keyword to the grammar. It went in without any problems. This new keyword only works on top-level objects.
Code: Select all
material Test
{
abstract technique{}
}
Code: Select all
abstract technique Test{}
So far, the AST generation is going great. I'm calling it finished for now. It appears to handle all valid productions, reject all invalid, and the ASTs it makes are logical and well-organized. I'm going to move on to the ScriptCompiler base class. The first step is to devise a number of pass which transform and pre-process the AST before compiler subclasses do the translation into the final form. For now, I'll take the naive approach, and each transformation will receive its own pass over the AST. No optimizations. They can always be added later when the need and opportunity arises.
1. Process top-level import statements (expand imports into the actual AST representing the imported script)
2. Expand the AST of base objects into the subtree of the object which are deriving from them.
3. Do variable replacement in the properties
[EDIT] Tests are uploaded to http://www.rit.edu/~bjj1478/files/SoCTests.zip
Last edited by Praetor on Mon May 28, 2007 8:20 pm, edited 1 time in total.
-
- OGRE Retired Moderator
- Posts: 4011
- Joined: Fri Sep 19, 2003 6:28 pm
- Location: Burgos, Spain
- x 2
-
- Gnome
- Posts: 393
- Joined: Thu Dec 08, 2005 9:57 pm
- x 1
This looks really good! I'm sure this will make the scripting system even better than it is now.
Is there a plan to include a preprocessor for #define / #ifdef style macros? I know this is not of much use for material/overlay scripts, but I would sure be glad for it
I use the Compiler2Pass to define data structures dynamically, it will be perfect to just do "include" for some common definitions. Macro system would improve this even more, such one could do...
...to implement just the differences between two versions of the data structure definitions.
But I know this is a different kind of a situation than most scripts encounter.
Anyway, please keep up the great work!
Edit: Oh. I thought that the "import" keyword is already accepted to be implemented. Oh well, doesn't matter. I should read more carefully
Is there a plan to include a preprocessor for #define / #ifdef style macros? I know this is not of much use for material/overlay scripts, but I would sure be glad for it

I use the Compiler2Pass to define data structures dynamically, it will be perfect to just do "include" for some common definitions. Macro system would improve this even more, such one could do...
Code: Select all
struct A {
uint32 a
float b
#ifdef Version_2
bool32 is_on
#endif
}
But I know this is a different kind of a situation than most scripts encounter.
Anyway, please keep up the great work!
Edit: Oh. I thought that the "import" keyword is already accepted to be implemented. Oh well, doesn't matter. I should read more carefully

-
- OGRE Retired Team Member
- Posts: 3335
- Joined: Tue Jun 21, 2005 8:26 pm
- Location: Rochester, New York, US
- x 3
Yeah, the import keyword is going in, I'm just trying to find the best way to implement it (in terms of the scripting syntax).
Right now, there are no plans to make a prepocessor. That doesn't mean it will never happen, just not at the first release. As we move towards a boost dependency we actually gain access to something interesting: wave. It is a fully-functional C-style preprocessor. It's possible that this could be plugged into the system and run over the input before any further processing is done. In order to keep a handle on the project though, I need to narrow the scope in certain areas, at least for the summer.
With regards to that import keyword, I had some more ideas. Using a new-style syntax like this would be great:
Or, using something more C/C++ like would be
Personally, I like the first form better, but I suppose the second is more inspired by C syntax which was the basis for this language. Any votes either way here?
Right now, there are no plans to make a prepocessor. That doesn't mean it will never happen, just not at the first release. As we move towards a boost dependency we actually gain access to something interesting: wave. It is a fully-functional C-style preprocessor. It's possible that this could be plugged into the system and run over the input before any further processing is done. In order to keep a handle on the project though, I need to narrow the scope in certain areas, at least for the summer.
With regards to that import keyword, I had some more ideas. Using a new-style syntax like this would be great:
Code: Select all
import BaseMaterial from bases.material
import * from bases.material
Code: Select all
import bases.material::BaseMaterial
import bases.material::*
-
- OGRE Retired Team Member
- Posts: 19269
- Joined: Sun Oct 06, 2002 11:19 pm
- Location: Guernsey, Channel Islands
- x 66
I like the former better myself, although I'd use either without much issue.
BTW I'm occasionally adding new features to Shoggoth (like polygon_mode_overrideable and normalise_normals), I'm not sure how you want to handle that. You can perhaps apply patches from the CVS mailing list to your branch if you want to pull them in.
BTW I'm occasionally adding new features to Shoggoth (like polygon_mode_overrideable and normalise_normals), I'm not sure how you want to handle that. You can perhaps apply patches from the CVS mailing list to your branch if you want to pull them in.