Small Protocol Buffer test

A place for Ogre users to discuss non-Ogre subjects with friends from the community.
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Small Protocol Buffer test

Post by jacmoe »

I've been fooling around with Google Protocol Buffers a bit, to see if it's usable as an alternative to XML for scene loading. :)

This is my test XML file:

Code: Select all

<scene formatVersion="1.0.0" author="PnP Terrain Creator">
    <nodes>
        <node>
            <position x="10.337000" y="110.991287" z="1009.718018" />
            <quaternion qw="1.000000" qx="0.000000" qy="0.000000" qz="0.000000" />
            <scale x="0.100000" y="0.100000" z="0.100000" />
            <entity name="ogrehead" meshfile="ogrehead.mesh" />
            <userData>
                <property type="BOOL" name="static" data="false" />
                <property type="STRING" name="script" data="" />
                <property type="STRING" name="name" data="thename" />
            </userData>
        </node>
        <node>
            <position x="162.940994" y="92.747040" z="981.447021" />
            <quaternion qw="-0.377235" qx="0.000000" qy="-0.926118" qz="0.000000" />
            <scale x="0.100000" y="0.100000" z="0.100000" />
            <entity name="ogrehead" meshfile="ogrehead.mesh" />
            <userData>
                <property type="BOOL" name="static" data="false" />
                <property type="STRING" name="name" data="second head" />
            </userData>
        </node>
        <node>
            <position x="57.161037" y="62.299145" z="853.399048" />
            <quaternion qw="0.980982" qx="0.194097" qy="0.000000" qz="0.000000" />
            <scale x="0.100000" y="0.100000" z="0.100000" />
            <entity name="cube" meshfile="cube.mesh" />
            <userData>
                <property type="BOOL" name="static" data="false" />
            </userData>
        </node>
        <node>
            <position x="527.401001" y="101.923523" z="835.505005" />
            <quaternion qw="-0.941894" qx="0.010783" qy="0.335727" qz="-0.002666" />
            <scale x="0.100000" y="0.100000" z="0.100000" />
            <entity name="tudorhouse" meshfile="tudorhouse.mesh" />
            <userData>
                <property type="BOOL" name="static" data="true" />
                <property type="STRING" name="script" data="tudorhouse.script" />
                <property type="STRING" name="name" data="The Tudorhouse" />
            </userData>
        </node>
        <node>
            <position x="510.214996" y="46.337009" z="876.419983" />
            <quaternion qw="-0.630757" qx="0.000000" qy="-0.775980" qz="0.000000" />
            <scale x="0.100000" y="0.100000" z="0.100000" />
            <entity name="ninja" meshfile="ninja.mesh" />
            <userData>
                <property type="BOOL" name="static" data="false" />
                <property type="STRING" name="script" data="ninja1.script" />
                <property type="STRING" name="name" data="FabNinja" />
            </userData>
        </node>
        <node>
            <position x="491.822998" y="49.951878" z="886.934021" />
            <quaternion qw="1.000000" qx="0.000000" qy="0.000000" qz="0.000000" />
            <scale x="0.100000" y="0.100000" z="0.100000" />
            <entity name="Barrel" meshfile="Barrel.mesh" />
            <userData>
                <property type="BOOL" name="static" data="false" />
                <property type="STRING" name="script" data="barrel.script" />
                <property type="STRING" name="name" data="ninjabarrel" />
            </userData>
        </node>
    </nodes>
</scene>
And this is how my proto file looks like:

Code: Select all

package dotproto;

message Vec3f {
	optional float x = 1 [default = 0.0];
	optional float y = 2 [default = 0.0];
	optional float z = 3 [default = 0.0];
}

message Quat4f {
	optional float w = 1 [default = 0.0];
	optional float x = 2 [default = 0.0];
	optional float y = 3 [default = 0.0];
	optional float z = 4 [default = 0.0];
}

message Property {
	optional string	type		= 1 [default = ""];
	optional string	name	= 2 [default = ""];
	optional string	data		= 3 [default = ""];
}

message Entity {
	optional string	name	= 1 [default = ""];
	optional string	meshfile	= 2 [default = ""];
}

message Node {
	optional string name		        = 1;
	optional Vec3f position		= 2;
	optional Quat4f quaternion		= 3;
	optional Vec3f scale			= 4;
	optional Entity entity			= 5;
	repeated Property property		= 6;
}

message Scene {
	repeated Node node = 1;
}
All I had to do was run the protoc compiler on the proto file, like this:
protoc --cpp_out=. dotscene.proto

And it gave me two files: dotscene.pb.cc and dotscene.pb.h.

Now, all I had to do was add those files to a project, link to Google Protocol Buffers and tell the project where the Google PB headers are.

This is a small test application, which loads a scene xml and converts it to a protocol buffer:

Code: Select all

#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <google/protobuf/text_format.h>

#include <string>
#include <sstream>
#include "tinyxml/tinyxml.h"

template <class T>
bool from_string(T& t, 
                 const std::string& s, 
                 std::ios_base& (*f)(std::ios_base&))
{
  std::istringstream iss(s);
  return !(iss >> f >> t).fail();
}


int _tmain(int argc, _TCHAR* argv[])
{
	GOOGLE_PROTOBUF_VERIFY_VERSION;

	// a dotproto scene instance
	dotproto::Scene scene;

	// open xml document
	TiXmlDocument xml;
	if (!xml.LoadFile("testexport.xml"))
	{
		return false;
	}
	// get scene base node
	TiXmlElement* pSceneRootNode = xml.FirstChildElement("scene");
	if (!pSceneRootNode)
	{
		return false;
	}

	// loop through nodes
	TiXmlElement* pNodesNode = pSceneRootNode->FirstChildElement("nodes");
	for (TiXmlElement* pNode = pNodesNode->FirstChildElement();
		pNode != 0; pNode = pNode->NextSiblingElement())
	{

		// add a node the scene
		dotproto::Node* node = scene.add_node(); 

		// read position
		TiXmlElement* pPosition = pNode->FirstChildElement("position");
		float pos_x, pos_y, pos_z = 0.0f;
		from_string<float>(pos_x, pPosition->Attribute("x"), std::dec);
		from_string<float>(pos_y, pPosition->Attribute("y"), std::dec);
		from_string<float>(pos_z, pPosition->Attribute("z"), std::dec);

		// write position
		dotproto::Vec3f* pos = node->mutable_position();
		pos->set_x(pos_x);
		pos->set_y(pos_y);
		pos->set_z(pos_z);

		// read orientation
		TiXmlElement* pQuaternion = pNode->FirstChildElement("quaternion");
		float qw, qx, qy, qz = 0.0f;
		from_string<float>(qw, pQuaternion->Attribute("qw"), std::dec);
		from_string<float>(qx, pQuaternion->Attribute("qx"), std::dec);
		from_string<float>(qy, pQuaternion->Attribute("qy"), std::dec);
		from_string<float>(qz, pQuaternion->Attribute("qz"), std::dec);

		// write orientation
		dotproto::Quat4f* quat = node->mutable_quaternion();
		quat->set_w(qw);
		quat->set_x(qx);
		quat->set_y(qy);
		quat->set_z(qz);

		// read scale
		TiXmlElement* pScale = pNode->FirstChildElement("scale");
		float sx, sy, sz = 0.0f;
		from_string<float>(sx, pScale->Attribute("x"), std::dec);
		from_string<float>(sy, pScale->Attribute("y"), std::dec);
		from_string<float>(sz, pScale->Attribute("z"), std::dec);
		
		// write scale
		dotproto::Vec3f* scale = node->mutable_scale();
		scale->set_x(sx);
		scale->set_y(sy);
		scale->set_z(sz);

		// read and write entity
		TiXmlElement* pEntity = pNode->FirstChildElement("entity");
		dotproto::Entity* ent = node->mutable_entity();
		ent->set_name(pEntity->Attribute("name"));
		ent->set_meshfile(pEntity->Attribute("meshfile"));

		// See if there's any userData
		TiXmlElement* pUserdataNode = pNode->FirstChildElement("userData");
		if(pUserdataNode)
		{
			// Loop the properties
			for (TiXmlElement* pProperty = pUserdataNode->FirstChildElement();
				pProperty != 0; pProperty = pProperty->NextSiblingElement())
			{
				// read and write property
				dotproto::Property* prop1 = node->add_property();
				prop1->set_type(pProperty->Attribute("type"));
				prop1->set_name(pProperty->Attribute("name"));
				prop1->set_data(pProperty->Attribute("data"));
				
			}
		}
	}

	// Are we ready to write a protocul buffer?	
	if(scene.IsInitialized())
	{
		// write protocol buffer in standard binary format.
		std::string path = "./" + std::string("test.pb");
		std::fstream output(path.c_str(), std::ios::out | std::ios::binary );
		scene.SerializeToOstream(&output);

		// write a protocol buffer in human readable text format.
		std::string theproto;
		google::protobuf::TextFormat::PrintToString(scene, &theproto);
		std::string thepath = "./" + std::string("test.txt");
		std::fstream textoutput(thepath.c_str(), std::ios::out | std::ios::binary );
		textoutput << theproto;
	}


	// Open protocol buffer for reading.
	std::string inpath = "./" + std::string("test.pb");
	std::fstream input(inpath.c_str(), std::ios::in | std::ios::binary );
	// Parse it
	scene.ParseFromIstream(&input);

	// Print the protocol buffer to the screen.
	
	// Loop through nodes in scene
	for (int i = 0; i < scene.node_size(); i++)
	{
		// grab a node
		const dotproto::Node& out_node = scene.node(i);
		// pretty print node values
		std::cout << "Node Position: " << out_node.position().x() << "," << out_node.position().y() << "," << out_node.position().z() << std::endl;
		std::cout << "Node Orientation: " << out_node.quaternion().w() << "," << out_node.position().x() << "," << out_node.position().y() << "," << out_node.position().z() << std::endl;
		std::cout << "Node Scale: " << out_node.scale().x() << "," << out_node.scale().y() << "," << out_node.scale().z() << std::endl;
		std::cout << "Node Entity name: " << out_node.entity().name() << std::endl;
		std::cout << "Node Entity meshfile: " << out_node.entity().meshfile() << std::endl;

		// Loop through userdata
		for (int i = 0; i < out_node.property_size(); i++)
		{
			// grab a property
			const dotproto::Property& out_property = out_node.property(i);
			// pretty print property values
			std::cout << "Node Userdata type: " << out_property.type() << std::endl;
			std::cout << "Node Userdata name: " << out_property.name() << std::endl;
			std::cout << "Node Userdata data: " << out_property.data() << std::endl;
		}
	}		

	return 0;
}
There's more code than actually needed, because it also saves a human readable copy of the buffer and loads it back in for pretty printing to the screen.

I think it's a nice alternative to XML.

What do you think? :)
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
Kojack
OGRE Moderator
OGRE Moderator
Posts: 7157
Joined: Sun Jan 25, 2004 7:35 am
Location: Brisbane, Australia
x 534

Re: Small Protocol Buffer test

Post by Kojack »

Interesting.
What's the actual data file look like?
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

The input XML is 3.28 KB, and the binary protocol buffer (pb) is 900 bytes.
The human readable text output is 2.56 KB and looks like this:

Code: Select all

node {
  position {
    x: 10.337
    y: 110.99129
    z: 1009.718
  }
  quaternion {
    w: 1
    x: 0
    y: 0
    z: 0
  }
  scale {
    x: 0.1
    y: 0.1
    z: 0.1
  }
  entity {
    name: "ogrehead"
    meshfile: "ogrehead.mesh"
  }
  property {
    type: "BOOL"
    name: "static"
    data: "false"
  }
  property {
    type: "STRING"
    name: "script"
    data: ""
  }
  property {
    type: "STRING"
    name: "name"
    data: "thename"
  }
}
node {
  position {
    x: 162.941
    y: 92.74704
    z: 981.447
  }
  quaternion {
    w: -0.377235
    x: 0
    y: -0.926118
    z: 0
  }
  scale {
    x: 0.1
    y: 0.1
    z: 0.1
  }
  entity {
    name: "ogrehead"
    meshfile: "ogrehead.mesh"
  }
  property {
    type: "BOOL"
    name: "static"
    data: "false"
  }
  property {
    type: "STRING"
    name: "name"
    data: "second head"
  }
}
node {
  position {
    x: 57.161037
    y: 62.299145
    z: 853.39905
  }
  quaternion {
    w: 0.980982
    x: 0.194097
    y: 0
    z: 0
  }
  scale {
    x: 0.1
    y: 0.1
    z: 0.1
  }
  entity {
    name: "cube"
    meshfile: "cube.mesh"
  }
  property {
    type: "BOOL"
    name: "static"
    data: "false"
  }
}
node {
  position {
    x: 527.401
    y: 101.92352
    z: 835.505
  }
  quaternion {
    w: -0.941894
    x: 0.010783
    y: 0.335727
    z: -0.002666
  }
  scale {
    x: 0.1
    y: 0.1
    z: 0.1
  }
  entity {
    name: "tudorhouse"
    meshfile: "tudorhouse.mesh"
  }
  property {
    type: "BOOL"
    name: "static"
    data: "true"
  }
  property {
    type: "STRING"
    name: "script"
    data: "tudorhouse.script"
  }
  property {
    type: "STRING"
    name: "name"
    data: "The Tudorhouse"
  }
}
node {
  position {
    x: 510.215
    y: 46.337009
    z: 876.42
  }
  quaternion {
    w: -0.630757
    x: 0
    y: -0.77598
    z: 0
  }
  scale {
    x: 0.1
    y: 0.1
    z: 0.1
  }
  entity {
    name: "ninja"
    meshfile: "ninja.mesh"
  }
  property {
    type: "BOOL"
    name: "static"
    data: "false"
  }
  property {
    type: "STRING"
    name: "script"
    data: "ninja1.script"
  }
  property {
    type: "STRING"
    name: "name"
    data: "FabNinja"
  }
}
node {
  position {
    x: 491.823
    y: 49.951878
    z: 886.934
  }
  quaternion {
    w: 1
    x: 0
    y: 0
    z: 0
  }
  scale {
    x: 0.1
    y: 0.1
    z: 0.1
  }
  entity {
    name: "Barrel"
    meshfile: "Barrel.mesh"
  }
  property {
    type: "BOOL"
    name: "static"
    data: "false"
  }
  property {
    type: "STRING"
    name: "script"
    data: "barrel.script"
  }
  property {
    type: "STRING"
    name: "name"
    data: "ninjatest"
  }
}
Should I post the binary file? It will look weird.

I still have to test it against a full-blown scene file, though. But they claim that it's 10 to 100 times faster to parse than XML.
The size is also a boon, if you have a lot of data files.
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

The compiler generated header file is 1001 lines long, and the corresponding implementation file is 542 lines.
Google saved me from a lot of work. :)

And the nice thing is, that my own code remains the same, regardless of updates to the proto file. Providing that I only add to it.
Like adding Ofusion tags or OgreMAX tags.
A proto file can include other proto files.
A perfect candidate for a separate proto file would be the math "messages".

And application specific tags.

I would call it SWIG for data files. :wink:
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
RedEyeCoder
Gnome
Posts: 344
Joined: Sat Jun 16, 2007 7:29 am
Location: Brisbane, Australia

Re: Small Protocol Buffer test

Post by RedEyeCoder »

I looked into protocol buffers when they were released and really liked what I saw. Unfortunately I have not had time to use them in anything serious. I am very interested in hearing some performance results from you :)
User avatar
nullsquared
Old One
Posts: 3245
Joined: Tue Apr 24, 2007 8:23 pm
Location: NY, NY, USA
x 11

Re: Small Protocol Buffer test

Post by nullsquared »

I'm planning on using Google's protocol buffer for entity/level formats.
User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
x 56
Contact:

Re: Small Protocol Buffer test

Post by Klaim »

Same here, I plan to use it for all game relative informations (level design).
I have a specific graph structure that describe the "geography" (not appropriate word in my case...) of a level. I don't think I'll use protobuf for that, it don't seem suited, but for all other level design data it seem to be the right tool.
User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
x 56
Contact:

Re: Small Protocol Buffer test

Post by Klaim »

Did someone here tested Protobuf to store some graph representation?

I'll have to store a graph soon and my other data are using protobuf, but for the graph i'm not sure it's a good idea - gut feeling, not logic...
User avatar
RedEyeCoder
Gnome
Posts: 344
Joined: Sat Jun 16, 2007 7:29 am
Location: Brisbane, Australia

Re: Small Protocol Buffer test

Post by RedEyeCoder »

I used protocol buffers for a level format in a recent project and it worked flawlessly. The only real issue I had was more related to work flow, for example when using XML it is very easy to tweak and change the data, but with protocol buffers I could no longer do that. I imagine it would be possible to convert one to the other as part of a tool chain but that would have been overkill for what I was working on.

I used the C++ code generator btw, I imagine the others are equally as easy to use.

In my most recent project I have actually gone back to XML due not having an entire understanding of the data that will need to be stored(so the format etc changes frequently), luckily the application design allows me to later change to protocol buffers if required.
User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
x 56
Contact:

Re: Small Protocol Buffer test

Post by Klaim »

Ok so the main problem could be format change that can make loose some time...

hmmm I think I can live with it. I have to think the basis of the format for long-term usage anyway so I'll have to care about that anyway.
User avatar
nullsquared
Old One
Posts: 3245
Joined: Tue Apr 24, 2007 8:23 pm
Location: NY, NY, USA
x 11

Re: Small Protocol Buffer test

Post by nullsquared »

I don't see why format change would be a problem...?
User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
x 56
Contact:

Re: Small Protocol Buffer test

Post by Klaim »

By problem I meant that It can possibly take lot of time for each time (if the project is very big).
It's not a problem, it's a possible constraint.
User avatar
stealth977
Gnoll
Posts: 638
Joined: Mon Dec 15, 2008 6:14 pm
Location: Istanbul, Turkey
x 42

Re: Small Protocol Buffer test

Post by stealth977 »

Just a Little note:

Last month, i made a small test with protocol buffers. The test was for creating a scene format to be used with Ogitor, using protocol buffers. Its promise to be much faster and compact inspired me to do the test. Unfortunately below are my test results (Compared to XML, the scene contained 8K objects):
1 - It was actually %40 smaller ( was expecting some 3x-10x smaller files, no way!) (700K PB, 1.3M XML)
2 - It was %50 slower to write (This was unexpected!) (2 secs XML vs 4seconds PB)

I honestly didnt check the reading speed, but i dont think anything %50 slower to write could be more than 2x faster to read, maybe even not that much. Also, it has some other problems:
1 - You must have a clear structural format, when it comes to storing a dynamic scene graph with different possible nodes, the resulting PB becomes huge and not that compact.
2 - PB is created to be used with small size files thats why its structures are called messages, its for sending messages through network, its is not actually built to be used as a saving system, anything bigger than a 1MB save size will drastically affect performance (Now thats not my words, written in PBs site of google)
3 - PBs can not be quickly edited with notepad
4 - PBs must be fed with a static set of data, where XML can be fed with anything you want (you need to recreate your structure to add any new properties to your messages (though will still be backward compatible) and then in code, you must make the correct assignments to those properties, where in XML you can dump as many new properties and read back them by just changing read/write functions, also you cant change the data type of any property, so if a property is int, later you cant decide to make it float, but you can do whatever suits you in XML

So, please make sure PBs is the right tool for you before you get deep into something. They are Xth wonder of the world in some cases, but not all cases!!!

Thats just my 2 cents,

ismail,
Ismail TARIM
Ogitor - Ogre Scene Editor
WWW:http://www.ogitor.org
Repository: https://bitbucket.org/ogitor
User avatar
Klaim
Old One
Posts: 2565
Joined: Sun Sep 11, 2005 1:04 am
Location: Paris, France
x 56
Contact:

Re: Small Protocol Buffer test

Post by Klaim »

Thank you for your feedback :) That will really help me thinking more about it before using it.
User avatar
nullsquared
Old One
Posts: 3245
Joined: Tue Apr 24, 2007 8:23 pm
Location: NY, NY, USA
x 11

Re: Small Protocol Buffer test

Post by nullsquared »

stealth977 wrote:Just a Little note:

Last month, i made a small test with protocol buffers. The test was for creating a scene format to be used with Ogitor, using protocol buffers. Its promise to be much faster and compact inspired me to do the test. Unfortunately below are my test results (Compared to XML, the scene contained 8K objects):
1 - It was actually %40 smaller ( was expecting some 3x-10x smaller files, no way!) (700K PB, 1.3M XML)
It depends on the structure of your XML, of course. Try storing more data, the XML will grow much bigger than the protocol buffer.
2 - It was %50 slower to write (This was unexpected!) (2 secs XML vs 4seconds PB)
The protocol buffers have some very complex tricks to minimize file size. In previous protocol buffers, optimization was defaulted to code size, not speed. In the current version (which is about 11 days old as of this post), optimization is for code speed. You can change behaviour explicitly using:

Code: Select all

option optimize_for = SPEED;
I honestly didnt check the reading speed, but i dont think anything %50 slower to write could be more than 2x faster to read, maybe even not that much. Also, it has some other problems:
Check with the SPEED optimization and both writing and reading should be much faster. And just for the record, during a test, you cannot estimate the read performance of something given the write performance. They are two separate tasks.
1 - You must have a clear structural format, when it comes to storing a dynamic scene graph with different possible nodes, the resulting PB becomes huge and not that compact.
Actually, I found it to be much more compact than XML. What you can do is this:

Code: Select all

message cameraInfo
{
    optional float fov = 1;
    // etc.
}

message entityInfo
{
    optional string mesh = 1;
    optional string material = 2;
}

message movableObject
{
    optional cameraInfo camera = 1;
    optional entityInfo entity = 2;
    // etc.
}

message sceneNode
{
    repeated movableObject attachments = 1;
}
I find this a lot easier to use than XML, and I believe it is a lot more compact, as well, as all of the fields are optional.
2 - PB is created to be used with small size files thats why its structures are called messages, its for sending messages through network, its is not actually built to be used as a saving system, anything bigger than a 1MB save size will drastically affect performance (Now thats not my words, written in PBs site of google)
Where do you see that? All I see is this:
overview page wrote: Protocol buffers are now Google's lingua franca for data – at time of writing, there are 48,162 different message types defined in the Google code tree across 12,183 .proto files. They're used both in RPC systems and for persistent storage of data in a variety of storage systems.
3 - PBs can not be quickly edited with notepad
Actually, they can, if you serialize/parse the text version. But then you lose most of the benefits of protocol buffers as they are not meant to be human-editable but rather binary.
4 - PBs must be fed with a static set of data, where XML can be fed with anything you want (you need to recreate your structure to add any new properties to your messages (though will still be backward compatible) and then in code, you must make the correct assignments to those properties, where in XML you can dump as many new properties and read back them by just changing read/write functions, also you cant change the data type of any property, so if a property is int, later you cant decide to make it float, but you can do whatever suits you in XML
Right. And this is why it makes C++ code (and I presume for Java, as well) so much easier. No more parsing properties and converting to the correct type. You save/load using correctly-typed formats, no conversions.
So, please make sure PBs is the right tool for you before you get deep into something. They are Xth wonder of the world in some cases, but not all cases!!!

Thats just my 2 cents,

ismail,
Just my $0.02 as well. But, please, in future tests, test with the newest version of the library or at least with optimize_for = SPEED if you're testing speed.
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

I am guilty of luring Stealth into the pbuffer idea, because I believed it to be far superior to XML.
Was rather surprised to see that it didn't work out for Ogitor.
It should have been much faster to serialize back and forth, due it being a statically typed format. Still completely backwards compatible (like the Ogre mesh format).
I still believe it's superior, though. With some doubts.
Would probably choose SQLite instead - or maybe a combination?
Am not so sure. :)
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
Kojack
OGRE Moderator
OGRE Moderator
Posts: 7157
Joined: Sun Jan 25, 2004 7:35 am
Location: Brisbane, Australia
x 534

Re: Small Protocol Buffer test

Post by Kojack »

For a human readable and editable data format alternative to xml, don't forget good old Lua. That's what it was originally created for (data files, not game scripting).

Here's the xml from the first post converted to lua source (well, for only 2 nodes, I didn't feel like converting the whole thing):

Code: Select all

scene
{ 
	formatVersion = "1.0.0",
	author = "PnP Terrain Creator",
	{ 
		position = {10.337000, 110.991287, 1009.718018},
		quaternion = {1.000000, 0.000000, 0.000000, 0.000000}, 
		scale = {0.100000, 0.100000, 0.100000},
		entity = {name = "ogrehead", meshfile = "ogrehead.mesh"},
		userData = { {"static", false}, {"script", ""}, {"name", "thename"} }
	},
	{ 
		position = {162.940994, 92.747040, 981.447021},
		quaternion = {-0.377235, 0.000000, -0.926118, 0.000000}, 
		scale = {0.100000, 0.100000, 0.100000},
		entity = {name = "ogrehead", meshfile = "ogrehead.mesh"},
		userData = { {"static", false}, {"name", "second head"} }
	}
}	
I could have put in all the w, x, y, z bits, but it was pointless bloat.
For a binary version, feed it through the lua compiler.
Accessing the data from c++ should be pretty simple, and parsing should be fast.
The main benefit would be if you are already using Lua for your scripting. If it's there then you might as well use it for several purposes, instead of adding on something extra like xml or protocol buffers.
(Not saying protocol buffers are bad, this is just another alternative to xml)
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

I like the way Nebula Device does it: persisting objects through scripts.
Your Lua thing looks similar. :)
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
stealth977
Gnoll
Posts: 638
Joined: Mon Dec 15, 2008 6:14 pm
Location: Istanbul, Turkey
x 42

Re: Small Protocol Buffer test

Post by stealth977 »

Some small side notes:

1 - I used the latest version, but didnt optimize for speed
2 - If you keep your structures small, of course the size can be smaller, but when you need to nest stuff, like using repeated fields, they make your files grow very quickly especially if you have a lot of repeated fields like in a scene graph (Node,Entity,Camera,Light,Billboard etc. etc. etc.)
3 - in XML you can nest anything in anything with an unlimited depth, in PBs you cant nest anything unless you adjust the depth and types by hand, still can not nest a message inside itself
4 - PBs type conversion is much more annoying than XML, I dont remember any set_rotation(Rot) instead you need to type rotation().set_x(X);rotation().set_y(Y);rotation().set_z(Z);rotation().set_w(W); etc....
5 - What about undefined data? XML can handle it, but although PBs have a way to do it (DynamicMessage) its much cumbursome. (If you ask why its needed, if your plugins are sharing the same file format, you cant predefine what data they save and how they want to save it, and you dont have to when it comes to XML)
6 - I spent 2 days reading every single sentence available in google site about PBs, every manual every definition every sample, even most of the header files and reflection stuff before implementing the TEST :) So, I am pretty sure about the sentence but dont ask me to spend another two days to locate it :)

So, to sum up, all the speed increase and data size gain depends on HOW MUCH YOU RESTRICT YOUR APPLICATION. All i wanted to say is, for my case, the gains didnt worth the restrictions, i would prefer spending 2x the size instead of restricting my file format.

Still I must say that PBs is a wonderfull piece of code, if you have a predefined strict file format on your mind, you can type a proto file in 5 minutes and watch PBs create an implementation C++ file with thousands of lines for you in a second :)

ismail,
Ismail TARIM
Ogitor - Ogre Scene Editor
WWW:http://www.ogitor.org
Repository: https://bitbucket.org/ogitor
User avatar
nullsquared
Old One
Posts: 3245
Joined: Tue Apr 24, 2007 8:23 pm
Location: NY, NY, USA
x 11

Re: Small Protocol Buffer test

Post by nullsquared »

stealth977 wrote:Some small side notes:

1 - I used the latest version, but didnt optimize for speed
stealth977 wrote: Last month, i made a small test with protocol buffers.
That is not the latest version. The latest version is ~11 days old as of this post. You cannot do a speed test if you're not optimizing for speed. That's like entering a race with a Lamborghini with a Beetle engine in it.
2 - If you keep your structures small, of course the size can be smaller, but when you need to nest stuff, like using repeated fields, they make your files grow very quickly especially if you have a lot of repeated fields like in a scene graph (Node,Entity,Camera,Light,Billboard etc. etc. etc.)
Well, clearly. But you must also repeat this in XML, which voids your point ;)
3 - in XML you can nest anything in anything with an unlimited depth, in PBs you cant nest anything unless you adjust the depth and types by hand, still can not nest a message inside itself
You can next things with unlimited depth in protocol buffers just fine :| . You can't nest a message inside itself because it wouldn't make sense. Each instance would have another instance ad-infinitum.
4 - PBs type conversion is much more annoying than XML, I dont remember any set_rotation(Rot) instead you need to type rotation().set_x(X);rotation().set_y(Y);rotation().set_z(Z);rotation().set_w(W); etc....
That's right. And then you use:

Code: Select all

Ogre::Quaternion rot(msg.x(), msg.y(), msg.z(), msw.w());
to read it. Compare that to parsing string attributes of an XML tag.
5 - What about undefined data? XML can handle it, but although PBs have a way to do it (DynamicMessage) its much cumbursome. (If you ask why its needed, if your plugins are sharing the same file format, you cant predefine what data they save and how they want to save it, and you dont have to when it comes to XML)
I guess I'll give you that point. However, this is why I believe there is a strong separation between graphics and other things, and why a scene editor should be just that, a scene editor, and exactly nothing more.
6 - I spent 2 days reading every single sentence available in google site about PBs, every manual every definition every sample, even most of the header files and reflection stuff before implementing the TEST :) So, I am pretty sure about the sentence but dont ask me to spend another two days to locate it :)
I spent about a week reading every single sentence on the site while implementing not a test, but my actual level format. Your statement that protobuffers are not for storage is false, as I've proved it with a quote from the overview page. And, I cannot find anywhere where it says that protobuffers are inefficient for larger files.
So, to sum up, all the speed increase and data size gain depends on HOW MUCH YOU RESTRICT YOUR APPLICATION. All i wanted to say is, for my case, the gains didnt worth the restrictions, i would prefer spending 2x the size instead of restricting my file format.

Still I must say that PBs is a wonderfull piece of code, if you have a predefined strict file format on your mind, you can type a proto file in 5 minutes and watch PBs create an implementation C++ file with thousands of lines for you in a second :)

ismail,
Once again, simply my $0.02.
User avatar
stealth977
Gnoll
Posts: 638
Joined: Mon Dec 15, 2008 6:14 pm
Location: Istanbul, Turkey
x 42

Re: Small Protocol Buffer test

Post by stealth977 »

I have a child, actually I have a son. The very first day i learned my wife was pregnant, i started reading books about handling children. The term may sound awkward but anyway English is not my mother tongue :) And the very first thing i learned from reading about the subject was that "What is important is not the amount of time you spend with your child but it is the quality of time you spend with your child.".

So, why am i making such a strange start? Because when i made a reference to google site, the reference was denied to exist. And when i said i am sure about it since i spent 2 days on reading everything about PBs, someone else tried to compare it with 2 weeks of time he spent on the subject, well, i am stating again, "Its not the amount of time that is important but the quality of time you spent" check below:
http://code.google.com/intl/tr-TR/apis/ ... iques.html

The Important Part:

Large Data Sets
Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.
That said, Protocol Buffers are great for handling individual messages within a large data set. Usually, large data sets are really just a collection of small pieces, where each small piece may be a structured piece of data. Even though Protocol Buffers cannot handle the entire set at once, using Protocol Buffers to encode each piece greatly simplifies your problem: now all you need is to handle a set of byte strings rather than a set of structures.
Protocol Buffers do not include any built-in support for large data sets because different situations call for different solutions. Sometimes a simple list of records will do while other times you may want something more like a database. Each solution should be developed as a separate library, so that only those who need it need to pay the costs.
Now about the second paragraph stating that protocol buffers can handle many small messages in a large database, refer to subject 1 in the above link:
Streaming Multiple Messages
If you want to write multiple messages to a single file or stream, it is up to you to keep track of where one message ends and the next begins. The Protocol Buffer wire format is not self-delimiting, so protocol buffer parsers cannot determine where a message ends on their own. The easiest way to solve this problem is to write the size of each message before you write the message itself. When you read the messages back in, you read the size, then read the bytes into a separate buffer, then parse from that buffer. (If you want to avoid copying bytes to a separate buffer, check out the CodedInputStream class (in both C++ and Java) which can be told to limit reads to a certain number of bytes.)
I find it a lot contradictory. If you are still asking why, all i can say is that get a neural network upgrade :) (just kidding no offense intended)

Protocol Buffers are developed to be a perfect network transmission and storage platform. There also exists a PDF file from google developers where they describe the various aspects of data retrieval and storage with in depth analysis of retrieval and storage costs measured in nano seconds. The summary of the large paper is that storing data as semi-compressed message caches scattered among different machines and retrieving them from there when needed is way faster than storing data and retrieving it from a single physical media (the first technique is about 10x faster even when using regular network structure). Of course this conclusion is very much expected since it is also the idea GFS was based on :) So, its very much the expected conclusion from the creators of GFS.

Now, back to the point. Google makes heavy use of distributed systems and effective messaging between those systems and PBs are created with that in mind. They are a very quick solution for creating your own file format, also that is why their tool and also their format use a word with two meanings. PROTO as a synonym for PROTOCOL and also as a synonym for PROTO-TYPE. You can create prototypes of binary formats in minutes and convert to actual C/C++/Phyton or Java code in no time and use it.

But as I stated before, there is always a CATCH. PBs have all the disadvantages of binary formats:
1 - They have strict rules
2 - Making changes is much more problematic
3 - You need to integrate them inside your code cycle (will explain later)
4 - Slower to write compared to text dumping. Why is writing speed important? For my specific project, the user reads the file ONCE but writes the file very often, it is an expected behavior in an editor for it to take a bit longer to prepare a file for editing, but its not tolerable to wait too long each time you press SAVE :)
5 - PBs are not nesting friendly. Lets say you create a node in XML, then you can add any number of children under that node and each can be unlimited amount of different node. But in PBs, there is no equal version of saving:
struct A
{
string name;
int value;
A *children;
};
Now thats what i call nesting. And think of it as a class, not struct, the problem gets bigger, because the children maybe not the same class but derivatives of the class with different data structures. It is easy to save them in XML, just nest the children under the parents node, but you cant do that in PBs, instead you need to write each child as separate messages, put a pointer to their parent, then after parsing all, you need to re-create the parent-child relations.

Anyway, there are a lot of things that can be said about PBs, either in favor or against. But please keep one thing in mind and that is who created them and for what purpose? Then you can understand where and how to use them better. You can understand where they will be effective and where not.

As I said before, its a wonderful creation, but dont forget, heart message can revive a dead but also kills a person when applied to a living person. So make sure you are using it at the right time and place :)

ismail,

NOTE: I forgot to explain the "Integrating in your code" part. For example in my project, i use the name value pair approach of OGRE for transmitting/copying/cloning/creating the In-Editor objects. But if I want to use protocol buffers I have two options:
1 - Bare with casting forward/backward the name value pairs to PBs and back which requires a different else/if block for every single type of object in the engine,
2 - Or worse, use PBs inside the engine to transmit the data in native format, which in turn is more alien to OGRE core.

You can guess the differences between for all in NameValuePairList x write("%s=\"%s\"", x->first,x->second) or using if x->first == "something" data().set_something(convert(x->second)) etc. etc. The first one is much more general purpose, doesnt even need strict in-engine integration, easy to extend etc. etc. you dont even need to know how many variables there are and which type they are, you just dump them :) and when reading them, you just create the same list by x[attributename] = attributevalue, so the reading and writing does not even need to know what it is actually doing :)

And both performance gain and data size gain is not as much as google states when it comes to storing and retrieving on a single computer.
Ismail TARIM
Ogitor - Ogre Scene Editor
WWW:http://www.ogitor.org
Repository: https://bitbucket.org/ogitor
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

The contradiction you speak of is streaming multiple pbuffers in the same stream, not contradictory to using pbuffers as records in a dataset.
Since pbuffers doesn't store their size, why would they? you need to - as you would using any custom, binary stream - keep track of the sizes yourself.
Just not use a single stream, then. :)
I think it's rather neat to have each GOB store/retrieve themselves in a pbuffer. Those buffers can be serialised as a scenefile/savegame, sent as messages to other components, across the network, etc.
I think I'll experiment with putting those pbuffers in a hierarchy, a tree - in a database, a filesystem hierarchy, or even another pbuffer.

However, maybe XML is better suited for a intermediary scene format than pbuffers. And that would explain why it doesn't work for something like Ogitor. :wink:

/* moved it to the lounge */
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

I am curious how Nullsquared and RedEyeCoder are using pbuffers - you seem to be quite happy with it.
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
User avatar
stealth977
Gnoll
Posts: 638
Joined: Mon Dec 15, 2008 6:14 pm
Location: Istanbul, Turkey
x 42

Re: Small Protocol Buffer test

Post by stealth977 »

Due to the structure of PBs, if i am not mistaken, either you need to write each individual record as individual message and you need to keep track of that message and its size in the stream, or you need to have a higher level message with repeated fields of the smaller message.

The option 1 makes the system more complex and renders some of PBs features useless, option 2 means a message > 1MB so go look for an alternative solution as stated by google developers :)

Man, i am stating once more, make sure you understand very well why they are created and for what they are created :)

Also, think of a scenario where you are passing values between many different objects, if you use PBs inside your engine so that you can actually benefit from all those size/speed improvements, all the objects need to know the message style of all the other objects so that they can communicate. But when you use a namevaluepairlist they dont need to know the data structure of other objects, they can just check for what they need like:

HMM did that object pass me a position? ni = list->find("position"), oh cool thank you sending object, now ill use that position :) or oh damn that object doesnt pass a position, so lemme use a default or it :)

but in pixel buffers, you cant do a passedbuffer().position() if you havent defined the position as a static member of all kinds of possible passed data. Of course you can simulate it in PBs but that is too much burden that renders their usage obsolete.

Nulsquared may be very happy using them:
*If his save system has a very static layout.
*If he doesnt need to serialize very complex derived hierarchies
*If the serialized data is only used by the serializer itself, so that no interchangeability is required (and recordsets are not polymorphic)
*If the number of records is relatively small
* And finally if he havent made an accurate comparison of what he gets and what he loses using them :)

which forms the perfect environment for PB usage, if any requirement not fitting one or more of the statements above, PBs start to loose efficiency and quickly gets below equilibrium point.

ismail,
Ismail TARIM
Ogitor - Ogre Scene Editor
WWW:http://www.ogitor.org
Repository: https://bitbucket.org/ogitor
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: Small Protocol Buffer test

Post by jacmoe »

Each optional field in a message has a function called inline bool has_whatever() const.

So when parsing our entity name, we just do this:

Code: Select all

dotproto::Go go;
if(go.has_name()) name == go.name();
No need to search. Just ask directly. More efficient. :wink:

As for multiple pbuffers in one stream, I don't really see how that's useless. That's how you'd use a stream in any case, if the stream is composed of several separate buffers. Just use a good old-fashioned header.
Or don't use a stream with multiple buffers in it.
Glob the pbuffers in a database, or.. :wink:

Of course, it's not a silver bullet. We can agree on that.
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
Post Reply