Tag Archives: operator overloading

Json made easy

In my last blog ‘An ode to the cast operator’ I talked about using a combination of 3 simple C++ syntactic sugar constructs in order to make a library API simple to use:

  • Cast operators
  • Subscript operators
  • Value semantic smart pointer wrappers.
I’ve now spent some more hours with the problem library that made me write that last blog and have come to a useful wrapper library that I would like to share with you.
I’ve called this library ‘JSON made easy for C++’ or JsonMe++ for short, and its now available on github. The result isn’t a full bidirectional JSON lib, at least noy at this moment. It just provides the facility for accesing the data inside a JSON document or string, the functionality I myself needed. Given that this library, that is a wrapper for the Glib json library is now complete up to a point that it should be usable to anyone, I wanted to revisit this subject to show how the resulting API is indeed trivial in use by the merit of the above 3 flavours of syntactic sugar.
So lets look at a piece of code using this library:
jsonme::JsonMeLib jsonlib;
try {
   jsonme::Node topnode=jsonlib.parseFile(pathToFile);
} catch (jsonme::ParseError &e) {
  ..
}
This simple piece of code is actualy the hardest part of the code. If the file path provided is invalid, or the file isn’t valid JSON,
an exception will get thrown. But lets not look at that. The interesting part is in the fact that there is no ‘new’ or ‘*’ anywhere in the code.
The user of the library doesn’t need to juggle with pointers, wrap them in smart pointers in order to take care of resource management,
mix pointer semantics and value semantics in the code, etc.
The library takes care of all of this, and all the user is exposed to are JsonMeLib and Node objects that can be used using value semantics,
internally taking care of effective resource management and efficient and relatively light object copy behavior. This is where we see
value semantic smart pointer wrappers in action. Both JsonMeLib and Node are value semantic smart pointer wrappers. So when the result of
parseFile is assigned to topnode, under the hood smart pointers are taking care of proper resource management, without the need of the user of the library being exposed to resource management. A subject that for many C++ developers is a difficult subject and large source of subtle bugs. The library user actually needs no knowledge about RAII, smart pointers or resource management in C++ in order to use the library without fear of creating resource management bugs, That is, if there are resource management bugs it would be the fault of the library author.
JSON knows basicaly 3 types of node’s:
  • Objects
  • Arrays
  • Scalars
The first two is where the second piece of syntactic sugar comes in: the subscript operator. The library uses size_t indexed subscript operators for arrays
and std::string indexed subscript operators for objects.
  size_t index=0;
  jsonme::Node firsfoobar=topnode[“foolist”][index][“bar”];
The overloaded subscript operators together with our value semantics smartpointer wrapper give the JSON object an interface that is friendly to the user of the library in that its intuitive and does not require the library user to get into the library type internals.
Combining these two with cast operators,  we make it the library user even easier:
   long long magicvalue=topnode[“magic”];
   std::string  owner= topnode[“foolist”][index][“bar”];
The library takes a step back from this to allow for validation. Node has a nodetype() method that can return  jsonme::INVALID. Next to this between Node and the primitive scalar types the library has a class Scalar that the user may use when working with JSON data structure not hard coded into the C++ code.  That is, the library user can choose to look into some more of the API for finer control, but the bottem line is that the provided syntactic sugar makes the amount of API internals a pretty small surface to work with.
Advertisements

An ode to the cast operator

Working on a project that is written 90% in python that uses JSON for its configuration, I had to parse the python generated JSON in C++. Not a problem, C++ boost has an excellent and usable library that has JSON support, the Boost Property Tree library. Unfortunately however the target platform comes with the 1.40 version of the boost library, and guess what, the boost property tree library became first available in the 1.41 version. Off to look for a different JSON library that is available as a standard package for the target platform. After searching the pachaging system of my target platform I found it comes with JSON-Glib.

First I started out being happy about the fact that I found an alternative JSON library for my project, but my joy quickly resided when I realized it was a C++ library with one of those API’s. Untill we get the rout node of our JSON document all looks fine, but once we have our first JsonNode while expecting to get a transparent API that maps effortlessly to the base C++ types, we end up being exposed to all kinds of JSON-Glib classes, and some classes from the core Gnome library.

Having written some of   those libraries myself, I know its easy to start believing in the merits of your own project internal type system, thinking nothing about exposing it to your library users.There are however a few useful tricks you can use with relatively little effort to decouple the knowledge of your library internal type system from the need of your library users to be exposed to that knowledge.

  • Cast operators
  • Subscript operators
  • Value semantic smart-pointer wrappers.

When designing an API for a library using these 3 tools, it is possible to greatly decrease the learning curve for using your library. You provide syntactic sugar that hides much of your internal type system from the part of your user base that isn’t that interested in learning about your no-doubt brilliantly designed library type hierarchy.

The cast operator

The most interesting of the 3 tools is the C++ cast operator. Adding a cast operator to a type allows to simply assign an object class that can validly be represented as a base type, for example an std::string, to a variable of that type. Yes, its just syntactic sugar, but its syntactic sugar that allows your library user to not learn about the internals of your library. We shall come back to the cast operator later, but lets start off by an example of an API class that uses cast operators. The following is part of a wrapper library for JSON-Glib that I am currently working on:

typedef enum {FLOAT,INTEGER,STRING,BOOL,NULLVAL} scalartype;

class AbstractScalar {

public:

virtual ~AbstractScalar(){}

virtual jsonme::scalartype type()=0;

virtual operator long double()=0;

virtual operator long long()=0;

virtual operator std::string()=0;

virtual operator bool()=0;

virtual bool isNull()=0;  };

 }

Subscript operator

An other piece of powerful syntactic sugar that is provided by languages like C++ and Python that support operator overloading is the subscript operator. For our JSON-Glib wrapper for example the subscript operator could be used both for accessing JSON object members by key, or for accessing JSON  aray members by index.  Again an excerpt from the wrapper library I’m working on:

typedef enum {OBJECT,ARRAY,SCALAR} nodetype;

class Node;

class AbstractNode {

public:

virtual ~AbstractNode(){}

virtual jsonme::nodetype type()=0;

virtual Node operator[](std::string name)=0;

virtual size_t size()=0;

virtual Node operator[](size_t index)=0;

virtual operator Scalar()=0;

};

If we combine this with the previously discussed cast operator, our JSON wrapper library could in theory be used as follows:

std::string firsthostname= rootnode[0][“host”][“name”][0];

As should be clear from this, the user should potentialy have verry litle knowledge if any about the types the API is composed with.

Value semantic smart-pointer wrappers.

A third piece of syntactic sugar that may not be directly obvious is the use of value semantic smart-pointer wrappers in our API.  As a rule, a library should not burden the users of that library with the issue of having to think about resource management. C++ provides the concept of smart pointers to help with correctly managing resources. Exposing and forcing the use of smart pointers can be a proper choice, for example returning an auto_ptr or a unique_ptr from a factory is much better than returning a raw pointer. In the library wrapper we are discussing here however, it may be much more suitable to completely hide the usage of smart pointers (that have pointer semantics) by providing a value semantics wrapper for the smart pointer.

class Node: public AbstractNode {

boost::shared_ptr<AbstractNode> mNode;

public:

Node(AbstractNode *node):mNode(node){}

~Node(){ delete mNode();}

jsonme::nodetype type(){ return mNode->type();}

Node operator[](std::string name) { return (*mNode)[name];}

size_t size() { return mNode->size();}

Node operator[](size_t index) { return (*mNode)[index];}

operator Scalar() { return (*mNode);}

};

conclusion

I hope that with the above examples I’ve shown how a adding a little bit of syntactic sugar to your library API can make a lot of sense when you want to spare your library users from a relatively steap learning curve. Although I have myself often been to lazy and presumptuous with respect to my own libraries and their type hierarchy to be thorough and sufficiently frequent in the usage of these constructs in my API’s, but running into JSON-Glib while only needing to parse a config file made me aware once more of the user perspective of those API’s. Remember that if your library is any good it will have more users than it has developers. Users that will be grateful for a reduced learning curve. And if its a library for a tiny market, than you may possibly still increase the number of potential users by adding these 3 ingredients to your API. I like all 3 of the syntactic sugar components described above, but to me the C++ cast operator is their absolute champion.

Programing language wishlist

My previous blog post was about garbage collection being anti-productive. Its my strong opinion that with respect to resource management, Java has got it completely wrong, C++ got it completely right, and languages with deterministic memory management (reference counting) like cPython come in as a good second to C++. I have to litle knowledge of C# to truly judge the productivity and scalability properties of its hybrid resource management model that at first glance seems to be a quite a bit better than Java Basically. I would say (taking into respect what I heard and read about C#) with respect to resource management:

  • +2 for C++
  • +1 for python
  • +/- for C#
  • -2 for Java

But I will be the first to admit that Java,  and especially its more secure little brother Joe-E, has many merits on other fronts, while C++ definitely has its weaknesses. Looking at the wide range of programing languages available, and looking at their features and miss-features,  there doesn’t seem to be a single language that doesn’t have at least one ingredient that is incompatible with my personal taste. This made me think that if I could order a programing language the way I can order a pizza: Tuna pizza,  double garlic, pickled green peppers , hold the onions.  What toping would I choose for my programming language? Its likely that my taste in programing language is just as disgusting to you as my taste  in pizza is to many of my friends 😉 My priorities basically are scalable high productivity and high integrity, if you have other priorities, your pizza will likely get quite a different combination of toppings.

Perl’s expressiveness and built-in regular expression support.

While I try to not use Perl anymore, I started my career as a Unix system administrator doing a lot of Perl, and have done many smaller and mid sized projects in Perl. While I’m no longer the big Perl fan I used to be, Perl expressiveness  keeps pulling me back to Perl if I need to build something small in a hurry. No language beats Perl with respect to how little code or how little time it takes to build something when in a hurry. Combined with the built-in regex support that is really integrated well in the language, this is a toping I would love to have onh my programing language pizza.

C++ its full encapsulation of resources, add some automated  resource management.

As I showed in my previous blog post on resource management, C++ RAII allows resources to be fully encapsulated without braking encapsulation or burdening polymorphism. This property in C++ comes at the price of not having automated memory management. I’d like to have my cake and eat it to here. C++ its full encapsulation of resources, but with some more syntactic sugar for RAII to further boost productivity.

C++ its generics and TMP support

Nothing boosts productivity like being smart about the  ‘Dont Repeat Yourself’ or DRY principle. Nothing spells DRY like C++ generics and template meta programing in C++.

C++/Python’s operator overloading

While operator overloading to many is just syntactic sugar, it to me is more like syntactic honey, making things do down much smoother in many cases.  Function objects (objects overloading the ‘()’ operator) are a great alternative to old-fashioned C callback functions.   In C++ cast operators can safe you much work when refactoring. Operator overloading on their own will boost your productivity, but when combined with generics this boost is multiplied.

Pythons Strong, Safe but property based type system.

Strong and safe type systems are great, but sometimes they get in your way. In python a strong and safe type system is combined with property based compatibility.

Message passing concurrency only, safe the raw threads.

Where shared mutable state can get you into trouble, you haven’t seen any real trouble until you have tried doing large scale development with raw threads. Raw threads give you shared state concurrency with all the troubles that sharing state brings with it. These problems can really kill your productivity and create hard to track down bugs. Message passing concurrency gives a less problematic alternative.  I would like my ideal language not to supply me with raw threads, but hide these (and lock free containers) these behind an abstraction layer that  provides message passing concurrency instead.

No class variable support.

As with shared state concurrency, class variables are shared essentially shared mutable state and shared mutable state is something that can be tricky to work with. There are a few basic rules for working with shared mutable state that come from object capability theory and from basics about testability of code. In essence class variables are not compatible with these rules.

No pointer arithmetic support

This for me is the thing I dislike most about C++. Pointer arithmetic will get you into trouble. Its a tempting thing to use in C++, and thats why I don’t want it in my ideal language.  Its like one of those fattening ingredients on your pizza that you know is bad for you, but you have a graving for it so you take it.

No reflection or typeof

Reflection and typeof will let you get away with bad design, it will let you be more productive for the short haul, but bad design will in the end make you regret you took a shortcut and will make you pay for taking the shortcut many times over.

Fully ocap with persistent object support

The E-language, a language that I have used for smaller personal projects (Its quite impossible to convince my co-workers about this great but exotic language is worth investing serious time in), combines being fully OCAP (a sane subset of OO for building high integrity systems) and has support for persistent applications that survive system reboots without loosing state. The E-Language is fully compatible with my MinorFs project and allows for fitting least authority design into a multi granularity system design from the OS level up.  My Ideal language would take quite a few things from E. Its an amazing litle language that has gotten way to little press.

Co-worker compatible

A language could have all the ingredients above, a real project requires multiple people working on it. This means that the ideal language, other than E should be compatible with the often conservative views of co-workers with respect to investing in new technology.  The ideal language should be similar enough to languages my co-workers know (Java,C++,Python,C#,Delphi,Perl) to be acceptable to the conservative mind.

The above list to a point is just personal taste, but I believe its most about personal taste in priorities that for me are productivity and high integrity. My ideal language would allow me and my co-workers to write large scale high integrity systems with the highest possible productivity.