Tag Archives: C++

RaiiCap pattern: Injected Singleton alternative for C++

The Singleton design pattern is a so called creational pattern from the venerable GoF design patterns book. While this book is often seen as something of a software engineering Bible, the Singleton pattern I dare say (at the risk of being stoned to dead by GoF bigots) is one that , while being widely used and being immensely popular in software design, is in fact a very controversial pattern. A design pattern that I would want to refer to as an anti-pattern. So what is this Singleton pattern meant for? The GoF book states that a singleton is intended for when you aim to :

“Ensure a class only has one instance, and provide a global access point to it.”

No wait, “Ensure a class only has one instance” I could get into why you may wish for such a thing, butprovide a global access point to it” ? There are basically a few situations where providing global access to something isn’t an extremely bad idea.

  1. If its a constant.
  2. If it is a free standing function that does in no way use static mutables.
  3. A combination of one and two.

In any other conceivable scenario, global access is a terrible idea. It leads to tight coupling, to excess authority and knowledge, hard to test code, code that is impossible to reason about with respect to security properties like integrity and confidentiality, etc, etc.

So if we agree that having global access to a singleton, at least for any singleton that could not just as well have been implemented as a free standing function that accesses no static mutables, is a very bad idea, we can look at what we could do to provide for a useful pattern that aims to provide a way to accommodate “Ensure a class only has one instance”, while at all cost avoid the  “provide a global access point to it” part.

Well lets take a step back once more. There are two questions we need to ask about our problem before we set out to fix it. “Why would we want to assure that a class has only one instance of it”, and “Why would we need any design pattern to do that?” . The answer to the first question could be abstracted to : “There is a single shared resource that we need to manage.”, When you look at how singletons are used, you often see singletons managing a small pool of resources. So we could probably rephrase this to: “There is a scarce shared resource that we need to manage.”. 

So now we have the problem clear, its a resource management issue really. We have a scarce resource from what a substantial portion is acquired at class construction time (or in the GoF Singleton pattern,  lazily, but this is hardly an essential property that the Singleton strives to accomplish). So what we basically don’t want, if we have a large code base with many people working on it, that a developer would look at our class and think : “hey that looks about right, lets instantiate one of those over here”.

So basically what we need to stop this from happening is to find a way to limit access to the constructional patterns for creating instances of a resource management class.

To summarize my reasoning thusfar: We have reduced :

“Ensure a class only has one instance, and provide a global access point to it.”


“Ensure a class only has one instance”

that in turn we have expanded to the larger:

“Limit access to the constructional patterns  of a resource management class”

So lets see if we can device a new pattern for accomplishing this goal. I shall look at this from C++, as I feel I’ve solved this problem with a few C++ specific constructs and idioms. Solving the same problem with a pattern suitable for other languages is something I must currently leave as an exercise for the reader.

In C++, the base pattern for resource management is called RAII. I’m not going to rehash RAII here, but basicly it can be described as a ‘Constructor Aquires, Destructor releases, single responsibility’ pattern. Given that we identified our Singleton problem as being a resource management problem, any alternative to the Singleton problem in C++ should thus IMO basically rely on RAII as its basis.

So whats the problem with RAII that keeps us from using it as Singleton alternative already? The problem is that in most cases a RAII object that claims a substantial portion of a sparse resource can be constructed from anywhere within the code-base. There is nothing about a constructor other than maybe its constructor arguments that keeps you from instantiating an object from any place where you may like to do so, and there is nothing about the interface of a RAII class that should prompt the always in a hurry maintenance programmer to consider he/she may be depleting a resource or may be breaking the resource its access model by instantiating one more object.

As you might know from previous blog posts, I’m a big fan of capability based systems, and although C++ is about the language most removed from capability secure languages that one can get, and C++ by not being memory safe could by definition never have true capabilities, this should not keep us from using a capability like object in solving our problem. That is, we are not protecting our constructor from anything needing real capabilities here. We just want to make the in a hurry maintenance programmer think twice before claiming a scarce resource.  So lets just call these capability like objects for C++ RAIICaps, as they will be used in combination with RAII classes and are rather capability like in their nature.  So what would a RAIICap look like? Consider the following simple C++ template:

template <typename T>
class raiicap {
    friend int main(int,char **);

Now within our RAII class we shall define our constructor as:

class raiiclass;
class raiiclass {
    raiiclass(raiicap<raiiclass> const &);

So what does this template do? Basically not much, it has an empty constructor nothing more, but this empty constructor is private, what keeps it from being instantiated in the normal way. C++ however has the friend keyword that can be used to give a befriended class or function access to the private parts of a class or object. The raiicap has no private member variables, but as its constructor is defined private, only its friends can instantiate it. The above defines the programs main  as friend. This means that only main can instantiate raiicaps, that it can than delegate to other parts of the program. Given that the RAII class requires a raiicap<raiiclass> reference as constructor parameter, this simple pattern effectively limits the access to the constructor of our RAII classes.

In main, this could look something like:

int main(int,char **) {
  raiicap<Foo> foo_raii_cap;
  Bar bar(foo_raii_cap);
  Baz baz();

In the above,  main instantiates a raiicap for Foo and constructor injects this raiicap into a new Bar object bar. The bar object now has the capability to create Foo objects. It might further delegate this capability, but this will need to be done explicitly. The also instantiated baz object does not get a raiicap to Foo and thus will have no ability to access to Foo’s constructor.

I think the above shows that, at least in C++, simple alternatives to the GoF Singleton pattern exist, at least for that part of the singletons intend that wasn’t a real bad idea to begin with. I’m pretty sure that similar patterns should be possible in other programming languages also. The most important thing to realize is that singletons in fact are a resource management pattern, and that for managing access to allocation of scarce resources, its a pattern that is both horrible and that, as shown above,  has alternatives that don’t have the same horrible side-effects.


Json made easy

In my last blog ‘An ode to the cast operator’ I talked about using a combination of 3 simple C++ syntactic sugar constructs in order to make a library API simple to use:

  • Cast operators
  • Subscript operators
  • Value semantic smart pointer wrappers.
I’ve now spent some more hours with the problem library that made me write that last blog and have come to a useful wrapper library that I would like to share with you.
I’ve called this library ‘JSON made easy for C++’ or JsonMe++ for short, and its now available on github. The result isn’t a full bidirectional JSON lib, at least noy at this moment. It just provides the facility for accesing the data inside a JSON document or string, the functionality I myself needed. Given that this library, that is a wrapper for the Glib json library is now complete up to a point that it should be usable to anyone, I wanted to revisit this subject to show how the resulting API is indeed trivial in use by the merit of the above 3 flavours of syntactic sugar.
So lets look at a piece of code using this library:
jsonme::JsonMeLib jsonlib;
try {
   jsonme::Node topnode=jsonlib.parseFile(pathToFile);
} catch (jsonme::ParseError &e) {
This simple piece of code is actualy the hardest part of the code. If the file path provided is invalid, or the file isn’t valid JSON,
an exception will get thrown. But lets not look at that. The interesting part is in the fact that there is no ‘new’ or ‘*’ anywhere in the code.
The user of the library doesn’t need to juggle with pointers, wrap them in smart pointers in order to take care of resource management,
mix pointer semantics and value semantics in the code, etc.
The library takes care of all of this, and all the user is exposed to are JsonMeLib and Node objects that can be used using value semantics,
internally taking care of effective resource management and efficient and relatively light object copy behavior. This is where we see
value semantic smart pointer wrappers in action. Both JsonMeLib and Node are value semantic smart pointer wrappers. So when the result of
parseFile is assigned to topnode, under the hood smart pointers are taking care of proper resource management, without the need of the user of the library being exposed to resource management. A subject that for many C++ developers is a difficult subject and large source of subtle bugs. The library user actually needs no knowledge about RAII, smart pointers or resource management in C++ in order to use the library without fear of creating resource management bugs, That is, if there are resource management bugs it would be the fault of the library author.
JSON knows basicaly 3 types of node’s:
  • Objects
  • Arrays
  • Scalars
The first two is where the second piece of syntactic sugar comes in: the subscript operator. The library uses size_t indexed subscript operators for arrays
and std::string indexed subscript operators for objects.
  size_t index=0;
  jsonme::Node firsfoobar=topnode[“foolist”][index][“bar”];
The overloaded subscript operators together with our value semantics smartpointer wrapper give the JSON object an interface that is friendly to the user of the library in that its intuitive and does not require the library user to get into the library type internals.
Combining these two with cast operators,  we make it the library user even easier:
   long long magicvalue=topnode[“magic”];
   std::string  owner= topnode[“foolist”][index][“bar”];
The library takes a step back from this to allow for validation. Node has a nodetype() method that can return  jsonme::INVALID. Next to this between Node and the primitive scalar types the library has a class Scalar that the user may use when working with JSON data structure not hard coded into the C++ code.  That is, the library user can choose to look into some more of the API for finer control, but the bottem line is that the provided syntactic sugar makes the amount of API internals a pretty small surface to work with.

An ode to the cast operator

Working on a project that is written 90% in python that uses JSON for its configuration, I had to parse the python generated JSON in C++. Not a problem, C++ boost has an excellent and usable library that has JSON support, the Boost Property Tree library. Unfortunately however the target platform comes with the 1.40 version of the boost library, and guess what, the boost property tree library became first available in the 1.41 version. Off to look for a different JSON library that is available as a standard package for the target platform. After searching the pachaging system of my target platform I found it comes with JSON-Glib.

First I started out being happy about the fact that I found an alternative JSON library for my project, but my joy quickly resided when I realized it was a C++ library with one of those API’s. Untill we get the rout node of our JSON document all looks fine, but once we have our first JsonNode while expecting to get a transparent API that maps effortlessly to the base C++ types, we end up being exposed to all kinds of JSON-Glib classes, and some classes from the core Gnome library.

Having written some of   those libraries myself, I know its easy to start believing in the merits of your own project internal type system, thinking nothing about exposing it to your library users.There are however a few useful tricks you can use with relatively little effort to decouple the knowledge of your library internal type system from the need of your library users to be exposed to that knowledge.

  • Cast operators
  • Subscript operators
  • Value semantic smart-pointer wrappers.

When designing an API for a library using these 3 tools, it is possible to greatly decrease the learning curve for using your library. You provide syntactic sugar that hides much of your internal type system from the part of your user base that isn’t that interested in learning about your no-doubt brilliantly designed library type hierarchy.

The cast operator

The most interesting of the 3 tools is the C++ cast operator. Adding a cast operator to a type allows to simply assign an object class that can validly be represented as a base type, for example an std::string, to a variable of that type. Yes, its just syntactic sugar, but its syntactic sugar that allows your library user to not learn about the internals of your library. We shall come back to the cast operator later, but lets start off by an example of an API class that uses cast operators. The following is part of a wrapper library for JSON-Glib that I am currently working on:

typedef enum {FLOAT,INTEGER,STRING,BOOL,NULLVAL} scalartype;

class AbstractScalar {


virtual ~AbstractScalar(){}

virtual jsonme::scalartype type()=0;

virtual operator long double()=0;

virtual operator long long()=0;

virtual operator std::string()=0;

virtual operator bool()=0;

virtual bool isNull()=0;  };


Subscript operator

An other piece of powerful syntactic sugar that is provided by languages like C++ and Python that support operator overloading is the subscript operator. For our JSON-Glib wrapper for example the subscript operator could be used both for accessing JSON object members by key, or for accessing JSON  aray members by index.  Again an excerpt from the wrapper library I’m working on:

typedef enum {OBJECT,ARRAY,SCALAR} nodetype;

class Node;

class AbstractNode {


virtual ~AbstractNode(){}

virtual jsonme::nodetype type()=0;

virtual Node operator[](std::string name)=0;

virtual size_t size()=0;

virtual Node operator[](size_t index)=0;

virtual operator Scalar()=0;


If we combine this with the previously discussed cast operator, our JSON wrapper library could in theory be used as follows:

std::string firsthostname= rootnode[0][“host”][“name”][0];

As should be clear from this, the user should potentialy have verry litle knowledge if any about the types the API is composed with.

Value semantic smart-pointer wrappers.

A third piece of syntactic sugar that may not be directly obvious is the use of value semantic smart-pointer wrappers in our API.  As a rule, a library should not burden the users of that library with the issue of having to think about resource management. C++ provides the concept of smart pointers to help with correctly managing resources. Exposing and forcing the use of smart pointers can be a proper choice, for example returning an auto_ptr or a unique_ptr from a factory is much better than returning a raw pointer. In the library wrapper we are discussing here however, it may be much more suitable to completely hide the usage of smart pointers (that have pointer semantics) by providing a value semantics wrapper for the smart pointer.

class Node: public AbstractNode {

boost::shared_ptr<AbstractNode> mNode;


Node(AbstractNode *node):mNode(node){}

~Node(){ delete mNode();}

jsonme::nodetype type(){ return mNode->type();}

Node operator[](std::string name) { return (*mNode)[name];}

size_t size() { return mNode->size();}

Node operator[](size_t index) { return (*mNode)[index];}

operator Scalar() { return (*mNode);}



I hope that with the above examples I’ve shown how a adding a little bit of syntactic sugar to your library API can make a lot of sense when you want to spare your library users from a relatively steap learning curve. Although I have myself often been to lazy and presumptuous with respect to my own libraries and their type hierarchy to be thorough and sufficiently frequent in the usage of these constructs in my API’s, but running into JSON-Glib while only needing to parse a config file made me aware once more of the user perspective of those API’s. Remember that if your library is any good it will have more users than it has developers. Users that will be grateful for a reduced learning curve. And if its a library for a tiny market, than you may possibly still increase the number of potential users by adding these 3 ingredients to your API. I like all 3 of the syntactic sugar components described above, but to me the C++ cast operator is their absolute champion.

Hypothesis: Reference stealing should be default (POLA) behaviour.

Those of you who have done programming in C++ are likely familiar with the auto_ptr smart pointer, or are familiar with coding standards that caution against the use of this type of smart pointer. With more and more parts of the new C++ standard making their way into C++ compilers, a new smart pointer type uniq_ptr is now available on the latest and hippest versions of some of the popular C++ compilers as part of the standard template library.

This new smart pointer type is said to deprecate the old auto_ptr smart pointer. If we combine the numerous C++ coding standards warning us against the use of auto_ptr with the C++ standard chromite deprecating the auto_ptr, it is hard not to jump to the conclusion that there must be something very wrong with this type of smart pointer.

In C++ we have a multiple ways to allocate objects and other data types. We have two ‘no-worries’ ways of allocating and one that requires the programmer to think about what he is doing as the language provides no default resource management for that type of allocation. An object or other data type can be allocated:

  1. On the stack
  2. Part of a composite object
  3. On the heap.

When an object is allocated on the heap using new, the new invocation returns a raw pointer to the newly allocated object. To any seasoned C++ programmers a bare raw pointer in a random place in the code is an abomination, something that needs fixing.  You know the when you go to the beach and run into your mom’s old friend wearing a bathing-suite that is way to tiny for someone of her age who doesn’t go to the gym every second day, and you think: “damn, please put some cloths on that” ?

Well thats how any C++ programmer (that is any good at his/her job) looks at raw heap pointers.  In C++ there are many cloths to choose from for raw heap pointers. The most elementary cloths are simple RAII objects that wrap a heap object pointer inside a stack object and thus controls the heap object its lifetime.  On the other side of the spectrum there is the shared_ptr, that as its name betrays is a shared reference counted smart pointer, the closest you will get to garbage collection in C++. Somewhere between those two we find a strange little creature called auto_ptr.

The auto_ptr is probably the simplest of all smart-pointers. It basically works like a RAII object with pointer semantics most of the time, but it has one funny little property that makes it behave rather odd.   Copy and assignment will under the hood result in a move instead of a copy. There are situations where its clear that this is desired behavior. For example when implementing the factory method or the abstract factory design patterns. These patterns can return an auto_ptr instead of a  raw pointer and the move will be exactly what anyone would want and would expect.

The problem however arises when an auto_ptr is used as an argument to a method. A method might steal our pointer by for example assigning it to a temporary or to a member variable of the invoked object.  I believe everyone can see there is a problem in there somewhere, but maybe, just maybe there is actually a glimpse of a solution to a problem hidden in there. A problem by the way that most programmers, especially C++ programmers don’t realize exists.

So lets have a look at the problem. When trying to build high integrity systems, there is one important principle that we must apply to every aspect of the system design and development. This principle is called the Principle Of Least Authority or POLA for short.  According to this principle we should, to state it simple, minimize the dependencies of each piece of code to the minimum amount possible. So what type of dependencies do we see in our pointer stealing example?  We see an object Bob with method foo and we see that this foo method takes some kind of reference to a Carol object. We see some object Alice with a reference to Bob and a reference to Carol that invokes :

  • bob->foo(somewrapper(carol))

So what does bob->foo do with carol and more importantly, if we treat bob and its foo method like a black box, what would we want bob->foo to be able to do with carol? Basic usable security doctrine states that the default action should be the most secure one.  So what would be tha sane default most secure option here?  Basically this depends on the concurrency properties of Bob with respect to Alice. First lets assume that bob is a regular (non active) object. Than the least authority  option would be for somewrapper to be a RAII style temporary object that when going out of scope revokes access to carol. That way bobs access to carol would be scoped to the foo invocation.

Basically when building software using multi threading or multi process architecture, there are two mode’s of concurency we could use.

  1. Shared state concurrency using raw threads, mutexes, semaphores, etc
  2. Message passing concurrency using communicating event loops,  lock-free fifo’s, active objects etc.

As with many shared state solutions, when looking for high system integrity guards, the shared state solutions tend to be the solutions we should steer away from, so basically message passing concurrency should be the way to go here.

Now lets assume that we are using a message passing style of concurrency where alice, bob (and not necessarily carol) are active objects. What would be the sane least authority action for this invocation. The foo invocation scoping would make absolutely no sense in this scenario. The foo method will be handled asynchronously by bob. This means we do need to give bob a reference to Carol that will not be automatically revoked.  Given that carol is not necessarily an active object,  this reference would thus make carol an object that is shared between two concurrently running active objects.  Now given that sharing between two threads of operations implies quite some more authority and thus less implicit integrity for the overall system, the safest default operation may be that provided by what we considered to be wrong with auto_ptr in C++.  With single core computers becoming hard to find and the normal  amount of cores in a computer still going up, I believe its safe to say that we should assume modern programs will use some form of concurrent processing that according to POLA should probably be message passing concurrency. That is, the foo method should by default steal the reference to carol and allice should be explicit if she wants to retain her own copy of the reference to carol. This line of reasoning makes me come to the following hypothesis with respect to least authority.

  • Reference stealing should be default (POLA) behavior for method invocation.

If this hypothesis is right, than the C++ auto_ptr flaw that the new C++ standard claims to fix by deprecating auto_ptr, might actually not be a flaw that needs fixing,  but a starting-point for,  in line with usable security doctrine,  making high integrity concerns into default behavior.

Why garbage collection is anti-productive.

While the non GC language C(++) is still the most popular language in the world (at least if we count C and C++ as the same language, if we look at them as separate languages, Java is the most popular language), GC languages as a whole currently make up a larger share than non GC languages, and new languages seem to build on virtual machine environments that mandate GC. It thus would be good to look at what GC has gotten us so far.

Lets start by looking what GC is for. GC is basically a form of automated resource management for one specific type of resource: Random Access Memory or RAM. In computer programming there are multiple types of resources: RAM, open file handles, locks, threads, etc,etc. GC automates resource management for just one of these resource types: memory.

This means that in a GC language, there are two types of resource, memory (a resource you should no longer worry about), and anything else, for what you should do some manual resource management.

Looking at this, it would seem that GC is a pretty good deal. You don’t have to worry anymore about resource management for memory so you gain productivity there, and for other resources nothing changes. But it is this last assumption that ends up as a giant mistake.

To understand why thinking nothing changes for non-memory resource management in GC languages, lets have a look of how a C++ programmer would do resource management on raw memory.

class RawMemoryWrapper {
char *mRaw;
RawMemoryWrapper(size_t bytes):
mRaw(new char[bytes]){}
~RawMemoryWrapper(){ delete []mRaw;}
operator char * () { return mRaw;}

We see a special class (refered in C++ as a RAII class) that wraps the resource. If a RAII class is created, the constructor acquires the resource. Hence the name ‘Resource Acquisition Is Initialization’ or RAII. If the RAII object goes out of scope, its destructor is called and the resource is released auto-magically. A third construct that is not actualy part of the RAII idiom but is used often together with RAII is a C++ cast operator that allows RAII classes to be used as drop in replacement for the resource they wrap. RAII helps wrapping any resource we might want to use in a fully encapsulating way.  You can make composite classes with RAII member variables or without them, there is nothing transitive about resources in RAII, what is a good thing from an OO perspective.

Although a RAII class isn’t that big, for memory its clear there IS overhead in having to write RAII classes when compared to GC.  From this we might conclude that this alone makes GC a winner, after-all, if at one front we win and at an other front we don’t lose we would have won. This assumption however is wrong for one elementary property of GC. GC works with the abstraction of eternal objects.  If an object goes out of scope, basically nothing happens. No destructor is called, no memory is freed. In most GC implementations, even when the memory the object held is freed, no destructor is called. This property of GC basically means that GC exludes the posibility for using the RAII idiom on non memory resources.

So what is the cost of not being able to use RAII for non memory resources in GC languages?  This cost can be expressed with a thing that shocked me when I grasped its impications:

  • Without RAII, ‘being a resource’  becomes transitive to composition.

Once you grasp what this means for object orientation, you will be as shocked as me by the popularity of GC and about the fact that the productivity gain that GC offers is so clearly a myth. To me it has become absolutely clear that GC does not only not improve resource management related productivity, it actually decreases resource management related productivity in such a dramatic way that the persistence of the productivity myth is absolutely shocking.

Imagine that you have a relatively large program. You will be pretty sure to have many resources in need of manual management.  These resources with RAII classes could be managed once down deep in some library code. In a large program compositional patterns may easily go 10 or 20 layers deep, and at each of these layers, without RAII, given the transitive nature of being a resource,  the result of this would be that there should hardly be any non resource classes within the higher abstraction layers of such a project.

So we started off trying to reduce the amount of manual resource management, but what happened? We exploded the amount of manual resource management and in the process broke the encapsulation principle good of OO programing  practice. A user of a class should have no knowledge if the class being used happens to be composed using non memory resources.  We can clearly state that:

  • GC breaks OO encapsulation.

But it doesn’t end there.It gets worse.  Given that we are using OO design and polymorphism,  an interface of a class should not change depending on class implementation.  So if we can conceive of the idea that a class could either be implemented without a composition that used non-memory resources, but also one that does, than given the  nature of being a resource in GC languages, we MUST when designing an interface assume there might be an implementation that is composed using non-memory resources.This means that we need to treat any polymorphic interface as a resource even if its implementation isn’t composed using any actual non memory resources. We can thus come to the following conclusion:

  • GC breaks OO polymorphism.

This in term means that our already larger need for manual resource management expands once more to even larger dimensions.

So lets look at all the facts once more:

  • Without RAII, ‘being a resource’  becomes transitive to composition.
  • GC breaks OO encapsulation.
  • GC breaks OO polymorphism.
  • As a result GC ‘explodes’ the need for manual resource management and thus reduces productivity.

With GC having started out as a way to reduce manual resource management and increase productivity, its pretty clear that GC has ended up being one big mistake.