Why garbage collection is anti-productive.

While the non GC language C(++) is still the most popular language in the world (at least if we count C and C++ as the same language, if we look at them as separate languages, Java is the most popular language), GC languages as a whole currently make up a larger share than non GC languages, and new languages seem to build on virtual machine environments that mandate GC. It thus would be good to look at what GC has gotten us so far.

Lets start by looking what GC is for. GC is basically a form of automated resource management for one specific type of resource: Random Access Memory or RAM. In computer programming there are multiple types of resources: RAM, open file handles, locks, threads, etc,etc. GC automates resource management for just one of these resource types: memory.

This means that in a GC language, there are two types of resource, memory (a resource you should no longer worry about), and anything else, for what you should do some manual resource management.

Looking at this, it would seem that GC is a pretty good deal. You don’t have to worry anymore about resource management for memory so you gain productivity there, and for other resources nothing changes. But it is this last assumption that ends up as a giant mistake.

To understand why thinking nothing changes for non-memory resource management in GC languages, lets have a look of how a C++ programmer would do resource management on raw memory.

class RawMemoryWrapper {
char *mRaw;
public:
RawMemoryWrapper(size_t bytes):
mRaw(new char[bytes]){}
~RawMemoryWrapper(){ delete []mRaw;}
operator char * () { return mRaw;}
};

We see a special class (refered in C++ as a RAII class) that wraps the resource. If a RAII class is created, the constructor acquires the resource. Hence the name ‘Resource Acquisition Is Initialization’ or RAII. If the RAII object goes out of scope, its destructor is called and the resource is released auto-magically. A third construct that is not actualy part of the RAII idiom but is used often together with RAII is a C++ cast operator that allows RAII classes to be used as drop in replacement for the resource they wrap. RAII helps wrapping any resource we might want to use in a fully encapsulating way.  You can make composite classes with RAII member variables or without them, there is nothing transitive about resources in RAII, what is a good thing from an OO perspective.

Although a RAII class isn’t that big, for memory its clear there IS overhead in having to write RAII classes when compared to GC.  From this we might conclude that this alone makes GC a winner, after-all, if at one front we win and at an other front we don’t lose we would have won. This assumption however is wrong for one elementary property of GC. GC works with the abstraction of eternal objects.  If an object goes out of scope, basically nothing happens. No destructor is called, no memory is freed. In most GC implementations, even when the memory the object held is freed, no destructor is called. This property of GC basically means that GC exludes the posibility for using the RAII idiom on non memory resources.

So what is the cost of not being able to use RAII for non memory resources in GC languages?  This cost can be expressed with a thing that shocked me when I grasped its impications:

  • Without RAII, ‘being a resource’  becomes transitive to composition.

Once you grasp what this means for object orientation, you will be as shocked as me by the popularity of GC and about the fact that the productivity gain that GC offers is so clearly a myth. To me it has become absolutely clear that GC does not only not improve resource management related productivity, it actually decreases resource management related productivity in such a dramatic way that the persistence of the productivity myth is absolutely shocking.

Imagine that you have a relatively large program. You will be pretty sure to have many resources in need of manual management.  These resources with RAII classes could be managed once down deep in some library code. In a large program compositional patterns may easily go 10 or 20 layers deep, and at each of these layers, without RAII, given the transitive nature of being a resource,  the result of this would be that there should hardly be any non resource classes within the higher abstraction layers of such a project.

So we started off trying to reduce the amount of manual resource management, but what happened? We exploded the amount of manual resource management and in the process broke the encapsulation principle good of OO programing  practice. A user of a class should have no knowledge if the class being used happens to be composed using non memory resources.  We can clearly state that:

  • GC breaks OO encapsulation.

But it doesn’t end there.It gets worse.  Given that we are using OO design and polymorphism,  an interface of a class should not change depending on class implementation.  So if we can conceive of the idea that a class could either be implemented without a composition that used non-memory resources, but also one that does, than given the  nature of being a resource in GC languages, we MUST when designing an interface assume there might be an implementation that is composed using non-memory resources.This means that we need to treat any polymorphic interface as a resource even if its implementation isn’t composed using any actual non memory resources. We can thus come to the following conclusion:

  • GC breaks OO polymorphism.

This in term means that our already larger need for manual resource management expands once more to even larger dimensions.

So lets look at all the facts once more:

  • Without RAII, ‘being a resource’  becomes transitive to composition.
  • GC breaks OO encapsulation.
  • GC breaks OO polymorphism.
  • As a result GC ‘explodes’ the need for manual resource management and thus reduces productivity.

With GC having started out as a way to reduce manual resource management and increase productivity, its pretty clear that GC has ended up being one big mistake.

Advertisements

5 responses to “Why garbage collection is anti-productive.

  1. All of the above is minor … and seems to be just an excuse to favor C++

    eg on a list in order of importance.
    1. GC improve productivity
    2. GC reduce bugs

    90. In Java ( but not really C#) resource management is more difficult.

    That said their is a case for non GCs in high performance libs which have a high staff to functionality ratio. This is one of the things Bitc hopes to fix .

  2. I would say:
    1) GC improves productivity upto the point where the reduced overhead of having to do RAII for memory is overshadowed by the increased overhead in management of composite objects that contain deep non memory resources.
    2) GC reduces bugs up to the point where the bug reduction from automated memory management is overshadowed by the increased bug count by accidental failure to treat composite objects that contain deep non memory resources as resources themselves.

    So its basically a scalability issue. GC works great on a small scale, reasonable on a mid scale and ends up doing the opposite of what it set out to do (that is, it reduces productivity and increases bugs) on al large scale.

    The above is not realy an excuse to favor C++, but my experience with and knowledge of GC is limited to JVM based languages, so my view may be quite biased on that end.

  3. Do I understand right that the first drawback (Without RAII, ‘being a resource’ becomes transitive to composition) makes sense only when we use the composition pattern? And if we do not use it, then there are no problems?

    Could you make a small code example for the first drawback? For example, if we take a code sample from Wikipedia and replace system output by output from a file in print() method, will the code suffer from transitivity?

    • Hi Fralik, Its not just about the pattern named ‘composit pattern’, its about any ‘part of’ relationship (like a wheel being part of a car). Its about what the following two statements imply about B.

      1) A is a resource in need of manual resource management
      2) B ic composed using one or more A

      If these statements imply:

      3) B is a resource in need of manual resource management

      than ‘being a resource’ is transitive to composition. In most larger projects you will have many layers of composition, for example we might have:

      * Open file handle ‘mFile’ (resource)
      * A fifo ‘mFifo[0]’ that is is composed using mFile
      * A priority queue ‘mQueue’ composed using mFifo
      * A worker mWorker[0] composed using mQueue
      * A multiworker multiworker composed using mWorker

      With RAII the user of the resource (fifo) needs to do manual resource management. The user of a fifo doesn’t as a fifo isn’t a resource. Without RAII ‘being a resource’ is transitive to composition. So if anywhere you instantiate a fifo, a priority queue, a worker or a multiworker you need to do manual resource management.

      In the above example we ‘know’ there is a deep resource hidden in for example a priority queue. When we add polymorphism, we don’t know if ther couldn’t be a deep resource somewhere down deep. Our current fifo might be implemented without a file handle, but nothing tells us if a future AbstractFifo implementation might not be implemented needing manual resource management, Lets show some example code to illustrate:

      void dosomething(AbstractFifoFactory &fifofactory) {
      PriorityQueue composit(fifofactory);
      .. //any code
      }

      If we would be sure our current and future fifo’s would all be implemented without deep resources, the above code would be OK both with and without RAII. Given the very nature of polymorphism however we are never sure of this, so while the above code would remain unchanged with potential deep resources with RAII, without RAII we would need to do just-in-case manual resource management here:

      void dosomething(AbstractFifoFactory &fifofactory) {
      PriorityQueue composit(fifofactory);
      try {
      //any code
      } finaly {
      composit.release();
      }
      }

  4. just for the record: in C# we have the Disposable idiom. which is close to RAII scoped semantics, e.g.:
    using( new IDisposable())
    {
    //do smth with resource
    }//Dispose method called after end of scope, even if exception

    but you point about breaking encapsulation principles:
    if i have a class internally first being implemented storing everything in memorystream and then i decide to go filestream, i have to introduce the Dispose() method and by convention expecting that you will be calling it. either that or worse I need a complete redesign now 😦

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s