# Tag Archives: MinorFs

## Rumpelstiltskin and his children: Part 2

In this post, I tried to explain the Rumpelstiltskin tree-graph algorithm. Apparently in the second part of my explanation, the part explaining the actual algorithm, I ‘ve ended up skipping a few steps that were apparently essential for understanding what the algorithm is supposed to do, and how it relates to the directional tree. So in this follow-up I’ll try to fill up the holes I left in my previous post.

So lets start off with the directional tree graph from my previous post:

The first thing that is important here is that each node in this graph has both a strong name (a password capability) and a weak name (a name that carries no authority by itself. The arcs between the nodes above are the strong names.

Now consider that each node has knowledge only of its own strong name and of the weak names of its children. If we were only concerned with maintaining the unidirectionality of our tree graph, than the following image would allow us to derive a child strong name knowing the parent’s strong name and the child’s weak name:

So if we have the strong parent name ‘Rumpelstiltskin’ and the weak child name ‘Bob’, we could use some kind of one way function to determine the strong name of that child. In this case ‘Slardibartfast’. Having the name ‘Rumpelstiltskin’ thus implies authorty to the parent node and all of its children, but the name ‘Slardibartfast’ impies no knowledge or authority over regarding the parent node.

Next to decomposition, there is an other aspect we are interested in: attenuation. If attenuation was the only aspect we were interested in, than an other use of a one way hash function would be usable:

In this picture, our node comes with two strong names. The first name gives unattenuated authority to the node. Using a one way hash function we create a second strong name for the same node. This second name implies attenuated access to the node, In the case of a file-system, this second strong name could be a read-only capability to a file or directory.

Now comes the interesting part: we want to combine the concept of decomposition with the concept of attenuation in a single algorithm in such a way that, and this is the essential quality of the Rumpelstiltskin tree-graph algorithm: “attenuated decomposition results in the same strong name as would decomposed attenuation”.

On a first glance it would seem that we’ve just combined the two diagrams, but there is an essential addition: the use of a secret salt. Again like in the attenuation only diagram, we use a one way hash to create a strong name for attenuated access, but instead of using the unattenuated strong name to determine the unattenuated strong name for the child, we now use the strong name that implies attenuated access to the parent node to create a strong name that implies unattenuated access to the child node. Without the secret salt, doing so would imply that someone with attenuated authority to the parent could unduly gain unattenuated access to the child.  By introducing the secret salt to the second hash operation, we allow the parent node to act as proxy for all decomposition operations. That is, the parent node would need to:

• Allow decomposing unattenuated authority to a parent to strong names implying unattenuated authority to the child.
• Allow decomposing attenuated authority to a parent to strong names implying attenuated authority to the child.
• NOT allow decomposing attenuated authority to a parent to strong names implying unattenuated authority to the child.

The fact that the holder of the strong name does not have access to the secret salt ensures that these properties can be guaranteed by the algorithm.Looking at the above diagram, its easy to see that attenuation of decomposition and decomposition of attenuation lead to exactly the same strong names. In order to do so we had to introduce a (server side) secret salt in order to disallow unintended privilege escalation.

We’re almost there, but there is still one more step to consider. Getting back to our imp, we must consider ‘where does our imp live? Or for a file-system, where is the serialisation of this node actually stored, and how is it stored? Lets look at the case of a file-system. Lets consider the nodes are serialized to a file-system that is supplied by some cloud storage provider that we do not fully trust. We would want to encrypt the file. Given that the node may be accessed with any of the two strong names, we can’t use the unattenuated access capability here, but we could use the strong name that implies attenuated authority as an encryption key. Doing so however implies we should never disclose either of our strong names to the cloud storage provider.  As you may expect by now, again the one-way hash operation comes to the rescue. We use a third one time hash operation to calculate the relative path where the encrypted  node serialisation is to be stored.   If we want to consider the scenario that there are entities that posses attenuated access strong names and also posess write access to our cloud storage directories, than we could again add a secret salt to this third hash operation, but in most cases it would seem that salting would not be needed for the third hash operation.

I hope that this follow up blog post helps in the wider comprehension of what I think is a very interesting algorithm for creating attenuatable decomposable unidirectional tree graphs using password-capabilities/strong-names.

## Rumpelstiltskin and his children

When communicating about my efforts on the upcoming version of my least authority file-system collection, I often seem unable to fully communicate the base design of the core capability-based file-system, even to many cryptographically educated people. In this post I’ll try to elaborate on the algorithm that lies at the center of all my efforts: The Rumpelstiltskin hash-tree algorithm.

Rumpelstilskin

In a fairy tale by the Grimm brothers that most of you will probably know by heart, a girl, as a result of her fathers big mouth, found herself in a rather problematic situation .  The king had locked her in a tower room filled with straw and had declared that she be decapitated unless she would spin the straw into gold by morning, An imp showed up and offered to help her out in exchange for her necklace, and the girl gladly accepted the offer. Next day, larger room with more straw, same story now in exchange for her ring. third and final day, even a larger room with even more straw, but as the girl has nothing left to give in exchange, the imp makes the girl promise to give up her firstborn child in exchange for his services.

Later in the story, the girl, now a queen married to the king is visited again by the imp who comes to claim his price. After pleading with the imp, the imp agrees that he will give up his claim if the queen guesses his true name within 3 days. When by chanche the queens messenger overhears the imp chanting:

“Today I bake, tomorrow I brew, then the Queen’s child I shall stew. For nobody knows my little game, for Rumpelstiltskin is my name.”

Learning his name, the queen is able to wield authority over the evil imp, forcing him to give up his claim to her firstborn.

You could say that the unguessable name “Rumpelstiltskin” is a token of authority not unlike what in information security theory we refer to as a capability, or more precisely a password-capability.

Directional tree graph

The CapFs file-system I’m working on, like most file-systems constitutes a tree. Unlike many file-systems though, the CapFs tree structure aims to be purely directional. That is, while in DOS, Unix, OSX or Linux we are used to navigate ‘up’ using a command like “cd ..”, the CapFs ‘directional’ tree adheres to the rule that:

“Authority to a node implies authority to its children but NOT to its parent.”

Names and true names

In our fairy tale, the imp its true name was Rumpelstiltskin (cap). This was the name that implied authority over the imp. The imp might have had an other non authoritative name used by others to address him, but if he had it was not relevant for the story.  Maybe Rumpelstiltskin himself did not have an other non authoritative name, but if he had any children, and those children had true names like him that could be used to wield authority over them, Rumpelstiltskin would probably often address his children by their non authoritative names.

Now talking of authority, if Rumpelstiltskin (cap) would want to delegate the authority to one of his children to an other imp, what options would there be? Lets assume Rumpelstiltskin (cap) had a kid named Slartibartfast (cap) for who he used the casual (non authoritative) name Bob.  There would be two ways someone could come to wield authority over Bob, either by using his true name Slartibartfast or by having authority over his parent Rumpelstiltskin that by proxy would give authority over Bob.  There are thus two authoritive ways to designate Bob:

1. Slartibartfast
2. Rumpelstiltskin::Bob

Back to our directional tree

So if we project Rumpelstiltskin and Bob to our directional tree graph, we could say that Rumpelstiltskin might be our tree’s root node. Rumpelstilskin being the root node has no name but his true authority carrying name. Bob and any other of Rumpelstiltskin’s descendants would have two names:

• A normal name that implies no direct authority (name)
• An unguessable name that implies authority (capability)

Attenuation

Where using a name and a capability (true name) for each non-root node in our tree takes care of making authority to branches and leafs decomposable, the authority to any sub branch that is implied by a capability  is still absolute. If we look back at our file-system example, one might want to delegate the authority to read without delegating the authority to write. We could provide non-root nodes with a second capability that would imply attenuated authority to a node, for example read only.  So now we have:

• A normal name that implies no direct authority (name)
• An unguessable name that implies FULL authority (capability)
• An unguessable name that implies attenuated authority (capability)

The algorithm

Given that we have only a single type of attenuation (for example read-only), there is an interesting algorithm we can use to securely calculate new  unguessable names for non root node’s given its name and the unguesable name of its parent.

In the above diagram Key1 would be the equivalent of ‘Rumpelstiltskin’, Key2 would be the equivalent of a read-only capability to Rumpelstiltskin. The subnode name would for example be ‘Bob’,  and Subnode Key1 would be the equivalent of Slartibartfast.

Going from Rumpelstiltskin to a read-only attenuated capability to Rumpelstiltskin would be straight forward. We basically take an HMAC hash of the unattenuated authority capability (Rumpelstiltskin) and some static string, and use a representation of the resulting key2 as attenuated authority capability.  There is no secret salt needed in this step, so a shared static string is used instead.

For going from a parent capability to a child capability, there are two scenario’s:

• From unattenuated parent capability to unattenuated child capability.
• From attenuated parent capability to attenuated child capability.

To allow both scenario’s to be used, the calculation of the unattenuated child capability is done by taking an HMAC hash of the parent attenuated authority capability and the child’s name,  together with a secret salt.

The secret salt makes sure that we can create a system where a client holding an attenuated authority capability can ask a server for an attenuated authority child capability, but can’t itself derive an unattenuated authority child capability from an attenuated authority parent capability.

Secure persistence

When looking at the diagram of the Rumpelstiltskin hash-tree algorithm, we see there is is yet a HMAC operation and a key3 that we did not yet discus. The problem is that if we want to implement something of persistence using relatively public storage, for example with a cloud storage provide we don’t fully trust to keep our big bag of secrets secret, we will want to:

• encrypt physical file’s
• store our files in a way that does not disclose capabilities.

In order to address the first issue, we shall let our attenuated authority capability double as file encryption key for our file. We now use a third HMAC hash operation to create a 4th name that we use to determine the location where the encrypted file is stored.

High-security versus high-scalability

Regarding the final HMAC hashing steps there are two possible implementation mode’s for the Rumpelstiltskin hash-tree algoritm: high-security and high-scalability.  In the high scalability version, encryption and decryption could potentially be client side operations, and the server takes care only of enforcing read-only attenuation. In this mode, the storage path (key3) is calculated using a HMAC hash of the attenuated authority capability together with a static string. If in contrast we value security so much that we are willing to sacrifice the scalability that comes with client side en/de-cryption in order to prevent a cloud storage provider with access to an attenuated authority (read-only) capability to combine his explicit and his implicit authority into an unattenuated authority capability, than high-security mode would include the servers secret salt in the HMAC calculation of key4, mandating server side encryption and decryption.

In my own project I shall be using the Rumpelstiltskin hash-tree algorithm in high-scalability mode for now. Maybe in the future switching to  high-security may turn out to be desired, but for now the hole plugged by giving up the potential for adding scalability seems to small to justify the price of using high-security mode.

Conclusion

I hope reading this blog post somewhat clarifies the Rumpelstiltskin hash-tree algorithm, and I hope others may find the algoritm usefull for other projects as well. I’ll be using this algorithm in MinorFs2, but I feel it might have much broader usability than just user space file-systems. Please use this algorithm as you please, and comment on this post if you think this information is usefull or still needs clarifications somewhere.

## MinorFs2

People who have read my blog, have read my article, of been at any of my public talks know about the problems of the unix \$HOME and \$TEMP facilities. MinorFs, a set of least authority file-systems aimed to solve this problem in a relatively ‘pure’ way. That is, it was a single granularity, single paradigm solution that gave pseudo persistent processes their own private storage that could be decomposed and delegated to other pseudo persistent processes.

A few years after I released MinorFs and a few years after I wrote a Linux Journal article about MinorFs, although its painful to admit, its time to come to the conclusion that that my attempts to make the world see the advantages of the purity of model that the stack AppArmor/MinorFs/E provided has failed.

At the same time, the problem with a \$HOME and \$TEMP directory that are shared between all the programs a user may run is becoming a bigger and bigger problem.  On one side we have real value being stored in the users \$HOME dir by programs like bitcoin. On the other side we have HTML5 that relies more and more on the browser being able to reliably store  essential parts and information of rich internet application.

The realization of these two facts made me come to an important conclusion: Its time for major changes to MinorFs.  Now I had two options. Do I patch a bunch of changes on top of the existing Perl code base, or do I start from scratch. In the past I had tried to get MinorFs accepted as an AppArmor package in Ubuntu. At that point I ran into the problem that MinorFs had rather exotic perl modules as dependencies.  So if I ever want a new version of MinorFs to be accepted as a companion package for AppArmor, I would have to rewrite it quite a bit to not use those exotic dependencies. Add to this the major changes needed to get MinorFs as practical as it gets without compromising security, I had to come to the conclusion that there was little to no benefit in re-using the old MinorFs code.  This made me have to change my earlier assertion to: Its time to write a new version of MinorFs from scratch.

So what should this new version of MinorFs do in order to make it more practical? What should I do to help ang det it packaged in the major distributions that package AppArmor? The second question is easy, be carefull with dependencies. The first question however turned out to be less simple.

I have a very persistent tendency to strive for purity of model and purity of design. But after a few years of seeing that such purity can lead to failure to adopt, I had to take a major leap and convince myself that where purity gets in the way of likelihood to be adopted, purity had to make way.

After a lot of thinking I managed to concentrate all of the impurity into a single place. A place that allowed for configuration by the user in such a way that a purity sensitive user, packagers and administrators could create a relatively pure system by way of the config, while practically inclined users, packagers and administrators could just ignore purity al together. The place where I concentrated the impurity is the persistence id service. This service that didn’t exist in the old MinorFs maps process-id’s to persistence-id’s, but it does this in a way where one process might get a persistence-id that implies a whole different level of granularity than the persistence-id an other process maps to. Where the old MinorFs had only one level of granularity (the pseudo persistent process), MinorFs2 allows different processes to exist at different granularity levels according to the needs and possibilities of their code-base.

This is the base of the first of my practical approaches. It suffers one program that requires the existing user level granularity to co-exist with for example an other program that benefits from living at the finest granularity level that MinorFs2 provides.  I tried to come up with every potentially usable granularity level I could come up with. In practice some of these levels might turn out to be useless or unneeded, but from a practical viewpoint its better to have to much than to have the one missing that would be the best fit for a particular program.

So what would be the most important practical implication of allowing multiple granularities down to user level granularity? The big goal would be : Allow to simply and effectively  replace the usage of the normal \$HOME and \$TEMP with the usage of MinorFs2.

We should make it possible to mount MinorFs2 filesystems at /tmp and /home and have all software function normally, but without having to worry about malware or hackers with access to the same user id gaining access to their secrets, or being able to compromise their integrity.

This practical goal completely ignores the benefits of decomposition and delegation, but it does make your computer a much safer place, while in theory still allowing application developers an upgrade path to fine grained least authority.

An other practical choice I had to make was replacing the use of symbolic links with  using overlay file-systems for minorfs2_home_fs and minorfs2_temp_fs and dissalow  ‘raw’ access to minorfs2_cap_fs to unconfined processes.  I won’t get into the details about what this entails, but basicaly I had to abandon the golden rule that states: ‘don’t prohibit what you cant enforce‘.  Unconfined processes have access to the guts of the processes running under the same uid. This makes them capable of stealing the sparse capabilities that MinorFs uses. I took the practical approach to:

• Limit the use of raw sparse caps to delegation and attenuation (less to steal)
• Disallow unconfined processes from directly using sparse caps

This is an other practical issue that makes stuff a bit impure. In the pure world there would be no unconfined processes in a pure world, no way to steal sparse caps from the guts of an other process. So what do we do, we break a golden rule and close the gap as good as we can. Knowing that:

• If the unconfined malicious process can convince a confined process to proxy for it shall be able tu use the stolen sparse cap.
• If a non malicious unconfined process wants to use a sparse cap it can’t.

It hurts having to make such impure design decisions,  it feels like I’m doing something bad,  badly plugging a hole by breaking legitimate use cases.  I hope that the pain of  the practical approach will work out being worth  it  and I’ll be able to create something with a much higher adoption rate than the old MinorFs.

## Taming mutable state for file-systems.

After my January 2009 Linux Journal article on MinorFs, I had a talk titled ﻿taming mutable state for file-systems that I gave several times over the past two year. Actually I gave this talk 7 times in 2009, once more in 2010, and my last appointment to give this talk this month (may 2011) bounced at the last moment.  I guess, that however much I enjoyed giving this talk, its unlikely that I will be giving it again. As a way to say goodbye to the material of this talk, I will dedicate this blog post to talking about my favored  talk 😉 While I did put the slides to this talk online, they were not completely self explanatory, so hopefully this blog post can open up my talk, that I probably won’t be giving any more times, to other people interested in least authority and high integrity system design.

As I hate phones going off in the middle of my talk,  I like to start my talks with a bit of scare tactics. I have a chrystal watter jug with a no phone zone sticker on it that I fill with water and a fake Nokia phone.  I than show my jug to the people in the room and asking them to please set their phones to silent, informing them that if they have problems doing so, than I would be happy to offer my jug as a solution to that problem.  Untill now, these scare tactics have worked and I have been able to give my talk without being interrupted by annoying phones each of the times.

My talk starts off with something I stole shamelessly from a presentation by Alan Karp. I talk to my audience about an extremely powerful program. A program that has the power to:

• Mail these files to the competition.
• Delete or compromise their files.
• Initiate a network tunnel, allowing their competition into their network

Then we let the audience think about what program this might be before showing a picture of solitaire, and we explain that while we don’t expect solitaire to do these things, it does have the power to do these things. As there will always be some Linux and Mac users who enjoy laughing at the perceived insecurity of Microsoft products,  I than go on explaining that this is not just a problem in the Microsoft world, but that other operating systems have exactly the same problem. That is, Linux for example is just as bad, so we change the picture in our slide from solitaire to my favorite old school Linux game sokoban.

Next we expand on the problem, saying that while sokoban might be OK, there are a lot of programs running on our system, written by even more people, with even more people in a position to compromise one of these programs into doing bad things. Then we extend it further by talking about network applications like a web browser, and how even if these are benignly written, an exploitable bug might easily transform these programs into something that will exploit the extensive powers that it is given.

Now other than Alan’s talk where I stole the solitaire stuff from, I don’t go on talking about how much power solitaire/sokoban has to do all these things, and how according to the principle of least authority solitaire/sokoban should not have the right to for example access that confidential ‘global’ data, but I take the opposite approach in that I talk about what this confidential data might be, and that it had no reason for being global in the first place.  I say that if we have an editor that was used to create a secret, this editor has no power to protect that secret from sokoban.

Than I went on to paint an extended picture of a secret where we wanted to share confidential information written in our editor with a friend using e-mail. I painted a scenario where the user would have 20 programs that she run on a regular basis on her system. 3 of these programs were our editor,  a mail client and encryption software.  I tried to explain that only the editor and the encryption software had any business with access to the secret.

Than we get to what I feel is the core of my talk. Mutable state. I have a slide that very graphically shows the potential difference between two ways of dealing with mutable state. Either as shared mutable state or as private mutable state. won’t describe the slide in detail, but it involved a rather vivid pictures of  lavatories, and what being public could lead to.

From our lavatories we came to the point that we were going to look at file systems and global mutable state, where we had to come to the conclusion that with all the users programs running as the same user, the file system, for all practical purposes, only gave us public mutable state and no private mutable state. From that we went back to look at the core problems with the concept of global mutable state, which are:

• That it can potentially be modified from anywhere.
• That any subsystem may rely on it.
• That it creates a high potential for mutual dependencies.
• That it makes composite systems harder to analyze or review.
• That it makes composite systems harder to test.
• That it basically in many cases  violates the principle of least authority.

Now with the problem so clearly identified, and with a small kid at home who loves to watch Bob the builder, I couldn’t resist but while creating my slides to let Bob ask the question ‘can we fix it?’….. Taking a few steps back to have to come to the conclusion that the problem might have already be fixed in an other domain, computer programming.

In computer programming we have different kinds of problematic shared mutable state:

• global variables
• class variables
• singletons

I like to refer to global variables as the obvious evil of the devil we know, class variables as the lesser evil, and singleton’s as the devil we don’t.  So now we show what computer programing has done to solve the problem. We show that OO has given us private member variables and a concept known as pass by reference, and that it, in its basic form has given us two lesser evils (singletons and class variables) we can use to avoid the bigger evil (global variables).  Now from two sides the lesser evils are under fire in computer programing. From the high integrity side that gives us the object capabilities model (a sub set of OO that excludes implicitly shared mutable state), and from the TDD side where dependency injection is used as a way to address the testability issues that come with implicitly shared mutable state. Now we dive into one side of this, object capabilities, and more specifically a cute little language called E. This language shows us some of the capability based security principles that we can apply on our file system problem.  Next to this, as we will later see, this language can provide us with the roof of a whole high integrity building that we will try to build.

So what makes E, or object capability languages as a whole such a great thing? Basically, not focusing primary on Trojans but on exploitable bugs, its about the size of the trusted code base. If I want to protect my secret, how much lines of code do I need to trust? In any non ocap language the answer basically is ‘all of them’. If for example an average program is 50000 lines of C or C++code, than if this program had access to my secret, I would be trusting 50k lines of code with my secret. Using an ocap language, its quite reasonable to have a core of a program designed according to the principle of least authority (POLA) that truly  needs access to the secret. The great thing about an ocap language is that it allows you to easily proof that only that for example 1000 lines of code to be considered trusted. So using an ocap language for trusted programs,  we could reduce the size of the trusted code base per trusted program for our secret to a few percent of its original size.

Now starting at the roof with building is seldom a good idea, we have Bob the Builder start out with a look at the foundation. The foundation we choose consists of two components:

1. The AppArmor access control framework for Suse and Ubuntu.
2. FUSE : Filesystems in userspace, a Linux/BSD library+kernel module for building custom filesystems in userspace.

AppArmor allows us to take away non essential ambient authority from all our processes, including the part of the file-system that should be considered as  global mutable parts of the filesystems from a user process perspective. Now in the place of where the public mutable state used to be, we drop in our own ‘private’ replacement, MinorFs. I won’t rehash MinorFS in this article as its extensively covered by my linux journal article, but basicaly it replaces the ‘public’ \$TMP and \$HOME  with a ‘private’ \$TMP and \$HOME, and allows for ways to pass by reference in order to do object oriented style pass by reference.

So now that we have our foundation (AppArmor, Fuse) in place, and have put our walls up (MinorFs), its time to look at our roof (E) again. Looking at our initial scenario of our user that wanted to share a secret document, the solution we buikd allows us to only have to trust our editor and our encryption tool with our secret. So instead of 20 programs of 50000 lines of code each we need to trust, there are only two. When these two programs would be implemented in E,  we would have to trust only 1000 lines of code per program instead of 50000 lines.  As a whole this would thus mean that our hypothetical trusted code base went down from one million lines of code to a mere two thousand lines of code., a factor of 500.

Two thousand lines of code other than one million are quite possible and affordable to audit for trust-ability and integrity. This means that our multi story least authority stack can provide us with a great and affordable way of building high integrity systems.  MinorFs is just a proof of concept, but it acts as an essential piece of glue for building high integrity systems. I hope people who read my article and/or this blog, and those people that were at one of my talks will think of MinorFs and of the multi story AppArmor+MinorFs+E approach that I advocated here and will apply the lessons learned in their own high integrity system designs.