When communicating about my efforts on the upcoming version of my least authority file-system collection, I often seem unable to fully communicate the base design of the core capability-based file-system, even to many cryptographically educated people. In this post I’ll try to elaborate on the algorithm that lies at the center of all my efforts: The Rumpelstiltskin hash-tree algorithm.
In a fairy tale by the Grimm brothers that most of you will probably know by heart, a girl, as a result of her fathers big mouth, found herself in a rather problematic situation . The king had locked her in a tower room filled with straw and had declared that she be decapitated unless she would spin the straw into gold by morning, An imp showed up and offered to help her out in exchange for her necklace, and the girl gladly accepted the offer. Next day, larger room with more straw, same story now in exchange for her ring. third and final day, even a larger room with even more straw, but as the girl has nothing left to give in exchange, the imp makes the girl promise to give up her firstborn child in exchange for his services.
Later in the story, the girl, now a queen married to the king is visited again by the imp who comes to claim his price. After pleading with the imp, the imp agrees that he will give up his claim if the queen guesses his true name within 3 days. When by chanche the queens messenger overhears the imp chanting:
“Today I bake, tomorrow I brew, then the Queen’s child I shall stew. For nobody knows my little game, for Rumpelstiltskin is my name.”
Learning his name, the queen is able to wield authority over the evil imp, forcing him to give up his claim to her firstborn.
You could say that the unguessable name “Rumpelstiltskin” is a token of authority not unlike what in information security theory we refer to as a capability, or more precisely a password-capability.
Directional tree graph
The CapFs file-system I’m working on, like most file-systems constitutes a tree. Unlike many file-systems though, the CapFs tree structure aims to be purely directional. That is, while in DOS, Unix, OSX or Linux we are used to navigate ‘up’ using a command like “cd ..”, the CapFs ‘directional’ tree adheres to the rule that:
“Authority to a node implies authority to its children but NOT to its parent.”
Names and true names
In our fairy tale, the imp its true name was Rumpelstiltskin (cap). This was the name that implied authority over the imp. The imp might have had an other non authoritative name used by others to address him, but if he had it was not relevant for the story. Maybe Rumpelstiltskin himself did not have an other non authoritative name, but if he had any children, and those children had true names like him that could be used to wield authority over them, Rumpelstiltskin would probably often address his children by their non authoritative names.
Now talking of authority, if Rumpelstiltskin (cap) would want to delegate the authority to one of his children to an other imp, what options would there be? Lets assume Rumpelstiltskin (cap) had a kid named Slartibartfast (cap) for who he used the casual (non authoritative) name Bob. There would be two ways someone could come to wield authority over Bob, either by using his true name Slartibartfast or by having authority over his parent Rumpelstiltskin that by proxy would give authority over Bob. There are thus two authoritive ways to designate Bob:
Back to our directional tree
So if we project Rumpelstiltskin and Bob to our directional tree graph, we could say that Rumpelstiltskin might be our tree’s root node. Rumpelstilskin being the root node has no name but his true authority carrying name. Bob and any other of Rumpelstiltskin’s descendants would have two names:
- A normal name that implies no direct authority (name)
- An unguessable name that implies authority (capability)
Where using a name and a capability (true name) for each non-root node in our tree takes care of making authority to branches and leafs decomposable, the authority to any sub branch that is implied by a capability is still absolute. If we look back at our file-system example, one might want to delegate the authority to read without delegating the authority to write. We could provide non-root nodes with a second capability that would imply attenuated authority to a node, for example read only. So now we have:
- A normal name that implies no direct authority (name)
- An unguessable name that implies FULL authority (capability)
- An unguessable name that implies attenuated authority (capability)
Given that we have only a single type of attenuation (for example read-only), there is an interesting algorithm we can use to securely calculate new unguessable names for non root node’s given its name and the unguesable name of its parent.
In the above diagram Key1 would be the equivalent of ‘Rumpelstiltskin’, Key2 would be the equivalent of a read-only capability to Rumpelstiltskin. The subnode name would for example be ‘Bob’, and Subnode Key1 would be the equivalent of Slartibartfast.
Going from Rumpelstiltskin to a read-only attenuated capability to Rumpelstiltskin would be straight forward. We basically take an HMAC hash of the unattenuated authority capability (Rumpelstiltskin) and some static string, and use a representation of the resulting key2 as attenuated authority capability. There is no secret salt needed in this step, so a shared static string is used instead.
For going from a parent capability to a child capability, there are two scenario’s:
- From unattenuated parent capability to unattenuated child capability.
- From attenuated parent capability to attenuated child capability.
To allow both scenario’s to be used, the calculation of the unattenuated child capability is done by taking an HMAC hash of the parent attenuated authority capability and the child’s name, together with a secret salt.
The secret salt makes sure that we can create a system where a client holding an attenuated authority capability can ask a server for an attenuated authority child capability, but can’t itself derive an unattenuated authority child capability from an attenuated authority parent capability.
When looking at the diagram of the Rumpelstiltskin hash-tree algorithm, we see there is is yet a HMAC operation and a key3 that we did not yet discus. The problem is that if we want to implement something of persistence using relatively public storage, for example with a cloud storage provide we don’t fully trust to keep our big bag of secrets secret, we will want to:
- encrypt physical file’s
- store our files in a way that does not disclose capabilities.
In order to address the first issue, we shall let our attenuated authority capability double as file encryption key for our file. We now use a third HMAC hash operation to create a 4th name that we use to determine the location where the encrypted file is stored.
High-security versus high-scalability
Regarding the final HMAC hashing steps there are two possible implementation mode’s for the Rumpelstiltskin hash-tree algoritm: high-security and high-scalability. In the high scalability version, encryption and decryption could potentially be client side operations, and the server takes care only of enforcing read-only attenuation. In this mode, the storage path (key3) is calculated using a HMAC hash of the attenuated authority capability together with a static string. If in contrast we value security so much that we are willing to sacrifice the scalability that comes with client side en/de-cryption in order to prevent a cloud storage provider with access to an attenuated authority (read-only) capability to combine his explicit and his implicit authority into an unattenuated authority capability, than high-security mode would include the servers secret salt in the HMAC calculation of key4, mandating server side encryption and decryption.
In my own project I shall be using the Rumpelstiltskin hash-tree algorithm in high-scalability mode for now. Maybe in the future switching to high-security may turn out to be desired, but for now the hole plugged by giving up the potential for adding scalability seems to small to justify the price of using high-security mode.
I hope reading this blog post somewhat clarifies the Rumpelstiltskin hash-tree algorithm, and I hope others may find the algoritm usefull for other projects as well. I’ll be using this algorithm in MinorFs2, but I feel it might have much broader usability than just user space file-systems. Please use this algorithm as you please, and comment on this post if you think this information is usefull or still needs clarifications somewhere.