Tag Archives: Least authority

Why HTTPS everywhere is a horrible idea (for now).

trafic sign

While privacy is a valuable thing, and while encryption in general helps improving privacy and can be used to help improve security, in this blog post I will discuss how more encryption can actually harm you when used with a fundamentally flawed public key infrastructure. Before we go on and discuss what the problems are, a big of a background.

When confidential communication between for example your web browser and and for example your bank is needed, how do your browser and the banks web server achieve this. The following steps will take place:

  • Your browser will ask the internet’s Domain Name System (DNS) for the IP address for the IP address of ‘www.yourbank.com’. The Domain Name System will resolve the name and come back with an IP adress.
  • Your browser will initiate a TCP connection to the IP address it got back from the Domain Name System.
  • Once the TCP connection is established, the client and server will initiate the ‘handshake’ phase of the Transport Layer Security protocol.

After the handshake phase, everything should be fine and dandy, but is it? What would an adversary need to do to defeat the confidentiality of your connection? Given that without encryption the adversary already would need access to the transmitted data to read your data, we shall assume the adversary is sniffing all network traffic. Now the first thing an adversary needs to do to defeat the above setup is to fool your browser into thinking it is your bank. It can do this quite easily given that the Domain Name System (in its most basic form) runs on top of the User Datagram Protocol (UDP), a trivial connection-less protocol that can effordlesly be spoofed to make your browser believe your bank’s server is running on the attacker’s IP. So now, after the TCP connection has been established to what your browser beliefs is your bank, the TLS handshake begins. Our attacker could try to impersonate our bank, or he could, and this is where we shall look at, attempt to take a role of ‘Man in the middle’. That is, next to making your browser think it is your bank, it will actually connect to your bank and relay content between your browser and your bank either just so it can snoop on your traffic or until it is ready to strike by changing transaction content. But lets not get ahead of ourselves. Our client has connected to our attacker and our attacker has made a connection to our bank so the attacker’s machine can act as man in the middle. What attack factors can it use?

  • It can launch a downgrade attack by offering an inferior subset of the ciphers offered by the real client to the real server. This way the cipher suite used in the connection can be made to be the weakest common denominator between the real client and the real server. This way it can weaken the strength of the encryption, or force the use of a cipher suite that can later be weakened by other man in the middle tricks.
  • It can provide the client with a rogue certificate for ‘www.yourbank.com’. This will be harder to get by, but doing so would leave the attacker with a fully decrypted stream of traffic.

The last scenario, by many people is often described as being relatively unlikely. Let me try to elaborate why it is not. The certificate offered by the attacker has some security in it. Your browser won’t accept it unless it is signed by a ‘trusted’ Certificate Authority (CA). Your browser will trust only about 50 or so CA’s, so that sounds kinda OK doesn’t it? Well, there is an other trick with the CA based public key infrastructure, not only will it trust ANY of these 50 CA’s to sign ANY domain, it will also trust many sub-CA’s to do so. In total there should be over 600 certificate authorities in over 50 counties that might be used to sign a rogue certificate for your domain. The problem with trusting ANY CA to sign ANY domain arises from the mathematical properties of probability calculus for such cases. What this properties basically result to is the following horrific fact:

The probability that non of 600 equally trustworthy CA’s could somehow be persuaded, tricked or compromised by our attacker in a way that would allow it to get a rogue certificate signed is equal to the probability for a single one of these CA’s to the power of 600.

So if I can put 100% trust in each of these 600, the cumulative trust would be 100%, fine. 99.99%? We are at 94% what is pretty decent sure. 99.9%? Now things start to crumble as we are down to only 55% of cumulative trust. 99%? All commutative trust basically evaporated as we are down to 0.24%.

While these numbers are pretty bad, we must address the subject of attacker resources. If our attacker is our neighbor’s 14 year old kid trying to mess with us on his computer, than 99.99% might pretty well be a realistic number. A group of skilled and determined cyber criminals? 99.9% might pretty well be a realistic number and thus a real concern when communicating with our bank from a technical perspective. There could be economical aspects that would make this type of attack less likely for committing banking fraud. Now how about industrial espionage? Nation states? 99% would sound like a pretty low estimate for those types of adversaries.  Remember, 99% had us down to 0.24% as cumulative trust factor and that is assuming  equal trustworthiness for all CA’s.

So basically in the current CA infrastructure we can safely say that HTTPS protects us from script kiddies and (some) non targeted cybercrime attacks. It might even protect us to a certain level from mass surveillance. But that’s it. It does not protect us from targeted attacks trough organized criminal activities. It does not protect any high stakes intellectual properties from industrial espionage nor does it by itself protect the privacy of political dissidents and the likes.

So you might say: some protection is better than none, right? Well, not exactly. There is one thing that HTTPS and SSL in general is perfectly good at protecting trafic from: “YOU!” 

trojan

Remember Trojan Horses? Programs that act like one thing but actually are an other.  Or how about malicious content on compromised websites that exploits vulnerabilities in your browser or your browser plugin for flash? Nasty stuff running on your machine with access to all of your sensitive data. If they want to get your data out of your computer and into theirs, than HTTPS would be a good way to do it.  Now compare the situation of using HTTPS for all your web traffic to using HTTPS only for connecting to sites you a) visit regularly and b) actually need protecting. In the last situation, unexpected malicious encrypted traffic will stand out.  Its not my bank, its not my e-mail, I’m not ordering anything, why am I seeing encrypted traffic? When using HTTPS for every site that offers it though, we are creating a situation where Trojans and other malware can remain under the radar.

But back to the issue with the CA based infrastructure. There is an other issue, the issue of patterns of trust. When you hire a contractor to work on your shed, there is a common pattern of introduction that is the most major factor in the trust involved in getting your shed fixed. The contractor will introduce you to the carpenter and after that there will be a partial trust relationship between you and the carpenter that is a sibling of the trust relationship you have with the contractor. In modern web based technologies similar relationships are not uncommon, but the CA based architecture is currently the only mechanism available. A mechanism that doesn’t allow for the concept of secure introduction. While domain name based trust might be suitable for establishing trust with our contractor, a form of trust establishment that is completely immune to the kind of problems we face with domain name based trust initiation suffers from. In order to establish the introduction based trust, the server equivalent of the contractor could simply send us a direct unforgeable reference to the server equivalent of the carpenter. Its like the contractor has issued a mobile phone to the carpenter for communication with clients and than gives the phone number to the client. Not a normal phone number but some maybe 60 number long phone number that nobody could dial unless they had been handed that phone number first. The client knows that the person answering the phone should be the carpenter. The carpenter knows that the person calling him on that number should be a client. No CA’s or certificates needed period. Unfortunately though, the simple technology needed for the use of these webkeys currently isn’t implemented in our browsers. In a webkey, no certificate or CA is needed as measures for validating the public key of the server are hard coded into the URL itself.

So one one side we have an outdated DNS system, a set of outdated legacy cipher suits and a dangerously untrustworthy CA infrastructure that undermine the positive side of using HTTPS and on the other side we have untrustworthy programs with way to much privileges running on our machines exposing us to the negative sides of HTTPS. The illusion of security offered by a mediocre security solution like this can be much worse than using no security at all, but the Trojan and malware aspects make it worse than that.  Basically there is quite a lot of things that need fixing before using HTTPS everywhere stops being decremental to security. The following list isn’t complete, but gives some thoughts of the kind of things we need before HTTPS everywhere becomes a decent concept:

  1. Least authority everywhere. First and foremost, trojans and exploited clients should not by default be ‘trusted’ applications with access to our sensitive data.
  2. The current CA based PKI infrastructure with >600 ‘trusted’ CA’s in over 50 countries must be replaced urgently:
    1. NameCoin everywhere. Use of technology like demonstrated by NameCoin is a possible path.
    2. DANE everywhere. The extensive use of DNSSEC with DANE could offer an alternative to CA’s that is a significant improvement over the CA based infrastructure.
  3. TLS1.2 everywhere. Older versions of TLS and many of the ciphers defined in those standards should become deprecated.
  4. DNSSEC everywhere.
  5. WebKeys everywhere. We really need webkey support in the browser.

I realize the above rant may be a tad controversial and not in line with the popular view that encryprion is good medicine.  I do however find it important to illuminate the dark side of HTTPS everywhere and to show that its just one piece from the center of the puzzle while it would be much better to start solving the puzzle with edge pieces instead,

Advertisements

Security: debunking the ‘weakest link’ myth.

stock-footage-the-weakest-link-a-plastic-tie-joining-together-metal-chain-represents-the-weakest-link

“The user is the weakest link”, “”Problem Exists Between Keyboard And Chair”, “Layer 8 issue”. We have all heard these mentioned hundreds of times. Most security specialist truly believe it to be true , but in this blog post I will not only show that NO, the user is not the weakest link, I hope to also show that in fact the ‘believe’ that the user is the weakest link may be the reason that our information security industry appears to be stuck in the 1990s.

Harsh words? sure, but bear with me as I try to explain. Once you understand the fallacy and the impact of the idea that the user is the weakest link in the security chain, you are bound to be shocked by what the industry is selling us today. That they are in fact selling shark cages as protection against malaria.

There are at least six major weak links in today’s information security landscape. The first one  we know about, the user, no denying the user is a weak link, especially when working with many of todays security solutions, but there are five other important weak links we need to look at. Links that arguable all would need to be stronger than the user is in order for our user to be considered the weakest link.  I hope to show that not one, but every single one of these other five links is in fact significantly weaker than our user. Here we have the full list, I will explain each bullet later:

  • The user
  • Disregard for socio-genetic security-awareness
  • Identity centric security models
  • Single granularity abstractions
  • Public/global mutable state
  • Massive size of the trusted code-base

11prehistoric-hunting

Lets first look at what our user is. Our user, at least in most cases, will be a member of the human race. We as humans share many millenniums of history, and during all of these millenniums we arguably have been a species that uses social patterns of cooperation as a way to accomplish great things. One of the pivotal concepts in these cooperative patterns has always been the concept of delegation. Imagine our human history with a severe restriction on delegation. We would probably still be living in caves. If we had not gone extinct that is. Delegation is  part of our culture, its part of our socio-genetic heritage. We humans are ‘programmed’ to know how to handle delegation. Unfortunately however, free delegation is a concept that many a security architect feels to be an enemy of security. When users share their passwords, 99 out of 100 security people will interpret this as a user error. This while the user is simply acting in a way he was programmed to do, he is delegating in order to get work done. So what do security people do? They try to  stop the user from delegating any authority by coming up with better forms of authentication. Forms that completely stop the possibility of delegation of authority. Or they try to educate the user into not sharing his password, resulting in less efficient work processes. The true problem is that lacking secure tokens of ‘authority’ that the user could use for delegation, the user opts to delegate the only token of authority he can, his ‘identity’. We see that not only are we ignoring all of the users strengths in his ability to use patterns of safe collaboration, we are actually fighting our own socio-genetic strengths by introducing stronger authentication that stops delegation. Worse, by training our users, we are  forcing them unlearn what should be their primary strength.

While we are at the subject of passwords, consider delegation of a username/password to equate functional abdication, and consider that safe collaboration requires  decomposition, attenuation and revocability.  Now look what happens when you want to do something on your computer that requires user approval. In many cases, you will be presented with a pop-up that asks you to confirm your action by providing your password. Now wait, if delegation of a password is potential abdication of all the users powers, we are training our users into abdicating any time they want to get some real work done. Anyone heard of Pavlov and his dog? Well, our desktop security solutions apparently are in the business of supplying our users with all the Pavlovian training they need  to become ideal phishing targets. Tell me again how the user is the weakest link!

If we realize that we can tap into the strengths of the users socio-genetic security awareness by  facilitating in patterns of safe collaboration between the user and other users, and between the user and the programs he uses, it becomes clear that  while passwords are horrible, they are horrible for a completely different reason than most security people think. The main problem is not that they are horrible tokens for authentication and that we need better authentication that stops delegation altogether. The problem is that they are horrible, single granularity, non-attenuable and non decomposable tokens of authorization. Our security solutions are to much centered about the concept of identity, and too little about the concept of authority and its use in safe collaborative patterns.

Identity also is a single granularity concept. As malware has shown, the granularity of individual users for access control is meaningless ones a trojan runs under the users identity. Identity locks in access control to a single granularity level. This while access control is relevant at multi levels, going up as far as whole nations and down as deep of individual methods within a small object inside of a process that is an instantiation of a program run by a specific user. Whenever you use identity for access control, you are locking your access control patterns into that one, rather coarse granularity level. This while many of the access control in the cooperative relatively safe interaction patterns between people are not actually that much different from patterns that are possible between individual objects in a running program. Single granularity abstractions such as identity are massively overused and are hurting information security.

Its not just identity, its also how we share state. Global, public or widely shared mutable state creates problems at many granularities.

  • It makes composite systems hard to analyse and review.
  • It makes composite systems hard to test
  • It creates a high potential for violating the Principle Of Least Authority (POLA)
  • It introduces a giant hurdle for reducing the trusted code-base size.

We need only to look at Heartbleed to understand how the size of the trusted code-base is important. In the current access control eco system, the trusted code-base is so gigantic, that there simply aren’t enough eyeballs on the world to keep up with everything. In a least authority ecosystem, openssl would have been part of a much smaller trusted code-base that would have never allowed a big issue such as Heartbleed to stay undiscovered for as long as it has.

So lets revisit our list of weak links and question which one could be identified to be the weakest link.

  • The user
  • Disregard for socio-genetic security-awareness
  • Identity centric security models
  • Single granularity abstractions
  • Public/global mutable state
  • Massive size of the trusted code-base

While I’ m not sure what the weakest link may be, its pretty clear that the user won’ t become the weakest link until we’ve addressed each of the other five.

I hope the above has not only convinced you that indeed the user is not the weakest link, but that many of our efforts to ‘fix’  the user have not only been ineffective, they have even been extremely harmful. We need to stop creating stronger authentication and educating users not to share passwords until we have better alternatives for delegation. We need to stop overusing identity and subjecting our users to pavlovian training that is turning them into ideal phishing victims. When we start realizing that the socio-genetic security awareness of our users are a large almost untapped foundation for a significantly more secure information security ecosystem.

Killing the goose that lays the golden eggs (infosec)

We all know there is a lot of money being spent on information security products, services, training, etc,  and we all know there is still a lot of damages from cyber-crime and other types of information security breaches. But only when we add up the numbers, and look at what free market principles have turned the information security industry into, it becomes clear that there is something very very wrong with information security today. Many of my best friends work in this industry, so I’d imagine I might have a few less friends after posting this blog post today, but I feel that the realisation I have about the industry just screams to be shared with the world. Please don’t kill the messenger, but feel free to set me straight if you feel my assertions below are in any way unfair.

If we look at information security, we see that the market size of the information security industry is somewhere around 70 billion USD per year. If we look at what this bus-load of money is supposed to protect us from, cyber-crime and other information security related damages, we see that is a lot of room for improvement. The total yearly global damages from cyber-crime and other information security failure related incidents currently according to different sources seems to be somewhere between 200 billion and 500 billion. If we take it to be somewhere in the middle, we can estimate the total yearly cost for information security products, services and failure of information security to round and about 420 bilion USD per year. To put this into perspective, with 7 billion people on this planet  and a world wide GDP per capita of about 10000 USD, that adds up to about 0.6% of the worlds total GDP. If we scale this to the GDP per capita of some western countries like the US or the Netherlands, we end up with every US man woman and child on average paying $300,- a year for information security related cost, or about €200,- for every man woman and child in the Netherlands.  For for example a US family of four this would add up to about $200 for information security products and services and $1000 for damages, or about $100 a month.

Information security apparently is both relatively inefficient and relatively expensive.  So what’s the problem with information security? Can’t we fix it to at least be more effective?

As anyone who has been reading my blog before will probably know, I’m very much convinced that using different techniques and paradigms to either reduce the size of the trusted code-base, or to sync information security models with our socio-genetic security awareness,  it should be possible to greatly improve the integrity of information technology systems, and more importantly, to reduce the impact and cost of security breaches. I’m pretty much convinced that with the right focus this could mean that we could make information security about an order of magnitude more effective, potential at  an order of magnitude less cost.

If we were to translate this to the numbers above, we should be able to reduce the damage done by cyber-crime and other infosec security breaches for our US family of four to about $100. That would be about 1% of the total global IT spending, while at the same time reducing the global cost of information security related products and services for our family to about $20,-, or about 0.2% of the total global IT spending.

Sounds good, right? Well no, at least not from an investors point of view apparently. While to most of us this should sound like a desirable cost reduction, this apparently isn’t a realistic idea. When half a decade ago I was attempting to get investors to buy into investing into an info-sec product I wanted to build a start-up around, it turned out that potential investors don’t really like the idea of reducing the information security market size by an order of magnitude, or even the idea of making information security significantly more effective. To them doing so would be the equivalent of killing the goose that lays the golden eggs.

goose

So if investors aren’t going to allow the infosec industry to become the lean and mean information technology protection machine that we all want it to be, how can we kill the goose without solid investments?

From a commercial perspective, and this is basically my personal interpretation of the feedback I got from my talks for what I thought would be potential investors or partners,  information security products should:

  • Not significantly reduce revenues from other information security investments by the same investors.
  • Never saturate the market with one-time sales, so either it should require periodic updates or it should generate substantial consulting and/or training related revenues.
  • Allow the arms-race to continue. Keep it interesting and economically viable for the bad guys to invest in braking today’s security products so tomorrow we can create new products and services we can sell.

In contrast, for the people buying information security products should:

  • Reduce the total cost of IT system ownership.
  • Be low-maintenance.
  • Be cognitively compatible with (IT) staff.
  • Make it economically uninteresting for the bad-guys to continue the arms-race.

So do economic free market principles make it impossible to move information security into the realm that allows the second list of desirables to be satisfied? In the current IT landscape it seems that it does. Information security vendors are rather powerful and very capable of spreading the fear uncertainty and doubt that is needed to scare other parties from reducing the need for their services and products. This seems especially obvious in the case of operating system vendors.  The OLPC BitFrost project for example has shown the world what is possible security wise with the simple concept of mutually exclusive privileges for software.  It would be trivial for Google to implement such a scheme for Android, effectively eradicating over 90% of today’s android malware, making additional AV software lose most of its worth. Apple introduced the concept of a PowerBox based flexible jail to its desktop operating system, potentially effectively eradicating the need for AV. A bit later AV vendors launched a media offensive claiming Apple was years behind on its main competitor regarding security and stating they were willing to help Apple clean up the mess. Given that most of us think that infosec vendors know more about infosec than OS vendors, especially given the earlier track records of what used to be the OS-market monopolist. Inforse vendors, especially AV vendors know very well how to play the FUD game with the media in such a way that they seem to effectively keep OS vendors from structurally plugging the holes they need for selling their outdated technology. I’m pretty sure that Microsoft, Google and Apple are perfectly capable to find solutions that make their OSses significantly more secure without AV products than they would ever be with any upcoming generation of add-on AV protection. OLPC’s BitFrost has shown what is possible without the need for backward compatibility while HP-Labs Polaris and I dare claim my MinorFS project together have shown that very much is possible in the realm of retrofitting least authority operating systems.OS vendors are making small steps, but given that they are rightfully scared of the media power that FUD spreading AV companies can apparently command, they can not be expected to kill the goose that lays the golden eggs.

So how about open source? Forget about Linux, at least the kernel related stuff, much of the development on Linux is being done by companies with a large interest in infosec services, and the companies that haven’t have much to fear from AV company induced FUD in the media . But the concept of open source goose killing is quite an interesting one. We are trying to reduce global infosec related cost by many many billions, while a few hands full of projects that each would require the equivalent of just mere millions in man hours each would likely be sufficient to combine to make such major impact on the technical level. Investors won’t help, its not in their interest. OS vendors have to much to lose when they pick a fight with AV vendors, and openly investing in goose killing would be an outright declaration of war against the AV industry. While spare-time open source projects can produce great products, spare time is scarce and for most of us open source spare time development is a relatively low priority. So to make any impact, at least part of the people working on such projects should have development of these products as a source of income. Volunteers are invaluable but we can’t work with volunteers alone if we want to overthrow the information security industry. We don’t want to fall into the same trap that infosec vendors and investors have fallen in, any commercial interest in end product would be contrary to the goals we are trying to achieve. So how could we fund these developers?

The best position and the only one that has a slight chance of success would seem to be that of a non-profit charity organisation. An charity organisation free from commercial ties to infosec and OS vendors and service providers. Such an organisation could act with the purpose of:

  • Funding the partially payed-development of  free and open-source initiatives that show promise of both reducing the global IT security related cost and increasing integrity, confidentiality, privacy and availability of computing devices and IT infrastructure.
  • Coordinating contact between volunteer developers and new projects and handling procedures to allow talented volunteers to go from volunteer to (part-time) payed developers.
  • Marketing these projects.
  • Defending all such projects against legal and media FUD campaigns by the AV industry.

Could this become reality? I think with the right people it could. I know I could not play more than a small role in the creation, but I would definitely put private time and money into such an organisation and if and when others would do likewise, we would have a great place to start from. I think its important in order for the information security field to progress that we kill the goose that lays the golden eggs. OS vendors used to be what was holding back infosec, now however its the information security industry itself, most notably the AV industry that has almost become a media variant a protection racket scheme.

 

Rumpelstiltskin and his children: Part 2

In this post, I tried to explain the Rumpelstiltskin tree-graph algorithm. Apparently in the second part of my explanation, the part explaining the actual algorithm, I ‘ve ended up skipping a few steps that were apparently essential for understanding what the algorithm is supposed to do, and how it relates to the directional tree. So in this follow-up I’ll try to fill up the holes I left in my previous post.

So lets start off with the directional tree graph from my previous post:

Tree

The first thing that is important here is that each node in this graph has both a strong name (a password capability) and a weak name (a name that carries no authority by itself. The arcs between the nodes above are the strong names.

Now consider that each node has knowledge only of its own strong name and of the weak names of its children. If we were only concerned with maintaining the unidirectionality of our tree graph, than the following image would allow us to derive a child strong name knowing the parent’s strong name and the child’s weak name:

step1a

So if we have the strong parent name ‘Rumpelstiltskin’ and the weak child name ‘Bob’, we could use some kind of one way function to determine the strong name of that child. In this case ‘Slardibartfast’. Having the name ‘Rumpelstiltskin’ thus implies authorty to the parent node and all of its children, but the name ‘Slardibartfast’ impies no knowledge or authority over regarding the parent node.

Next to decomposition, there is an other aspect we are interested in: attenuation. If attenuation was the only aspect we were interested in, than an other use of a one way hash function would be usable:

step1b

In this picture, our node comes with two strong names. The first name gives unattenuated authority to the node. Using a one way hash function we create a second strong name for the same node. This second name implies attenuated access to the node, In the case of a file-system, this second strong name could be a read-only capability to a file or directory.

Now comes the interesting part: we want to combine the concept of decomposition with the concept of attenuation in a single algorithm in such a way that, and this is the essential quality of the Rumpelstiltskin tree-graph algorithm: “attenuated decomposition results in the same strong name as would decomposed attenuation”.

step2

On a first glance it would seem that we’ve just combined the two diagrams, but there is an essential addition: the use of a secret salt. Again like in the attenuation only diagram, we use a one way hash to create a strong name for attenuated access, but instead of using the unattenuated strong name to determine the unattenuated strong name for the child, we now use the strong name that implies attenuated access to the parent node to create a strong name that implies unattenuated access to the child node. Without the secret salt, doing so would imply that someone with attenuated authority to the parent could unduly gain unattenuated access to the child.  By introducing the secret salt to the second hash operation, we allow the parent node to act as proxy for all decomposition operations. That is, the parent node would need to:

  • Allow decomposing unattenuated authority to a parent to strong names implying unattenuated authority to the child.
  • Allow decomposing attenuated authority to a parent to strong names implying attenuated authority to the child.
  • NOT allow decomposing attenuated authority to a parent to strong names implying unattenuated authority to the child.

The fact that the holder of the strong name does not have access to the secret salt ensures that these properties can be guaranteed by the algorithm.Looking at the above diagram, its easy to see that attenuation of decomposition and decomposition of attenuation lead to exactly the same strong names. In order to do so we had to introduce a (server side) secret salt in order to disallow unintended privilege escalation.

We’re almost there, but there is still one more step to consider. Getting back to our imp, we must consider ‘where does our imp live? Or for a file-system, where is the serialisation of this node actually stored, and how is it stored? Lets look at the case of a file-system. Lets consider the nodes are serialized to a file-system that is supplied by some cloud storage provider that we do not fully trust. We would want to encrypt the file. Given that the node may be accessed with any of the two strong names, we can’t use the unattenuated access capability here, but we could use the strong name that implies attenuated authority as an encryption key. Doing so however implies we should never disclose either of our strong names to the cloud storage provider.  As you may expect by now, again the one-way hash operation comes to the rescue. We use a third one time hash operation to calculate the relative path where the encrypted  node serialisation is to be stored.   If we want to consider the scenario that there are entities that posses attenuated access strong names and also posess write access to our cloud storage directories, than we could again add a secret salt to this third hash operation, but in most cases it would seem that salting would not be needed for the third hash operation.

step3

I hope that this follow up blog post helps in the wider comprehension of what I think is a very interesting algorithm for creating attenuatable decomposable unidirectional tree graphs using password-capabilities/strong-names.

Rumpelstiltskin and his children

When communicating about my efforts on the upcoming version of my least authority file-system collection, I often seem unable to fully communicate the base design of the core capability-based file-system, even to many cryptographically educated people. In this post I’ll try to elaborate on the algorithm that lies at the center of all my efforts: The Rumpelstiltskin hash-tree algorithm.

Rumpelstilskin

rumplestiltskin-classics-illustrated-junior

In a fairy tale by the Grimm brothers that most of you will probably know by heart, a girl, as a result of her fathers big mouth, found herself in a rather problematic situation .  The king had locked her in a tower room filled with straw and had declared that she be decapitated unless she would spin the straw into gold by morning, An imp showed up and offered to help her out in exchange for her necklace, and the girl gladly accepted the offer. Next day, larger room with more straw, same story now in exchange for her ring. third and final day, even a larger room with even more straw, but as the girl has nothing left to give in exchange, the imp makes the girl promise to give up her firstborn child in exchange for his services.

Later in the story, the girl, now a queen married to the king is visited again by the imp who comes to claim his price. After pleading with the imp, the imp agrees that he will give up his claim if the queen guesses his true name within 3 days. When by chanche the queens messenger overhears the imp chanting:

“Today I bake, tomorrow I brew, then the Queen’s child I shall stew. For nobody knows my little game, for Rumpelstiltskin is my name.”

Learning his name, the queen is able to wield authority over the evil imp, forcing him to give up his claim to her firstborn.

You could say that the unguessable name “Rumpelstiltskin” is a token of authority not unlike what in information security theory we refer to as a capability, or more precisely a password-capability.

Directional tree graph

The CapFs file-system I’m working on, like most file-systems constitutes a tree. Unlike many file-systems though, the CapFs tree structure aims to be purely directional. That is, while in DOS, Unix, OSX or Linux we are used to navigate ‘up’ using a command like “cd ..”, the CapFs ‘directional’ tree adheres to the rule that:

“Authority to a node implies authority to its children but NOT to its parent.”

Names and true names

In our fairy tale, the imp its true name was Rumpelstiltskin (cap). This was the name that implied authority over the imp. The imp might have had an other non authoritative name used by others to address him, but if he had it was not relevant for the story.  Maybe Rumpelstiltskin himself did not have an other non authoritative name, but if he had any children, and those children had true names like him that could be used to wield authority over them, Rumpelstiltskin would probably often address his children by their non authoritative names.

Now talking of authority, if Rumpelstiltskin (cap) would want to delegate the authority to one of his children to an other imp, what options would there be? Lets assume Rumpelstiltskin (cap) had a kid named Slartibartfast (cap) for who he used the casual (non authoritative) name Bob.  There would be two ways someone could come to wield authority over Bob, either by using his true name Slartibartfast or by having authority over his parent Rumpelstiltskin that by proxy would give authority over Bob.  There are thus two authoritive ways to designate Bob:

  1. Slartibartfast
  2. Rumpelstiltskin::Bob

Back to our directional tree

Tree

So if we project Rumpelstiltskin and Bob to our directional tree graph, we could say that Rumpelstiltskin might be our tree’s root node. Rumpelstilskin being the root node has no name but his true authority carrying name. Bob and any other of Rumpelstiltskin’s descendants would have two names:

  • A normal name that implies no direct authority (name)
  • An unguessable name that implies authority (capability)

Attenuation

Where using a name and a capability (true name) for each non-root node in our tree takes care of making authority to branches and leafs decomposable, the authority to any sub branch that is implied by a capability  is still absolute. If we look back at our file-system example, one might want to delegate the authority to read without delegating the authority to write. We could provide non-root nodes with a second capability that would imply attenuated authority to a node, for example read only.  So now we have:

  • A normal name that implies no direct authority (name)
  • An unguessable name that implies FULL authority (capability)
  • An unguessable name that implies attenuated authority (capability)

The algorithm

Given that we have only a single type of attenuation (for example read-only), there is an interesting algorithm we can use to securely calculate new  unguessable names for non root node’s given its name and the unguesable name of its parent.

TripleHash

In the above diagram Key1 would be the equivalent of ‘Rumpelstiltskin’, Key2 would be the equivalent of a read-only capability to Rumpelstiltskin. The subnode name would for example be ‘Bob’,  and Subnode Key1 would be the equivalent of Slartibartfast.

Going from Rumpelstiltskin to a read-only attenuated capability to Rumpelstiltskin would be straight forward. We basically take an HMAC hash of the unattenuated authority capability (Rumpelstiltskin) and some static string, and use a representation of the resulting key2 as attenuated authority capability.  There is no secret salt needed in this step, so a shared static string is used instead.

For going from a parent capability to a child capability, there are two scenario’s:

  • From unattenuated parent capability to unattenuated child capability.
  • From attenuated parent capability to attenuated child capability.

To allow both scenario’s to be used, the calculation of the unattenuated child capability is done by taking an HMAC hash of the parent attenuated authority capability and the child’s name,  together with a secret salt.

The secret salt makes sure that we can create a system where a client holding an attenuated authority capability can ask a server for an attenuated authority child capability, but can’t itself derive an unattenuated authority child capability from an attenuated authority parent capability.

Secure persistence

When looking at the diagram of the Rumpelstiltskin hash-tree algorithm, we see there is is yet a HMAC operation and a key3 that we did not yet discus. The problem is that if we want to implement something of persistence using relatively public storage, for example with a cloud storage provide we don’t fully trust to keep our big bag of secrets secret, we will want to:

  • encrypt physical file’s
  • store our files in a way that does not disclose capabilities.

In order to address the first issue, we shall let our attenuated authority capability double as file encryption key for our file. We now use a third HMAC hash operation to create a 4th name that we use to determine the location where the encrypted file is stored.

High-security versus high-scalability

Regarding the final HMAC hashing steps there are two possible implementation mode’s for the Rumpelstiltskin hash-tree algoritm: high-security and high-scalability.  In the high scalability version, encryption and decryption could potentially be client side operations, and the server takes care only of enforcing read-only attenuation. In this mode, the storage path (key3) is calculated using a HMAC hash of the attenuated authority capability together with a static string. If in contrast we value security so much that we are willing to sacrifice the scalability that comes with client side en/de-cryption in order to prevent a cloud storage provider with access to an attenuated authority (read-only) capability to combine his explicit and his implicit authority into an unattenuated authority capability, than high-security mode would include the servers secret salt in the HMAC calculation of key4, mandating server side encryption and decryption.

In my own project I shall be using the Rumpelstiltskin hash-tree algorithm in high-scalability mode for now. Maybe in the future switching to  high-security may turn out to be desired, but for now the hole plugged by giving up the potential for adding scalability seems to small to justify the price of using high-security mode.

Conclusion

I hope reading this blog post somewhat clarifies the Rumpelstiltskin hash-tree algorithm, and I hope others may find the algoritm usefull for other projects as well. I’ll be using this algorithm in MinorFs2, but I feel it might have much broader usability than just user space file-systems. Please use this algorithm as you please, and comment on this post if you think this information is usefull or still needs clarifications somewhere.

MinorFs2

People who have read my blog, have read my article, of been at any of my public talks know about the problems of the unix $HOME and $TEMP facilities. MinorFs, a set of least authority file-systems aimed to solve this problem in a relatively ‘pure’ way. That is, it was a single granularity, single paradigm solution that gave pseudo persistent processes their own private storage that could be decomposed and delegated to other pseudo persistent processes.

A few years after I released MinorFs and a few years after I wrote a Linux Journal article about MinorFs, although its painful to admit, its time to come to the conclusion that that my attempts to make the world see the advantages of the purity of model that the stack AppArmor/MinorFs/E provided has failed.

At the same time, the problem with a $HOME and $TEMP directory that are shared between all the programs a user may run is becoming a bigger and bigger problem.  On one side we have real value being stored in the users $HOME dir by programs like bitcoin. On the other side we have HTML5 that relies more and more on the browser being able to reliably store  essential parts and information of rich internet application.

The realization of these two facts made me come to an important conclusion: Its time for major changes to MinorFs.  Now I had two options. Do I patch a bunch of changes on top of the existing Perl code base, or do I start from scratch. In the past I had tried to get MinorFs accepted as an AppArmor package in Ubuntu. At that point I ran into the problem that MinorFs had rather exotic perl modules as dependencies.  So if I ever want a new version of MinorFs to be accepted as a companion package for AppArmor, I would have to rewrite it quite a bit to not use those exotic dependencies. Add to this the major changes needed to get MinorFs as practical as it gets without compromising security, I had to come to the conclusion that there was little to no benefit in re-using the old MinorFs code.  This made me have to change my earlier assertion to: Its time to write a new version of MinorFs from scratch.

So what should this new version of MinorFs do in order to make it more practical? What should I do to help ang det it packaged in the major distributions that package AppArmor? The second question is easy, be carefull with dependencies. The first question however turned out to be less simple.

I have a very persistent tendency to strive for purity of model and purity of design. But after a few years of seeing that such purity can lead to failure to adopt, I had to take a major leap and convince myself that where purity gets in the way of likelihood to be adopted, purity had to make way.

After a lot of thinking I managed to concentrate all of the impurity into a single place. A place that allowed for configuration by the user in such a way that a purity sensitive user, packagers and administrators could create a relatively pure system by way of the config, while practically inclined users, packagers and administrators could just ignore purity al together. The place where I concentrated the impurity is the persistence id service. This service that didn’t exist in the old MinorFs maps process-id’s to persistence-id’s, but it does this in a way where one process might get a persistence-id that implies a whole different level of granularity than the persistence-id an other process maps to. Where the old MinorFs had only one level of granularity (the pseudo persistent process), MinorFs2 allows different processes to exist at different granularity levels according to the needs and possibilities of their code-base.

This is the base of the first of my practical approaches. It suffers one program that requires the existing user level granularity to co-exist with for example an other program that benefits from living at the finest granularity level that MinorFs2 provides.  I tried to come up with every potentially usable granularity level I could come up with. In practice some of these levels might turn out to be useless or unneeded, but from a practical viewpoint its better to have to much than to have the one missing that would be the best fit for a particular program.

So what would be the most important practical implication of allowing multiple granularities down to user level granularity? The big goal would be : Allow to simply and effectively  replace the usage of the normal $HOME and $TEMP with the usage of MinorFs2.

We should make it possible to mount MinorFs2 filesystems at /tmp and /home and have all software function normally, but without having to worry about malware or hackers with access to the same user id gaining access to their secrets, or being able to compromise their integrity.

This practical goal completely ignores the benefits of decomposition and delegation, but it does make your computer a much safer place, while in theory still allowing application developers an upgrade path to fine grained least authority.

An other practical choice I had to make was replacing the use of symbolic links with  using overlay file-systems for minorfs2_home_fs and minorfs2_temp_fs and dissalow  ‘raw’ access to minorfs2_cap_fs to unconfined processes.  I won’t get into the details about what this entails, but basicaly I had to abandon the golden rule that states: ‘don’t prohibit what you cant enforce‘.  Unconfined processes have access to the guts of the processes running under the same uid. This makes them capable of stealing the sparse capabilities that MinorFs uses. I took the practical approach to:

  • Limit the use of raw sparse caps to delegation and attenuation (less to steal)
  • Disallow unconfined processes from directly using sparse caps

This is an other practical issue that makes stuff a bit impure. In the pure world there would be no unconfined processes in a pure world, no way to steal sparse caps from the guts of an other process. So what do we do, we break a golden rule and close the gap as good as we can. Knowing that:

  • If the unconfined malicious process can convince a confined process to proxy for it shall be able tu use the stolen sparse cap.
  • If a non malicious unconfined process wants to use a sparse cap it can’t.

It hurts having to make such impure design decisions,  it feels like I’m doing something bad,  badly plugging a hole by breaking legitimate use cases.  I hope that the pain of  the practical approach will work out being worth  it  and I’ll be able to create something with a much higher adoption rate than the old MinorFs.

Taming mutable state for file-systems.

 

After my January 2009 Linux Journal article on MinorFs, I had a talk titled taming mutable state for file-systems that I gave several times over the past two year. Actually I gave this talk 7 times in 2009, once more in 2010, and my last appointment to give this talk this month (may 2011) bounced at the last moment.  I guess, that however much I enjoyed giving this talk, its unlikely that I will be giving it again. As a way to say goodbye to the material of this talk, I will dedicate this blog post to talking about my favored  talk 😉 While I did put the slides to this talk online, they were not completely self explanatory, so hopefully this blog post can open up my talk, that I probably won’t be giving any more times, to other people interested in least authority and high integrity system design.

As I hate phones going off in the middle of my talk,  I like to start my talks with a bit of scare tactics. I have a chrystal watter jug with a no phone zone sticker on it that I fill with water and a fake Nokia phone.  I than show my jug to the people in the room and asking them to please set their phones to silent, informing them that if they have problems doing so, than I would be happy to offer my jug as a solution to that problem.  Untill now, these scare tactics have worked and I have been able to give my talk without being interrupted by annoying phones each of the times.

My talk starts off with something I stole shamelessly from a presentation by Alan Karp. I talk to my audience about an extremely powerful program. A program that has the power to:

  • Read their confidential files.
  • Mail these files to the competition.
  • Delete or compromise their files.
  • Initiate a network tunnel, allowing their competition into their network

Then we let the audience think about what program this might be before showing a picture of solitaire, and we explain that while we don’t expect solitaire to do these things, it does have the power to do these things. As there will always be some Linux and Mac users who enjoy laughing at the perceived insecurity of Microsoft products,  I than go on explaining that this is not just a problem in the Microsoft world, but that other operating systems have exactly the same problem. That is, Linux for example is just as bad, so we change the picture in our slide from solitaire to my favorite old school Linux game sokoban.

Next we expand on the problem, saying that while sokoban might be OK, there are a lot of programs running on our system, written by even more people, with even more people in a position to compromise one of these programs into doing bad things. Then we extend it further by talking about network applications like a web browser, and how even if these are benignly written, an exploitable bug might easily transform these programs into something that will exploit the extensive powers that it is given.

Now other than Alan’s talk where I stole the solitaire stuff from, I don’t go on talking about how much power solitaire/sokoban has to do all these things, and how according to the principle of least authority solitaire/sokoban should not have the right to for example access that confidential ‘global’ data, but I take the opposite approach in that I talk about what this confidential data might be, and that it had no reason for being global in the first place.  I say that if we have an editor that was used to create a secret, this editor has no power to protect that secret from sokoban.

Than I went on to paint an extended picture of a secret where we wanted to share confidential information written in our editor with a friend using e-mail. I painted a scenario where the user would have 20 programs that she run on a regular basis on her system. 3 of these programs were our editor,  a mail client and encryption software.  I tried to explain that only the editor and the encryption software had any business with access to the secret.

Than we get to what I feel is the core of my talk. Mutable state. I have a slide that very graphically shows the potential difference between two ways of dealing with mutable state. Either as shared mutable state or as private mutable state. won’t describe the slide in detail, but it involved a rather vivid pictures of  lavatories, and what being public could lead to.

From our lavatories we came to the point that we were going to look at file systems and global mutable state, where we had to come to the conclusion that with all the users programs running as the same user, the file system, for all practical purposes, only gave us public mutable state and no private mutable state. From that we went back to look at the core problems with the concept of global mutable state, which are:

  • That it can potentially be modified from anywhere.
  • That any subsystem may rely on it.
  • That it creates a high potential for mutual dependencies.
  • That it makes composite systems harder to analyze or review.
  • That it makes composite systems harder to test.
  • That it basically in many cases  violates the principle of least authority.

Now with the problem so clearly identified, and with a small kid at home who loves to watch Bob the builder, I couldn’t resist but while creating my slides to let Bob ask the question ‘can we fix it?’….. Taking a few steps back to have to come to the conclusion that the problem might have already be fixed in an other domain, computer programming.

In computer programming we have different kinds of problematic shared mutable state:

  • global variables
  • class variables
  • singletons

I like to refer to global variables as the obvious evil of the devil we know, class variables as the lesser evil, and singleton’s as the devil we don’t.  So now we show what computer programing has done to solve the problem. We show that OO has given us private member variables and a concept known as pass by reference, and that it, in its basic form has given us two lesser evils (singletons and class variables) we can use to avoid the bigger evil (global variables).  Now from two sides the lesser evils are under fire in computer programing. From the high integrity side that gives us the object capabilities model (a sub set of OO that excludes implicitly shared mutable state), and from the TDD side where dependency injection is used as a way to address the testability issues that come with implicitly shared mutable state. Now we dive into one side of this, object capabilities, and more specifically a cute little language called E. This language shows us some of the capability based security principles that we can apply on our file system problem.  Next to this, as we will later see, this language can provide us with the roof of a whole high integrity building that we will try to build.

So what makes E, or object capability languages as a whole such a great thing? Basically, not focusing primary on Trojans but on exploitable bugs, its about the size of the trusted code base. If I want to protect my secret, how much lines of code do I need to trust? In any non ocap language the answer basically is ‘all of them’. If for example an average program is 50000 lines of C or C++code, than if this program had access to my secret, I would be trusting 50k lines of code with my secret. Using an ocap language, its quite reasonable to have a core of a program designed according to the principle of least authority (POLA) that truly  needs access to the secret. The great thing about an ocap language is that it allows you to easily proof that only that for example 1000 lines of code to be considered trusted. So using an ocap language for trusted programs,  we could reduce the size of the trusted code base per trusted program for our secret to a few percent of its original size.

Now starting at the roof with building is seldom a good idea, we have Bob the Builder start out with a look at the foundation. The foundation we choose consists of two components:

  1. The AppArmor access control framework for Suse and Ubuntu.
  2. FUSE : Filesystems in userspace, a Linux/BSD library+kernel module for building custom filesystems in userspace.

AppArmor allows us to take away non essential ambient authority from all our processes, including the part of the file-system that should be considered as  global mutable parts of the filesystems from a user process perspective. Now in the place of where the public mutable state used to be, we drop in our own ‘private’ replacement, MinorFs. I won’t rehash MinorFS in this article as its extensively covered by my linux journal article, but basicaly it replaces the ‘public’ $TMP and $HOME  with a ‘private’ $TMP and $HOME, and allows for ways to pass by reference in order to do object oriented style pass by reference.

So now that we have our foundation (AppArmor, Fuse) in place, and have put our walls up (MinorFs), its time to look at our roof (E) again. Looking at our initial scenario of our user that wanted to share a secret document, the solution we buikd allows us to only have to trust our editor and our encryption tool with our secret. So instead of 20 programs of 50000 lines of code each we need to trust, there are only two. When these two programs would be implemented in E,  we would have to trust only 1000 lines of code per program instead of 50000 lines.  As a whole this would thus mean that our hypothetical trusted code base went down from one million lines of code to a mere two thousand lines of code., a factor of 500.

Two thousand lines of code other than one million are quite possible and affordable to audit for trust-ability and integrity. This means that our multi story least authority stack can provide us with a great and affordable way of building high integrity systems.  MinorFs is just a proof of concept, but it acts as an essential piece of glue for building high integrity systems. I hope people who read my article and/or this blog, and those people that were at one of my talks will think of MinorFs and of the multi story AppArmor+MinorFs+E approach that I advocated here and will apply the lessons learned in their own high integrity system designs.