Tag Archives: infosec

Why HTTPS everywhere is a horrible idea (for now).

trafic sign

While privacy is a valuable thing, and while encryption in general helps improving privacy and can be used to help improve security, in this blog post I will discuss how more encryption can actually harm you when used with a fundamentally flawed public key infrastructure. Before we go on and discuss what the problems are, a big of a background.

When confidential communication between for example your web browser and and for example your bank is needed, how do your browser and the banks web server achieve this. The following steps will take place:

  • Your browser will ask the internet’s Domain Name System (DNS) for the IP address for the IP address of ‘www.yourbank.com’. The Domain Name System will resolve the name and come back with an IP adress.
  • Your browser will initiate a TCP connection to the IP address it got back from the Domain Name System.
  • Once the TCP connection is established, the client and server will initiate the ‘handshake’ phase of the Transport Layer Security protocol.

After the handshake phase, everything should be fine and dandy, but is it? What would an adversary need to do to defeat the confidentiality of your connection? Given that without encryption the adversary already would need access to the transmitted data to read your data, we shall assume the adversary is sniffing all network traffic. Now the first thing an adversary needs to do to defeat the above setup is to fool your browser into thinking it is your bank. It can do this quite easily given that the Domain Name System (in its most basic form) runs on top of the User Datagram Protocol (UDP), a trivial connection-less protocol that can effordlesly be spoofed to make your browser believe your bank’s server is running on the attacker’s IP. So now, after the TCP connection has been established to what your browser beliefs is your bank, the TLS handshake begins. Our attacker could try to impersonate our bank, or he could, and this is where we shall look at, attempt to take a role of ‘Man in the middle’. That is, next to making your browser think it is your bank, it will actually connect to your bank and relay content between your browser and your bank either just so it can snoop on your traffic or until it is ready to strike by changing transaction content. But lets not get ahead of ourselves. Our client has connected to our attacker and our attacker has made a connection to our bank so the attacker’s machine can act as man in the middle. What attack factors can it use?

  • It can launch a downgrade attack by offering an inferior subset of the ciphers offered by the real client to the real server. This way the cipher suite used in the connection can be made to be the weakest common denominator between the real client and the real server. This way it can weaken the strength of the encryption, or force the use of a cipher suite that can later be weakened by other man in the middle tricks.
  • It can provide the client with a rogue certificate for ‘www.yourbank.com’. This will be harder to get by, but doing so would leave the attacker with a fully decrypted stream of traffic.

The last scenario, by many people is often described as being relatively unlikely. Let me try to elaborate why it is not. The certificate offered by the attacker has some security in it. Your browser won’t accept it unless it is signed by a ‘trusted’ Certificate Authority (CA). Your browser will trust only about 50 or so CA’s, so that sounds kinda OK doesn’t it? Well, there is an other trick with the CA based public key infrastructure, not only will it trust ANY of these 50 CA’s to sign ANY domain, it will also trust many sub-CA’s to do so. In total there should be over 600 certificate authorities in over 50 counties that might be used to sign a rogue certificate for your domain. The problem with trusting ANY CA to sign ANY domain arises from the mathematical properties of probability calculus for such cases. What this properties basically result to is the following horrific fact:

The probability that non of 600 equally trustworthy CA’s could somehow be persuaded, tricked or compromised by our attacker in a way that would allow it to get a rogue certificate signed is equal to the probability for a single one of these CA’s to the power of 600.

So if I can put 100% trust in each of these 600, the cumulative trust would be 100%, fine. 99.99%? We are at 94% what is pretty decent sure. 99.9%? Now things start to crumble as we are down to only 55% of cumulative trust. 99%? All commutative trust basically evaporated as we are down to 0.24%.

While these numbers are pretty bad, we must address the subject of attacker resources. If our attacker is our neighbor’s 14 year old kid trying to mess with us on his computer, than 99.99% might pretty well be a realistic number. A group of skilled and determined cyber criminals? 99.9% might pretty well be a realistic number and thus a real concern when communicating with our bank from a technical perspective. There could be economical aspects that would make this type of attack less likely for committing banking fraud. Now how about industrial espionage? Nation states? 99% would sound like a pretty low estimate for those types of adversaries.  Remember, 99% had us down to 0.24% as cumulative trust factor and that is assuming  equal trustworthiness for all CA’s.

So basically in the current CA infrastructure we can safely say that HTTPS protects us from script kiddies and (some) non targeted cybercrime attacks. It might even protect us to a certain level from mass surveillance. But that’s it. It does not protect us from targeted attacks trough organized criminal activities. It does not protect any high stakes intellectual properties from industrial espionage nor does it by itself protect the privacy of political dissidents and the likes.

So you might say: some protection is better than none, right? Well, not exactly. There is one thing that HTTPS and SSL in general is perfectly good at protecting trafic from: “YOU!” 

trojan

Remember Trojan Horses? Programs that act like one thing but actually are an other.  Or how about malicious content on compromised websites that exploits vulnerabilities in your browser or your browser plugin for flash? Nasty stuff running on your machine with access to all of your sensitive data. If they want to get your data out of your computer and into theirs, than HTTPS would be a good way to do it.  Now compare the situation of using HTTPS for all your web traffic to using HTTPS only for connecting to sites you a) visit regularly and b) actually need protecting. In the last situation, unexpected malicious encrypted traffic will stand out.  Its not my bank, its not my e-mail, I’m not ordering anything, why am I seeing encrypted traffic? When using HTTPS for every site that offers it though, we are creating a situation where Trojans and other malware can remain under the radar.

But back to the issue with the CA based infrastructure. There is an other issue, the issue of patterns of trust. When you hire a contractor to work on your shed, there is a common pattern of introduction that is the most major factor in the trust involved in getting your shed fixed. The contractor will introduce you to the carpenter and after that there will be a partial trust relationship between you and the carpenter that is a sibling of the trust relationship you have with the contractor. In modern web based technologies similar relationships are not uncommon, but the CA based architecture is currently the only mechanism available. A mechanism that doesn’t allow for the concept of secure introduction. While domain name based trust might be suitable for establishing trust with our contractor, a form of trust establishment that is completely immune to the kind of problems we face with domain name based trust initiation suffers from. In order to establish the introduction based trust, the server equivalent of the contractor could simply send us a direct unforgeable reference to the server equivalent of the carpenter. Its like the contractor has issued a mobile phone to the carpenter for communication with clients and than gives the phone number to the client. Not a normal phone number but some maybe 60 number long phone number that nobody could dial unless they had been handed that phone number first. The client knows that the person answering the phone should be the carpenter. The carpenter knows that the person calling him on that number should be a client. No CA’s or certificates needed period. Unfortunately though, the simple technology needed for the use of these webkeys currently isn’t implemented in our browsers. In a webkey, no certificate or CA is needed as measures for validating the public key of the server are hard coded into the URL itself.

So one one side we have an outdated DNS system, a set of outdated legacy cipher suits and a dangerously untrustworthy CA infrastructure that undermine the positive side of using HTTPS and on the other side we have untrustworthy programs with way to much privileges running on our machines exposing us to the negative sides of HTTPS. The illusion of security offered by a mediocre security solution like this can be much worse than using no security at all, but the Trojan and malware aspects make it worse than that.  Basically there is quite a lot of things that need fixing before using HTTPS everywhere stops being decremental to security. The following list isn’t complete, but gives some thoughts of the kind of things we need before HTTPS everywhere becomes a decent concept:

  1. Least authority everywhere. First and foremost, trojans and exploited clients should not by default be ‘trusted’ applications with access to our sensitive data.
  2. The current CA based PKI infrastructure with >600 ‘trusted’ CA’s in over 50 countries must be replaced urgently:
    1. NameCoin everywhere. Use of technology like demonstrated by NameCoin is a possible path.
    2. DANE everywhere. The extensive use of DNSSEC with DANE could offer an alternative to CA’s that is a significant improvement over the CA based infrastructure.
  3. TLS1.2 everywhere. Older versions of TLS and many of the ciphers defined in those standards should become deprecated.
  4. DNSSEC everywhere.
  5. WebKeys everywhere. We really need webkey support in the browser.

I realize the above rant may be a tad controversial and not in line with the popular view that encryprion is good medicine.  I do however find it important to illuminate the dark side of HTTPS everywhere and to show that its just one piece from the center of the puzzle while it would be much better to start solving the puzzle with edge pieces instead,

Security: debunking the ‘weakest link’ myth.

stock-footage-the-weakest-link-a-plastic-tie-joining-together-metal-chain-represents-the-weakest-link

“The user is the weakest link”, “”Problem Exists Between Keyboard And Chair”, “Layer 8 issue”. We have all heard these mentioned hundreds of times. Most security specialist truly believe it to be true , but in this blog post I will not only show that NO, the user is not the weakest link, I hope to also show that in fact the ‘believe’ that the user is the weakest link may be the reason that our information security industry appears to be stuck in the 1990s.

Harsh words? sure, but bear with me as I try to explain. Once you understand the fallacy and the impact of the idea that the user is the weakest link in the security chain, you are bound to be shocked by what the industry is selling us today. That they are in fact selling shark cages as protection against malaria.

There are at least six major weak links in today’s information security landscape. The first one  we know about, the user, no denying the user is a weak link, especially when working with many of todays security solutions, but there are five other important weak links we need to look at. Links that arguable all would need to be stronger than the user is in order for our user to be considered the weakest link.  I hope to show that not one, but every single one of these other five links is in fact significantly weaker than our user. Here we have the full list, I will explain each bullet later:

  • The user
  • Disregard for socio-genetic security-awareness
  • Identity centric security models
  • Single granularity abstractions
  • Public/global mutable state
  • Massive size of the trusted code-base

11prehistoric-hunting

Lets first look at what our user is. Our user, at least in most cases, will be a member of the human race. We as humans share many millenniums of history, and during all of these millenniums we arguably have been a species that uses social patterns of cooperation as a way to accomplish great things. One of the pivotal concepts in these cooperative patterns has always been the concept of delegation. Imagine our human history with a severe restriction on delegation. We would probably still be living in caves. If we had not gone extinct that is. Delegation is  part of our culture, its part of our socio-genetic heritage. We humans are ‘programmed’ to know how to handle delegation. Unfortunately however, free delegation is a concept that many a security architect feels to be an enemy of security. When users share their passwords, 99 out of 100 security people will interpret this as a user error. This while the user is simply acting in a way he was programmed to do, he is delegating in order to get work done. So what do security people do? They try to  stop the user from delegating any authority by coming up with better forms of authentication. Forms that completely stop the possibility of delegation of authority. Or they try to educate the user into not sharing his password, resulting in less efficient work processes. The true problem is that lacking secure tokens of ‘authority’ that the user could use for delegation, the user opts to delegate the only token of authority he can, his ‘identity’. We see that not only are we ignoring all of the users strengths in his ability to use patterns of safe collaboration, we are actually fighting our own socio-genetic strengths by introducing stronger authentication that stops delegation. Worse, by training our users, we are  forcing them unlearn what should be their primary strength.

While we are at the subject of passwords, consider delegation of a username/password to equate functional abdication, and consider that safe collaboration requires  decomposition, attenuation and revocability.  Now look what happens when you want to do something on your computer that requires user approval. In many cases, you will be presented with a pop-up that asks you to confirm your action by providing your password. Now wait, if delegation of a password is potential abdication of all the users powers, we are training our users into abdicating any time they want to get some real work done. Anyone heard of Pavlov and his dog? Well, our desktop security solutions apparently are in the business of supplying our users with all the Pavlovian training they need  to become ideal phishing targets. Tell me again how the user is the weakest link!

If we realize that we can tap into the strengths of the users socio-genetic security awareness by  facilitating in patterns of safe collaboration between the user and other users, and between the user and the programs he uses, it becomes clear that  while passwords are horrible, they are horrible for a completely different reason than most security people think. The main problem is not that they are horrible tokens for authentication and that we need better authentication that stops delegation altogether. The problem is that they are horrible, single granularity, non-attenuable and non decomposable tokens of authorization. Our security solutions are to much centered about the concept of identity, and too little about the concept of authority and its use in safe collaborative patterns.

Identity also is a single granularity concept. As malware has shown, the granularity of individual users for access control is meaningless ones a trojan runs under the users identity. Identity locks in access control to a single granularity level. This while access control is relevant at multi levels, going up as far as whole nations and down as deep of individual methods within a small object inside of a process that is an instantiation of a program run by a specific user. Whenever you use identity for access control, you are locking your access control patterns into that one, rather coarse granularity level. This while many of the access control in the cooperative relatively safe interaction patterns between people are not actually that much different from patterns that are possible between individual objects in a running program. Single granularity abstractions such as identity are massively overused and are hurting information security.

Its not just identity, its also how we share state. Global, public or widely shared mutable state creates problems at many granularities.

  • It makes composite systems hard to analyse and review.
  • It makes composite systems hard to test
  • It creates a high potential for violating the Principle Of Least Authority (POLA)
  • It introduces a giant hurdle for reducing the trusted code-base size.

We need only to look at Heartbleed to understand how the size of the trusted code-base is important. In the current access control eco system, the trusted code-base is so gigantic, that there simply aren’t enough eyeballs on the world to keep up with everything. In a least authority ecosystem, openssl would have been part of a much smaller trusted code-base that would have never allowed a big issue such as Heartbleed to stay undiscovered for as long as it has.

So lets revisit our list of weak links and question which one could be identified to be the weakest link.

  • The user
  • Disregard for socio-genetic security-awareness
  • Identity centric security models
  • Single granularity abstractions
  • Public/global mutable state
  • Massive size of the trusted code-base

While I’ m not sure what the weakest link may be, its pretty clear that the user won’ t become the weakest link until we’ve addressed each of the other five.

I hope the above has not only convinced you that indeed the user is not the weakest link, but that many of our efforts to ‘fix’  the user have not only been ineffective, they have even been extremely harmful. We need to stop creating stronger authentication and educating users not to share passwords until we have better alternatives for delegation. We need to stop overusing identity and subjecting our users to pavlovian training that is turning them into ideal phishing victims. When we start realizing that the socio-genetic security awareness of our users are a large almost untapped foundation for a significantly more secure information security ecosystem.

The relevance of Pacman and the van de Graaff generator to peer to peer networking in IPv4 networks.

In the late 1990s I was working on a OSI like layering model for peer to peer networks. In the early 2000s, the code red worm hit the internet, and a person I hold highly that was aware of some of my work ended up appealing to my sense of responsibility regarding the possible use of my algorithms in internet worms.  After thorough consideration I decided to stop my efforts on a multi layered pure-P2P stack, and remove all information on the lowest layer algorithm that I had come up with.

About half a decade later, when p2p and malware were starting to rather crudely come together, I ended up discussing my algorithm with a security specialist at a conference, and he advised me to create a limited public paper on an hypothetical worm that would use this technology. Worms those days were rather crude and lets call it ‘loud’ , while my hypothetical Shai Hulud worm would be rather stealthy.I did some simulations, and ended up sending a simple high level textual description to some of my CERT contacts, so they would at least know what to look for .  APT wasn’t a thing yet in these days, so looking for low footprint patterns that might hide a stealthy worm really was not a priority to anyone.

Now, an other half a decade has passed, and crypto currency, BitCoin, etc have advanced P2P-trust way beyond what I envisioned in the late 1990s.  Next to that, the infosec community has evolved quite a bit, and a worm, even a stealthy one should be in the scope of modern APT focused monitoring. Further, I still believe the algorithms may indeed proof useful for the benign purposes that I initially envisioned them for: layered trusted pure P2P.

Van-de-Graaf-Generator-web

I won’ t cant give the exact details of the algorithm. Not only do I not want to make things to easy on malware builders,  even if I wanted to,  quite a bit of life happened between CodeRed and now, things where my original files were lost, so I have to do things from memory (If anyone still has a copy my original Shai Hulud paper, please drop me a message). As it is probably better to be vague than to be wrong, I won’t give details I’m not completely sure about anymore.

So lets talk about tho concepts that inspired the algorithm: The van de Graaf generator and the PacMan video game.  Many of you will know the van de Graaf generator (picture above) as something purely fun and totaly unrelated to IT. The metal sphere used in the generator though has a very interesting property. The electrons on a charged sphere are perfectly evenly spaced on the surface of the round sphere, and the mathematics involved that allow one to calculate how this happens are basic high school math.  My first version of my algorithm was based on mapping the IPv4 address space unto a spherical coordinate system, and while the math was manageable,  I ended up with the reverse problem that projecting a round globe onto a flat map gives,  to many virtual IP addresses ended up at the poles.

pacman arcade game

Than something hit me, in the old  Pacman game, if you went of the flat surface on the left end, you ended up coming out on the right and vise versa. If you take the IPv4 address space, use 16 bit for the X axis and 16 bit for the Y axis, you can create what basically is a pacman sphere tile. If you than place the same tile around a center tile on all sides, you end up with a 3×3 square of copies of your pacman tile. Now we get back to our electrons, turns out that if we take a single electron on our central tile, take the position of that electron as the center of a new virtual tile, we can use that virtual tile as a window relevant to the electrodynamics relevant to our individual electron. Further it turns out that if we disregard all but the closest N electrons , the dynamics of the whole pacman sphere remain virtually the same.

Now this was basically the core concept of part one of the algorithm. If we say that every node in a peer to peer network has two positions:

  • A static position determined by its IP
  • A dynamic position determined by the electromechanical interaction with peers.

A node would start off with its dynamic position being set to its static position, and would start looking for peers by scanning random positions. If an other node was discovered, information on dynamic positions of the node itself and its closest neighbours would be exchanged. Each node would thus keep in contact only with its N closest peers in terms of virtual position. This interaction would make the ‘connected’ nodes virtual locations spread relatively evenly over the address space.

image9

Ones stabilized, there would be a number of triangles between our node and its closest peers equal to the number N.   Each individual triangle could be divided into 6 smaller triangles, that could than be divides between the nodes.

Su55k02_m21

Now comes the stealthy part of the algorithm that had me and other so scared that I felt it important to keep this simple algorithm hidden for well over a decade, given that there hardly  any overlap between the triangles, each node would have exactly 2N triangles worth of address space to scan for peers, and any of these triangles would be scanned by only one connected peer.

The math for this algorithm is even significantly simpler than the simple high school math needed for the 3 dimensional sphere. I hope this simple algorithm can proof useful for P2P design, and I trust that in 2014, the infosec community has grown sufficiently to deal with the stealthiness that this algorithm will imply if it were to be used in malware. Further I believe that advances in distributed trust have finally made use of this algorithm in solid P2P architectures a serious option.  I hope my insights for choosing this moment for finally publishing this potentially powerful P2P algorithm are on the mark and I am not publishing this before the infosec community is ready or after the P2P development community needed it.

Killing the goose that lays the golden eggs (infosec)

We all know there is a lot of money being spent on information security products, services, training, etc,  and we all know there is still a lot of damages from cyber-crime and other types of information security breaches. But only when we add up the numbers, and look at what free market principles have turned the information security industry into, it becomes clear that there is something very very wrong with information security today. Many of my best friends work in this industry, so I’d imagine I might have a few less friends after posting this blog post today, but I feel that the realisation I have about the industry just screams to be shared with the world. Please don’t kill the messenger, but feel free to set me straight if you feel my assertions below are in any way unfair.

If we look at information security, we see that the market size of the information security industry is somewhere around 70 billion USD per year. If we look at what this bus-load of money is supposed to protect us from, cyber-crime and other information security related damages, we see that is a lot of room for improvement. The total yearly global damages from cyber-crime and other information security failure related incidents currently according to different sources seems to be somewhere between 200 billion and 500 billion. If we take it to be somewhere in the middle, we can estimate the total yearly cost for information security products, services and failure of information security to round and about 420 bilion USD per year. To put this into perspective, with 7 billion people on this planet  and a world wide GDP per capita of about 10000 USD, that adds up to about 0.6% of the worlds total GDP. If we scale this to the GDP per capita of some western countries like the US or the Netherlands, we end up with every US man woman and child on average paying $300,- a year for information security related cost, or about €200,- for every man woman and child in the Netherlands.  For for example a US family of four this would add up to about $200 for information security products and services and $1000 for damages, or about $100 a month.

Information security apparently is both relatively inefficient and relatively expensive.  So what’s the problem with information security? Can’t we fix it to at least be more effective?

As anyone who has been reading my blog before will probably know, I’m very much convinced that using different techniques and paradigms to either reduce the size of the trusted code-base, or to sync information security models with our socio-genetic security awareness,  it should be possible to greatly improve the integrity of information technology systems, and more importantly, to reduce the impact and cost of security breaches. I’m pretty much convinced that with the right focus this could mean that we could make information security about an order of magnitude more effective, potential at  an order of magnitude less cost.

If we were to translate this to the numbers above, we should be able to reduce the damage done by cyber-crime and other infosec security breaches for our US family of four to about $100. That would be about 1% of the total global IT spending, while at the same time reducing the global cost of information security related products and services for our family to about $20,-, or about 0.2% of the total global IT spending.

Sounds good, right? Well no, at least not from an investors point of view apparently. While to most of us this should sound like a desirable cost reduction, this apparently isn’t a realistic idea. When half a decade ago I was attempting to get investors to buy into investing into an info-sec product I wanted to build a start-up around, it turned out that potential investors don’t really like the idea of reducing the information security market size by an order of magnitude, or even the idea of making information security significantly more effective. To them doing so would be the equivalent of killing the goose that lays the golden eggs.

goose

So if investors aren’t going to allow the infosec industry to become the lean and mean information technology protection machine that we all want it to be, how can we kill the goose without solid investments?

From a commercial perspective, and this is basically my personal interpretation of the feedback I got from my talks for what I thought would be potential investors or partners,  information security products should:

  • Not significantly reduce revenues from other information security investments by the same investors.
  • Never saturate the market with one-time sales, so either it should require periodic updates or it should generate substantial consulting and/or training related revenues.
  • Allow the arms-race to continue. Keep it interesting and economically viable for the bad guys to invest in braking today’s security products so tomorrow we can create new products and services we can sell.

In contrast, for the people buying information security products should:

  • Reduce the total cost of IT system ownership.
  • Be low-maintenance.
  • Be cognitively compatible with (IT) staff.
  • Make it economically uninteresting for the bad-guys to continue the arms-race.

So do economic free market principles make it impossible to move information security into the realm that allows the second list of desirables to be satisfied? In the current IT landscape it seems that it does. Information security vendors are rather powerful and very capable of spreading the fear uncertainty and doubt that is needed to scare other parties from reducing the need for their services and products. This seems especially obvious in the case of operating system vendors.  The OLPC BitFrost project for example has shown the world what is possible security wise with the simple concept of mutually exclusive privileges for software.  It would be trivial for Google to implement such a scheme for Android, effectively eradicating over 90% of today’s android malware, making additional AV software lose most of its worth. Apple introduced the concept of a PowerBox based flexible jail to its desktop operating system, potentially effectively eradicating the need for AV. A bit later AV vendors launched a media offensive claiming Apple was years behind on its main competitor regarding security and stating they were willing to help Apple clean up the mess. Given that most of us think that infosec vendors know more about infosec than OS vendors, especially given the earlier track records of what used to be the OS-market monopolist. Inforse vendors, especially AV vendors know very well how to play the FUD game with the media in such a way that they seem to effectively keep OS vendors from structurally plugging the holes they need for selling their outdated technology. I’m pretty sure that Microsoft, Google and Apple are perfectly capable to find solutions that make their OSses significantly more secure without AV products than they would ever be with any upcoming generation of add-on AV protection. OLPC’s BitFrost has shown what is possible without the need for backward compatibility while HP-Labs Polaris and I dare claim my MinorFS project together have shown that very much is possible in the realm of retrofitting least authority operating systems.OS vendors are making small steps, but given that they are rightfully scared of the media power that FUD spreading AV companies can apparently command, they can not be expected to kill the goose that lays the golden eggs.

So how about open source? Forget about Linux, at least the kernel related stuff, much of the development on Linux is being done by companies with a large interest in infosec services, and the companies that haven’t have much to fear from AV company induced FUD in the media . But the concept of open source goose killing is quite an interesting one. We are trying to reduce global infosec related cost by many many billions, while a few hands full of projects that each would require the equivalent of just mere millions in man hours each would likely be sufficient to combine to make such major impact on the technical level. Investors won’t help, its not in their interest. OS vendors have to much to lose when they pick a fight with AV vendors, and openly investing in goose killing would be an outright declaration of war against the AV industry. While spare-time open source projects can produce great products, spare time is scarce and for most of us open source spare time development is a relatively low priority. So to make any impact, at least part of the people working on such projects should have development of these products as a source of income. Volunteers are invaluable but we can’t work with volunteers alone if we want to overthrow the information security industry. We don’t want to fall into the same trap that infosec vendors and investors have fallen in, any commercial interest in end product would be contrary to the goals we are trying to achieve. So how could we fund these developers?

The best position and the only one that has a slight chance of success would seem to be that of a non-profit charity organisation. An charity organisation free from commercial ties to infosec and OS vendors and service providers. Such an organisation could act with the purpose of:

  • Funding the partially payed-development of  free and open-source initiatives that show promise of both reducing the global IT security related cost and increasing integrity, confidentiality, privacy and availability of computing devices and IT infrastructure.
  • Coordinating contact between volunteer developers and new projects and handling procedures to allow talented volunteers to go from volunteer to (part-time) payed developers.
  • Marketing these projects.
  • Defending all such projects against legal and media FUD campaigns by the AV industry.

Could this become reality? I think with the right people it could. I know I could not play more than a small role in the creation, but I would definitely put private time and money into such an organisation and if and when others would do likewise, we would have a great place to start from. I think its important in order for the information security field to progress that we kill the goose that lays the golden eggs. OS vendors used to be what was holding back infosec, now however its the information security industry itself, most notably the AV industry that has almost become a media variant a protection racket scheme.

 

Rumpelstiltskin and his children: Part 2

In this post, I tried to explain the Rumpelstiltskin tree-graph algorithm. Apparently in the second part of my explanation, the part explaining the actual algorithm, I ‘ve ended up skipping a few steps that were apparently essential for understanding what the algorithm is supposed to do, and how it relates to the directional tree. So in this follow-up I’ll try to fill up the holes I left in my previous post.

So lets start off with the directional tree graph from my previous post:

Tree

The first thing that is important here is that each node in this graph has both a strong name (a password capability) and a weak name (a name that carries no authority by itself. The arcs between the nodes above are the strong names.

Now consider that each node has knowledge only of its own strong name and of the weak names of its children. If we were only concerned with maintaining the unidirectionality of our tree graph, than the following image would allow us to derive a child strong name knowing the parent’s strong name and the child’s weak name:

step1a

So if we have the strong parent name ‘Rumpelstiltskin’ and the weak child name ‘Bob’, we could use some kind of one way function to determine the strong name of that child. In this case ‘Slardibartfast’. Having the name ‘Rumpelstiltskin’ thus implies authorty to the parent node and all of its children, but the name ‘Slardibartfast’ impies no knowledge or authority over regarding the parent node.

Next to decomposition, there is an other aspect we are interested in: attenuation. If attenuation was the only aspect we were interested in, than an other use of a one way hash function would be usable:

step1b

In this picture, our node comes with two strong names. The first name gives unattenuated authority to the node. Using a one way hash function we create a second strong name for the same node. This second name implies attenuated access to the node, In the case of a file-system, this second strong name could be a read-only capability to a file or directory.

Now comes the interesting part: we want to combine the concept of decomposition with the concept of attenuation in a single algorithm in such a way that, and this is the essential quality of the Rumpelstiltskin tree-graph algorithm: “attenuated decomposition results in the same strong name as would decomposed attenuation”.

step2

On a first glance it would seem that we’ve just combined the two diagrams, but there is an essential addition: the use of a secret salt. Again like in the attenuation only diagram, we use a one way hash to create a strong name for attenuated access, but instead of using the unattenuated strong name to determine the unattenuated strong name for the child, we now use the strong name that implies attenuated access to the parent node to create a strong name that implies unattenuated access to the child node. Without the secret salt, doing so would imply that someone with attenuated authority to the parent could unduly gain unattenuated access to the child.  By introducing the secret salt to the second hash operation, we allow the parent node to act as proxy for all decomposition operations. That is, the parent node would need to:

  • Allow decomposing unattenuated authority to a parent to strong names implying unattenuated authority to the child.
  • Allow decomposing attenuated authority to a parent to strong names implying attenuated authority to the child.
  • NOT allow decomposing attenuated authority to a parent to strong names implying unattenuated authority to the child.

The fact that the holder of the strong name does not have access to the secret salt ensures that these properties can be guaranteed by the algorithm.Looking at the above diagram, its easy to see that attenuation of decomposition and decomposition of attenuation lead to exactly the same strong names. In order to do so we had to introduce a (server side) secret salt in order to disallow unintended privilege escalation.

We’re almost there, but there is still one more step to consider. Getting back to our imp, we must consider ‘where does our imp live? Or for a file-system, where is the serialisation of this node actually stored, and how is it stored? Lets look at the case of a file-system. Lets consider the nodes are serialized to a file-system that is supplied by some cloud storage provider that we do not fully trust. We would want to encrypt the file. Given that the node may be accessed with any of the two strong names, we can’t use the unattenuated access capability here, but we could use the strong name that implies attenuated authority as an encryption key. Doing so however implies we should never disclose either of our strong names to the cloud storage provider.  As you may expect by now, again the one-way hash operation comes to the rescue. We use a third one time hash operation to calculate the relative path where the encrypted  node serialisation is to be stored.   If we want to consider the scenario that there are entities that posses attenuated access strong names and also posess write access to our cloud storage directories, than we could again add a secret salt to this third hash operation, but in most cases it would seem that salting would not be needed for the third hash operation.

step3

I hope that this follow up blog post helps in the wider comprehension of what I think is a very interesting algorithm for creating attenuatable decomposable unidirectional tree graphs using password-capabilities/strong-names.

Rumpelstiltskin and his children

When communicating about my efforts on the upcoming version of my least authority file-system collection, I often seem unable to fully communicate the base design of the core capability-based file-system, even to many cryptographically educated people. In this post I’ll try to elaborate on the algorithm that lies at the center of all my efforts: The Rumpelstiltskin hash-tree algorithm.

Rumpelstilskin

rumplestiltskin-classics-illustrated-junior

In a fairy tale by the Grimm brothers that most of you will probably know by heart, a girl, as a result of her fathers big mouth, found herself in a rather problematic situation .  The king had locked her in a tower room filled with straw and had declared that she be decapitated unless she would spin the straw into gold by morning, An imp showed up and offered to help her out in exchange for her necklace, and the girl gladly accepted the offer. Next day, larger room with more straw, same story now in exchange for her ring. third and final day, even a larger room with even more straw, but as the girl has nothing left to give in exchange, the imp makes the girl promise to give up her firstborn child in exchange for his services.

Later in the story, the girl, now a queen married to the king is visited again by the imp who comes to claim his price. After pleading with the imp, the imp agrees that he will give up his claim if the queen guesses his true name within 3 days. When by chanche the queens messenger overhears the imp chanting:

“Today I bake, tomorrow I brew, then the Queen’s child I shall stew. For nobody knows my little game, for Rumpelstiltskin is my name.”

Learning his name, the queen is able to wield authority over the evil imp, forcing him to give up his claim to her firstborn.

You could say that the unguessable name “Rumpelstiltskin” is a token of authority not unlike what in information security theory we refer to as a capability, or more precisely a password-capability.

Directional tree graph

The CapFs file-system I’m working on, like most file-systems constitutes a tree. Unlike many file-systems though, the CapFs tree structure aims to be purely directional. That is, while in DOS, Unix, OSX or Linux we are used to navigate ‘up’ using a command like “cd ..”, the CapFs ‘directional’ tree adheres to the rule that:

“Authority to a node implies authority to its children but NOT to its parent.”

Names and true names

In our fairy tale, the imp its true name was Rumpelstiltskin (cap). This was the name that implied authority over the imp. The imp might have had an other non authoritative name used by others to address him, but if he had it was not relevant for the story.  Maybe Rumpelstiltskin himself did not have an other non authoritative name, but if he had any children, and those children had true names like him that could be used to wield authority over them, Rumpelstiltskin would probably often address his children by their non authoritative names.

Now talking of authority, if Rumpelstiltskin (cap) would want to delegate the authority to one of his children to an other imp, what options would there be? Lets assume Rumpelstiltskin (cap) had a kid named Slartibartfast (cap) for who he used the casual (non authoritative) name Bob.  There would be two ways someone could come to wield authority over Bob, either by using his true name Slartibartfast or by having authority over his parent Rumpelstiltskin that by proxy would give authority over Bob.  There are thus two authoritive ways to designate Bob:

  1. Slartibartfast
  2. Rumpelstiltskin::Bob

Back to our directional tree

Tree

So if we project Rumpelstiltskin and Bob to our directional tree graph, we could say that Rumpelstiltskin might be our tree’s root node. Rumpelstilskin being the root node has no name but his true authority carrying name. Bob and any other of Rumpelstiltskin’s descendants would have two names:

  • A normal name that implies no direct authority (name)
  • An unguessable name that implies authority (capability)

Attenuation

Where using a name and a capability (true name) for each non-root node in our tree takes care of making authority to branches and leafs decomposable, the authority to any sub branch that is implied by a capability  is still absolute. If we look back at our file-system example, one might want to delegate the authority to read without delegating the authority to write. We could provide non-root nodes with a second capability that would imply attenuated authority to a node, for example read only.  So now we have:

  • A normal name that implies no direct authority (name)
  • An unguessable name that implies FULL authority (capability)
  • An unguessable name that implies attenuated authority (capability)

The algorithm

Given that we have only a single type of attenuation (for example read-only), there is an interesting algorithm we can use to securely calculate new  unguessable names for non root node’s given its name and the unguesable name of its parent.

TripleHash

In the above diagram Key1 would be the equivalent of ‘Rumpelstiltskin’, Key2 would be the equivalent of a read-only capability to Rumpelstiltskin. The subnode name would for example be ‘Bob’,  and Subnode Key1 would be the equivalent of Slartibartfast.

Going from Rumpelstiltskin to a read-only attenuated capability to Rumpelstiltskin would be straight forward. We basically take an HMAC hash of the unattenuated authority capability (Rumpelstiltskin) and some static string, and use a representation of the resulting key2 as attenuated authority capability.  There is no secret salt needed in this step, so a shared static string is used instead.

For going from a parent capability to a child capability, there are two scenario’s:

  • From unattenuated parent capability to unattenuated child capability.
  • From attenuated parent capability to attenuated child capability.

To allow both scenario’s to be used, the calculation of the unattenuated child capability is done by taking an HMAC hash of the parent attenuated authority capability and the child’s name,  together with a secret salt.

The secret salt makes sure that we can create a system where a client holding an attenuated authority capability can ask a server for an attenuated authority child capability, but can’t itself derive an unattenuated authority child capability from an attenuated authority parent capability.

Secure persistence

When looking at the diagram of the Rumpelstiltskin hash-tree algorithm, we see there is is yet a HMAC operation and a key3 that we did not yet discus. The problem is that if we want to implement something of persistence using relatively public storage, for example with a cloud storage provide we don’t fully trust to keep our big bag of secrets secret, we will want to:

  • encrypt physical file’s
  • store our files in a way that does not disclose capabilities.

In order to address the first issue, we shall let our attenuated authority capability double as file encryption key for our file. We now use a third HMAC hash operation to create a 4th name that we use to determine the location where the encrypted file is stored.

High-security versus high-scalability

Regarding the final HMAC hashing steps there are two possible implementation mode’s for the Rumpelstiltskin hash-tree algoritm: high-security and high-scalability.  In the high scalability version, encryption and decryption could potentially be client side operations, and the server takes care only of enforcing read-only attenuation. In this mode, the storage path (key3) is calculated using a HMAC hash of the attenuated authority capability together with a static string. If in contrast we value security so much that we are willing to sacrifice the scalability that comes with client side en/de-cryption in order to prevent a cloud storage provider with access to an attenuated authority (read-only) capability to combine his explicit and his implicit authority into an unattenuated authority capability, than high-security mode would include the servers secret salt in the HMAC calculation of key4, mandating server side encryption and decryption.

In my own project I shall be using the Rumpelstiltskin hash-tree algorithm in high-scalability mode for now. Maybe in the future switching to  high-security may turn out to be desired, but for now the hole plugged by giving up the potential for adding scalability seems to small to justify the price of using high-security mode.

Conclusion

I hope reading this blog post somewhat clarifies the Rumpelstiltskin hash-tree algorithm, and I hope others may find the algoritm usefull for other projects as well. I’ll be using this algorithm in MinorFs2, but I feel it might have much broader usability than just user space file-systems. Please use this algorithm as you please, and comment on this post if you think this information is usefull or still needs clarifications somewhere.

Caps-Lock, security and PAM (Pluggable Authentication Module)

When logging into my desktop system (running Ubuntu), every once in a while the Caps-Lock key accidentally being pressed keeps me from successfully logging in to my system. While being just a minor and rare annoyance, its stuck in the back of my mind and I started wondering if something could and should be done about it.
I started wondering, what if I would make the log-in password caps-lock insensitive on my system. The first thing we need to understand is that caps-lock insensitive isn’t the same as case insensitive. If my password is ‘TomtiDom14’, a caps-lock insensitive password chack should match only ‘TomtiDom14’ and ‘tOMTIdOM14’. So unlike case-insensitive where there would be 256 different valid passwords, seriously degrading security for you password by dropping a single bit of entropy for each dual case character in your password, a caps-lock insensitive password takes away only a single bit of entropy for the whole password. While loosing a single bit of entropy in theory cuts the amount of time or resources needed to brute-force crack your password in half, this is not really that relevant if we take the approach that caps-lock insensitivity is an authentication system issue. We are thus not talking about making our password hash caps-lock insensitive, what would indeed be a bad idea that cuts brute-force time in half if someone were to get his/her hands on the hash of your password. We are talking about an authentication system that should simply , in failure, try the case-inverted version of the presented password. Authentication systems have excellent measures for rate-limiting or even blocking brute-force attacks, so a single bit of entropy should not be that much of a problem for our approach.

Looking at the usability increase however, we can see what happens when someone by accident has his caps-lock key pressed and tries to enter a password. Chances are that he/she won’t think about checking the caps-lock led. No, the first thing he/she will think would be making a type. So the password will be typed in caps-inverted one or more consecutive times. Then he/she will start to wonder if he/she didn’t have the password mixed up with an other password. He/she will than try that password, maybe two or 3 times, than maybe a third one. In the end, the user might have entered a number of false passwords sufficient to trigger a complete lock-out. So while making our authentication caps-lock insensitive slightly decreases the security of the password, and while accidentally having the caps-lock pressed tends to be a rare condition, the usability consequences of not having a caps-lock insensitive authentication systems end up being rather big.

So now that we have established that having a caps-lock insensitive authentication system for our passwords is a quite desirable goal, we need to look at how we can establish such a system.

On (Ubuntu) Linux, the authentication system is a modular system that uses so called Plugable Authentication Modules (PAM). There are many authentication modules available, and looking at the source code of the PAM system there appeared no central place to elegantly solve the problem in a simple way. I thus chose to fix the problem for myself by looking just at stand-alone systems using the basic pam_unix module. Fixing the problem for other modules can probably be done in a similar way. Although solving the problem in individual modules may not be the most elegant solution, it does show that its almost trivial to do so in a somewhat less elegant way.

Before we can change the way the pam_unix PAM module validates passwords, we need to get our hands on the sources of the PAM system. On my Ubuntu system I hag to use the following command to get the source.

bzr branch https://code.launchpad.net/~ubuntu-core-dev/pam/ubuntu

cd ubuntu

./configure

I had to add a -lfl flag to the LIBS definition in a few Makefile files to make the code-base build on my system. Now for the almost trivial fix. After that I could build the source tree:

cd modules pam_unix

make clean

Now for the patch to pam_unix. The file ‘support.c’ defines a function named ‘_unix_verify_password’  that seems like the perfect place for our patch.

A few pages down in this function we see an invocation of the function ‘verify_pwd_hash’. This function validates the password as it was originally typed against a password hash stored in the system. By adding a litle piece of code after this invocation, we can add a second invocation of the same function with a caps inverted version of our password.  The additional code looks as follows:


/* If the first check fails, lets try with inverted case. */
if (retval != PAM_SUCCESS) {
/*Allocate a string for the case inverted password.*/
char * capsinvertedpass= (char *) calloc(strlen(p)+1,0);
if (capsinvertedpass==0) {
pam_syslog(pamh, LOG_CRIT, "no memory for caps inverted password.");
} else {
size_t ip_index=0;
size_t ip_len=strlen(p);
/* Case invert every character */
for (ip_index=0;ip_index<ip_len;ip_index++) {
char ip_c=p[ip_index];
char ip_c2=ip_c;
/* Lowercase all upcase characters */
if ((!(ip_c < 'A')) && (!(ip_c > 'Z'))) {
ip_c2 = ip_c - 'A' + 'a'; /*uppercase to lower*/
} else {
/* Uppercase all lowercase characters */
if ( (!(ip_c < 'a')) && (!(ip_c > 'z'))) {
ip_c2 = ip_c - 'a' + 'A'; /*lower case to upper*/
}
}
/* Put the updated character value in the new password string.
capsinvertedpass[ip_index]=ip_c2;
}
/* Try again once more with the caps inverted passord*/
retval = verify_pwd_hash(capsinvertedpass, salt, off(UNIX__NONULL, ctrl));
free(capsinvertedpass);
}
}

Now we build the code again and move our updated module to the proper library.

make

sudo mv /lib/x86_64-linux-gnu/security/pam_unix.so /lib/x86_64-linux-gnu/security/pam_unix.so.original

sudo mv .libs/pam_unix.so /lib/x86_64-linux-gnu/security/pam_unix.so

sudo chown root:root /lib/x86_64-linux-gnu/security/pam_unix.so

sudo chmod 644  /lib/x86_64-linux-gnu/security/pam_unix.so

Now just to make sure everything is still working ok, we call login:

sudo login

Everything worked fine, and its relatively simple to do. We can now log in with or withour the caps-lock active. Its a quite simple patch that should be almost as simple for other PAM modules. I hope the above description will help others achieve the same. I feel that although an accidental caps-lock is rare, its sufficiently annoying when it happens to want a patch like the one described above. There is a minor security implication, but IMO the usability benefits outweigh the effective loss of a single bit of entropy.