Tag Archives: risk

Why HTTPS everywhere is a horrible idea (for now).

trafic sign

While privacy is a valuable thing, and while encryption in general helps improving privacy and can be used to help improve security, in this blog post I will discuss how more encryption can actually harm you when used with a fundamentally flawed public key infrastructure. Before we go on and discuss what the problems are, a big of a background.

When confidential communication between for example your web browser and and for example your bank is needed, how do your browser and the banks web server achieve this. The following steps will take place:

  • Your browser will ask the internet’s Domain Name System (DNS) for the IP address for the IP address of ‘www.yourbank.com’. The Domain Name System will resolve the name and come back with an IP adress.
  • Your browser will initiate a TCP connection to the IP address it got back from the Domain Name System.
  • Once the TCP connection is established, the client and server will initiate the ‘handshake’ phase of the Transport Layer Security protocol.

After the handshake phase, everything should be fine and dandy, but is it? What would an adversary need to do to defeat the confidentiality of your connection? Given that without encryption the adversary already would need access to the transmitted data to read your data, we shall assume the adversary is sniffing all network traffic. Now the first thing an adversary needs to do to defeat the above setup is to fool your browser into thinking it is your bank. It can do this quite easily given that the Domain Name System (in its most basic form) runs on top of the User Datagram Protocol (UDP), a trivial connection-less protocol that can effordlesly be spoofed to make your browser believe your bank’s server is running on the attacker’s IP. So now, after the TCP connection has been established to what your browser beliefs is your bank, the TLS handshake begins. Our attacker could try to impersonate our bank, or he could, and this is where we shall look at, attempt to take a role of ‘Man in the middle’. That is, next to making your browser think it is your bank, it will actually connect to your bank and relay content between your browser and your bank either just so it can snoop on your traffic or until it is ready to strike by changing transaction content. But lets not get ahead of ourselves. Our client has connected to our attacker and our attacker has made a connection to our bank so the attacker’s machine can act as man in the middle. What attack factors can it use?

  • It can launch a downgrade attack by offering an inferior subset of the ciphers offered by the real client to the real server. This way the cipher suite used in the connection can be made to be the weakest common denominator between the real client and the real server. This way it can weaken the strength of the encryption, or force the use of a cipher suite that can later be weakened by other man in the middle tricks.
  • It can provide the client with a rogue certificate for ‘www.yourbank.com’. This will be harder to get by, but doing so would leave the attacker with a fully decrypted stream of traffic.

The last scenario, by many people is often described as being relatively unlikely. Let me try to elaborate why it is not. The certificate offered by the attacker has some security in it. Your browser won’t accept it unless it is signed by a ‘trusted’ Certificate Authority (CA). Your browser will trust only about 50 or so CA’s, so that sounds kinda OK doesn’t it? Well, there is an other trick with the CA based public key infrastructure, not only will it trust ANY of these 50 CA’s to sign ANY domain, it will also trust many sub-CA’s to do so. In total there should be over 600 certificate authorities in over 50 counties that might be used to sign a rogue certificate for your domain. The problem with trusting ANY CA to sign ANY domain arises from the mathematical properties of probability calculus for such cases. What this properties basically result to is the following horrific fact:

The probability that non of 600 equally trustworthy CA’s could somehow be persuaded, tricked or compromised by our attacker in a way that would allow it to get a rogue certificate signed is equal to the probability for a single one of these CA’s to the power of 600.

So if I can put 100% trust in each of these 600, the cumulative trust would be 100%, fine. 99.99%? We are at 94% what is pretty decent sure. 99.9%? Now things start to crumble as we are down to only 55% of cumulative trust. 99%? All commutative trust basically evaporated as we are down to 0.24%.

While these numbers are pretty bad, we must address the subject of attacker resources. If our attacker is our neighbor’s 14 year old kid trying to mess with us on his computer, than 99.99% might pretty well be a realistic number. A group of skilled and determined cyber criminals? 99.9% might pretty well be a realistic number and thus a real concern when communicating with our bank from a technical perspective. There could be economical aspects that would make this type of attack less likely for committing banking fraud. Now how about industrial espionage? Nation states? 99% would sound like a pretty low estimate for those types of adversaries.  Remember, 99% had us down to 0.24% as cumulative trust factor and that is assuming  equal trustworthiness for all CA’s.

So basically in the current CA infrastructure we can safely say that HTTPS protects us from script kiddies and (some) non targeted cybercrime attacks. It might even protect us to a certain level from mass surveillance. But that’s it. It does not protect us from targeted attacks trough organized criminal activities. It does not protect any high stakes intellectual properties from industrial espionage nor does it by itself protect the privacy of political dissidents and the likes.

So you might say: some protection is better than none, right? Well, not exactly. There is one thing that HTTPS and SSL in general is perfectly good at protecting trafic from: “YOU!” 

trojan

Remember Trojan Horses? Programs that act like one thing but actually are an other.  Or how about malicious content on compromised websites that exploits vulnerabilities in your browser or your browser plugin for flash? Nasty stuff running on your machine with access to all of your sensitive data. If they want to get your data out of your computer and into theirs, than HTTPS would be a good way to do it.  Now compare the situation of using HTTPS for all your web traffic to using HTTPS only for connecting to sites you a) visit regularly and b) actually need protecting. In the last situation, unexpected malicious encrypted traffic will stand out.  Its not my bank, its not my e-mail, I’m not ordering anything, why am I seeing encrypted traffic? When using HTTPS for every site that offers it though, we are creating a situation where Trojans and other malware can remain under the radar.

But back to the issue with the CA based infrastructure. There is an other issue, the issue of patterns of trust. When you hire a contractor to work on your shed, there is a common pattern of introduction that is the most major factor in the trust involved in getting your shed fixed. The contractor will introduce you to the carpenter and after that there will be a partial trust relationship between you and the carpenter that is a sibling of the trust relationship you have with the contractor. In modern web based technologies similar relationships are not uncommon, but the CA based architecture is currently the only mechanism available. A mechanism that doesn’t allow for the concept of secure introduction. While domain name based trust might be suitable for establishing trust with our contractor, a form of trust establishment that is completely immune to the kind of problems we face with domain name based trust initiation suffers from. In order to establish the introduction based trust, the server equivalent of the contractor could simply send us a direct unforgeable reference to the server equivalent of the carpenter. Its like the contractor has issued a mobile phone to the carpenter for communication with clients and than gives the phone number to the client. Not a normal phone number but some maybe 60 number long phone number that nobody could dial unless they had been handed that phone number first. The client knows that the person answering the phone should be the carpenter. The carpenter knows that the person calling him on that number should be a client. No CA’s or certificates needed period. Unfortunately though, the simple technology needed for the use of these webkeys currently isn’t implemented in our browsers. In a webkey, no certificate or CA is needed as measures for validating the public key of the server are hard coded into the URL itself.

So one one side we have an outdated DNS system, a set of outdated legacy cipher suits and a dangerously untrustworthy CA infrastructure that undermine the positive side of using HTTPS and on the other side we have untrustworthy programs with way to much privileges running on our machines exposing us to the negative sides of HTTPS. The illusion of security offered by a mediocre security solution like this can be much worse than using no security at all, but the Trojan and malware aspects make it worse than that.  Basically there is quite a lot of things that need fixing before using HTTPS everywhere stops being decremental to security. The following list isn’t complete, but gives some thoughts of the kind of things we need before HTTPS everywhere becomes a decent concept:

  1. Least authority everywhere. First and foremost, trojans and exploited clients should not by default be ‘trusted’ applications with access to our sensitive data.
  2. The current CA based PKI infrastructure with >600 ‘trusted’ CA’s in over 50 countries must be replaced urgently:
    1. NameCoin everywhere. Use of technology like demonstrated by NameCoin is a possible path.
    2. DANE everywhere. The extensive use of DNSSEC with DANE could offer an alternative to CA’s that is a significant improvement over the CA based infrastructure.
  3. TLS1.2 everywhere. Older versions of TLS and many of the ciphers defined in those standards should become deprecated.
  4. DNSSEC everywhere.
  5. WebKeys everywhere. We really need webkey support in the browser.

I realize the above rant may be a tad controversial and not in line with the popular view that encryprion is good medicine.  I do however find it important to illuminate the dark side of HTTPS everywhere and to show that its just one piece from the center of the puzzle while it would be much better to start solving the puzzle with edge pieces instead,

Advertisements

The relevance of Pacman and the van de Graaff generator to peer to peer networking in IPv4 networks.

In the late 1990s I was working on a OSI like layering model for peer to peer networks. In the early 2000s, the code red worm hit the internet, and a person I hold highly that was aware of some of my work ended up appealing to my sense of responsibility regarding the possible use of my algorithms in internet worms.  After thorough consideration I decided to stop my efforts on a multi layered pure-P2P stack, and remove all information on the lowest layer algorithm that I had come up with.

About half a decade later, when p2p and malware were starting to rather crudely come together, I ended up discussing my algorithm with a security specialist at a conference, and he advised me to create a limited public paper on an hypothetical worm that would use this technology. Worms those days were rather crude and lets call it ‘loud’ , while my hypothetical Shai Hulud worm would be rather stealthy.I did some simulations, and ended up sending a simple high level textual description to some of my CERT contacts, so they would at least know what to look for .  APT wasn’t a thing yet in these days, so looking for low footprint patterns that might hide a stealthy worm really was not a priority to anyone.

Now, an other half a decade has passed, and crypto currency, BitCoin, etc have advanced P2P-trust way beyond what I envisioned in the late 1990s.  Next to that, the infosec community has evolved quite a bit, and a worm, even a stealthy one should be in the scope of modern APT focused monitoring. Further, I still believe the algorithms may indeed proof useful for the benign purposes that I initially envisioned them for: layered trusted pure P2P.

Van-de-Graaf-Generator-web

I won’ t cant give the exact details of the algorithm. Not only do I not want to make things to easy on malware builders,  even if I wanted to,  quite a bit of life happened between CodeRed and now, things where my original files were lost, so I have to do things from memory (If anyone still has a copy my original Shai Hulud paper, please drop me a message). As it is probably better to be vague than to be wrong, I won’t give details I’m not completely sure about anymore.

So lets talk about tho concepts that inspired the algorithm: The van de Graaf generator and the PacMan video game.  Many of you will know the van de Graaf generator (picture above) as something purely fun and totaly unrelated to IT. The metal sphere used in the generator though has a very interesting property. The electrons on a charged sphere are perfectly evenly spaced on the surface of the round sphere, and the mathematics involved that allow one to calculate how this happens are basic high school math.  My first version of my algorithm was based on mapping the IPv4 address space unto a spherical coordinate system, and while the math was manageable,  I ended up with the reverse problem that projecting a round globe onto a flat map gives,  to many virtual IP addresses ended up at the poles.

pacman arcade game

Than something hit me, in the old  Pacman game, if you went of the flat surface on the left end, you ended up coming out on the right and vise versa. If you take the IPv4 address space, use 16 bit for the X axis and 16 bit for the Y axis, you can create what basically is a pacman sphere tile. If you than place the same tile around a center tile on all sides, you end up with a 3×3 square of copies of your pacman tile. Now we get back to our electrons, turns out that if we take a single electron on our central tile, take the position of that electron as the center of a new virtual tile, we can use that virtual tile as a window relevant to the electrodynamics relevant to our individual electron. Further it turns out that if we disregard all but the closest N electrons , the dynamics of the whole pacman sphere remain virtually the same.

Now this was basically the core concept of part one of the algorithm. If we say that every node in a peer to peer network has two positions:

  • A static position determined by its IP
  • A dynamic position determined by the electromechanical interaction with peers.

A node would start off with its dynamic position being set to its static position, and would start looking for peers by scanning random positions. If an other node was discovered, information on dynamic positions of the node itself and its closest neighbours would be exchanged. Each node would thus keep in contact only with its N closest peers in terms of virtual position. This interaction would make the ‘connected’ nodes virtual locations spread relatively evenly over the address space.

image9

Ones stabilized, there would be a number of triangles between our node and its closest peers equal to the number N.   Each individual triangle could be divided into 6 smaller triangles, that could than be divides between the nodes.

Su55k02_m21

Now comes the stealthy part of the algorithm that had me and other so scared that I felt it important to keep this simple algorithm hidden for well over a decade, given that there hardly  any overlap between the triangles, each node would have exactly 2N triangles worth of address space to scan for peers, and any of these triangles would be scanned by only one connected peer.

The math for this algorithm is even significantly simpler than the simple high school math needed for the 3 dimensional sphere. I hope this simple algorithm can proof useful for P2P design, and I trust that in 2014, the infosec community has grown sufficiently to deal with the stealthiness that this algorithm will imply if it were to be used in malware. Further I believe that advances in distributed trust have finally made use of this algorithm in solid P2P architectures a serious option.  I hope my insights for choosing this moment for finally publishing this potentially powerful P2P algorithm are on the mark and I am not publishing this before the infosec community is ready or after the P2P development community needed it.

Dividing by uncertainty

I have a great deal of respect for the work done by ISECOM in their OSSTMM. Its overall a great and accessibly written document on doing security audits in a thorough and methodically. Its a document however, that suffers from the same, lets call it deterministic optimism that is paramount in the information security industry.  That is, an optimism that is a result of a failure to come to grasp with the nature of uncertainty.  While in this post I talk about the OSSTMM and how its failure to deal with uncertainty makes it overly optimistic about the True Protection that it helps to calculate, the OSSTMM is actually probably the best thing there is in this infosec sub field, so if even the OSSTMM doesn’t get this right, the whole subfield may be in for a black-swan event that will proof the point I am trying to make here.

So what is this uncertainty I talk about?  People tend to prefer hard numbers to fuzzy concepts like stochastics , but in many cases, certainly in information security, hard numbers are mostly impossible to get at, even when using a methodological approach like OSSTMM provides.  This means we can do one of two things:

  1. Ignore the uncertainty of our numbers.
  2. Work the uncertainty into our model.

A problem is that without an understanding of uncertainty, it is hard to know when its safe to opt for the first option and when it is not. If a variable, for example OSSTMM its Opsec(sum) variable has a level of uncertainty, a better representation than a single number could be a probability density function. A simplified variant of the probability density function is a simple histogram like the one below.

So instead of the hard number (10) we have a histogram of numbers and their probability.  So when does working with such histograms yield  significant different results than working with the hard numbers? Addition yields the same results, multiplication yields results that are generally close enough, but there is one operation where the histogram will give you a result that can, depending on your level of uncertainty and the proximity of your possible values to the dreaded zero value, and that is division.

In OSSTMM for example, the True Protection level is calculated by subtracting a security limit SecLimit from a base value. This means that if we  underestimate SecLimit we will end up being to optimistic about the true protection level.  And how is SecLimit calculated? Exactly, by dividing by an uncertain value. Worse, by dividing by an uncertain value that was itself calculated by dividing by an uncertain value.

To understand why this dividing by uncertainty can yield such different results when forgetting to take the uncertainty into account, we can device an artificial histogram to show how this happens. Lets say we have an uncertain number X with an expected value of 3. Now lets say the probability density histogram looks as follows:

  • 10% probability of being 1
  • 20% probability of being 2
  • 40% probability of being 3
  • 20% probability of being 4
  • 10% probability of being 5

Now lets take the formula Y = 9 / (X^2). When working with the expected value of 3, the result would be 9/(3*3) = 9/9=1. Lets look what happens if we apply this same formula to our histogram:

  • 10% probability of being 9
  • 20% probability of being 2.25
  • 40% probability of being 1
  • 20% probability of being 0.5625
  • 10% probability of being 0.36

Looking at the expected value of the result from the histogram, we see that this value ends up being almost twice the value we got when not taking the uncertainty into account. The lower numbers with low probabilities become the dominant factors in the result.

Note that I am in no way suggesting that these numbers will be anywhere as bad for the OSSTMM SecLimit variable, this depends greatly on the level of uncertainty of our input variables and its proximity to zero, but the above example does illustrate that not taking uncertainty into account when doing divisions can have big consequences.  In the case of OSSTMM, these consequences could make the calculated true protection level overly optimistic, what could in some cases lead to not implementing the level of security controls that this uncertainty would warrant.  This example shows us a very important and simple lesson about uncertainty. When dividing by an uncertain number, unless the uncertainty is small, be sure to include the uncertainty into your model or be prepared to get a result that is dangerously incorrect.