Tag Archives: stochastics

Dividing by uncertainty

I have a great deal of respect for the work done by ISECOM in their OSSTMM. Its overall a great and accessibly written document on doing security audits in a thorough and methodically. Its a document however, that suffers from the same, lets call it deterministic optimism that is paramount in the information security industry.  That is, an optimism that is a result of a failure to come to grasp with the nature of uncertainty.  While in this post I talk about the OSSTMM and how its failure to deal with uncertainty makes it overly optimistic about the True Protection that it helps to calculate, the OSSTMM is actually probably the best thing there is in this infosec sub field, so if even the OSSTMM doesn’t get this right, the whole subfield may be in for a black-swan event that will proof the point I am trying to make here.

So what is this uncertainty I talk about?  People tend to prefer hard numbers to fuzzy concepts like stochastics , but in many cases, certainly in information security, hard numbers are mostly impossible to get at, even when using a methodological approach like OSSTMM provides.  This means we can do one of two things:

  1. Ignore the uncertainty of our numbers.
  2. Work the uncertainty into our model.

A problem is that without an understanding of uncertainty, it is hard to know when its safe to opt for the first option and when it is not. If a variable, for example OSSTMM its Opsec(sum) variable has a level of uncertainty, a better representation than a single number could be a probability density function. A simplified variant of the probability density function is a simple histogram like the one below.

So instead of the hard number (10) we have a histogram of numbers and their probability.  So when does working with such histograms yield  significant different results than working with the hard numbers? Addition yields the same results, multiplication yields results that are generally close enough, but there is one operation where the histogram will give you a result that can, depending on your level of uncertainty and the proximity of your possible values to the dreaded zero value, and that is division.

In OSSTMM for example, the True Protection level is calculated by subtracting a security limit SecLimit from a base value. This means that if we  underestimate SecLimit we will end up being to optimistic about the true protection level.  And how is SecLimit calculated? Exactly, by dividing by an uncertain value. Worse, by dividing by an uncertain value that was itself calculated by dividing by an uncertain value.

To understand why this dividing by uncertainty can yield such different results when forgetting to take the uncertainty into account, we can device an artificial histogram to show how this happens. Lets say we have an uncertain number X with an expected value of 3. Now lets say the probability density histogram looks as follows:

  • 10% probability of being 1
  • 20% probability of being 2
  • 40% probability of being 3
  • 20% probability of being 4
  • 10% probability of being 5

Now lets take the formula Y = 9 / (X^2). When working with the expected value of 3, the result would be 9/(3*3) = 9/9=1. Lets look what happens if we apply this same formula to our histogram:

  • 10% probability of being 9
  • 20% probability of being 2.25
  • 40% probability of being 1
  • 20% probability of being 0.5625
  • 10% probability of being 0.36

Looking at the expected value of the result from the histogram, we see that this value ends up being almost twice the value we got when not taking the uncertainty into account. The lower numbers with low probabilities become the dominant factors in the result.

Note that I am in no way suggesting that these numbers will be anywhere as bad for the OSSTMM SecLimit variable, this depends greatly on the level of uncertainty of our input variables and its proximity to zero, but the above example does illustrate that not taking uncertainty into account when doing divisions can have big consequences.  In the case of OSSTMM, these consequences could make the calculated true protection level overly optimistic, what could in some cases lead to not implementing the level of security controls that this uncertainty would warrant.  This example shows us a very important and simple lesson about uncertainty. When dividing by an uncertain number, unless the uncertainty is small, be sure to include the uncertainty into your model or be prepared to get a result that is dangerously incorrect.

Advertisements