š¾ Archived View for nox.im āŗ posts āŗ 2021 āŗ 1230 āŗ the-probability-of-six-sigma-events captured on 2024-03-21 at 15:10:54. Gemini links have been rewritten to link to archived content
ā¬ ļø Previous capture (2023-09-28)
-=-=-=-=-=-=-
I was recently in a conversation where an individual stated the likelihood of occurrence to be a "ten sigma event". My question was, what is the assumption of the distribution of the data? If normal, this would equal an expected frequency occurrence of several lifetimes of the universe. As it turns out, many people who use these terms lack an understanding what they really mean. It justifies a brief post on the ludic fallacy.
Black Swan[1]
The term was coined from assumptions of the normal distribution to describe extreme movements in market prices for modeling exposure to losses. In the 2008 financial crisis, Goldman and Citi alike called the circumstances "unpredictable" and to be "unforeseen events". The fallacy of using the normal distribution is that it makes these type of events ___extremely___ rare. A six sigma event assumes a 0.000000197% probability of occurrence. In other words once every 1.38 million years. The notable fallacy here is that a six-sigma event isnāt that rare, only if your probability distribution is normal.
Let's compute the probability and expected frequency of N sigma events. Let $\mu$ be the the average (expected value) of random variable $X$ with density $f(x)$
$ \mu \equiv \operatorname{E}[X] = \int_{-\infty}^{+\infty} x f(x) \, dx $
The standard deviation $\sigma$ of $X$ is defined as the square root of the variance of $X$, i.e. subtract the average from every data point ad square it.
The squre root of the average the squared differences for all the data points.
$ \sigma \equiv \sqrt{\operatorname E\left[(X - \mu)^2\right]} = \sqrt{ \int_{-\infty}^{+\infty} (x-\mu)^2 f(x) \, dx }, $
The probability density function of the Gaussian distribution is
$ f(x) = \frac{1}{\sigma \sqrt{2\pi} } e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} $
For normally distributed data, the values within one standard deviation of the mean account for about 68% of the set, within two standard deviations account for about 95%.
The following table shows the probability and expected frequency of $\mu \pm x\sigma$ events assuming a Gaussian distribution (more on that shortly), and one can see that the probabilities fall extremely rapidly. Note that in most online material the probabilities are $\frac{1}{2}$ of the ones listed here the frequencies are doubled. In financial contexts one is concerned with only one side of the distribution, the downside risk.
$\sigma$ | Probability $p$ | Expected frequency $f$ :
1 | 31.73% | every 3 days 2 | 4.55% | 21 days 3 | .27% | 370 days 4 | .0063% | 15787 days 5 | .000057% | 1 million days 6 | .000000197% | 506 million days 7 | .00000000025% | 390 billion days 8 | .00000000000012% | 803 trillion days 9 | .000000000000000022% | ...
10 | .0000000000000000000015% | ...
We can see that under the Gaussian distribution, a $\mu \pm 8\sigma$ event has a probability with an expected frequency of several lifetimes of the universe.
Wolfram alpha was helpful in getting hold of this many decimal places during the calculation. The probabilities $p$ were calculated given
$ p = 1 - \operatorname{erf}\left(\frac{x}{\sqrt{2}}\right) $
and the frequencies $f$ given
$ f = \tfrac{1}{1-\operatorname{erf}\left(\frac{x}{\sqrt{2}}\right)} $
The gaussian error function $\operatorname{erf}$ is also known as the standard normal cumulative probability. The integral is a non-elementary sigmoid function.
$ \operatorname{erf} z = \frac{2}{\sqrt\pi}\int_0^z e^{-t^2}\,dt $
Many financial models such as Modern Portfolio Theory and Efficient Markets rely (or at least used to rely) heavily upon the assumption that market returns follow a normal distribution. It is difficult to measure as tail events happen infrequently and with various impact.
The standard deviation calculation makes actually no assumptions about the distribution, how the data points are arranged around the average. The standard deviation does not require the distribution to be normal (Gaussian). If this is not the case, the tail risk estimations are wrong.
Nassim Nicholas Taleb called the misuse of games to model real-life situations the "ludic fallacy" (The Black Swan 2007). The discrepancy between normality and reality. The word "ludic" originates from the Latin noun "ludus", meaning "play, game, sport".
Most people assume however that standard deviation and the normal distribution are linked together. For example, the BlackāScholes model of option pricing is based on a normal distribution. Another "cool" concept to sell more business books are "Six Sigma Quality" or "Six Sigma Practice", which strives to achieve fewer than 3.4 defects per million (see the table above halved). This is obviously ludicrous. This percentage assumes a normal distribution for flaws in the manufacturing process. A different distribution of errors may yield hundreds of errors per million at six sigmas.