It is sometimes hard to notice, but many of the foundations of computer science and programming come from mathematics and their many branches.
One example of that is the famous Big O notation, a technique so useful to evaluate an algorithm’s time and space complexity that nowadays it is part of most job interviews.
The Big O notation has has been known in mathematics since at least the 19th century [1] and used in computer science since its beginnings, but you may be surprised to know that we’ve been using it wrong all this time! Let’s see why.
O() (Big O)
As mentioned earlier, Big O has been around for a long time. We write it as the capital Latin letter O in italics, where the O stands for the German “Ordnung”, meaning “order” [2]. It’s formal definitions is:
\begin{alignedat}{1} f(x) &= O\bigl(g(x)\bigr)\ \text{as } x\to\infty \\[6pt] &\Longleftrightarrow\; \exists\,M>0,\ \exists\,x_{0}\ \text{s.t.}\ \forall\,x\ge x_{0}:\ |f(x)|\le M\,|g(x)| \end{alignedat}
Let me explain that with the following example:
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
if (j == 10) break;
do_work(i, j);
}
}
We can define a function f(n) that tells us how many steps this code will execute:
\begin{alignedat}{1} f(n) = \begin{cases} n^{2}, & 0 \le n \le 10,\\ 10n, & n \ge 11. \end{cases} \end{alignedat}
Now, let’s imagine another function g(n) such that, when multiplied by a constant that we define, returns a value equal or greater than f(n)
for large values of n
. For example, all these functions meet that criteria:
\begin{alignedat}{1} g_{1}(n) = n \quad (\text{for any } C \ge 10)\\ g_{2}(n) = n^{2} \quad (\text{for any } C > 0)\\ g_{3}(n) = e^{n} \quad (\text{for any } C > 0) \end{alignedat}
So, all these statements would be valid:
\begin{alignedat}{1} f(n) \text{ is } O(n)\\ f(n) \text{ is } O(n^2)\\ f(n) \text{ is } O(e^n) \end{alignedat}
This means that what Big O really does is to define an upper limit for our function, but it does not have to be tight.
Although this is still useful sometimes, we usually need a tighter value. Unfortunately, people were used to using Big O for both tight and non-tight complexity estimation, which led to more than one headaches.
This is why, in 1976, Donald Knuth sent a letter to the SIGACT News computer science newsletter to describe this problem. In this letter, he proposed the use of an improved version of the Big Omega notation introduced by Hardy and Littlewood in 1914, and a new Big Theta notation [2] based on the mathematical concept of “same order of magnitude”. Knuth’s idea gained traction, and it’s what the computer science community uses to this day.
Ω() (Big Omega)
Just like Big O defines an upper bound for a function, Big Omega (or just Ω) defines a function’s lower bound.
The formal definition is:
\begin{aligned} \Omega(f(n)) \text{ denotes the set of all } g(n) \text { such that }\\ \text {there exist positive constants } C \text{ and } n_0\\ \text { with } g(n) >= Cf(n) \text{ for all n >= n0 } \end{aligned}
Considering the same example and function f(n)
from the Big O section above, we can imagine another function g'(n) such that, when multiplied by a constant that we define, returns a value equal or lower than f(n)
for large values of n
. For example, all these functions meet that criteria:
\begin{alignedat}{1} g'_{1}(n) = n \quad (\text{for } C = 1 \text{ and } n_0 = 1)\\ g'_{2}(n) = \sqrt n \quad (\text{for } C = 1 \text{ and } n_0 = 1)\\ g'_{3}(n) = log(n) \quad (\text{for } C = 1 \text{ and } n_0 = 1) \end{alignedat}
And all these statements would be valid:
\begin{alignedat}{1} f(n) \text{ is } \Omega(n)\\ f(n) \text{ is } \Omega(\sqrt n)\\ f(n) \text{ is } \Omega(log(n)) \end{alignedat}
Similarly to Big O, Bit Omega defines a lower bound, but it does not requires that it is a tight lower bound, Any lower bound that meets the criteria is still valid.
Θ() (Big Theta)
Last but not least, Big Theta defines the actual complexity of an algorithm for large inputs. It means that both the Big O and Big Omega statements hold at the same time. Formally, it’s defined as:
\begin{aligned} \Theta(f(n)) \text{ denotes the set of all } g(n) \text { such that }\\ \text {there exist positive constants } C \text{, } C'\text { and } n_0\\ \text { with } Cf(n) \le g(n) \le C'f(n) \text{ for all n >= n0 } \end{aligned}
This is, in fact, what most people really means when giving the order of an algorithm.
Using the same example as before, we can see that these functions g''(n)
meet our criteria:
\begin{alignedat}{1} g''_{1}(n) = n \quad (\text{for } C = 0.1 \text{, } C' = 1 \text { and } n_0 = 11)\\ g''_{2}(n) = n+log(n) \end{alignedat}
And so, these statements are true:
\begin{alignedat}{1} f(n) \text{ is } \Theta(n)\\ f(n) \text{ is } \Theta(n + log(n)) \end{alignedat}
Clear Communications
Despite Knuth’s work and the further uptake by the community of the notations that he proposed, the confusion about the terms has not finished yet. Nowadays, we keep using Big O for something that means something different.
This is pure speculation, but I believe that one reason that may contribute to Big O remaining the most popular option to refer to complexity is that we do not have a key for the Omega or Theta symbols on our keyboards, unless you have a Greek keyboard 🙂 , while the Latin letter O is present in most keyboards.
Whatever the reason, the misuse of the term remains a problem to this day. I would encourage people to try to use the correct terms unless they will cause more confusion. If everyone in your team says Big O when they mean Big Theta, maybe just follow along unless you really need to make the distinction.
Cheers,
José Miguel
Share if you find this content useful, and Follow me on LinkedIn to be notified of new articles.
References
[1] Landau Symbols, Wolfram MathWorld
[2] Big O, Wikipedia
[3] Big Omicron and Big Omega and Big Theta, Donald E. Knuth, 1976