Kolmogorov complexity

tags: Complexity, Algorithmic Information theory, Computability theory

Definition

Invariance theorem

For two descriptive languages \(L_1\) and \(L_2\) and their respective associated Kolmogorov complexity functions \(K_1\) and \(K_2\), there exist a constant \(c\) — dependant only on \(L_1, L_2\) such that \[ \forall s, -c \leq K_1(s) - K_2(s) \leq c \]

In other words, there is always a bounded difference between the Kolmogorov complexity in two separate description languages.

This is proven by using the fact that the two descriptive languages have to be Turing-complete and therefore there exists an interpreter in \(L_1\) for \(L_2\) of finite size \(l\).

Uncomputability

Lemma

There exist strings of arbitrarily large Kolmogorov complexity. Formally: for each \(n \in \mathbb{N}\), there is a string s with \(K(s) \geq n\).

Proof: Otherwise infinitely many sequences would have finitely many generating programs.

Theorem

Kolmogorov complexity is not computable. There doesn’t exist a program that can compute for any string \(s\) the quantity \(K(s)\).

Proof: Let’s assume that a given description language (e.g. a common programming language) has an interepreter of size \(I\) in our Universal Turing machine and that there exist a program of lenght \(N\) that computes the Kolmogorov complexity of any string. We can write a program of fixed small size \(S\) that loops for each string of increasing length and stops when the result of the Kolmogorov complexity function is above \(2*(N+I+S)\).

Therefore, we found a contradiction with a string which Kolmogorov complexity is actually smaller than what our supposed function outputed.

Upper bound

Kolmogorov complexity is upper bounded by