Entropy

tags: Complexity metrics
references: (Shannon, Weaver 1975)

For a discrete random variable $X$ with outcomes $x_i$, $P(X=x_i) = P_i$, the entropy or uncertainty function of $X$ is defined as \[ H(X) = -\sum_{i=1}^{N} P_i \log P_i \]

Entropy is always positive, and is maximized when the uncertainty is maximal, that is when $P_1 = P_2 = … = P_N = \frac{1}{N}$ entropy in that case is $\log N$.

Interpretations:

$H$ measures uncertainty of a process (or the average number of yes/no answers one needs to specify the value of $i$).
Average information received by an observer when reading the outcomes of $X$.

If we encode the random variable $X$ with an erroneous encoding $P_i’$, the resulting code length is $\sum P_i \log \frac{1}{P_i’}$. The difference between this encoding and the optimal encoding is the Kullback-leibler divergence.

Shannon entropy for a sequence

In the case of a sequence of symbols $s_1, …, s_i, …$, following a translation invariant distribution (for any $n$ and $k$, $P(s_1s_2…s_n) = P(s_{1+k}s_{2+k}…s_{n+k})$).

We can define the block entropy of the $n$-tuple random variable $S=(S_1…S_n)$ like above \[ H_n = \sum_{s_1…s_n} P(s_1…s_n) \log P(s_1…s_n) \]

Then, we can define the average length of the description per symbol of the sequence as \[ h =\lim_{n \rightarrow \infty} h_n \] \[ h_n =H_{n+1} - H_n \]

$h$ is called by Shannon the entropy of the source emitting the sequence, or entropy of the sequence.

Bibliography

Claude E. Shannon, Warren Weaver. 1975. The Mathematical Theory of Communication. Urbana: University of Illinois Press.

Shannon entropy for a sequence

Bibliography

Links to this note

Comments