# Entropy

tags
Complexity metrics
references
(Shannon, Weaver 1975)

For a discrete random variable $$X$$ with outcomes $$x_i$$, $$P(X=x_i) = P_i$$, the entropy or uncertainty function of $$X$$ is defined as $H(X) = -\sum_{i=1}^{N} P_i \log P_i$

Entropy is always positive, and is maximized when the uncertainty is maximal, that is when $$P_1 = P_2 = … = P_N = \frac{1}{N}$$ entropy in that case is $$\log N$$.

Interpretations:

• $$H$$ measures uncertainty of a process (or the average number of yes/no answers one needs to specify the value of $$i$$).
• Average information received by an observer when reading the outcomes of $$X$$.

If we encode the random variable $$X$$ with an erroneous encoding $$P_i’$$, the resulting code length is $$\sum P_i \log \frac{1}{P_i’}$$. The difference between this encoding and the optimal encoding is the Kullback-leibler divergence.

## Shannon entropy for a sequence

In the case of a sequence of symbols $s_1, …, s_i, …$, following a translation invariant distribution (for any $$n$$ and $$k$$, $$P(s_1s_2…s_n) = P(s_{1+k}s_{2+k}…s_{n+k})$$).

We can define the block entropy of the $n$-tuple random variable $$S=(S_1…S_n)$$ like above $H_n = \sum_{s_1…s_n} P(s_1…s_n) \log P(s_1…s_n)$

Then, we can define the average length of the description per symbol of the sequence as $h =\lim_{n \rightarrow \infty} h_n$ $h_n =H_{n+1} - H_n$

$$h$$ is called by Shannon the entropy of the source emitting the sequence, or entropy of the sequence.

## Bibliography

1. . . The Mathematical Theory of Communication. Urbana: University of Illinois Press.