Diffusion models

tags: Generative modelling
papers: (Sohl-Dickstein et al. 2015), (Ho et al. 2020)

Principle of diffusion

Forward diffusion

An image of size \(N\) by \(N\) \(x_0\), which is a vector in \(\mathbb{R}^{N \times N \times c}\) is diffused at each timestep \(t\) to become \(x_t\). The forward diffusion step is defined as follows: \[ q(\boldsymbol{x}_t | \boldsymbol{x}_{t-1}) = \mathcal{N}(\boldsymbol{x}_t; \sqrt{1 - \beta_t} \boldsymbol{x}_{ t - 1 }, \beta_t I) \] The probability of a sequence of images \(x_1, \ldots, x_T\) is then \[ q(\boldsymbol{x}_1, \ldots, \boldsymbol{x}_T | \boldsymbol{x}_0) = \prod_{t=1}^T q(\boldsymbol{x}_t|\boldsymbol{x}_{t -1}) \]

For each timestep a new diffused image is sampled from a gaussian distribution centered on \(\sqrt{1 - \beta_t} \boldsymbol{x}_{ t - 1 }\) with covariance matrix \(\beta_t I\). It gradually perturbs the data independently for each pixel.

This choice of scaled mean allows to compute the distribution of \(\boldsymbol{x}_t\) directly for any number of timesteps : \[ q(\mathbf{x}_t | \mathbf{x}_0) = \mathcal{N}(\mathbf{x}_t; \sqrt{\bar{\alpha}_t} \mathbf{x}_0, (1 - \bar{\alpha}_t)\mathbf{I}) \] where \(\bar{\alpha}_t = \prod_{i = 1}^t \alpha_i\) and \(\alpha_t = 1 - \beta_t\)

The forward process progressively alters data, mapping the distribution of images to the normal distribution in the limit of infinitely many timesteps.

Iterative denoising

Bibliography

Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli. November 18, 2015. "Deep Unsupervised Learning Using Nonequilibrium Thermodynamics". arXiv. http://arxiv.org/abs/1503.03585.
Jonathan Ho, Ajay Jain, Pieter Abbeel. December 16, 2020. "Denoising Diffusion Probabilistic Models". arXiv. http://arxiv.org/abs/2006.11239.

Principle of diffusion

Forward diffusion

Iterative denoising

Bibliography

Links to this note

Comments