Flamingo

tags
Transformers, Computer vision, NLP, Chinchilla
paper
(Alayrac et al. 2022)

Architecture

Uses a frozen language model (e.g. Chinchilla) that is conditioned on a visual representation given from a normalizer-free ResNet.

Parameter count

80B

Bibliography

  1. . . "Flamingo: A Visual Language Model for Few-shot Learning". arXiv. http://arxiv.org/abs/2204.14198.

Comments


← Back to Notes