- tags
- Transformers, Computer vision, NLP, Chinchilla
- paper
- (Alayrac et al. 2022)
Architecture
Uses a frozen language model (e.g. Chinchilla) that is conditioned on a visual representation given from a normalizer-free ResNet.
Parameter count
80B
Bibliography
- Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, et al.. . "Flamingo: A Visual Language Model for Few-shot Learning". arXiv. http://arxiv.org/abs/2204.14198.