- tags
- Transformers, GPT, Chinchilla
- paper
- (Glaese et al. 2022)
- blog post
- Deepmind announcement blog post
Architecture
Starts from the Chinchilla 70B model but adds RLHF (Reinforcement Learning with Human Feedback). It also adds inline evidence like GopherCite.
Parameter count
70B
Bibliography
- Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, et al.. . "Improving Alignment of Dialogue Agents via Targeted Human Judgements". arXiv. DOI.