- tags
- Transformers, GPT, NLP
- paper
- (Ouyang et al. 2022)
Architecture
This model starts off from a pretrained GPT-3. Reward modeling is added with Reinforcement learning.
Parameter count
175B
Bibliography
- Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al.. . "Training Language Models to Follow Instructions with Human Feedback". arXiv. DOI.