- tags
- Transformers, NLP
- paper
- (Radford et al. 2018)
Succesors
The GPT architecture was improved upon and extended into GPT-2 and GPT-3. The original “GPT-1” was quickly abandoned in favor of its successor, but GPT is still used to refer to this family of models.
Parameter count
117M
Bibliography
- Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever. . "Improving Language Understanding by Generative Pre-training". OpenAI.