- tags
- Transformers, GPT, NLP
- website
- Microsoft Project Turing
Architecture
The architecture is similar to GPT-2 and GPT-3 with some parameter optimization and software/hardware platform to improve training.
Parameter count
17B originally, now up to 530B.