This is an extension of BERT with more data and a better optimized training procedure.
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. . "Roberta: A Robustly Optimized BERT Pretraining Approach". arXiv. http://arxiv.org/abs/1907.11692.