- tags
- Transformers, RoBERTa, NLP
- paper
- (Conneau et al. 2020)
Architecture
The model is an extension of RoBERTa that introduces small parameter tuning insights in the context of multilingual applications.
Parameter count
- Base = 270M
- Large = 550M
Bibliography
- Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov. . "Unsupervised Cross-lingual Representation Learning at Scale". arXiv. DOI.