XLM-RoBERTa

tags
Transformers, RoBERTa, NLP
paper
(Conneau et al. 2020)

Architecture

The model is an extension of RoBERTa that introduces small parameter tuning insights in the context of multilingual applications.

Parameter count

  • Base = 270M
  • Large = 550M

Bibliography

  1. . . "Unsupervised Cross-lingual Representation Learning at Scale". arXiv. DOI.

Comments


← Back to Notes