XLM-RoBERTa

tags
Transformers, RoBERTa, NLP
paper
(Conneau et al. 2020)

Architecture

The model is an extension of RoBERTa that introduces small parameter tuning insights in the context of multilingual applications.

Parameter count

  • Base = 270M
  • Large = 550M

Bibliography

  1. . . "Unsupervised Cross-lingual Representation Learning at Scale". arXiv. DOI.
Last changed | authored by

Comments


← Back to Notes