DistillBERT

tags
Transformers, BERT, NLP
paper
(Sanh et al. 2020)

Architecture

It is a distilled version of BERT that is much more efficient.

Parameter count

66M

Bibliography

  1. . . "Distilbert, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter". arXiv. DOI.

Comments


← Back to Notes