T5

tags
Transformers, NLP
paper
(Raffel et al. 2020)

Architecture

It is the same as the original transformer with some relative positional embedding added (similar to Transformer-XL).

11B

Bibliography

1. . . "Exploring the Limits of Transfer Learning with a Unified Text-to-text Transformer". arXiv. DOI.

Links to this note

Last changed | authored by

← Back to Notes