- tags
- Transformers, NLP
- paper
- (Zhang et al. 2020)
Architecture
This is a standard encoder/decoder architecture with a special pre-training task suited for summarization of text.
Parameter count
- Base = 223M
- Large = 568M
Bibliography
- Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu. . "PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization". arXiv. http://arxiv.org/abs/1912.08777.