- tags
- Transformers, BART, NLP
- paper
- (Li et al. 2022)
Architecture
It is a distilled and quantized version of BART. It improves performance as well as the model size.
Bibliography
- Zheng Li, Zijian Wang, Ming Tan, Ramesh Nallapati, Parminder Bhatia, Andrew Arnold, Bing Xiang, Dan Roth. . "DQ-BART: Efficient Sequence-to-sequence Model via Joint Distillation and Quantization". arXiv. DOI.