BART

Architecture

It is an encoder/decoder architecture. The encoder is based on BERT and the decoder is based on GPT. It generalizes the two models into a single one.

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer. October 29, 2019. "BART: Denoising Sequence-to-sequence Pre-training for Natural Language Generation, Translation, and Comprehension". arXiv. DOI.