- tags
- Neural networks
Transformers are a neural network architecture based on a mechanism called Attention.
They have been particularly successful for NLP applications which started around the publication of a very influential paper by Vaswani and colleagues (Vaswani et al. 2017). Transformers turned out to be very effective language models.
They also penetrated other fields of machine learning such as Computer vision.
Bibliography
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. . "Attention Is All You Need". Arxiv:1706.03762 [cs]. http://arxiv.org/abs/1706.03762.