LM with RNNs.
Different models have been studied, starting from the initial Recurrent neural network based language model (Mikolov et al. 2011).
LSTM were then used with more success than previous models (Zaremba, Sutskever, and Vinyals 2015).
LM with Transformers
Language modeling and Compression
Language models can be used to generate text from a prompt or starting sentence. This is the kind of examples that made models like GPT-2 and GPT-3 famous, because of their ability to generate long sequences of apparently coherent text (Radford et al. 2019; Brown et al. 2020).
- Notes on: Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data by Bender, E. M., & Koller, A. (2020)
- Notes on: Neural Architecture Search with Reinforcement Learning by Zoph, B., & Le, Q. V. (2017)
- Notes on: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention by Katharopoulos, A., Vyas, A., Pappas, N., & Fleuret, F. (2020)
- Word vectors