Different models have been studied, starting from the initial
Recurrent neural network based language model (Mikolov et al. 2011).
LSTM were then used with more success than previous models
(Zaremba, Sutskever, and Vinyals
Language models can be used to generate text from a prompt or
starting sentence. This is the kind of examples that made models
like GPT-2 and GPT-3 famous, because of their ability to generate
long sequences of apparently coherent text (Radford et al. 2019; Brown
et al. 2020).
Brown, Tom B., Benjamin Mann, Nick Ryder,
Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind
Neelakantan, et al. 2020. “Language Models Are Few-Shot Learners.”
arXiv:2005.14165 [Cs], June.
Mikolov, Tomas, Martin Karafiat, Lukas
Burget, Jan Cernocky, and Sanjeev Khudanpur. 2011. “Recurrent
Neural Network Based Language Model.” In Interspeech 2011,
Radford, Alec, Jeffrey Wu, Rewon Child,
David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language
Models Are Unsupervised Multitask Learners.” OpenAI Blog 1
Zaremba, Wojciech, Ilya Sutskever, and
Oriol Vinyals. 2015. “Recurrent Neural Network Regularization.”
arXiv:1409.2329 [Cs], February.