- tags
- Transformers, GPT
- paper
- (Shuster et al. 2022)
Architecture
This is an extension that can be applied to any Transformer model by introducing “search”, “knowledge”, and “response” modules during pre-training of the model. It has the same applications as the base model it extends.
Parameter count
Depends on the base model being extended.
Bibliography
- Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston. . "Language Models That Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion". arXiv. DOI.