SeeKer

tags: Transformers, GPT
paper: (Shuster et al. 2022)

Architecture

This is an extension that can be applied to any Transformer model by introducing “search”, “knowledge”, and “response” modules during pre-training of the model. It has the same applications as the base model it extends.

Parameter count

Depends on the base model being extended.

Bibliography

Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston. March 29, 2022. "Language Models That Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion". arXiv. DOI.

Architecture

Parameter count

Bibliography

Comments