- tags
- NLP, Computer science
- source
- (Weiss et al. 2021)
Summary
This paper introduces a programming language that is inspired by the way Transformers process input data. The language is called Restricted Access Sequence Processing Language (RASP).
Data is represented as sequences, which is the structure transformers manipulate (since they have been designed for NLP applications). The language has two types of internal data representation:
- Sequence operators (s-ops) are functions that translate sequences into sequences. For example, the
tokens
operator is the identity and thelength
operator is defined bylength(["h", "i"]) = [2, 2]
. - Selectors are functions that takes two sequences and a boolean predicate that can be applied to all pairs of input values. It returns a matrix which values are the result of the boolean op on pairs of values. \[ \text{select}([1, 2, 3], [0, 3, 4], <) = \begin{bmatrix} \text{F} & \text{T} & \text{T}\newline \text{F} & \text{T} & \text{T}\newline \text{F} & \text{F} & \text{T} \end{bmatrix} \] Selectors are paired with aggregators that will convert a selector matrix and a sequence to a sequence. The selector filters and average input values: \[ \text{aggregate}(\begin{bmatrix} \text{F} & \text{T} & \text{T} \newline \text{F} & \text{T} & \text{T}\newline \text{F} & \text{F} & \text{T} \end{bmatrix} , [0, 12, 3]) = [7.5, 7.5, 3] \]
Correspondence between RASP ops and transformer components:
- Built-in s-ops like
indices
andtokens
are similar to input embeddings of a transformer. - Element-wise s-ops are related to feed-forward components in transformers which are just regular neural networks.
- Selection and aggregation are akin to attention in transformers.
To evaluate how “good” this new language is for understanding and designing transformer architectures, the authors study three main questions:
- its ability to upper bound the number of heads and layers required to solve a task,
- the tightness of that bound,
- its feasibility in a transformer
Bibliography
- Gail Weiss, Yoav Goldberg, Eran Yahav. . "Thinking Like Transformers". Arxiv:2106.06981 [cs]. http://arxiv.org/abs/2106.06981.