- source
- (Yang, Piantadosi 2022)
- tags
- NLP, Artificial Intelligence, Machine learning

## Summary

This paper introduces a model for learning language from few examples while generalizing effectively.

This model builds sentences with function that are combinations of elementary functions, including:

`pair(L, C)`

: Concatenates character C onto list L`first(L)`

: Return the first character of L`flip(P)`

: Return true with probability P`if(B, X, Y)`

: Return X if B else return Y (X and Y may be lists, sets, or probabilities)- etc.

(See Table 1 in (Yang, Piantadosi 2022) for the complete list)

The space of possible functions constructed by composing the elementary ones above is denoted \(H\). The method wants to compute a posterior distribution over hypotheses \(P(H |D)\) given an example string set \(D\). This is estimated using Bayes rule: \[ P(H|D) \propto P(D | H) P(H) \]

\(P(H)\) is given by a probabilistic context-free grammar a priori and the other term is estimated by maximizing likelihood.

### Innovations

- The authors propose a special definition of that likelihood \(P(D|H)\) such that it can be used incrementally on parts of a string only. Therefore if \(H\) generates strings similar but not equal to \(D\) it still benefits from it.
- Hypotheses can be constructed from multiple other sub-hypotheses.
- For non-deterministic expressions, multiple execution path are considered simultaneously.

## Comments

It is hard to tell how scalable the method is. Also relatively few details are given about the inference and how it is performed.

I think it is very cool to see “constructive” models of language that can “compute” the language instead of simply predicting the next token statistically. It also feels like this has a much better potential to be a data efficient language model.

## Bibliography

- Yuan Yang, Steven T. Piantadosi. . "One Model for the Learning of Language".
*Proceedings of the National Academy of Sciences*119 (5):e2021865119. DOI.