# Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data by Bender, E. M., & Koller, A. (2020)

source
(Bender and Koller 2020)
tags
NLP, Artificial Intelligence, Evaluating NLP

## Summary

The main point of the article could be summarized like so:

We argue that the language modeling task, because it only uses form as training data, cannot in principle lead to learning of meaning. We take the term language model to refer to any system trained only on the task of string prediction, whether it operates over characters, words or sentences, and sequentially or not. We take (linguistic) meaning to be the relation between a linguistic form and communicative intent.

Several NLP papers (cited in the text) use overly confident claims about language models understanding a piece of text or knowledge.

The authors formalize meaning with a set of pairs $$M \subseteq E \times I$$, $$(e, i)$$ representing natural language expressions and their communicative intent. In this framework, understanding means being able to retrieve $$i$$ when given $$e$$.

They also mention the concept of conventional meaning, i.e. the communicative potential $$s$$ of a form $$e$$ which is constant across contexts of use of the form $$e$$.

When communicating, a speaker has a prior intent $$i$$, and has to choose a form $$e$$ with the right potential $$s$$ that fits $$i$$. The argument of the paper is that there isn’t enough signal in a text corpus to learn the above relations in $$M$$.

A nice example is the Chinese room experiment, wherein a person is tasked with answering questions in Chinese by consulting a library of Chinese books according to fixed rules. Such a person would look intelligent without having any actual understanding of Chinese.

A nice example is the problem of learning language models on Programming languages. No matter how good a LM is at learning the semantics of the language, it has no way of learning the relations between program inputs and outputs.

The authors provide more example why human-language acquisition is fundamentally different from the flawed language models.

A good LM might learn to give meaningful answers, but will need to store infinite amount of data and will ultimately fail at dealing with the ever evolving set of forms humans use.