NLP is about creating algorithms that can manipulate and use language. It is often thought that having functioning NLP algorithms that provably “understand” language would be equivalent to reaching human-level Artificial Intelligence.
There are several ways to encode text data.
- One-hot encoding of characters
- One-hot encoding of words
- Byte-pair encoding which can be seen as being a compromise between the two