This is done by outputting an extra halting probability at each update step, and considering two timelines:
- the input timeline which plays the role of an outer loop, at each of those step, a new input symbol is fed to the RNN. This step outputs a single output vector.
- the internal processing timeline, this is the inner loop being run at each of the input steps. This runs until the cumulative halting probability is above a threshold and emits as many output values as steps..