This paper introduces interesting ideas for training cellular automata as CNNs to have self-repairing stable structures. The automata have 16 dimensional continuous states. The main modeling ideas are:
- Use hard-coded filters for the initial perception step. The filters are Sobel convolutions and those two are concatenated with the current state.
- Update rules are then 1D convolutions applied to the \(3 * 16 = 48\) dimensional state vector. They use a neural network with dimensions 40 -> 128 -> 16.
- States are updated with probability .5 only, making them highly asynchronous.
- One of the 16 channels is an \(\alpha\) channel that determines whether the cell is alive or dead. The threshold is set to 0.1 for setting a cell to alive. dead cells have their state manually set to 0 at each step.
The authors then apply several training tricks to make the patterns more robust, self-repairing, etc.
I find the paper quite interesting, especially with its take on CA update as CNN. The fixed convolutions restricts the possible rules while enabling a more stable search process probably.
Some of the modeling ideas such as the size of the downstream neural in 1D convolutions and the asynchronous updates aren’t really justified clearly. The final quantization step to make the whole thing work in browsers is particularly interesting to me: the end up with a CA that has \(16 * 8 = 128\) bits states. Or maybe a 120 bits states and 8 bits alive/dead semi-independent state. This is something like \(10^36\) states which is many orders of magnitude larger than my experiments.
The very last paragraph I particularly like:
Engineering and machine learning
The models described in this article run on the powerful GPU of a modern computer or a smartphone. Yet, let’s speculate about what a “more physical” implementation of such a system could look like. We can imagine it as a grid of tiny independent computers, simulating individual cells. Each of those computers would require approximately 10Kb of ROM to store the “cell genome”: neural network weights and the control code, and about 256 bytes of RAM for the cell state and intermediate activations. The cells must be able to communicate their 16-value state vectors to neighbors. Each cell would also require an RGB-diode to display the color of the pixel it represents. A single cell update would require about 10k multiply-add operations and does not have to be synchronised across the grid. We propose that cells might wait for random time intervals between updates. The system described above is uniform and decentralised. Yet, our method provides a way to program it to reach the predefined global state, and recover this state in case of multi-element failures and restarts. We therefore conjecture this kind of modeling may be used for designing reliable, self-organising agents. On the more theoretical machine learning front, we show an instance of a decentralized model able to accomplish remarkably complex tasks. We believe this direction to be opposite to the more traditional global modeling used in the majority of contemporary work in the deep learning field, and we hope this work to be an inspiration to explore more decentralized learning modeling.