AI, Native Supercomputing and The Revival of Moore's Law
This work tackles the fundamental problem of hardware inefficiency in AI for researchers and engineers, but it appears incremental as it builds on existing supercomputing concepts without proven results.
The paper proposes a new AI computing architecture that separates a universal computer from a universal learning machine to address the slowdown of Moore's Law, suggesting that this approach can overcome limitations by natively understanding linear algebra and using collective streaming for data distribution.
Based on Alan Turing's proposition on AI and computing machinery, which shaped Computing as we know it today, the new AI computing machinery should comprise a universal computer and a universal learning machine. The later should understand linear algebra natively to overcome the slowdown of Moore's law. In such a universal learnig machine, a computing unit does not need to keep the legacy of a universal computing core. The data can be distributed to the computing units, and the results can be collected from them through Collective Streaming, reminiscent of Collective Communication in Supercomputing. It is not necessary to use a GPU-like deep memory hierarchy, nor a TPU-like fine-grain mesh.