In-network Neural Networks
This work addresses the challenge of low-latency, high-throughput neural network inference in networked systems, offering an incremental advancement by adapting existing hardware for AI tasks.
The paper tackled the problem of implementing neural networks directly in network hardware by presenting N2Net, a system that runs binary neural networks on commodity switching chips at packet processing speeds of billions of packets per second, demonstrating feasibility for simple models and suggesting potential for more complex ones with minor chip modifications.
We present N2Net, a system that implements binary neural networks using commodity switching chips deployed in network switches and routers. Our system shows that these devices can run simple neural network models, whose input is encoded in the network packets' header, at packet processing speeds (billions of packets per second). Furthermore, our experience highlights that switching chips could support even more complex models, provided that some minor and cheap modifications to the chip's design are applied. We believe N2Net provides an interesting building block for future end-to-end networked systems.