Gianni Antichi

10.0NIMay 29

Laurin Brandner, Ayush Mishra, Sebastiano Miano et al.

Service meshes have recently emerged as the de-facto standard for deploying microservices. Conceptually, they provide a uniform abstraction for inter-process communication (IPC) between services by implementing common networking mechanisms -- such as encryption, routing, and load balancing -- and by allowing these mechanisms to be configured and composed through high-level policies. Supporting these policies, however, comes with a significant performance cost, since service meshes interpose proxies (``sidecars'') on the data path, leading to numerous context switches. This paper presents L7FP, a fast path for service meshes which can enforce the vast majority of application-layer policies seen in the wild directly in kernel space. Given high-level policies, L7FP automatically synthesizes an eBPF-based data plane which enforces them in the kernel. L7FP accelerates existing microservices without any code modification, and transparently falls back to existing service proxies (the slow path) for the few unsupported policies. We fully implemented L7FP, with support for both TLS and HTTP/2. Compared to state-of-the-art service meshes, L7FP reduces the median request latency of realistic applications by up to $6\times$ while sustaining $3\times$ more throughput.

DCSep 4, 2020

Running Neural Networks on the NIC

Giuseppe Siracusano, Salvator Galea, Davide Sanvito et al.

In this paper we show that the data plane of commodity programmable (Network Interface Cards) NICs can run neural network inference tasks required by packet monitoring applications, with low overhead. This is particularly important as the data transfer costs to the host system and dedicated machine learning accelerators, e.g., GPUs, can be more expensive than the processing task itself. We design and implement our system -- N3IC -- on two different NICs and we show that it can greatly benefit three different network monitoring use cases that require machine learning inference as first-class-primitive. N3IC can perform inference for millions of network flows per second, while forwarding traffic at 40Gb/s. Compared to an equivalent solution implemented on a general purpose CPU, N3IC can provide 100x lower processing latency, with 1.5x increase in throughput.

Gianni Antichi

2 Papers