ITJun 8, 2016
(Almost) Practical Tree CodesAnatoly Khina, Wael Halbawi, Babak Hassibi
We consider the problem of stabilizing an unstable plant driven by bounded noise over a digital noisy communication link, a scenario at the heart of networked control. To stabilize such a plant, one needs real-time encoding and decoding with an error probability profile that decays exponentially with the decoding delay. The works of Schulman and Sahai over the past two decades have developed the notions of tree codes and anytime capacity, and provided the theoretical framework for studying such problems. Nonetheless, there has been little practical progress in this area due to the absence of explicit constructions of tree codes with efficient encoding and decoding algorithms. Recently, linear time-invariant tree codes were proposed to achieve the desired result under maximum-likelihood decoding. In this work, we take one more step towards practicality, by showing that these codes can be efficiently decoded using sequential decoding algorithms, up to some loss in performance (and with some practical complexity caveats). We supplement our theoretical results with numerical simulations that demonstrate the effectiveness of the decoder in a control system setting.
SYOct 28, 2016
Multi-Rate Control over AWGN Channels via Analog Joint Source-Channel CodingAnatoly Khina, Gustav M. Pettersson, Victoria Kostina et al.
We consider the problem of controlling an unstable plant over an additive white Gaussian noise (AWGN) channel with a transmit power constraint, where the signaling rate of communication is larger than the sampling rate (for generating observations and applying control inputs) of the underlying plant. Such a situation is quite common since sampling is done at a rate that captures the dynamics of the plant and which is often much lower than the rate that can be communicated. This setting offers the opportunity of improving the system performance by employing multiple channel uses to convey a single message (output plant observation or control input). Common ways of doing so are through either repeating the message, or by quantizing it to a number of bits and then transmitting a channel coded version of the bits whose length is commensurate with the number of channel uses per sampled message. We argue that such "separated source and channel coding" can be suboptimal and propose to perform joint source-channel coding. Since the block length is short we obviate the need to go to the digital domain altogether and instead consider analog joint source-channel coding. For the case where the communication signaling rate is twice the sampling rate, we employ the Archimedean bi-spiral-based Shannon-Kotel'nikov analog maps to show significant improvement in stability margins and linear-quadratic Gaussian (LQG) costs over simple schemes that employ repetition.
26.5LGMay 23
Beyond Fixed Points: Superpolynomial Capacity of Asymmetric Hopfield NetworksAakash Kumar, Anatoly Khina, Frederik Mallmann-Trenn et al.
Classical Hopfield networks are limited to static patterns due to symmetric weights, whereas asymmetric networks can encode temporal sequences via limit-cycle attractors. Achieving high-capacity storage of long sequences in classical synchronous asymmetric networks, however, has remained a challenge. We present a simple and robust construction within the classical asymmetric Hopfield model with binary neurons and synchronous updates, that allows $n$ neurons to support $\exp\!\big(Ω(n/(\log n)^2)\big)$ distinct limit-cycle attractors, each with period $\exp\!\big(Ω(\sqrt n/\log n)\big)$ and robust to random noise with flip probability up to $\frac12-o(1)$, yielding superpolynomial capacity in both the number and length of stored sequences. This is the first demonstration of such capacity for asymmetric Hopfield networks, which we obtain by combining results from combinatorics, number theory and the analysis of opinion dynamics. Our findings show that synchronous asymmetric Hopfield networks possess a sequence-memory capacity which is larger and more robust than previously recognized, demonstrating that, in both biological and artificial neural systems, robust sequence representation can be achieved through coarse architectural motifs rather than complex nonlinearities.
LGDec 3, 2024
Batch Normalization DecomposedIdo Nachum, Marco Bondaschi, Michael Gastpar et al.
\emph{Batch normalization} is a successful building block of neural network architectures. Yet, it is not well understood. A neural network layer with batch normalization comprises three components that affect the representation induced by the network: \emph{recentering} the mean of the representation to zero, \emph{rescaling} the variance of the representation to one, and finally applying a \emph{non-linearity}. Our work follows the work of Hadi Daneshmand, Amir Joudaki, Francis Bach [NeurIPS~'21], which studied deep \emph{linear} neural networks with only the rescaling stage between layers at initialization. In our work, we present an analysis of the other two key components of networks with batch normalization, namely, the recentering and the non-linearity. When these two components are present, we observe a curious behavior at initialization. Through the layers, the representation of the batch converges to a single cluster except for an odd data point that breaks far away from the cluster in an orthogonal direction. We shed light on this behavior from two perspectives: (1) we analyze the geometrical evolution of a simplified indicative model; (2) we prove a stability result for the aforementioned~configuration.
LGNov 3, 2021
A Johnson--Lindenstrauss Framework for Randomly Initialized CNNsIdo Nachum, Jan Hązła, Michael Gastpar et al.
How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson--Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), stating that the geometry is essentially preserved. For FNNs with the ReLU activation, the angle between two inputs contracts according to a known mapping. The question for non-linear convolutional neural networks (CNNs) becomes much more intricate. To answer this question, we introduce a geometric framework. For linear CNNs, we show that the Johnson--Lindenstrauss lemma continues to hold, namely, that the angle between two inputs is preserved. For CNNs with ReLU activation, on the other hand, the behavior is richer: The angle between the outputs contracts, where the level of contraction depends on the nature of the inputs. In particular, after one layer, the geometry of natural images is essentially preserved, whereas for Gaussian correlated inputs, CNNs exhibit the same contracting behavior as FNNs with ReLU activation.
SYSep 17, 2018
Learning-based attacks in cyber-physical systemsMohammad Javad Khojasteh, Anatoly Khina, Massimo Franceschetti et al.
We introduce the problem of learning-based attacks in a simple abstraction of cyber-physical systems---the case of a discrete-time, linear, time-invariant plant that may be subject to an attack that overrides the sensor readings and the controller actions. The attacker attempts to learn the dynamics of the plant and subsequently override the controller's actuation signal, to destroy the plant without being detected. The attacker can feed fictitious sensor readings to the controller using its estimate of the plant dynamics and mimic the legitimate plant operation. The controller, on the other hand, is constantly on the lookout for an attack; once the controller detects an attack, it immediately shuts the plant off. In the case of scalar plants, we derive an upper bound on the attacker's deception probability for any measurable control policy when the attacker uses an arbitrary learning algorithm to estimate the system dynamics. We then derive lower bounds for the attacker's deception probability for both scalar and vector plants by assuming a specific authentication test that inspects the empirical variance of the system disturbance. We also show how the controller can improve the security of the system by superimposing a carefully crafted privacy-enhancing signal on top of the "nominal control policy." Finally, for nonlinear scalar dynamics that belong to the Reproducing Kernel Hilbert Space (RKHS), we investigate the performance of attacks based on nonlinear Gaussian-processes (GP) learning algorithms.
SYSep 13, 2018
Algorithms for Optimal Control with Fixed-Rate FeedbackAnatoly Khina, Yorie Nakahira, Yu Su et al.
We consider a discrete-time linear quadratic Gaussian networked control setting where the (full information) observer and controller are separated by a fixed-rate noiseless channel. The minimal rate required to stabilize such a system has been well studied. However, for a given fixed rate, how to quantize the states so as to optimize performance is an open question of great theoretical and practical significance. We concentrate on minimizing the control cost for first-order scalar systems. To that end, we use the Lloyd-Max algorithm and leverage properties of logarithmically-concave functions and sequential Bayesian filtering to construct the optimal quantizer that greedily minimizes the cost at every time instant. By connecting the globally optimal scheme to the problem of scalar successive refinement, we argue that its gain over the proposed greedy algorithm is negligible. This is significant since the globally optimal scheme is often computationally intractable. All the results are proven for the more general case of disturbances with logarithmically-concave distributions and rate-limited time-varying noiseless channels. We further extend the framework to event-triggered control by allowing to convey information via an additional "silent symbol", i.e., by avoiding transmitting bits; by constraining the minimal probability of silence we attain a tradeoff between the transmission rate and the control cost for rates below one bit per sample.