ROJul 16, 2022
Physics Embedded Neural Network Vehicle Model and Applications in Risk-Aware Autonomous Driving Using Latent FeaturesTaekyung Kim, Hojin Lee, Wonsuk Lee
Non-holonomic vehicle motion has been studied extensively using physics-based models. Common approaches when using these models interpret the wheel/ground interactions using a linear tire model and thus may not fully capture the nonlinear and complex dynamics under various environments. On the other hand, neural network models have been widely employed in this domain, demonstrating powerful function approximation capabilities. However, these black-box learning strategies completely abandon the existing knowledge of well-known physics. In this paper, we seamlessly combine deep learning with a fully differentiable physics model to endow the neural network with available prior knowledge. The proposed model shows better generalization performance than the vanilla neural network model by a large margin. We also show that the latent features of our model can accurately represent lateral tire forces without the need for any additional training. Lastly, We develop a risk-aware model predictive controller using proprioceptive information derived from the latent features. We validate our idea in two autonomous driving tasks under unknown friction, outperforming the baseline control framework.
66.7ARMar 27
Data Gravity and the Energy Limits of ComputationWonsuk Lee, Jehoshua Bruck
Unlike the von Neumann architecture, which separates computation from memory, the brain tightly integrates them, an organization that large language models increasingly resemble. The crucial difference lies in the ratio of energy spent on computation versus data access: in the brain, most energy fuels compute, while in von Neumann architectures, data movement dominates. To capture this imbalance, we introduce the \emph{operation-operand disjunction constant} $G_d$, a dimensionless measure of the energy required for data transport relative to computation. As part of this framework, we propose the metaphor of \emph{data gravity}: just as mass exerts gravitational pull, large and frequently accessed data sets attract computation. We develop expressions for optimal computation placement and show that bringing the computation closer to the data can reduce energy consumption by a factor of $G_d^{(β- 1)/2}$, where $β\in (1, 3)$ captures the empirically observed distance-dependent energy scaling. We demonstrate that these findings are consistent with measurements across processors from 45\,nm to 7\,nm, as well as with results from processing-in-memory (PIM) architectures. High $G_d$ values are limiting; as $G_d$ increases, the energy required for data movement threatens to stall progress, slowing the scaling of large language models and pushing modern computing toward a plateau. Unless computation is realigned with data gravity, the growth of AI may be capped not by algorithms but by physics.
65.1CLApr 6
Compressible Softmax-Attended Language under Incompressible AttentionWonsuk Lee
Across every attention head in five transformer language models (124M--7B parameters, four architecture families), the logit energy field $\tilde{E}$ reaches 90\% of its variance in 2--11 singular components. The \emph{learned} interaction matrix $W_Q^\mathrm{T} W_K$ needs 38--75 components for the same threshold out of $d_h \in \{64, 128\}$. The spectral gap is $5$--$25\times$ in effective rank. The attention mechanism allocates capacity uniformly across all $d_h$ dimensions, but language concentrates the actual interaction into a few. The compressibility of softmax-attended language is a property of the data, not the frame that analyzes it.
34.6LGApr 6
On the Invariants of Softmax AttentionWonsuk Lee
Softmax attention maps every query--key interaction into a probability distribution, but the underlying structure remains largely unexplored. We define the \emph{energy field}, the row-centered attention logit, and show that it exhibits invariant properties across models, architectures, and inputs. Two classes of invariants emerge. \emph{Mechanism-level} invariants follow from the algebraic structure of softmax attention. They include a per-row zero-sum constraint, a rank bound determined by the head dimension, and spectral signatures that follow from them. \emph{Model-level} regularities are not required by the mechanism, yet hold in every autoregressive language model we test, spanning several architecture families. The energy field distributes its variance over key positions without concentrating at a few. This delocalization traces to a property of the key matrix we call \emph{key incoherence}. These invariants have practical consequences. The rank bound confines the energy field to a low-dimensional subspace. Key incoherence yields a per-head training monitor. All results are verified at multiple context lengths and input texts.
ROMay 1, 2023
Learning Terrain-Aware Kinodynamic Model for Autonomous Off-Road Rally Driving With Model Predictive Path Integral ControlHojin Lee, Taekyung Kim, Jungwi Mun et al.
High-speed autonomous driving in off-road environments has immense potential for various applications, but it also presents challenges due to the complexity of vehicle-terrain interactions. In such environments, it is crucial for the vehicle to predict its motion and adjust its controls proactively in response to environmental changes, such as variations in terrain elevation. To this end, we propose a method for learning terrain-aware kinodynamic model which is conditioned on both proprioceptive and exteroceptive information. The proposed model generates reliable predictions of 6-degree-of-freedom motion and can even estimate contact interactions without requiring ground truth force data during training. This enables the design of a safe and robust model predictive controller through appropriate cost function design which penalizes sampled trajectories with unstable motion, unsafe interactions, and high levels of uncertainty derived from the model. We demonstrate the effectiveness of our approach through experiments on a simulated off-road track, showing that our proposed model-controller pair outperforms the baseline and ensures robust high-speed driving performance without control failure.
ROJan 20, 2022
TOAST: Trajectory Optimization and Simultaneous Tracking using Shared Neural Network DynamicsTaekyung Kim, Hojin Lee, Seongil Hong et al.
Neural networks have been increasingly employed in Model Predictive Controller (MPC) to control nonlinear dynamic systems. However, MPC still poses a problem that an achievable update rate is insufficient to cope with model uncertainty and external disturbances. In this paper, we present a novel control scheme that can design an optimal tracking controller using the neural network dynamics of the MPC, making it possible to be applied as a plug-and-play extension for any existing model-based feedforward controller. We also describe how our method handles a neural network containing history information, which does not follow a general form of dynamics. The proposed method is evaluated by its performance in classical control benchmarks with external disturbances. We also extend our control framework to be applied in an aggressive autonomous driving task with unknown friction. In all experiments, our method outperformed the compared methods by a large margin. Our controller also showed low control chattering levels, demonstrating that our feedback controller does not interfere with the optimal command of MPC.
RODec 18, 2021
Smooth Model Predictive Path Integral Control without SmoothingTaekyung Kim, Gyuhyun Park, Kiho Kwak et al.
We present a sampling-based control approach that can generate smooth actions for general nonlinear systems without external smoothing algorithms. Model Predictive Path Integral (MPPI) control has been utilized in numerous robotic applications due to its appealing characteristics to solve non-convex optimization problems. However, the stochastic nature of sampling-based methods can cause significant chattering in the resulting commands. Chattering becomes more prominent in cases where the environment changes rapidly, possibly even causing the MPPI to diverge. To address this issue, we propose a method that seamlessly combines MPPI with an input-lifting strategy. In addition, we introduce a new action cost to smooth control sequence during trajectory rollouts while preserving the information theoretic interpretation of MPPI, which was derived from non-affine dynamics. We validate our method in two nonlinear control tasks with neural network dynamics: a pendulum swing-up task and a challenging autonomous driving task. The experimental results demonstrate that our method outperforms the MPPI baselines with additionally applied smoothing algorithms.