CVDec 6, 2019Code
SAM: Squeeze-and-Mimic Networks for Conditional Visual Driving Policy LearningAlbert Zhao, Tong He, Yitao Liang et al.
We describe a policy learning approach to map visual inputs to driving controls conditioned on turning command that leverages side tasks on semantics and object affordances via a learned representation trained for driving. To learn this representation, we train a squeeze network to drive using annotations for the side task as input. This representation encodes the driving-relevant information associated with the side task while ideally throwing out side task-relevant but driving-irrelevant nuisances. We then train a mimic network to drive using only images as input and use the squeeze network's latent representation to supervise the mimic network via a mimicking loss. Notably, we do not aim to achieve the side task nor to learn features for it; instead, we aim to learn, via the mimicking loss, a representation of the side task annotations directly useful for driving. We test our approach using the CARLA simulator. In addition, we introduce a more challenging but realistic evaluation protocol that considers a run that reaches the destination successful only if it does not violate common traffic rules. A video summarizing this work is available at https://youtu.be/ipKAMzmJpMs , and code is available at https://github.com/twsq/sam-driving .
ROMay 18, 2025
Robust Planning for Autonomous Driving via Mixed Adversarial Diffusion PredictionsAlbert Zhao, Stefano Soatto
We describe a robust planning method for autonomous driving that mixes normal and adversarial agent predictions output by a diffusion model trained for motion prediction. We first train a diffusion model to learn an unbiased distribution of normal agent behaviors. We then generate a distribution of adversarial predictions by biasing the diffusion model at test time to generate predictions that are likely to collide with a candidate plan. We score plans using expected cost with respect to a mixture distribution of normal and adversarial predictions, leading to a planner that is robust against adversarial behaviors but not overly conservative when agents behave normally. Unlike current approaches, we do not use risk measures that over-weight adversarial behaviors while placing little to no weight on low-cost normal behaviors or use hard safety constraints that may not be appropriate for all driving scenarios. We show the effectiveness of our method on single-agent and multi-agent jaywalking scenarios as well as a red light violation scenario.
LGApr 3, 2018
Learning to Search via Retrospective ImitationJialin Song, Ravi Lanka, Albert Zhao et al.
We study the problem of learning a good search policy for combinatorial search spaces. We propose retrospective imitation learning, which, after initial training by an expert, improves itself by learning from \textit{retrospective inspections} of its own roll-outs. That is, when the policy eventually reaches a feasible solution in a combinatorial search tree after making mistakes and backtracks, it retrospectively constructs an improved search trace to the solution by removing backtracks, which is then used to further train the policy. A key feature of our approach is that it can iteratively scale up, or transfer, to larger problem sizes than those solved by the initial expert demonstrations, thus dramatically expanding its applicability beyond that of conventional imitation learning. We showcase the effectiveness of our approach on a range of tasks, including synthetic maze solving and combinatorial problems expressed as integer programs.