ROFeb 28, 2025Code
SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation ModelsJiawei Zhang, Xuan Yang, Taiqi Wang et al.
Traditional autonomous driving systems often struggle to connect high-level reasoning with low-level control, leading to suboptimal and sometimes unsafe behaviors. Recent advances in multimodal large language models (MLLMs), which process both visual and textual data, offer an opportunity to unify perception and reasoning. However, effectively embedding precise safety knowledge into MLLMs for autonomous driving remains a significant challenge. To address this, we propose SafeAuto, a framework that enhances MLLM-based autonomous driving by incorporating both unstructured and structured knowledge. First, we introduce a Position-Dependent Cross-Entropy (PDCE) loss to improve low-level control signal predictions when values are represented as text. Second, to explicitly integrate safety knowledge, we develop a reasoning component that translates traffic rules into first-order logic (e.g., "red light $\implies$ stop") and embeds them into a probabilistic graphical model (e.g., Markov Logic Network) to verify predicted actions using recognized environmental attributes. Additionally, our Multimodal Retrieval-Augmented Generation (RAG) model leverages video, control signals, and environmental attributes to learn from past driving experiences. Integrating PDCE, MLN, and Multimodal RAG, SafeAuto outperforms existing baselines across multiple datasets, enabling more accurate, reliable, and safer autonomous driving. The code is available at https://github.com/AI-secure/SafeAuto.
LGFeb 23, 2024Code
Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical ApplicationsZihan Zhou, Jonathan Booher, Khashayar Rohanimanesh et al.
Safe reinforcement learning tasks are a challenging domain despite being very common in the real world. The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states. In safety-critical domains, such behaviors could lead to disastrous outcomes. To address this issue, we first describe the problem with a stronger Uniformly Constrained MDP (UCMDP) model where we impose constraints on all reachable states; we then propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic, as a solution to the Lagrangian dual of a UCMDP. We benchmark Objective Suppression in two multi-constraint safety domains, including an autonomous driving domain where any incorrect behavior can lead to disastrous consequences. On the driving domain, we evaluate on open source and proprietary data and evaluate transfer to a real autonomous fleet. Empirically, we demonstrate that our proposed method, when combined with existing safe RL algorithms, can match the task reward achieved by baselines with significantly fewer constraint violations.
CVSep 27, 2025
Real-World Transferable Adversarial Attack on Face-Recognition SystemsAndrey Kaznacheev, Matvey Mikhalchuk, Andrey Kuznetsov et al.
Adversarial attacks on face recognition (FR) systems pose a significant security threat, yet most are confined to the digital domain or require white-box access. We introduce GaP (Gaussian Patch), a novel method to generate a universal, physically transferable adversarial patch under a strict black-box setting. Our approach uses a query-efficient, zero-order greedy algorithm to iteratively construct a symmetric, grayscale pattern for the forehead. The patch is optimized by successively adding Gaussian blobs, guided only by the cosine similarity scores from a surrogate FR model to maximally degrade identity recognition. We demonstrate that with approximately 10,000 queries to a black-box ArcFace model, the resulting GaP achieves a high attack success rate in both digital and real-world physical tests. Critically, the attack shows strong transferability, successfully deceiving an entirely unseen FaceNet model. Our work highlights a practical and severe vulnerability, proving that robust, transferable attacks can be crafted with limited knowledge of the target system.
CVJun 11, 2025
Inverting Black-Box Face Recognition Systems via Zero-Order Optimization in Eigenface SpaceAnton Razzhigaev, Matvey Mikhalchuk, Klim Kireev et al.
Reconstructing facial images from black-box recognition models poses a significant privacy threat. While many methods require access to embeddings, we address the more challenging scenario of model inversion using only similarity scores. This paper introduces DarkerBB, a novel approach that reconstructs color faces by performing zero-order optimization within a PCA-derived eigenface space. Despite this highly limited information, experiments on LFW, AgeDB-30, and CFP-FP benchmarks demonstrate that DarkerBB achieves state-of-the-art verification accuracies in the similarity-only setting, with competitive query efficiency.
LGJun 13, 2024
CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous DrivingJonathan Booher, Khashayar Rohanimanesh, Junhong Xu et al.
Modern approaches to autonomous driving rely heavily on learned components trained with large amounts of human driving data via imitation learning. However, these methods require large amounts of expensive data collection and even then face challenges with safely handling long-tail scenarios and compounding errors over time. At the same time, pure Reinforcement Learning (RL) methods can fail to learn performant policies in sparse, constrained, and challenging-to-define reward settings such as autonomous driving. Both of these challenges make deploying purely cloned or pure RL policies in safety critical applications such as autonomous vehicles challenging. In this paper we propose Combining IMitation and Reinforcement Learning (CIMRL) approach - a safe reinforcement learning framework that enables training driving policies in simulation through leveraging imitative motion priors and safety constraints. CIMRL does not require extensive reward specification and improves on the closed loop behavior of pure cloning methods. By combining RL and imitation, we demonstrate that our method achieves state-of-the-art results in closed loop simulation and real world driving benchmarks.
MLFeb 7, 2022
Nonparametric Uncertainty Quantification for Single Deterministic Neural NetworkNikita Kotelevskii, Aleksandr Artemenkov, Kirill Fedyanin et al.
This paper proposes a fast and scalable method for uncertainty quantification of machine learning models' predictions. First, we show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. Importantly, the proposed approach allows to disentangle explicitly aleatoric and epistemic uncertainties. The resulting method works directly in the feature space. However, one can apply it to any neural network by considering an embedding of the data induced by the network. We demonstrate the strong performance of the method in uncertainty estimation tasks on text classification problems and a variety of real-world image datasets, such as MNIST, SVHN, CIFAR-100 and several versions of ImageNet.
LGFeb 2, 2022
Smoothed Embeddings for Certified Few-Shot LearningMikhail Pautov, Olesya Kuznetsova, Nurislam Tursynbek et al.
Randomized smoothing is considered to be the state-of-the-art provable defense against adversarial perturbations. However, it heavily exploits the fact that classifiers map input objects to class probabilities and do not focus on the ones that learn a metric space in which classification is performed by computing distances to embeddings of classes prototypes. In this work, we extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. We provide analysis of Lipschitz continuity of such models and derive robustness certificate against $\ell_2$-bounded perturbations that may be useful in few-shot learning scenarios. Our theoretical results are confirmed by experiments on different datasets.
CVNov 22, 2021
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask ArchitectureDaria Bakshandaeva, Denis Dimitrov, Vladimir Arkhipkin et al.
Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called Fusion Brain, the first competition which is targeted to make the universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The Fusion Brain Challenge combines the following specific tasks: Code2code Translation, Handwritten Text recognition, Zero-shot Object Detection, and Visual Question Answering. We have created datasets for each task to test the participants' submissions on it. Moreover, we have collected and made publicly available a new handwritten dataset in both English and Russian, which consists of 94,128 pairs of images and texts. We also propose a multimodal and multitask architecture - a baseline solution, in the center of which is a frozen foundation model and which has been trained in Fusion mode along with Single-task mode. The proposed Fusion approach proves to be competitive and more energy-efficient compared to the task-specific one.
LGSep 22, 2021
CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural NetworksMikhail Pautov, Nurislam Tursynbek, Marina Munkhoeva et al.
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks -- small modifications of the input that change the predictions. Besides rigorously studied $\ell_p$-bounded additive perturbations, recently proposed semantic perturbations (e.g. rotation, translation) raise a serious concern on deploying ML systems in real-world. Therefore, it is important to provide provable guarantees for deep learning models against semantically meaningful input transformations. In this paper, we propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds that can be used in general attack settings. We estimate the probability of a model to fail if the attack is sampled from a certain distribution. Our theoretical findings are supported by experimental results on different datasets.
LGJul 8, 2021
Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic Approach to Manifold Dimension EstimationAlexander Ivanov, Gleb Nosovskiy, Alexey Chekunov et al.
Manifold hypothesis states that data points in high-dimensional space actually lie in close vicinity of a manifold of much lower dimension. In many cases this hypothesis was empirically verified and used to enhance unsupervised and semi-supervised learning. Here we present new approach to manifold hypothesis checking and underlying manifold dimension estimation. In order to do it we use two very different methods simultaneously - one geometric, another probabilistic - and check whether they give the same result. Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation. The probabilistic method is new. Although it exploits standard nearest neighborhood distance, it is different from methods which were previously used in such situations. This method is robust, fast and includes special preliminary data transformation. Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
LGJun 28, 2021
Certified Robustness via Randomized Smoothing over Multiplicative Parameters of Input TransformationsNikita Muravev, Aleksandr Petiushko
Currently the most popular method of providing robustness certificates is randomized smoothing where an input is smoothed via some probability distribution. We propose a novel approach to randomized smoothing over multiplicative parameters. Using this method we construct certifiably robust classifiers with respect to a gamma correction perturbation and compare the result with classifiers obtained via other smoothing distributions (Gaussian, Laplace, uniform). The experiments show that asymmetrical Rayleigh distribution allows to obtain better certificates for some values of perturbation parameters. To the best of our knowledge it is the first work concerning certified robustness against the multiplicative gamma correction transformation and the first to study effects of asymmetrical distributions in randomized smoothing.
CVJun 27, 2021
Darker than Black-Box: Face Reconstruction from Similarity QueriesAnton Razzhigaev, Klim Kireev, Igor Udovichenko et al.
Several methods for inversion of face recognition models were recently presented, attempting to reconstruct a face from deep templates. Although some of these approaches work in a black-box setup using only face embeddings, usually, on the end-user side, only similarity scores are provided. Therefore, these algorithms are inapplicable in such scenarios. We propose a novel approach that allows reconstructing the face querying only similarity scores of the black-box model. While our algorithm operates in a more general setup, experiments show that it is query efficient and outperforms the existing methods.
CVMar 19, 2021
MDMMT: Multidomain Multimodal Transformer for Video RetrievalMaksim Dzabraev, Maksim Kalashnikov, Stepan Komkov et al.
We present a new state-of-the-art on the text to video retrieval task on MSRVTT and LSMDC benchmarks where our model outperforms all previous solutions by a large margin. Moreover, state-of-the-art results are achieved with a single model on two datasets without finetuning. This multidomain generalisation is achieved by a proper combination of different video caption datasets. We show that training on different datasets can improve test results of each other. Additionally we check intersection between many popular datasets and found that MSRVTT has a significant overlap between the test and the train parts, and the same situation is observed for ActivityNet.
LGFeb 11, 2021
Quadric Hypersurface Intersection for Manifold Learning in Feature SpaceFedor Pavutnitskiy, Sergei O. Ivanov, Evgeny Abramov et al.
The knowledge that data lies close to a particular submanifold of the ambient Euclidean space may be useful in a number of ways. For instance, one may want to automatically mark any point far away from the submanifold as an outlier or to use the geometry to come up with a better distance metric. Manifold learning problems are often posed in a very high dimension, e.g. for spaces of images or spaces of words. Today, with deep representation learning on the rise in areas such as computer vision and natural language processing, many problems of this kind may be transformed into problems of moderately high dimension, typically of the order of hundreds. Motivated by this, we propose a manifold learning technique suitable for moderately high dimension and large datasets. The manifold is learned from the training data in the form of an intersection of quadric hypersurfaces -- simple but expressive objects. At test time, this manifold can be used to introduce a computationally efficient outlier score for arbitrary new data points and to improve a given similarity metric by incorporating the learned geometric structure into it.
LGDec 14, 2020
Robustness Threats of Differential PrivacyNurislam Tursynbek, Aleksandr Petiushko, Ivan Oseledets
Differential privacy (DP) is a gold-standard concept of measuring and guaranteeing privacy in data analysis. It is well-known that the cost of adding DP to deep learning model is its accuracy. However, it remains unclear how it affects robustness of the model. Standard neural networks are not robust to different input perturbations: either adversarial attacks or common corruptions. In this paper, we empirically observe an interesting trade-off between privacy and robustness of neural networks. We experimentally demonstrate that networks, trained with DP, in some settings might be even more vulnerable in comparison to non-private versions. To explore this, we extensively study different robustness measurements, including FGSM and PGD adversaries, distance to linear decision boundaries, curvature profile, and performance on a corrupted dataset. Finally, we study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect (decrease and increase) the robustness of the model.
CVNov 4, 2020
Mutual Modality Learning for Video Action ClassificationStepan Komkov, Maksim Dzabraev, Aleksandr Petiushko
The construction of models for video action classification progresses rapidly. However, the performance of those models can still be easily improved by ensembling with the same models trained on different modalities (e.g. Optical flow). Unfortunately, it is computationally expensive to use several modalities during inference. Recent works examine the ways to integrate advantages of multi-modality into a single RGB-model. Yet, there is still a room for improvement. In this paper, we explore the various methods to embed the ensemble power into a single model. We show that proper initialization, as well as mutual modality learning, enhances single-modality models. As a result, we achieve state-of-the-art results in the Something-Something-v2 benchmark.
CVJul 27, 2020
Black-Box Face Recovery from Identity FeaturesAnton Razzhigaev, Klim Kireev, Edgar Kaziakhmedov et al.
In this work, we present a novel algorithm based on an it-erative sampling of random Gaussian blobs for black-box face recovery, given only an output feature vector of deep face recognition systems. We attack the state-of-the-art face recognition system (ArcFace) to test our algorithm. Another network with different architecture (FaceNet) is used as an independent critic showing that the target person can be identified with the reconstructed image even with no access to the attacked model. Furthermore, our algorithm requires a significantly less number of queries compared to the state-of-the-art solution.
CVJun 28, 2020
Geometry-Inspired Top-k Adversarial PerturbationsNurislam Tursynbek, Aleksandr Petiushko, Ivan Oseledets
The brittleness of deep image classifiers to small adversarial input perturbations has been extensively studied in the last several years. However, the main objective of existing perturbations is primarily limited to change the correctly predicted Top-1 class by an incorrect one, which does not intend to change the Top-k prediction. In many digital real-world scenarios Top-k prediction is more relevant. In this work, we propose a fast and accurate method of computing Top-k adversarial examples as a simple multi-objective optimization. We demonstrate its efficacy and performance by comparing it to other adversarial example crafting techniques. Moreover, based on this method, we propose Top-k Universal Adversarial Perturbations, image-agnostic tiny perturbations that cause the true class to be absent among the Top-k prediction for the majority of natural images. We experimentally show that our approach outperforms baseline methods and even improves existing techniques of finding Universal Adversarial Perturbations.
CVOct 15, 2019
On adversarial patches: real-world attack on ArcFace-100 face recognition systemMikhail Pautov, Grigorii Melnikov, Edgar Kaziakhmedov et al.
Recent works showed the vulnerability of image classifiers to adversarial attacks in the digital domain. However, the majority of attacks involve adding small perturbation to an image to fool the classifier. Unfortunately, such procedures can not be used to conduct a real-world attack, where adding an adversarial attribute to the photo is a more practical approach. In this paper, we study the problem of real-world attacks on face recognition systems. We examine security of one of the best public face recognition systems, LResNet100E-IR with ArcFace loss, and propose a simple method to attack it in the physical world. The method suggests creating an adversarial patch that can be printed, added as a face attribute and photographed; the photo of a person with such attribute is then passed to the classifier such that the classifier's recognized class changes from correct to the desired one. Proposed generating procedure allows projecting adversarial patches not only on different areas of the face, such as nose or forehead but also on some wearable accessory, such as eyeglasses.
CVOct 14, 2019
Real-world adversarial attack on MTCNN face detection systemEdgar Kaziakhmedov, Klim Kireev, Grigorii Melnikov et al.
Recent studies proved that deep learning approaches achieve remarkable results on face detection task. On the other hand, the advances gave rise to a new problem associated with the security of the deep convolutional neural network models unveiling potential risks of DCNNs based applications. Even minor input changes in the digital domain can result in the network being fooled. It was shown then that some deep learning-based face detectors are prone to adversarial attacks not only in a digital domain but also in the real world. In the paper, we investigate the security of the well-known cascade CNN face detection system - MTCNN and introduce an easily reproducible and a robust way to attack it. We propose different face attributes printed on an ordinary white and black printer and attached either to the medical face mask or to the face directly. Our approach is capable of breaking the MTCNN detector in a real-world scenario.
CVAug 23, 2019
AdvHat: Real-world adversarial attack on ArcFace Face ID systemStepan Komkov, Aleksandr Petiushko
In this paper we propose a novel easily reproducible technique to attack the best public Face ID system ArcFace in different shooting conditions. To create an attack, we print the rectangular paper sticker on a common color printer and put it on the hat. The adversarial sticker is prepared with a novel algorithm for off-plane transformations of the image which imitates sticker location on the hat. Such an approach confuses the state-of-the-art public Face ID model LResNet100E-IR, ArcFace@ms1m-refine-v2 and is transferable to other Face ID models.