22.4CRMar 27Code
Hermes Seal: Zero-Knowledge Assurance for Autonomous Vehicle CommunicationsMunawar Hasan, Apostol Vassilev, Edward Griffor et al.
The application of zero-knowledge proofs (ZKPs) in autonomous systems is an emerging area of research, motivated by the growing need for regulatory compliance, transparent auditing, and trustworthy operation in decentralized environments. zk-SNARK is a powerful cryptographic tool that allows a party (the prover) to prove to another party (the verifier) that a statement about its own internal state is true, without revealing sensitive or proprietary data about that state. This paper proposes Hermes Seal: a zk-SNARK-based ZKP framework for enabling privacy-preserving, verifiable communication in vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) networks. The framework allows autonomous systems to generate cryptographic proofs of perception and decision-related computations without revealing proprietary models, sensor data, or internal system states, thereby supporting interoperability across heterogeneous autonomous systems. We present two real-world case studies implemented and empirically evaluated within our framework, demonstrating a step toward verifiable autonomous system information exchanges. The first demonstrates real-time proof generation and verification, achieving 8 ms proof generation and 1 ms verification on a GPU, while the second evaluates the performance of an autonomous vehicle perception stack, enabling proof of computation without exposing proprietary or confidential data. Furthermore, the framework can be integrated into AV perception stacks to facilitate verifiable interoperability and privacy-preserving cooperative perception. The demonstration code for this project is open source, available on Github.
22.5LGMay 28
Bounded Behavioral Indistinguishability for Black-Box LLM DistillationMunawar Hasan
Black-box LLM distillation is usually evaluated as an output-matching problem: a student is considered successful when its responses are semantically similar to, or task-consistent with, those of a teacher. However, output similarity does not imply that the student is behaviorally indistinguishable from the model it imitates. We introduce bounded behavioral indistinguishability, formalized as $(ε,q,t,\mathbb{A})$-behavioral indistinguishability over an explicit prompt distribution, where $ε$ bounds distinguishing advantage, $q$ bounds oracle queries, $t$ bounds computation, and $\mathbb{A}$ denotes the adversary class. We instantiate this notion on Qwen and Llama teacher-student pairs using a controlled $5,000$-prompt behavioral probe suite. For each family, we compare the teacher with both the base student and the LoRA-distilled student, measuring whether distillation reduces distinguishability rather than merely improving similarity. LoRA raises semantic similarity from $0.788$ to $0.862$ for Qwen and from $0.814$ to $0.874$ for Llama. Yet adversarial evaluation reveals remaining behavioral differences: learned discriminators retain nonzero advantage, and pairwise category analysis shows artifacts concentrated in style/format, robustness, and domain-technical prompts. A pairwise teacher-identification adversary confirms this trend. With a different-family Llama judge and A/B-swap consistency filtering, Qwen distinguishing advantage drops from $0.158$ for the base student to $0.081$ after LoRA distillation. Query-budget experiments show that disagreement-guided acquisition does not consistently outperform stratified random sampling, indicating that coverage and diversity remain strong baselines. Our results show that semantic fidelity is useful but insufficient: black-box LLM distillation requires bounded, adversarial, and category-aware evaluation.
CVJan 30
On the Assessment of Sensitivity of Autonomous Vehicle PerceptionApostol Vassilev, Munawar Hasan, Edward Griffor et al.
The viability of automated driving is heavily dependent on the performance of perception systems to provide real-time accurate and reliable information for robust decision-making and maneuvers. These systems must perform reliably not only under ideal conditions, but also when challenged by natural and adversarial driving factors. Both of these types of interference can lead to perception errors and delays in detection and classification. Hence, it is essential to assess the robustness of the perception systems of automated vehicles (AVs) and explore strategies for making perception more reliable. We approach this problem by evaluating perception performance using predictive sensitivity quantification based on an ensemble of models, capturing model disagreement and inference variability across multiple models, under adverse driving scenarios in both simulated environments and real-world conditions. A notional architecture for assessing perception performance is proposed. A perception assessment criterion is developed based on an AV's stopping distance at a stop sign on varying road surfaces, such as dry and wet asphalt, and vehicle speed. Five state-of-the-art computer vision models are used, including YOLO (v8-v9), DEtection TRansformer (DETR50, DETR101), Real-Time DEtection TRansformer (RT-DETR)in our experiments. Diminished lighting conditions, e.g., resulting from the presence of fog and low sun altitude, have the greatest impact on the performance of the perception models. Additionally, adversarial road conditions such as occlusions of roadway objects increase perception sensitivity and model performance drops when faced with a combination of adversarial road conditions and inclement weather conditions. Also, it is demonstrated that the greater the distance to a roadway object, the greater the impact on perception performance, hence diminished perception robustness.
LGOct 23, 2023
Meta learning with language models: Challenges and opportunities in the classification of imbalanced textApostol Vassilev, Honglan Jin, Munawar Hasan
Detecting out of policy speech (OOPS) content is important but difficult. While machine learning is a powerful tool to tackle this challenging task, it is hard to break the performance ceiling due to factors like quantity and quality limitations on training data and inconsistencies in OOPS definition and data labeling. To realize the full potential of available limited resources, we propose a meta learning technique (MLT) that combines individual models built with different text representations. We analytically show that the resulting technique is numerically stable and produces reasonable combining weights. We combine the MLT with a threshold-moving (TM) technique to further improve the performance of the combined predictor on highly-imbalanced in-distribution and out-of-distribution datasets. We also provide computational results to show the statistically significant advantages of the proposed MLT approach. All authors contributed equally to this work.
2.7AIApr 6
Incompleteness of AI Safety Verification via Kolmogorov ComplexityMunawar Hasan
Ensuring that artificial intelligence (AI) systems satisfy formal safety and policy constraints is a central challenge in safety-critical domains. While limitations of verification are often attributed to combinatorial complexity and model expressiveness, we show that they arise from intrinsic information-theoretic limits. We formalize policy compliance as a verification problem over encoded system behaviors and analyze it using Kolmogorov complexity. We prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. This reveals a fundamental limitation of AI safety verification independent of computational resources, and motivates proof-carrying approaches that provide instance-level correctness guarantees.
CLJun 23, 2020
Can you tell? SSNet -- a Sagittal Stratum-inspired Neural Network Framework for Sentiment AnalysisApostol Vassilev, Munawar Hasan, Honglan Jin
When people try to understand nuanced language they typically process multiple input sensor modalities to complete this cognitive task. It turns out the human brain has even a specialized neuron formation, called sagittal stratum, to help us understand sarcasm. We use this biological formation as the inspiration for designing a neural network architecture that combines predictions of different models on the same text to construct robust, accurate and computationally efficient classifiers for sentiment analysis and study several different realizations. Among them, we propose a systematic new approach to combining multiple predictions based on a dedicated neural network and develop mathematical analysis of it along with state-of-the-art experimental results. We also propose a heuristic-hybrid technique for combining models and back it up with experimental results on a representative benchmark dataset and comparisons to other methods to show the advantages of the new approaches.
CVJan 1, 2020
Multi-lane Detection Using Instance Segmentation and Attentive VotingDonghoon Chang, Vinjohn Chirakkal, Shubham Goswami et al.
Autonomous driving is becoming one of the leading industrial research areas. Therefore many automobile companies are coming up with semi to fully autonomous driving solutions. Among these solutions, lane detection is one of the vital driver-assist features that play a crucial role in the decision-making process of the autonomous vehicle. A variety of solutions have been proposed to detect lanes on the road, which ranges from using hand-crafted features to the state-of-the-art end-to-end trainable deep learning architectures. Most of these architectures are trained in a traffic constrained environment. In this paper, we propose a novel solution to multi-lane detection, which outperforms state of the art methods in terms of both accuracy and speed. To achieve this, we also offer a dataset with a more intuitive labeling scheme as compared to other benchmark datasets. Using our approach, we are able to obtain a lane segmentation accuracy of 99.87% running at 54.53 fps (average).