Mizanur Rahman

CL
h-index61
48papers
900citations
Novelty38%
AI Score56

48 Papers

CLJan 8Code
Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

Mizanur Rahman, Mohammed Saidul Islam, Md Tahmid Rahman Laskar et al.

Text-to-Visualization (Text2Vis) systems translate natural language queries over tabular data into concise answers and executable visualizations. While closed-source LLMs generate functional code, the resulting charts often lack semantic alignment and clarity, qualities that can only be assessed post-execution. Open-source models struggle even more, frequently producing non-executable or visually poor outputs. Although supervised fine-tuning can improve code executability, it fails to enhance overall visualization quality, as traditional SFT loss cannot capture post-execution feedback. To address this gap, we propose RL-Text2Vis, the first reinforcement learning framework for Text2Vis generation. Built on Group Relative Policy Optimization (GRPO), our method uses a novel multi-objective reward that jointly optimizes textual accuracy, code validity, and visualization quality using post-execution feedback. By training Qwen2.5 models (7B and 14B), RL-Text2Vis achieves a 22% relative improvement in chart quality over GPT-4o on the Text2Vis benchmark and boosts code execution success from 78% to 97% relative to its zero-shot baseline. Our models significantly outperform strong zero-shot and supervised baselines and also demonstrate robust generalization to out-of-domain datasets like VIS-Eval and NVBench. These results establish GRPO as an effective strategy for structured, multimodal reasoning in visualization generation. We release our code at https://github.com/vis-nlp/RL-Text2Vis.

CLJul 4, 2024
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations

Md Tahmid Rahman Laskar, Sawsan Alqahtani, M Saiful Bari et al.

Large Language Models (LLMs) have recently gained significant attention due to their remarkable capabilities in performing diverse tasks across various domains. However, a thorough evaluation of these models is crucial before deploying them in real-world applications to ensure they produce reliable performance. Despite the well-established importance of evaluating LLMs in the community, the complexity of the evaluation process has led to varied evaluation setups, causing inconsistencies in findings and interpretations. To address this, we systematically review the primary challenges and limitations causing these inconsistencies and unreliable evaluations in various steps of LLM evaluation. Based on our critical review, we present our perspectives and recommendations to ensure LLM evaluations are reproducible, reliable, and robust.

98.8AIApr 28Code
DATAREEL: Automated Data-Driven Video Story Generation with Animations

Ridwan Mahbub, Syem Aziz, Mahir Ahmed et al.

Data videos are a powerful medium for visual data based storytelling, combining animated, chart-centric visualizations with synchronized narration. Widely used in journalism, education, and public communication, they help audiences understand complex data through clear and engaging visual explanations. Despite their growing impact, generating data-driven video stories remains challenging, as it requires careful coordination of visual encoding, temporal progression, and narration and substantial expertise in visualization design, animation, and video-editing tools. Recent advances in large language models offer new opportunities to automate this process; however, there is currently no benchmark for rigorously evaluating models on animated visualization-based video storytelling. To address this gap, we introduce DataReel, a benchmark for automated data-driven video story generation comprising 328 real-world stories. Each story pairs structured data, a chart visualization, and a narration transcript, enabling systematic evaluation of models' abilities to generate animated data video stories. We further propose a multi-agent framework that decomposes the task into planning, generation, and verification stages, mirroring key aspects of the human storytelling process. Experiments show that this multi-agent approach outperforms direct prompting baselines under both automatic and human evaluations, while revealing persistent challenges in coordinating animation, narration, and visual emphasis. We release DataReel at https://github.com/vis-nlp/DataReel.

IRJul 18, 2024Code
A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice

Shaina Raza, Mizanur Rahman, Safiullah Kamawal et al.

Recommender Systems (RS) play an integral role in enhancing user experiences by providing personalized item suggestions. This survey reviews the progress in RS inclusively from 2017 to 2024, effectively connecting theoretical advances with practical applications. We explore the development from traditional RS techniques like content-based and collaborative filtering to advanced methods involving deep learning, graph-based models, reinforcement learning, and large language models. We also discuss specialized systems such as context-aware, review-based, and fairness-aware RS. The primary goal of this survey is to bridge theory with practice. It addresses challenges across various sectors, including e-commerce, healthcare, and finance, emphasizing the need for scalable, real-time, and trustworthy solutions. Through this survey, we promote stronger partnerships between academic research and industry practices. The insights offered by this survey aim to guide industry professionals in optimizing RS deployment and to inspire future research directions, especially in addressing emerging technological and societal trends\footnote. The survey resources are available in the public GitHub repository https://github.com/VectorInstitute/Recommender-Systems-Survey. (Recommender systems, large language models, chatgpt, responsible AI)

CLMar 31, 2023
CQSumDP: A ChatGPT-Annotated Resource for Query-Focused Abstractive Summarization Based on Debatepedia

Md Tahmid Rahman Laskar, Mizanur Rahman, Israt Jahan et al.

Debatepedia is a publicly available dataset consisting of arguments and counter-arguments on controversial topics that has been widely used for the single-document query-focused abstractive summarization task in recent years. However, it has been recently found that this dataset is limited by noise and even most queries in this dataset do not have any relevance to the respective document. In this paper, we present a methodology for cleaning the Debatepedia dataset by leveraging the generative power of large language models to make it suitable for query-focused abstractive summarization. More specifically, we harness the language generation capabilities of ChatGPT to regenerate its queries. We evaluate the effectiveness of the proposed ChatGPT annotated version of the Debatepedia dataset using several benchmark summarization models and demonstrate that the newly annotated version of Debatepedia outperforms the original dataset in terms of both query relevance as well as summary generation quality. We will make this annotated and cleaned version of the dataset publicly available.

SDSep 9, 2022
Improving the Environmental Perception of Autonomous Vehicles using Deep Learning-based Audio Classification

Finley Walden, Sagar Dasgupta, Mizanur Rahman et al.

Sense of hearing is crucial for autonomous vehicles (AVs) to better perceive its surrounding environment. Although visual sensors of an AV, such as camera, lidar, and radar, help to see its surrounding environment, an AV cannot see beyond those sensors line of sight. On the other hand, an AV s sense of hearing cannot be obstructed by line of sight. For example, an AV can identify an emergency vehicle s siren through audio classification even though the emergency vehicle is not within the line of sight of the AV. Thus, auditory perception is complementary to the camera, lidar, and radar-based perception systems. This paper presents a deep learning-based robust audio classification framework aiming to achieve improved environmental perception for AVs. The presented framework leverages a deep Convolution Neural Network (CNN) to classify different audio classes. UrbanSound8k, an urban environment dataset, is used to train and test the developed framework. Seven audio classes i.e., air conditioner, car horn, children playing, dog bark, engine idling, gunshot, and siren, are identified from the UrbanSound8k dataset because of their relevancy related to AVs. Our framework can classify different audio classes with 97.82% accuracy. Moreover, the audio classification accuracies with all ten classes are presented, which proves that our framework performed better in the case of AV-related sounds compared to the existing audio classification frameworks.

CROct 31, 2022
Reinforcement Learning based Cyberattack Model for Adaptive Traffic Signal Controller in Connected Transportation Systems

Muhammad Sami Irfan, Mizanur Rahman, Travis Atkison et al.

In a connected transportation system, adaptive traffic signal controllers (ATSC) utilize real-time vehicle trajectory data received from vehicles through wireless connectivity (i.e., connected vehicles) to regulate green time. However, this wirelessly connected ATSC increases cyber-attack surfaces and increases their vulnerability to various cyber-attack modes, which can be leveraged to induce significant congestion in a roadway network. An attacker may receive financial benefits to create such a congestion for a specific roadway. One such mode is a 'sybil' attack in which an attacker creates fake vehicles in the network by generating fake Basic Safety Messages (BSMs) imitating actual connected vehicles following roadway traffic rules. The ultimate goal of an attacker will be to block a route(s) by generating fake or 'sybil' vehicles at a rate such that the signal timing and phasing changes occur without flagging any abrupt change in number of vehicles. Because of the highly non-linear and unpredictable nature of vehicle arrival rates and the ATSC algorithm, it is difficult to find an optimal rate of sybil vehicles, which will be injected from different approaches of an intersection. Thus, it is necessary to develop an intelligent cyber-attack model to prove the existence of such attacks. In this study, a reinforcement learning based cyber-attack model is developed for a waiting time-based ATSC. Specifically, an RL agent is trained to learn an optimal rate of sybil vehicle injection to create congestion for an approach(s). Our analyses revealed that the RL agent can learn an optimal policy for creating an intelligent attack.

CLApr 7, 2025Code
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Ahmed Masry, Mohammed Saidul Islam, Mahir Ahmed et al.

Charts are ubiquitous, as people often use them to analyze data, answer questions, and discover critical insights. However, performing complex analytical tasks with charts requires significant perceptual and cognitive effort. Chart Question Answering (CQA) systems automate this process by enabling models to interpret and reason with visual representations of data. However, existing benchmarks like ChartQA lack real-world diversity and have recently shown performance saturation with modern large vision-language models (LVLMs). To address these limitations, we introduce ChartQAPro, a new benchmark that includes 1,341 charts from 157 diverse sources, spanning various chart types, including infographics and dashboards, and featuring 1,948 questions in various types, such as multiple-choice, conversational, hypothetical, and unanswerable questions, to better reflect real-world challenges. Our evaluations with 21 models show a substantial performance drop for LVLMs on ChartQAPro; e.g., Claude Sonnet 3.5 scores 90.5% on ChartQA but only 55.81% on ChartQAPro, underscoring the complexity of chart reasoning. We complement our findings with detailed error analyses and ablation studies, identifying key challenges and opportunities for advancing LVLMs in chart understanding and reasoning. We release ChartQAPro at https://github.com/vis-nlp/ChartQAPro.

CLNov 27, 2023
FakeWatch ElectionShield: A Benchmarking Framework to Detect Fake News for Credible US Elections

Tahniat Khan, Mizanur Rahman, Veronica Chatrath et al.

In today's technologically driven world, the spread of fake news, particularly during crucial events such as elections, presents an increasing challenge to the integrity of information. To address this challenge, we introduce FakeWatch ElectionShield, an innovative framework carefully designed to detect fake news. We have created a novel dataset of North American election-related news articles through a blend of advanced language models (LMs) and thorough human verification, for precision and relevance. We propose a model hub of LMs for identifying fake news. Our goal is to provide the research community with adaptable and accurate classification models in recognizing the dynamic nature of misinformation. Extensive evaluation of fake news classifiers on our dataset and a benchmark dataset shows our that while state-of-the-art LMs slightly outperform the traditional ML models, classical models are still competitive with their balance of accuracy, explainability, and computational efficiency. This research sets the foundation for future studies to address misinformation related to elections.

AISep 9, 2022
Audio Analytics-based Human Trafficking Detection Framework for Autonomous Vehicles

Sagar Dasgupta, Kazi Shakib, Mizanur Rahman et al.

Human trafficking is a universal problem, persistent despite numerous efforts to combat it globally. Individuals of any age, race, ethnicity, sex, gender identity, sexual orientation, nationality, immigration status, cultural background, religion, socioeconomic class, and education can be a victim of human trafficking. With the advancements in technology and the introduction of autonomous vehicles (AVs), human traffickers will adopt new ways to transport victims, which could accelerate the growth of organized human trafficking networks, which can make the detection of trafficking in persons more challenging for law enforcement agencies. The objective of this study is to develop an innovative audio analytics-based human trafficking detection framework for autonomous vehicles. The primary contributions of this study are to: (i) define four non-trivial, feasible, and realistic human trafficking scenarios for AVs; (ii) create a new and comprehensive audio dataset related to human trafficking with five classes i.e., crying, screaming, car door banging, car noise, and conversation; and (iii) develop a deep 1-D Convolution Neural Network (CNN) architecture for audio data classification related to human trafficking. We have also conducted a case study using the new audio dataset and evaluated the audio classification performance of the deep 1-D CNN. Our analyses reveal that the deep 1-D CNN can distinguish sound coming from a human trafficking victim from a non-human trafficking sound with an accuracy of 95%, which proves the efficacy of our framework.

85.8CLApr 21Code
Lost in Translation: Do LVLM Judges Generalize Across Languages?

Md Tahmid Rahman Laskar, Mohammed Saidul Islam, Mir Tafseer Nayeem et al.

Automatic evaluators such as reward models play a central role in the alignment and evaluation of large vision-language models (LVLMs). Despite their growing importance, these evaluators are almost exclusively assessed on English-centric benchmarks, leaving open the question of how well these evaluators generalize across languages. To answer this question, we introduce MM-JudgeBench, the first large-scale benchmark for multilingual and multimodal judge model evaluation, which includes over 60K pairwise preference instances spanning 25 typologically diverse languages. MM-JudgeBench integrates two complementary subsets: a general vision-language preference evaluation subset extending VL-RewardBench, and a chart-centric visual-text reasoning subset derived from OpenCQA, enabling systematic analysis of reward models (i.e., LVLM judges) across diverse settings. We additionally release a multilingual training set derived from MM-RewardBench, disjoint from our evaluation data, to support domain adaptation. By evaluating 22 LVLMs (15 open-source, 7 proprietary), we uncover substantial cross-lingual performance variance in our proposed benchmark. Our analysis further shows that model size and architecture are poor predictors of multilingual robustness, and that even state-of-the-art LVLM judges exhibit inconsistent behavior across languages. Together, these findings expose fundamental limitations of current reward modeling and underscore the necessity of multilingual, multimodal benchmarks for developing reliable automated evaluators.

CLMay 13, 2025Code
Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

Md Tahmid Rahman Laskar, Mohammed Saidul Islam, Ridwan Mahbub et al.

Charts are ubiquitous as they help people understand and reason with data. Recently, various downstream tasks, such as chart question answering, chart2text, and fact-checking, have emerged. Large Vision-Language Models (LVLMs) show promise in tackling these tasks, but their evaluation is costly and time-consuming, limiting real-world deployment. While using LVLMs as judges to assess the chart comprehension capabilities of other LVLMs could streamline evaluation processes, challenges like proprietary datasets, restricted access to powerful models, and evaluation costs hinder their adoption in industrial settings. To this end, we present a comprehensive evaluation of 13 open-source LVLMs as judges for diverse chart comprehension and reasoning tasks. We design both pairwise and pointwise evaluation tasks covering criteria like factual correctness, informativeness, and relevancy. Additionally, we analyze LVLM judges based on format adherence, positional consistency, length bias, and instruction-following. We focus on cost-effective LVLMs (<10B parameters) suitable for both research and commercial use, following a standardized evaluation protocol and rubric to measure the LVLM judge's accuracy. Experimental results reveal notable variability: while some open LVLM judges achieve GPT-4-level evaluation performance (about 80% agreement with GPT-4 judgments), others struggle (below ~10% agreement). Our findings highlight that state-of-the-art open-source LVLMs can serve as cost-effective automatic evaluators for chart-related tasks, though biases such as positional preference and length bias persist.

CLJul 26, 2025Code
Text2Vis: A Challenging and Diverse Benchmark for Generating Multimodal Visualizations from Text

Mizanur Rahman, Md Tahmid Rahman Laskar, Shafiq Joty et al.

Automated data visualization plays a crucial role in simplifying data interpretation, enhancing decision-making, and improving efficiency. While large language models (LLMs) have shown promise in generating visualizations from natural language, the absence of comprehensive benchmarks limits the rigorous evaluation of their capabilities. We introduce Text2Vis, a benchmark designed to assess text-to-visualization models, covering 20+ chart types and diverse data science queries, including trend analysis, correlation, outlier detection, and predictive analytics. It comprises 1,985 samples, each with a data table, natural language query, short answer, visualization code, and annotated charts. The queries involve complex reasoning, conversational turns, and dynamic data retrieval. We benchmark 11 open-source and closed-source models, revealing significant performance gaps, highlighting key challenges, and offering insights for future advancements. To close this gap, we propose the first cross-modal actor-critic agentic framework that jointly refines the textual answer and visualization code, increasing GPT-4o`s pass rate from 26% to 42% over the direct approach and improving chart quality. We also introduce an automated LLM-based evaluation framework that enables scalable assessment across thousands of samples without human annotation, measuring answer correctness, code execution success, visualization readability, and chart accuracy. We release Text2Vis at https://github.com/vis-nlp/Text2Vis.

CRJun 1, 2025Code
A Large Language Model-Supported Threat Modeling Framework for Transportation Cyber-Physical Systems

M Sabbir Salek, Mashrur Chowdhury, Muhaimin Bin Munir et al.

Existing threat modeling frameworks related to transportation cyber-physical systems (CPS) are often narrow in scope, labor-intensive, and require substantial cybersecurity expertise. To this end, we introduce the Transportation Cybersecurity and Resiliency Threat Modeling Framework (TraCR-TMF), a large language model (LLM)-based threat modeling framework for transportation CPS that requires limited cybersecurity expert intervention. TraCR-TMF identifies threats, potential attack techniques, and relevant countermeasures for transportation CPS. Three LLM-based approaches support these identifications: (i) a retrieval-augmented generation approach requiring no cybersecurity expert intervention, (ii) an in-context learning approach with low expert intervention, and (iii) a supervised fine-tuning approach with moderate expert intervention. TraCR-TMF offers LLM-based attack path identification for critical assets based on vulnerabilities across transportation CPS entities. Additionally, it incorporates the Common Vulnerability Scoring System (CVSS) scores of known exploited vulnerabilities to prioritize threat mitigations. The framework was evaluated through two cases. First, the framework identified relevant attack techniques for various transportation CPS applications, 73% of which were validated by cybersecurity experts as correct. Second, the framework was used to identify attack paths for a target asset in a real-world cyberattack incident. TraCR-TMF successfully predicted exploitations, like lateral movement of adversaries, data exfiltration, and data encryption for ransomware, as reported in the incident. These findings show the efficacy of TraCR-TMF in transportation CPS threat modeling, while reducing the need for extensive involvement of cybersecurity experts. To facilitate real-world adoptions, all our codes are shared via an open-source repository.

CLAug 24, 2025Code
DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards

Aaryaman Kartha, Ahmed Masry, Mohammed Saidul Islam et al.

Dashboards are powerful visualization tools for data-driven decision-making, integrating multiple interactive views that allow users to explore, filter, and navigate data. Unlike static charts, dashboards support rich interactivity, which is essential for uncovering insights in real-world analytical workflows. However, existing question-answering benchmarks for data visualizations largely overlook this interactivity, focusing instead on static charts. This limitation severely constrains their ability to evaluate the capabilities of modern multimodal agents designed for GUI-based reasoning. To address this gap, we introduce DashboardQA, the first benchmark explicitly designed to assess how vision-language GUI agents comprehend and interact with real-world dashboards. The benchmark includes 112 interactive dashboards from Tableau Public and 405 question-answer pairs with interactive dashboards spanning five categories: multiple-choice, factoid, hypothetical, multi-dashboard, and conversational. By assessing a variety of leading closed- and open-source GUI agents, our analysis reveals their key limitations, particularly in grounding dashboard elements, planning interaction trajectories, and performing reasoning. Our findings indicate that interactive dashboard reasoning is a challenging task overall for all the VLMs evaluated. Even the top-performing agents struggle; for instance, the best agent based on Gemini-Pro-2.5 achieves only 38.69% accuracy, while the OpenAI CUA agent reaches just 22.69%, demonstrating the benchmark's significant difficulty. We release DashboardQA at https://github.com/vis-nlp/DashboardQA

CLAug 13, 2025Code
From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text

Ridwan Mahbub, Mohammed Saidul Islam, Mir Tafseer Nayeem et al.

Charts are very common for exploring data and communicating insights, but extracting key takeaways from charts and articulating them in natural language can be challenging. The chart-to-text task aims to automate this process by generating textual summaries of charts. While with the rapid advancement of large Vision-Language Models (VLMs), we have witnessed great progress in this domain, little to no attention has been given to potential biases in their outputs. This paper investigates how VLMs can amplify geo-economic biases when generating chart summaries, potentially causing societal harm. Specifically, we conduct a large-scale evaluation of geo-economic biases in VLM-generated chart summaries across 6,000 chart-country pairs from six widely used proprietary and open-source models to understand how a country's economic status influences the sentiment of generated summaries. Our analysis reveals that existing VLMs tend to produce more positive descriptions for high-income countries compared to middle- or low-income countries, even when country attribution is the only variable changed. We also find that models such as GPT-4o-mini, Gemini-1.5-Flash, and Phi-3.5 exhibit varying degrees of bias. We further explore inference-time prompt-based debiasing techniques using positive distractors but find them only partially effective, underscoring the complexity of the issue and the need for more robust debiasing strategies. Our code and dataset are publicly available here.

CLJun 6, 2024Code
BEADs: Bias Evaluation Across Domains

Shaina Raza, Mizanur Rahman, Michael R. Zhang

Recent advances in large language models (LLMs) have substantially improved natural language processing (NLP) applications. However, these models often inherit and amplify biases present in their training data. Although several datasets exist for bias detection, most are limited to one or two NLP tasks, typically classification or evaluation and do not provide broad coverage across diverse task settings. To address this gap, we introduce the \textbf{Bias Evaluations Across Domains} (\textbf{B}\texttt{EADs}) dataset, designed to support a wide range of NLP tasks, including text classification, token classification, bias quantification, and benign language generation. A key contribution of this work is a gold-standard annotation scheme that supports both evaluation and supervised training of language models. Experiments on state-of-the-art models reveal some gaps: some models exhibit systematic bias toward specific demographics, while others apply safety guardrails more strictly or inconsistently across groups. Overall, these results highlight persistent shortcomings in current models and underscore the need for comprehensive bias evaluation. Project: https://vectorinstitute.github.io/BEAD/ Data: https://huggingface.co/datasets/shainar/BEAD

CLMar 14, 2024Code
FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections

Shaina Raza, Tahniat Khan, Veronica Chatrath et al.

In today's technologically driven world, the rapid spread of fake news, particularly during critical events like elections, poses a growing threat to the integrity of information. To tackle this challenge head-on, we introduce FakeWatch, a comprehensive framework carefully designed to detect fake news. Leveraging a newly curated dataset of North American election-related news articles, we construct robust classification models. Our framework integrates a model hub comprising of both traditional machine learning (ML) techniques, and state-of-the-art Language Models (LMs) to discern fake news effectively. Our objective is to provide the research community with adaptable and precise classification models adept at identifying fake news for the elections agenda. Quantitative evaluations of fake news classifiers on our dataset reveal that, while state-of-the-art LMs exhibit a slight edge over traditional ML models, classical models remain competitive due to their balance of accuracy and computational efficiency. Additionally, qualitative analyses shed light on patterns within fake news articles. We provide our labeled data at https://huggingface.co/datasets/newsmediabias/fake_news_elections_labelled_data and model https://huggingface.co/newsmediabias/FakeWatch for reproducibility and further research.

CVNov 14, 2025
Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions

Redwan Hussain, Mizanur Rahman, Prithwiraj Bhattacharjee

Artificial intelligence (AI) in media has advanced rapidly over the last decade. The introduction of Generative Adversarial Networks (GANs) improved the quality of photorealistic image generation. Diffusion models later brought a new era of generative media. These advances made it difficult to separate real and synthetic content. The rise of deepfakes demonstrated how these tools could be misused to spread misinformation, political conspiracies, privacy violations, and fraud. For this reason, many detection models have been developed. They often use deep learning methods such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). These models search for visual, spatial, or temporal anomalies. However, such approaches often fail to generalize across unseen data and struggle with content from different models. In addition, existing approaches are ineffective in multimodal data and highly modified content. This study reviews twenty-four recent works on AI-generated media detection. Each study was examined individually to identify its contributions and weaknesses, respectively. The review then summarizes the common limitations and key challenges faced by current approaches. Based on this analysis, a research direction is suggested with a focus on multimodal deep learning models. Such models have the potential to provide more robust and generalized detection. It offers future researchers a clear starting point for building stronger defenses against harmful synthetic media.

CRJan 2, 2024
Experimental Validation of Sensor Fusion-based GNSS Spoofing Attack Detection Framework for Autonomous Vehicles

Sagar Dasgupta, Kazi Hassan Shakib, Mizanur Rahman

In this paper, we validate the performance of the a sensor fusion-based Global Navigation Satellite System (GNSS) spoofing attack detection framework for Autonomous Vehicles (AVs). To collect data, a vehicle equipped with a GNSS receiver, along with Inertial Measurement Unit (IMU) is used. The detection framework incorporates two strategies: The first strategy involves comparing the predicted location shift, which is the distance traveled between two consecutive timestamps, with the inertial sensor-based location shift. For this purpose, data from low-cost in-vehicle inertial sensors such as the accelerometer and gyroscope sensor are fused and fed into a long short-term memory (LSTM) neural network. The second strategy employs a Random-Forest supervised machine learning model to detect and classify turns, distinguishing between left and right turns using the output from the steering angle sensor. In experiments, two types of spoofing attack models: turn-by-turn and wrong turn are simulated. These spoofing attacks are modeled as SQL injection attacks, where, upon successful implementation, the navigation system perceives injected spoofed location information as legitimate while being unable to detect legitimate GNSS signals. Importantly, the IMU data remains uncompromised throughout the spoofing attack. To test the effectiveness of the detection framework, experiments are conducted in Tuscaloosa, AL, mimicking urban road structures. The results demonstrate the framework's ability to detect various sophisticated GNSS spoofing attacks, even including slow position drifting attacks. Overall, the experimental results showcase the robustness and efficacy of the sensor fusion-based spoofing attack detection approach in safeguarding AVs against GNSS spoofing threats.

AIOct 5, 2025
LLM-Based Data Science Agents: A Survey of Capabilities, Challenges, and Future Directions

Mizanur Rahman, Amran Bhuiyan, Mohammed Saidul Islam et al.

Recent advances in large language models (LLMs) have enabled a new class of AI agents that automate multiple stages of the data science workflow by integrating planning, tool use, and multimodal reasoning across text, code, tables, and visuals. This survey presents the first comprehensive, lifecycle-aligned taxonomy of data science agents, systematically analyzing and mapping forty-five systems onto the six stages of the end-to-end data science process: business understanding and data acquisition, exploratory analysis and visualization, feature engineering, model building and selection, interpretation and explanation, and deployment and monitoring. In addition to lifecycle coverage, we annotate each agent along five cross-cutting design dimensions: reasoning and planning style, modality integration, tool orchestration depth, learning and alignment methods, and trust, safety, and governance mechanisms. Beyond classification, we provide a critical synthesis of agent capabilities, highlight strengths and limitations at each stage, and review emerging benchmarks and evaluation practices. Our analysis identifies three key trends: most systems emphasize exploratory analysis, visualization, and modeling while neglecting business understanding, deployment, and monitoring; multimodal reasoning and tool orchestration remain unresolved challenges; and over 90% lack explicit trust and safety mechanisms. We conclude by outlining open challenges in alignment stability, explainability, governance, and robust evaluation frameworks, and propose future research directions to guide the development of robust, trustworthy, low-latency, transparent, and broadly accessible data science agents.

CLAug 13, 2025
The Perils of Chart Deception: How Misleading Visualizations Affect Vision-Language Models

Ridwan Mahbub, Mohammed Saidul Islam, Md Tahmid Rahman Laskar et al.

Information visualizations are powerful tools that help users quickly identify patterns, trends, and outliers, facilitating informed decision-making. However, when visualizations incorporate deceptive design elements-such as truncated or inverted axes, unjustified 3D effects, or violations of best practices-they can mislead viewers and distort understanding, spreading misinformation. While some deceptive tactics are obvious, others subtly manipulate perception while maintaining a facade of legitimacy. As Vision-Language Models (VLMs) are increasingly used to interpret visualizations, especially by non-expert users, it is critical to understand how susceptible these models are to deceptive visual designs. In this study, we conduct an in-depth evaluation of VLMs' ability to interpret misleading visualizations. By analyzing over 16,000 responses from ten different models across eight distinct types of misleading chart designs, we demonstrate that most VLMs are deceived by them. This leads to altered interpretations of charts, despite the underlying data remaining the same. Our findings highlight the need for robust safeguards in VLMs against visual misinformation.

CLOct 8, 2025
Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices

Md Tahmid Rahman Laskar, Mohammed Saidul Islam, Ridwan Mahbub et al.

Large Vision-Language Models (LVLMs) with only 7B parameters have shown promise as automated judges in chart comprehension tasks. However, tiny models (<=2B parameters) still perform poorly as judges, limiting their real-world use in resource-constrained settings. To address this, we propose two approaches to ensure cost-efficient evaluation: (i) multi-criteria prompting, which combines separate evaluation criteria into a single query, and (ii) domain-adaptive transfer learning, in which we fine-tune a 2B-parameter LVLM on synthetic judgments in a chart dataset to create the ChartJudge. Experiments show that multi-criteria prompting exposes robustness gaps, which led to a huge drop in performance for 7B models, including specialized LVLM judges like LLaVA-Critic. In addition, we find that our tiny LVLM (ChartJudge) can effectively transfer knowledge from one dataset to another to make it a more specialized model. Our fine-grained analysis across chart types and query complexities offers actionable insights into trade-offs between model size, prompt design, and transferability, enabling scalable, low-cost evaluation for chart reasoning tasks.

CVSep 29, 2025
Infrastructure Sensor-enabled Vehicle Data Generation using Multi-Sensor Fusion for Proactive Safety Applications at Work Zone

Suhala Rabab Saba, Sakib Khan, Minhaj Uddin Ahmad et al.

Infrastructure-based sensing and real-time trajectory generation show promise for improving safety in high-risk roadway segments such as work zones, yet practical deployments are hindered by perspective distortion, complex geometry, occlusions, and costs. This study tackles these barriers by integrating roadside camera and LiDAR sensors into a cosimulation environment to develop a scalable, cost-effective vehicle detection and localization framework, and employing a Kalman Filter-based late fusion strategy to enhance trajectory consistency and accuracy. In simulation, the fusion algorithm reduced longitudinal error by up to 70 percent compared to individual sensors while preserving lateral accuracy within 1 to 3 meters. Field validation in an active work zone, using LiDAR, a radar-camera rig, and RTK-GPS as ground truth, demonstrated that the fused trajectories closely match real vehicle paths, even when single-sensor data are intermittent or degraded. These results confirm that KF based sensor fusion can reliably compensate for individual sensor limitations, providing precise and robust vehicle tracking capabilities. Our approach thus offers a practical pathway to deploy infrastructure-enabled multi-sensor systems for proactive safety measures in complex traffic environments.

LGAug 25, 2025
Quantum-Classical Hybrid Framework for Zero-Day Time-Push GNSS Spoofing Detection

Abyad Enan, Mashrur Chowdhury, Sagar Dasgupta et al.

Global Navigation Satellite Systems (GNSS) are critical for Positioning, Navigation, and Timing (PNT) applications. However, GNSS are highly vulnerable to spoofing attacks, where adversaries transmit counterfeit signals to mislead receivers. Such attacks can lead to severe consequences, including misdirected navigation, compromised data integrity, and operational disruptions. Most existing spoofing detection methods depend on supervised learning techniques and struggle to detect novel, evolved, and unseen attacks. To overcome this limitation, we develop a zero-day spoofing detection method using a Hybrid Quantum-Classical Autoencoder (HQC-AE), trained solely on authentic GNSS signals without exposure to spoofed data. By leveraging features extracted during the tracking stage, our method enables proactive detection before PNT solutions are computed. We focus on spoofing detection in static GNSS receivers, which are particularly susceptible to time-push spoofing attacks, where attackers manipulate timing information to induce incorrect time computations at the receiver. We evaluate our model against different unseen time-push spoofing attack scenarios: simplistic, intermediate, and sophisticated. Our analysis demonstrates that the HQC-AE consistently outperforms its classical counterpart, traditional supervised learning-based models, and existing unsupervised learning-based methods in detecting zero-day, unseen GNSS time-push spoofing attacks, achieving an average detection accuracy of 97.71% with an average false negative rate of 0.62% (when an attack occurs but is not detected). For sophisticated spoofing attacks, the HQC-AE attains an accuracy of 98.23% with a false negative rate of 1.85%. These findings highlight the effectiveness of our method in proactively detecting zero-day GNSS time-push spoofing attacks across various stationary GNSS receiver platforms.

LGAug 11, 2025
Vision-Based Localization and LLM-based Navigation for Indoor Environments

Keyan Rahimi, Md. Wasiul Haque, Sagar Dasgupta et al.

Indoor navigation remains a complex challenge due to the absence of reliable GPS signals and the architectural intricacies of large enclosed environments. This study presents an indoor localization and navigation approach that integrates vision-based localization with large language model (LLM)-based navigation. The localization system utilizes a ResNet-50 convolutional neural network fine-tuned through a two-stage process to identify the user's position using smartphone camera input. To complement localization, the navigation module employs an LLM, guided by a carefully crafted system prompt, to interpret preprocessed floor plan images and generate step-by-step directions. Experimental evaluation was conducted in a realistic office corridor with repetitive features and limited visibility to test localization robustness. The model achieved high confidence and an accuracy of 96% across all tested waypoints, even under constrained viewing conditions and short-duration queries. Navigation tests using ChatGPT on real building floor maps yielded an average instruction accuracy of 75%, with observed limitations in zero-shot reasoning and inference time. This research demonstrates the potential for scalable, infrastructure-free indoor navigation using off-the-shelf cameras and publicly available floor plans, particularly in resource-constrained settings like hospitals, airports, and educational institutions.

LGAug 11, 2025
Grid2Guide: A* Enabled Small Language Model for Indoor Navigation

Md. Wasiul Haque, Sagar Dasgupta, Mizanur Rahman

Reliable indoor navigation remains a significant challenge in complex environments, particularly where external positioning signals and dedicated infrastructures are unavailable. This research presents Grid2Guide, a hybrid navigation framework that combines the A* search algorithm with a Small Language Model (SLM) to generate clear, human-readable route instructions. The framework first conducts a binary occupancy matrix from a given indoor map. Using this matrix, the A* algorithm computes the optimal path between origin and destination, producing concise textual navigation steps. These steps are then transformed into natural language instructions by the SLM, enhancing interpretability for end users. Experimental evaluations across various indoor scenarios demonstrate the method's effectiveness in producing accurate and timely navigation guidance. The results validate the proposed approach as a lightweight, infrastructure-free solution for real-time indoor navigation support.

CVJun 16, 2025
Evolution of ReID: From Early Methods to LLM Integration

Amran Bhuiyan, Mizanur Rahman, Md Tahmid Rahman Laskar et al.

Person re-identification (ReID) has evolved from handcrafted feature-based methods to deep learning approaches and, more recently, to models incorporating large language models (LLMs). Early methods struggled with variations in lighting, pose, and viewpoint, but deep learning addressed these issues by learning robust visual features. Building on this, LLMs now enable ReID systems to integrate semantic and contextual information through natural language. This survey traces that full evolution and offers one of the first comprehensive reviews of ReID approaches that leverage LLMs, where textual descriptions are used as privileged information to improve visual matching. A key contribution is the use of dynamic, identity-specific prompts generated by GPT-4o, which enhance the alignment between images and text in vision-language ReID systems. Experimental results show that these descriptions improve accuracy, especially in complex or ambiguous cases. To support further research, we release a large set of GPT-4o-generated descriptions for standard ReID datasets. By bridging computer vision and natural language processing, this survey offers a unified perspective on the field's development and outlines key future directions such as better prompt design, cross-modal transfer learning, and real-world adaptability.

CLMay 23, 2025
Retrieval Augmented Generation-based Large Language Models for Bridging Transportation Cybersecurity Legal Knowledge Gaps

Khandakar Ashrafi Akbar, Md Nahiyan Uddin, Latifur Khan et al.

As connected and automated transportation systems evolve, there is a growing need for federal and state authorities to revise existing laws and develop new statutes to address emerging cybersecurity and data privacy challenges. This study introduces a Retrieval-Augmented Generation (RAG) based Large Language Model (LLM) framework designed to support policymakers by extracting relevant legal content and generating accurate, inquiry-specific responses. The framework focuses on reducing hallucinations in LLMs by using a curated set of domain-specific questions to guide response generation. By incorporating retrieval mechanisms, the system enhances the factual grounding and specificity of its outputs. Our analysis shows that the proposed RAG-based LLM outperforms leading commercial LLMs across four evaluation metrics: AlignScore, ParaScore, BERTScore, and ROUGE, demonstrating its effectiveness in producing reliable and context-aware legal insights. This approach offers a scalable, AI-driven method for legislative analysis, supporting efforts to update legal frameworks in line with advancements in transportation technologies.

CLMay 29, 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets

Md Tahmid Rahman Laskar, M Saiful Bari, Mizanur Rahman et al.

The development of large language models (LLMs) such as ChatGPT has brought a lot of attention recently. However, their evaluation in the benchmark academic datasets remains under-explored due to the difficulty of evaluating the generative outputs produced by this model against the ground truth. In this paper, we aim to present a thorough evaluation of ChatGPT's performance on diverse academic datasets, covering tasks like question-answering, text summarization, code generation, commonsense reasoning, mathematical problem-solving, machine translation, bias detection, and ethical considerations. Specifically, we evaluate ChatGPT across 140 tasks and analyze 255K responses it generates in these datasets. This makes our work the largest evaluation of ChatGPT in NLP benchmarks. In short, our study aims to validate the strengths and weaknesses of ChatGPT in various tasks and provide insights for future research using LLMs. We also report a new emergent ability to follow multi-query instructions that we mostly found in ChatGPT and other instruction-tuned models. Our extensive evaluation shows that even though ChatGPT is capable of performing a wide variety of tasks, and may obtain impressive performance in several benchmark datasets, it is still far from achieving the ability to reliably solve many challenging tasks. By providing a thorough assessment of ChatGPT's performance across diverse NLP tasks, this paper sets the stage for a targeted deployment of ChatGPT-like LLMs in real-world applications.

SPAug 19, 2021
A Sensor Fusion-based GNSS Spoofing Attack Detection Framework for Autonomous Vehicles

Sagar Dasgupta, Mizanur Rahman, Mhafuzul Islam et al.

This paper presents a sensor fusion based Global Navigation Satellite System (GNSS) spoofing attack detection framework for autonomous vehicles (AV) that consists of two concurrent strategies: (i) detection of vehicle state using predicted location shift -- i.e., distance traveled between two consecutive timestamps -- and monitoring of vehicle motion state -- i.e., standstill/ in motion; and (ii) detection and classification of turns (i.e., left or right). Data from multiple low-cost in-vehicle sensors (i.e., accelerometer, steering angle sensor, speed sensor, and GNSS) are fused and fed into a recurrent neural network model, which is a long short-term memory (LSTM) network for predicting the location shift, i.e., the distance that an AV travels between two consecutive timestamps. This location shift is then compared with the GNSS-based location shift to detect an attack. We have then combined k-Nearest Neighbors (k-NN) and Dynamic Time Warping (DTW) algorithms to detect and classify left and right turns using data from the steering angle sensor. To prove the efficacy of the sensor fusion-based attack detection framework, attack datasets are created for four unique and sophisticated spoofing attacks-turn-by-turn, overshoot, wrong turn, and stop, using the publicly available real-world Honda Research Institute Driving Dataset (HDD). Our analysis reveals that the sensor fusion-based detection framework successfully detects all four types of spoofing attacks within the required computational latency threshold.

SPAug 19, 2021
A Reinforcement Learning Approach for GNSS Spoofing Attack Detection of Autonomous Vehicles

Sagar Dasgupta, Tonmoy Ghosh, Mizanur Rahman

A resilient and robust positioning, navigation, and timing (PNT) system is a necessity for the navigation of autonomous vehicles (AVs). Global Navigation Satelite System (GNSS) provides satellite-based PNT services. However, a spoofer can temper an authentic GNSS signal and could transmit wrong position information to an AV. Therefore, a GNSS must have the capability of real-time detection and feedback-correction of spoofing attacks related to PNT receivers, whereby it will help the end-user (autonomous vehicle in this case) to navigate safely if it falls into any compromises. This paper aims to develop a deep reinforcement learning (RL)-based turn-by-turn spoofing attack detection using low-cost in-vehicle sensor data. We have utilized Honda Driving Dataset to create attack and non-attack datasets, develop a deep RL model, and evaluate the performance of the RL-based attack detection model. We find that the accuracy of the RL model ranges from 99.99% to 100%, and the recall value is 100%. However, the precision ranges from 93.44% to 100%, and the f1 score ranges from 96.61% to 100%. Overall, the analyses reveal that the RL model is effective in turn-by-turn spoofing attack detection.

LGAug 19, 2021
An Innovative Attack Modelling and Attack Detection Approach for a Waiting Time-based Adaptive Traffic Signal Controller

Sagar Dasgupta, Courtland Hollis, Mizanur Rahman et al.

An adaptive traffic signal controller (ATSC) combined with a connected vehicle (CV) concept uses real-time vehicle trajectory data to regulate green time and has the ability to reduce intersection waiting time significantly and thereby improve travel time in a signalized corridor. However, the CV-based ATSC increases the size of the surface vulnerable to potential cyber-attack, allowing an attacker to generate disastrous traffic congestion in a roadway network. An attacker can congest a route by generating fake vehicles by maintaining traffic and car-following rules at a slow rate so that the signal timing and phase change without having any abrupt changes in number of vehicles. Because of the adaptive nature of ATSC, it is a challenge to model this kind of attack and also to develop a strategy for detection. This paper introduces an innovative "slow poisoning" cyberattack for a waiting time based ATSC algorithm and a corresponding detection strategy. Thus, the objectives of this paper are to: (i) develop a "slow poisoning" attack generation strategy for an ATSC, and (ii) develop a prediction-based "slow poisoning" attack detection strategy using a recurrent neural network -- i.e., long short-term memory model. We have generated a "slow poisoning" attack modeling strategy using a microscopic traffic simulator -- Simulation of Urban Mobility (SUMO) -- and used generated data from the simulation to develop both the attack model and detection model. Our analyses revealed that the attack strategy is effective in creating a congestion in an approach and detection strategy is able to flag the attack.

CRJun 5, 2021
Sensor Fusion-based GNSS Spoofing Attack Detection Framework for Autonomous Vehicles

Sagar Dasgupta, Mizanur Rahman, Mhafuzul Islam et al.

In this study, a sensor fusion based GNSS spoofing attack detection framework is presented that consists of three concurrent strategies for an autonomous vehicle (AV): (i) prediction of location shift, (ii) detection of turns (left or right), and (iii) recognition of motion state (including standstill state). Data from multiple low-cost in-vehicle sensors (i.e., accelerometer, steering angle sensor, speed sensor, and GNSS) are fused and fed into a recurrent neural network model, which is a long short-term memory (LSTM) network for predicting the location shift, i.e., the distance that an AV travels between two consecutive timestamps. We have then combined k-Nearest Neighbors (k-NN) and Dynamic Time Warping (DTW) algorithms to detect turns using data from the steering angle sensor. In addition, data from an AV's speed sensor is used to recognize the AV's motion state including the standstill state. To prove the efficacy of the sensor fusion-based attack detection framework, attack datasets are created for three unique and sophisticated spoofing attacks turn by turn, overshoot, and stop using the publicly available real-world Honda Research Institute Driving Dataset (HDD). Our analysis reveals that the sensor fusion-based detection framework successfully detects all three types of spoofing attacks within the required computational latency threshold.

ROOct 16, 2020
Prediction-Based GNSS Spoofing Attack Detection for Autonomous Vehicles

Sagar Dasgupta, Mizanur Rahman, Mhafuzul Islam et al.

Global Navigation Satellite System (GNSS) provides Positioning, Navigation, and Timing (PNT) services for autonomous vehicles (AVs) using satellites and radio communications. Due to the lack of encryption, open-access of the coarse acquisition (C/A) codes, and low strength of the signal, GNSS is vulnerable to spoofing attacks compromising the navigational capability of the AV. A spoofed attack is difficult to detect as a spoofer (attacker who performs spoofing attack) can mimic the GNSS signal and transmit inaccurate location coordinates to an AV. In this study, we have developed a prediction-based spoofing attack detection strategy using the long short-term memory (LSTM) model, a recurrent neural network model. The LSTM model is used to predict the distance traveled between two consecutive locations of an autonomous vehicle. In order to develop the LSTM prediction model, we have used a publicly available real-world comma2k19 driving dataset. The training dataset contains different features (i.e., acceleration, steering wheel angle, speed, and distance traveled between two consecutive locations) extracted from the controlled area network (CAN), GNSS, and inertial measurement unit (IMU) sensors of AVs. Based on the predicted distance traveled between the current location and the immediate future location of an autonomous vehicle, a threshold value is established using the positioning error of the GNSS device and prediction error (i.e., maximum absolute error) related to distance traveled between the current location and the immediate future location. Our analysis revealed that the prediction-based spoofed attack detection strategy can successfully detect the attack in real-time.

CRMar 5, 2020
Change Point Models for Real-time Cyber Attack Detection in Connected Vehicle Environment

Gurcan Comert, Mizanur Rahman, Mhafuzul Islam et al.

Connected vehicle (CV) systems are cognizant of potential cyber attacks because of increasing connectivity between its different components such as vehicles, roadside infrastructure, and traffic management centers. However, it is a challenge to detect security threats in real-time and develop appropriate or effective countermeasures for a CV system because of the dynamic behavior of such attacks, high computational power requirement, and a historical data requirement for training detection models. To address these challenges, statistical models, especially change point models, have potentials for real-time anomaly detections. Thus, the objective of this study is to investigate the efficacy of two change point models, Expectation Maximization (EM) and two forms of Cumulative Summation (CUSUM) algorithms (i.e., typical and adaptive), for real-time V2I cyber attack detection in a CV Environment. To prove the efficacy of these models, we evaluated these two models for three different type of cyber attack, denial of service (DOS), impersonation, and false information, using basic safety messages (BSMs) generated from CVs through simulation. Results from numerical analysis revealed that EM, CUSUM, and adaptive CUSUM could detect these cyber attacks, DOS, impersonation, and false information, with an accuracy of (99%, 100%, 100%), (98%, 10%, 100%), and (100%, 98%, 100%) respectively.

CVJan 29, 2020
Dynamic Error-bounded Lossy Compression (EBLC) to Reduce the Bandwidth Requirement for Real-time Vision-based Pedestrian Safety Applications

Mizanur Rahman, Mhafuzul Islam, Jon C. Calhoun et al.

As camera quality improves and their deployment moves to areas with limited bandwidth, communication bottlenecks can impair real-time constraints of an ITS application, such as video-based real-time pedestrian detection. Video compression reduces the bandwidth requirement to transmit the video but degrades the video quality. As the quality level of the video decreases, it results in the corresponding decreases in the accuracy of the vision-based pedestrian detection model. Furthermore, environmental conditions (e.g., rain and darkness) alter the compression ratio and can make maintaining a high pedestrian detection accuracy more difficult. The objective of this study is to develop a real-time error-bounded lossy compression (EBLC) strategy to dynamically change the video compression level depending on different environmental conditions in order to maintain a high pedestrian detection accuracy. We conduct a case study to show the efficacy of our dynamic EBLC strategy for real-time vision-based pedestrian detection under adverse environmental conditions. Our strategy selects the error tolerances dynamically for lossy compression that can maintain a high detection accuracy across a representative set of environmental conditions. Analyses reveal that our strategy increases pedestrian detection accuracy up to 14% and reduces the communication bandwidth up to 14x for adverse environmental conditions compared to the same conditions but without our dynamic EBLC strategy. Our dynamic EBLC strategy is independent of detection models and environmental conditions allowing other detection models and environmental conditions to be easily incorporated in our strategy.

SYJan 1, 2020
Development of a Connected and Automated Vehicle Longitudinal Control Model

Mizanur Rahman, Md Rafiul Islam, Mashrur Chowdhury et al.

It is envisioned that, in the future, most vehicles on our roadway will be controlled autonomously and will be connected via vehicle to everything (V2X) wireless communication networks. Developing a connected and automated vehicle (CAV) longitudinal controller, which will consider safety, comfort and operational efficiency simultaneously, is a challenge. A CAV longitudinal controller is a complex system where a vehicle senses immediate upstream vehicles using its sensors and receives information about its surroundings via wireless connectivity, and move forward accordingly. In this study, we develop an information-aware driver model (IADM) that utilizes information regarding an immediate upstream vehicle of a subject CAV through CAV sensors and V2X connectivity while considering passenger comfort and operational efficiency along with maintaining safety gap for longitudinal vehicle motion of the autonomous vehicle. Unlike existing driver models for longitudinal control, the IADM intelligently fuses data received from in vehicle sensors, and immediate upstream vehicles of the subject CAV through wireless connectivity, and IADM parameters do not need to be calibrated for different traffic states, such as congested and non congested traffic conditions. It only requires defining the subject CAVs maximum acceleration and deceleration limit, and computation time that is needed to update the subject CAVs trajectory from its previous state. Our analyses suggest that the IADM (i) is able to maintain safety using a newly defined safe gap function depending on the speed and reaction time of a CAV; (ii) shows local stability and string stability and (iii) provides riding comfort for a range of autonomous driving aggressiveness depending on the passenger preferences.

SYDec 29, 2019
Grey Models for Short-Term Queue Length Predictions for Adaptive Traffic Signal Control

Gurcan Comert, Zadid Khan, Mizanur Rahman et al.

Traffic congestion at a signalized intersection greatly reduces the travel time reliability in urban areas. Adaptive signal control system (ASCS) is the most advanced traffic signal technology that regulates the signal phasing and timings considering the patterns in real-time in order to reduce congestion. Real-time prediction of queue lengths can be used to adjust the phasing and timings for different movements at an intersection with ASCS. The accuracy of the prediction varies based on the factors, such as the stochastic nature of the vehicle arrival rates, time of the day, weather and driver characteristics. In addition, accurate prediction for multilane, undersaturated and saturated traffic scenarios is challenging. Thus, the objective of this study is to develop queue length prediction models for signalized intersections that can be leveraged by ASCS using four variations of Grey systems: (i) the first order single variable Grey model (GM(1,1)); (ii) GM(1,1) with Fourier error corrections; (iii) the Grey Verhulst model (GVM), and (iv) GVM with Fourier error corrections. The efficacy of the GM is that they facilitate fast processing; as these models do not require a large amount of data; as would be needed in artificial intelligence models; and they are able to adapt to stochastic changes, unlike statistical models. We have conducted a case study using queue length data from five intersections with ASCS on a calibrated roadway network in Lexington, South Carolina. GM were compared with linear, nonlinear time series models, and long short-term memory (LSTM) neural network. Based on our analyses, we found that EGVM reduces the prediction error over closest competing models (i.e., LSTM and time series models) in predicting average and maximum queue lengths by 40% and 42%, respectively, in terms of Root Mean Squared Error, and 51% and 50%, respectively, in terms of Mean Absolute Error.

CVJul 2, 2019
Vision-based Pedestrian Alert Safety System (PASS) for Signalized Intersections

Mhafuzul Islam, Mizanur Rahman, Mashrur Chowdhury et al.

Although Vehicle-to-Pedestrian (V2P) communication can significantly improve pedestrian safety at a signalized intersection, this safety is hindered as pedestrians often do not carry hand-held devices (e.g., Dedicated short-range communication (DSRC) and 5G enabled cell phone) to communicate with connected vehicles nearby. To overcome this limitation, in this study, traffic cameras at a signalized intersection were used to accurately detect and locate pedestrians via a vision-based deep learning technique to generate safety alerts in real-time about possible conflicts between vehicles and pedestrians. The contribution of this paper lies in the development of a system using a vision-based deep learning model that is able to generate personal safety messages (PSMs) in real-time (every 100 milliseconds). We develop a pedestrian alert safety system (PASS) to generate a safety alert of an imminent pedestrian-vehicle crash using generated PSMs to improve pedestrian safety at a signalized intersection. Our approach estimates the location and velocity of a pedestrian more accurately than existing DSRC-enabled pedestrian hand-held devices. A connected vehicle application, the Pedestrian in Signalized Crosswalk Warning (PSCW), was developed to evaluate the vision-based PASS. Numerical analyses show that our vision-based PASS is able to satisfy the accuracy and latency requirements of pedestrian safety applications in a connected vehicle environment.

CYJun 24, 2019
Long Short-Term Memory Neural Networks for False Information Attack Detection in Software-Defined In-Vehicle Network

Zadid Khan, Mashrur Chowdhury, Mhafuzul Islam et al.

A modern vehicle contains many electronic control units (ECUs), which communicate with each other through the in-vehicle network to ensure vehicle safety and performance. Emerging Connected and Automated Vehicles (CAVs) will have more ECUs and coupling between them due to the vast array of additional sensors, advanced driving features and Vehicle-to-Everything (V2X) connectivity. Due to the connectivity, CAVs will be more vulnerable to remote attackers. In this study, we developed a software-defined in-vehicle Ethernet networking system that provides security against false information attacks. We then created an attack model and attack datasets for false information attacks on brake-related ECUs. After analyzing the attack dataset, we found that the features of the dataset are time-series that have sequential variation patterns. Therefore, we subsequently developed a long short term memory (LSTM) neural network based false information attack/anomaly detection model for the real-time detection of anomalies within the in-vehicle network. This attack detection model can detect false information with an accuracy, precision and recall of 95%, 95% and 87%, respectively, while satisfying the real-time communication and computational requirements.

CRNov 30, 2018
Change Point Models for Real-time V2I Cyber Attack Detection in a Connected Vehicle Environment

Gurcan Comert, Mizanur Rahman, Mhafuzul Islam et al.

Connected vehicle (CV) systems are cognizant of potential cyber attacks because of increasing connectivity between its different components such as vehicles, roadside infrastructure and traffic management centers. However, it is a challenge to detect security threats in real-time and develop appropriate/effective countermeasures for a CV system because of the dynamic behavior of such attacks, high computational power requirement and a historical data requirement for training detection models. To address these challenges, statistical models, especially change point models, have potentials for real-time anomaly detections. Thus, the objective of this study is to investigate the efficacy of two change point models, Expectation Maximization (EM) and Cumulative Sum (CUSUM), for real-time V2I cyber attack detection in a CV Environment. To prove the efficacy of these models, we evaluated these two models for three different type of cyber attack, denial of service (DOS), impersonation, and false information, using basic safety messages (BSMs) generated from CVs through simulation. Results from numerical analysis revealed that EM and CUSUM could detect these cyber attacks, DOS, impersonation, and false information, with an accuracy of 99\%, 100\%, and 98\%, and 100\%, 100\% and 98\%, respectively.

LGNov 8, 2018
Real time Traffic Flow Parameters Prediction with Basic Safety Messages at Low Penetration of Connected Vehicles

Mizanur Rahman, Mashrur Chowdhury, Jerome McClendon

The expected low market penetration of connected vehicles (CVs) in the near future could be a constraint in estimating traffic flow parameters, such as average travel speed of a roadway segment and average space headway between vehicles from the CV broadcasted data. This estimated traffic flow parameters from low penetration of connected vehicles become noisy compared to 100 percent penetration of CVs, and such noise reduces the real time prediction accuracy of a machine learning model, such as the accuracy of long short term memory (LSTM) model in terms of predicting traffic flow parameters. The accurate prediction of the parameters is important for future traffic condition assessment. To improve the prediction accuracy using noisy traffic flow parameters, which is constrained by limited CV market penetration and limited CV data, we developed a real time traffic data prediction model that combines LSTM with Kalman filter based Rauch Tung Striebel (RTS) noise reduction model. We conducted a case study using the Enhanced Next Generation Simulation (NGSIM) dataset, which contains vehicle trajectory data for every one tenth of a second, to evaluate the performance of this prediction model. Compared to a baseline LSTM model performance, for only 5 percent penetration of CVs, the analyses revealed that combined LSTM and RTS model reduced the mean absolute percentage error (MAPE) from 19 percent to 5 percent for speed prediction and from 27 percent to 9 percent for space-headway prediction. The statistical significance test with a 95 percent confidence interval confirmed no significant difference in predicted average speed and average space headway using this LSTM and RTS combination with only 5 percent CV penetration rate.

CRSep 13, 2018
Towards Secure Infrastructure-based Cooperative Adaptive Cruise Control

Manveen Kaur, Anjan Rayamajhi, Mizanur Rahman et al.

Cooperative Adaptive Cruise Control (CACC) is a pivotal vehicular application that would allow transportation field to achieve its goals of increased traffic throughput and roadway capacity. This application is of paramount interest to the vehicular technology community with a large body of literature dedicated to research within different aspects of CACC, including but not limited to security with CACC. Of all available literature, the overwhelming focus in on CACC utilizing vehicle-to-vehicle (V2V) communication. In this work, we assert that a qualitative increase in vehicle-to-infrastructure (V2I) and infrastructure-to-vehicle (I2V) involvement has the potential to add greater value to CACC. In this study, we developed a strategy for detection of a denial-of-service (DoS) attack on a CACC platoon where the system edge in the vehicular network plays a central role in attack detection. The proposed security strategy is substantiated with a simulation-based evaluation using the ns-3 discrete event network simulator. Empirical evidence obtained through simulation-based results illustrate successful detection of the DoS attack at four different levels of attack severity using this security strategy.

CVAug 27, 2018
Real-time Pedestrian Detection Approach with an Efficient Data Communication Bandwidth Strategy

Mizanur Rahman, Mhafuzul Islam, Jon Calhoun et al.

Vehicle-to-Pedestrian (V2P) communication can significantly improve pedestrian safety at a signalized intersection. It is unlikely that pedestrians will carry a low latency communication enabled device and activate a pedestrian safety application in their hand-held device all the time. Because of this limitation, multiple traffic cameras at the signalized intersection can be used to accurately detect and locate pedestrians using deep learning and broadcast safety alerts related to pedestrians to warn connected and automated vehicles around a signalized intersection. However, unavailability of high-performance computing infrastructure at the roadside and limited network bandwidth between traffic cameras and the computing infrastructure limits the ability of real-time data streaming and processing for pedestrian detection. In this paper, we develop an edge computing based real-time pedestrian detection strategy combining pedestrian detection algorithm using deep learning and an efficient data communication approach to reduce bandwidth requirements while maintaining a high object detection accuracy. We utilize a lossy compression technique on traffic camera data to determine the tradeoff between the reduction of the communication bandwidth requirements and a defined object detection accuracy. The performance of the pedestrian-detection strategy is measured in terms of pedestrian classification accuracy with varying peak signal-to-noise ratios. The analyses reveal that we detect pedestrians by maintaining a defined detection accuracy with a peak signal-to-noise ratio (PSNR) 43 dB while reducing the communication bandwidth from 9.82 Mbits/sec to 0.31 Mbits/sec, a 31x reduction.

SIJun 23, 2018
Search Rank Fraud De-Anonymization in Online Systems

Mizanur Rahman, Nestor Hernandez, Bogdan Carbunar et al.

We introduce the fraud de-anonymization problem, that goes beyond fraud detection, to unmask the human masterminds responsible for posting search rank fraud in online systems. We collect and study search rank fraud data from Upwork, and survey the capabilities and behaviors of 58 search rank fraudsters recruited from 6 crowdsourcing sites. We propose Dolos, a fraud de-anonymization system that leverages traits and behaviors extracted from these studies, to attribute detected fraud to crowdsourcing site fraudsters, thus to real identities and bank accounts. We introduce MCDense, a min-cut dense component detection algorithm to uncover groups of user accounts controlled by different fraudsters, and leverage stylometry and deep learning to attribute them to crowdsourcing site profiles. Dolos correctly identified the owners of 95% of fraudster-controlled communities, and uncovered fraudsters who promoted as many as 97.5% of fraud apps we collected from Google Play. When evaluated on 13,087 apps (820,760 reviews), which we monitored over more than 6 months, Dolos identified 1,056 apps with suspicious reviewer groups. We report orthogonal evidence of their fraud, including fraud duplicates and fraud re-posts.

SIJun 5, 2017
Stateless Puzzles for Real Time Online Fraud Preemption

Mizanur Rahman, Ruben Recabarren, Bogdan Carbunar et al.

The profitability of fraud in online systems such as app markets and social networks marks the failure of existing defense mechanisms. In this paper, we propose FraudSys, a real-time fraud preemption approach that imposes Bitcoin-inspired computational puzzles on the devices that post online system activities, such as reviews and likes. We introduce and leverage several novel concepts that include (i) stateless, verifiable computational puzzles, that impose minimal performance overhead, but enable the efficient verification of their authenticity, (ii) a real-time, graph-based solution to assign fraud scores to user activities, and (iii) mechanisms to dynamically adjust puzzle difficulty levels based on fraud scores and the computational capabilities of devices. FraudSys does not alter the experience of users in online systems, but delays fraudulent actions and consumes significant computational resources of the fraudsters. Using real datasets from Google Play and Facebook, we demonstrate the feasibility of FraudSys by showing that the devices of honest users are minimally impacted, while fraudster controlled devices receive daily computational penalties of up to 3,079 hours. In addition, we show that with FraudSys, fraud does not pay off, as a user equipped with mining hardware (e.g., AntMiner S7) will earn less than half through fraud than from honest Bitcoin mining.

SIMar 6, 2017
FairPlay: Fraud and Malware Detection in Google Play

Mahmudur Rahman, Mizanur Rahman, Bogdan Carbunar et al.

Fraudulent behaviors in Google Android app market fuel search rank abuse and malware proliferation. We present FairPlay, a novel system that uncovers both malware and search rank fraud apps, by picking out trails that fraudsters leave behind. To identify suspicious apps, FairPlay PCF algorithm correlates review activities and uniquely combines detected review relations with linguistic and behavioral signals gleaned from longitudinal Google Play app data. We contribute a new longitudinal app dataset to the community, which consists of over 87K apps, 2.9M reviews, and 2.4M reviewers, collected over half a year. FairPlay achieves over 95% accuracy in classifying gold standard datasets of malware, fraudulent and legitimate apps. We show that 75% of the identified malware apps engage in search rank fraud. FairPlay discovers hundreds of fraudulent apps that currently evade Google Bouncer detection technology, and reveals a new type of attack campaign, where users are harassed into writing positive reviews, and install and review other apps.