Armstrong Aboah

h-index12

29papers

630citations

Novelty38%

AI Score42

Ranked #63,328 of 194,257 authors (top 33%)#21,748 in CV (top 37%)

29 Papers

13.1CVApr 26, 2023Code

GazeSAM: What You See is What You Segment

Bin Wang, Armstrong Aboah, Zheyuan Zhang et al.

This study investigates the potential of eye-tracking technology and the Segment Anything Model (SAM) to design a collaborative human-computer interaction system that automates medical image segmentation. We present the \textbf{GazeSAM} system to enable radiologists to collect segmentation masks by simply looking at the region of interest during image diagnosis. The proposed system tracks radiologists' eye movement and utilizes the eye-gaze data as the input prompt for SAM, which automatically generates the segmentation mask in real time. This study is the first work to leverage the power of eye-tracking technology and SAM to enhance the efficiency of daily clinical practice. Moreover, eye-gaze data coupled with image and corresponding segmentation labels can be easily recorded for further advanced eye-tracking research. The code is available in \url{https://github.com/ukaukaaaa/GazeSAM}.

8.7CVSep 11, 2024Code

PaveSAM Segment Anything for Pavement Distress

Neema Jakisa Owor, Yaw Adu-Gyamfi, Armstrong Aboah et al.

Automated pavement monitoring using computer vision can analyze pavement conditions more efficiently and accurately than manual methods. Accurate segmentation is essential for quantifying the severity and extent of pavement defects and consequently, the overall condition index used for prioritizing rehabilitation and maintenance activities. Deep learning-based segmentation models are however, often supervised and require pixel-level annotations, which can be costly and time-consuming. While the recent evolution of zero-shot segmentation models can generate pixel-wise labels for unseen classes without any training data, they struggle with irregularities of cracks and textured pavement backgrounds. This research proposes a zero-shot segmentation model, PaveSAM, that can segment pavement distresses using bounding box prompts. By retraining SAM's mask decoder with just 180 images, pavement distress segmentation is revolutionized, enabling efficient distress segmentation using bounding box prompts, a capability not found in current segmentation models. This not only drastically reduces labeling efforts and costs but also showcases our model's high performance with minimal input, establishing the pioneering use of SAM in pavement distress segmentation. Furthermore, researchers can use existing open-source pavement distress images annotated with bounding boxes to create segmentation masks, which increases the availability and diversity of segmentation pavement distress datasets.

11.6CVApr 13, 2023

Real-time Multi-Class Helmet Violation Detection Using Few-Shot Data Sampling Technique and YOLOv8

Armstrong Aboah, Bin Wang, Ulas Bagci et al.

Traffic safety is a major global concern. Helmet usage is a key factor in preventing head injuries and fatalities caused by motorcycle accidents. However, helmet usage violations continue to be a significant problem. To identify such violations, automatic helmet detection systems have been proposed and implemented using computer vision techniques. Real-time implementation of such systems is crucial for traffic surveillance and enforcement, however, most of these systems are not real-time. This study proposes a robust real-time helmet violation detection system. The proposed system utilizes a unique data processing strategy, referred to as few-shot data sampling, to develop a robust model with fewer annotations, and a single-stage object detection model, YOLOv8 (You Only Look Once Version 8), for detecting helmet violations in real-time from video frames. Our proposed method won 7th place in the 2023 AI City Challenge, Track 5, with an mAP score of 0.5861 on experimental validation data. The experimental results demonstrate the effectiveness, efficiency, and robustness of the proposed system.

6.5CVNov 10, 2022

Driver Maneuver Detection and Analysis using Time Series Segmentation and Classification

Armstrong Aboah, Yaw Adu-Gyamfi, Senem Velipasalar Gursoy et al.

The current paper implements a methodology for automatically detecting vehicle maneuvers from vehicle telemetry data under naturalistic driving settings. Previous approaches have treated vehicle maneuver detection as a classification problem, although both time series segmentation and classification are required since input telemetry data is continuous. Our objective is to develop an end-to-end pipeline for frame-by-frame annotation of naturalistic driving studies videos into various driving events including stop and lane keeping events, lane changes, left-right turning movements, and horizontal curve maneuvers. To address the time series segmentation problem, the study developed an Energy Maximization Algorithm (EMA) capable of extracting driving events of varying durations and frequencies from continuous signal data. To reduce overfitting and false alarm rates, heuristic algorithms were used to classify events with highly variable patterns such as stops and lane-keeping. To classify segmented driving events, four machine learning models were implemented, and their accuracy and transferability were assessed over multiple data sources. The duration of events extracted by EMA were comparable to actual events, with accuracies ranging from 59.30% (left lane change) to 85.60% (lane-keeping). Additionally, the overall accuracy of the 1D-convolutional neural network model was 98.99%, followed by the Long-short-term-memory model at 97.75%, then random forest model at 97.71%, and the support vector machine model at 97.65%. These model accuracies where consistent across different data sources. The study concludes that implementing a segmentation-classification pipeline significantly improves both the accuracy for driver maneuver detection and transferability of shallow and deep ML models across diverse datasets.

6.5CVJun 10, 2022

The 1st Data Science for Pavements Challenge

Ashkan Behzadian, Tanner Wambui Muturi, Tianjie Zhang et al.

The Data Science for Pavement Challenge (DSPC) seeks to accelerate the research and development of automated vision systems for pavement condition monitoring and evaluation by providing a platform with benchmarked datasets and codes for teams to innovate and develop machine learning algorithms that are practice-ready for use by industry. The first edition of the competition attracted 22 teams from 8 countries. Participants were required to automatically detect and classify different types of pavement distresses present in images captured from multiple sources, and under different conditions. The competition was data-centric: teams were tasked to increase the accuracy of a predefined model architecture by utilizing various data modification methods such as cleaning, labeling and augmentation. A real-time, online evaluation system was developed to rank teams based on the F1 score. Leaderboard results showed the promise and challenges of machine for advancing automation in pavement monitoring and evaluation. This paper summarizes the solutions from the top 5 teams. These teams proposed innovations in the areas of data cleaning, annotation, augmentation, and detection parameter tuning. The F1 score for the top-ranked team was approximately 0.9. The paper concludes with a review of different experiments that worked well for the current challenge and those that did not yield any significant improvement in model accuracy.

3.9CVApr 13, 2023

Real-Time Helmet Violation Detection in AI City Challenge 2023 with Genetic Algorithm-Enhanced YOLOv5

Elham Soltanikazemi, Ashwin Dhakal, Bijaya Kumar Hatuwal et al.

This research focuses on real-time surveillance systems as a means for tackling the issue of non-compliance with helmet regulations, a practice that considerably amplifies the risk for motorcycle drivers or riders. Despite the well-established advantages of helmet usage, achieving widespread compliance remains challenging due to diverse contributing factors. To effectively address this concern, real-time monitoring and enforcement of helmet laws have been proposed as a plausible solution. However, previous attempts at real-time helmet violation detection have been hindered by their limited ability to operate in real-time. To overcome this limitation, the current paper introduces a novel real-time helmet violation detection system that utilizes the YOLOv5 single-stage object detection model. This model is trained on the 2023 NVIDIA AI City Challenge 2023 Track 5 dataset. The optimal hyperparameters for training the model are determined using genetic algorithms. Additionally, data augmentation and various sampling techniques are implemented to enhance the model's performance. The efficacy of the models is evaluated using precision, recall, and mean Average Precision (mAP) metrics. The results demonstrate impressive precision, recall, and mAP scores of 0.848, 0.599, and 0.641, respectively for the training data. Furthermore, the model achieves notable mAP score of 0.6667 for the test datasets, leading to a commendable 4th place rank in the public leaderboard. This innovative approach represents a notable breakthrough in the field and holds immense potential to substantially enhance motorcycle safety. By enabling real-time monitoring and enforcement capabilities, this system has the capacity to contribute towards increased compliance with helmet laws, thereby effectively reducing the risks faced by motorcycle riders and passengers.

8.1CVApr 18, 2022

A Region-Based Deep Learning Approach to Automated Retail Checkout

Maged Shoman, Armstrong Aboah, Alex Morehead et al.

Automating the product checkout process at conventional retail stores is a task poised to have large impacts on society generally speaking. Towards this end, reliable deep learning models that enable automated product counting for fast customer checkout can make this goal a reality. In this work, we propose a novel, region-based deep learning approach to automate product counting using a customized YOLOv5 object detection pipeline and the DeepSORT algorithm. Our results on challenging, real-world test videos demonstrate that our method can generalize its predictions to a sufficient level of accuracy and with a fast enough runtime to warrant deployment to real-world commercial settings. Our proposed method won 4th place in the 2022 AI City Challenge, Track 4, with an F1 score of 0.4400 on experimental validation data.

9.1CVApr 13, 2023Code

DeepSegmenter: Temporal Action Localization for Detecting Anomalies in Untrimmed Naturalistic Driving Videos

Armstrong Aboah, Ulas Bagci, Abdul Rashid Mussah et al.

Identifying unusual driving behaviors exhibited by drivers during driving is essential for understanding driver behavior and the underlying causes of crashes. Previous studies have primarily approached this problem as a classification task, assuming that naturalistic driving videos come discretized. However, both activity segmentation and classification are required for this task due to the continuous nature of naturalistic driving videos. The current study therefore departs from conventional approaches and introduces a novel methodological framework, DeepSegmenter, that simultaneously performs activity segmentation and classification in a single framework. The proposed framework consists of four major modules namely Data Module, Activity Segmentation Module, Classification Module and Postprocessing Module. Our proposed method won 8th place in the 2023 AI City Challenge, Track 3, with an activity overlap score of 0.5426 on experimental validation data. The experimental results demonstrate the effectiveness, efficiency, and robustness of the proposed system.

6.8CVOct 12, 2023Code

Image2PCI -- A Multitask Learning Framework for Estimating Pavement Condition Indices Directly from Images

Neema Jakisa Owor, Hang Du, Abdulateef Daud et al.

The Pavement Condition Index (PCI) is a widely used metric for evaluating pavement performance based on the type, extent and severity of distresses detected on a pavement surface. In recent times, significant progress has been made in utilizing deep-learning approaches to automate PCI estimation process. However, the current approaches rely on at least two separate models to estimate PCI values -- one model dedicated to determining the type and extent and another for estimating their severity. This approach presents several challenges, including complexities, high computational resource demands, and maintenance burdens that necessitate careful consideration and resolution. To overcome these challenges, the current study develops a unified multi-tasking model that predicts the PCI directly from a top-down pavement image. The proposed architecture is a multi-task model composed of one encoder for feature extraction and four decoders to handle specific tasks: two detection heads, one segmentation head and one PCI estimation head. By multitasking, we are able to extract features from the detection and segmentation heads for automatically estimating the PCI directly from the images. The model performs very well on our benchmarked and open pavement distress dataset that is annotated for multitask learning (the first of its kind). To our best knowledge, this is the first work that can estimate PCI directly from an image at real time speeds while maintaining excellent accuracy on all related tasks for crack detection and segmentation.

3.9CVJan 23, 2023

AI-Based Framework for Understanding Car Following Behaviors of Drivers in A Naturalistic Driving Environment

Armstrong Aboah, Abdul Rashid Mussah, Yaw Adu-Gyamfi

The most common type of accident on the road is a rear-end crash. These crashes have a significant negative impact on traffic flow and are frequently fatal. To gain a more practical understanding of these scenarios, it is necessary to accurately model car following behaviors that result in rear-end crashes. Numerous studies have been carried out to model drivers' car-following behaviors; however, the majority of these studies have relied on simulated data, which may not accurately represent real-world incidents. Furthermore, most studies are restricted to modeling the ego vehicle's acceleration, which is insufficient to explain the behavior of the ego vehicle. As a result, the current study attempts to address these issues by developing an artificial intelligence framework for extracting features relevant to understanding driver behavior in a naturalistic environment. Furthermore, the study modeled the acceleration of both the ego vehicle and the leading vehicle using extracted information from NDS videos. According to the study's findings, young people are more likely to be aggressive drivers than elderly people. In addition, when modeling the ego vehicle's acceleration, it was discovered that the relative velocity between the ego vehicle and the leading vehicle was more important than the distance between the two vehicles.

2.8CVApr 14, 2023

Real-Time Helmet Violation Detection Using YOLOv5 and Ensemble Learning

Geoffery Agorku, Divine Agbobli, Vuban Chowdhury et al.

The proper enforcement of motorcycle helmet regulations is crucial for ensuring the safety of motorbike passengers and riders, as roadway cyclists and passengers are not likely to abide by these regulations if no proper enforcement systems are instituted. This paper presents the development and evaluation of a real-time YOLOv5 Deep Learning (DL) model for detecting riders and passengers on motorbikes, identifying whether the detected person is wearing a helmet. We trained the model on 100 videos recorded at 10 fps, each for 20 seconds. Our study demonstrated the applicability of DL models to accurately detect helmet regulation violators even in challenging lighting and weather conditions. We employed several data augmentation techniques in the study to ensure the training data is diverse enough to help build a robust model. The proposed model was tested on 100 test videos and produced an mAP score of 0.5267, ranking 11th on the AI City Track 5 public leaderboard. The use of deep learning techniques for image classification tasks, such as identifying helmet-wearing riders, has enormous potential for improving road safety. The study shows the potential of deep learning models for application in smart cities and enforcing traffic regulations and can be deployed in real-time for city-wide monitoring.

8.7CVAug 10, 2024

Advancing Pavement Distress Detection in Developing Countries: A Novel Deep Learning Approach with Locally-Collected Datasets

Blessing Agyei Kyem, Eugene Kofi Okrah Denteh, Joshua Kofi Asamoah et al.

Road infrastructure maintenance in developing countries faces unique challenges due to resource constraints and diverse environmental factors. This study addresses the critical need for efficient, accurate, and locally-relevant pavement distress detection methods in these regions. We present a novel deep learning approach combining YOLO (You Only Look Once) object detection models with a Convolutional Block Attention Module (CBAM) to simultaneously detect and classify multiple pavement distress types. The model demonstrates robust performance in detecting and classifying potholes, longitudinal cracks, alligator cracks, and raveling, with confidence scores ranging from 0.46 to 0.93. While some misclassifications occur in complex scenarios, these provide insights into unique challenges of pavement assessment in developing countries. Additionally, we developed a web-based application for real-time distress detection from images and videos. This research advances automated pavement distress detection and provides a tailored solution for developing countries, potentially improving road safety, optimizing maintenance strategies, and contributing to sustainable transportation infrastructure development.

1.5CVOct 9, 2023

Edge Computing-Enabled Road Condition Monitoring: System Development and Evaluation

Abdulateef Daud, Mark Amo-Boateng, Neema Jakisa Owor et al.

Real-time pavement condition monitoring provides highway agencies with timely and accurate information that could form the basis of pavement maintenance and rehabilitation policies. Existing technologies rely heavily on manual data processing, are expensive and therefore, difficult to scale for frequent, networklevel pavement condition monitoring. Additionally, these systems require sending large packets of data to the cloud which requires large storage space, are computationally expensive to process, and results in high latency. The current study proposes a solution that capitalizes on the widespread availability of affordable Micro Electro-Mechanical System (MEMS) sensors, edge computing and internet connection capabilities of microcontrollers, and deployable machine learning (ML) models to (a) design an Internet of Things (IoT)-enabled device that can be mounted on axles of vehicles to stream live pavement condition data (b) reduce latency through on-device processing and analytics of pavement condition sensor data before sending to the cloud servers. In this study, three ML models including Random Forest, LightGBM and XGBoost were trained to predict International Roughness Index (IRI) at every 0.1-mile segment. XGBoost had the highest accuracy with an RMSE and MAPE of 16.89in/mi and 20.3%, respectively. In terms of the ability to classify the IRI of pavement segments based on ride quality according to MAP-21 criteria, our proposed device achieved an average accuracy of 96.76% on I-70EB and 63.15% on South Providence. Overall, our proposed device demonstrates significant potential in providing real-time pavement condition data to State Highway Agencies (SHA) and Department of Transportation (DOTs) with a satisfactory level of accuracy.

9.6CVAug 7, 2024Code

PaveCap: The First Multimodal Framework for Comprehensive Pavement Condition Assessment with Dense Captioning and PCI Estimation

Blessing Agyei Kyem, Eugene Kofi Okrah Denteh, Joshua Kofi Asamoah et al.

This research introduces the first multimodal approach for pavement condition assessment, providing both quantitative Pavement Condition Index (PCI) predictions and qualitative descriptions. We introduce PaveCap, a novel framework for automated pavement condition assessment. The framework consists of two main parts: a Single-Shot PCI Estimation Network and a Dense Captioning Network. The PCI Estimation Network uses YOLOv8 for object detection, the Segment Anything Model (SAM) for zero-shot segmentation, and a four-layer convolutional neural network to predict PCI. The Dense Captioning Network uses a YOLOv8 backbone, a Transformer encoder-decoder architecture, and a convolutional feed-forward module to generate detailed descriptions of pavement conditions. To train and evaluate these networks, we developed a pavement dataset with bounding box annotations, textual annotations, and PCI values. The results of our PCI Estimation Network showed a strong positive correlation (0.70) between predicted and actual PCIs, demonstrating its effectiveness in automating condition assessment. Also, the Dense Captioning Network produced accurate pavement condition descriptions, evidenced by high BLEU (0.7445), GLEU (0.5893), and METEOR (0.7252) scores. Additionally, the dense captioning model handled complex scenarios well, even correcting some errors in the ground truth data. The framework developed here can greatly improve infrastructure management and decision18 making in pavement maintenance.

11.3CVApr 12, 2024Code

Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

Maged Shoman, Dongdong Wang, Armstrong Aboah et al.

This paper introduces our solution for Track 2 in AI City Challenge 2024. The task aims to solve traffic safety description and analysis with the dataset of Woven Traffic Safety (WTS), a real-world Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding. Our solution mainly focuses on the following points: 1) To solve dense video captioning, we leverage the framework of dense video captioning with parallel decoding (PDVC) to model visual-language sequences and generate dense caption by chapters for video. 2) Our work leverages CLIP to extract visual features to more efficiently perform cross-modality training between visual and textual representations. 3) We conduct domain-specific model adaptation to mitigate domain shift problem that poses recognition challenge in video understanding. 4) Moreover, we leverage BDD-5K captioned videos to conduct knowledge transfer for better understanding WTS videos and more accurate captioning. Our solution has yielded on the test set, achieving 6th place in the competition. The open source code will be available at https://github.com/UCF-SST-Lab/AICity2024CVPRW

9.6CVApr 15, 2024Code

Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Dai Quoc Tran, Armstrong Aboah, Yuntae Jeon et al.

This study addresses the evolving challenges in urban traffic monitoring detection systems based on fisheye lens cameras by proposing a framework that improves the efficacy and accuracy of these systems. In the context of urban infrastructure and transportation management, advanced traffic monitoring systems have become critical for managing the complexities of urbanization and increasing vehicle density. Traditional monitoring methods, which rely on static cameras with narrow fields of view, are ineffective in dynamic urban environments, necessitating the installation of multiple cameras, which raises costs. Fisheye lenses, which were recently introduced, provide wide and omnidirectional coverage in a single frame, making them a transformative solution. However, issues such as distorted views and blurriness arise, preventing accurate object detection on these images. Motivated by these challenges, this study proposes a novel approach that combines a ransformer-based image enhancement framework and ensemble learning technique to address these challenges and improve traffic monitoring accuracy, making significant contributions to the future of intelligent traffic management systems. Our proposed methodological framework won 5th place in the 2024 AI City Challenge, Track 4, with an F1 score of 0.5965 on experimental validation data. The experimental results demonstrate the effectiveness, efficiency, and robustness of the proposed system. Our code is publicly available at https://github.com/daitranskku/AIC2024-TRACK4-TEAM15.

6.2CVDec 23, 2025

PaveSync: A Unified and Comprehensive Dataset for Pavement Distress Analysis and Classification

Blessing Agyei Kyem, Joshua Kofi Asamoah, Anthony Dontoh et al.

Automated pavement defect detection often struggles to generalize across diverse real-world conditions due to the lack of standardized datasets. Existing datasets differ in annotation styles, distress type definitions, and formats, limiting their integration for unified training. To address this gap, we introduce a comprehensive benchmark dataset that consolidates multiple publicly available sources into a standardized collection of 52747 images from seven countries, with 135277 bounding box annotations covering 13 distinct distress types. The dataset captures broad real-world variation in image quality, resolution, viewing angles, and weather conditions, offering a unique resource for consistent training and evaluation. Its effectiveness was demonstrated through benchmarking with state-of-the-art object detection models including YOLOv8-YOLOv12, Faster R-CNN, and DETR, which achieved competitive performance across diverse scenarios. By standardizing class definitions and annotation formats, this dataset provides the first globally representative benchmark for pavement defect detection and enables fair comparison of models, including zero-shot transfer to new environments.

1.5CVApr 13, 2023

SigSegment: A Signal-Based Segmentation Algorithm for Identifying Anomalous Driving Behaviours in Naturalistic Driving Videos

Kelvin Kwakye, Younho Seong, Armstrong Aboah et al.

In recent years, distracted driving has garnered considerable attention as it continues to pose a significant threat to public safety on the roads. This has increased the need for innovative solutions that can identify and eliminate distracted driving behavior before it results in fatal accidents. In this paper, we propose a Signal-Based anomaly detection algorithm that segments videos into anomalies and non-anomalies using a deep CNN-LSTM classifier to precisely estimate the start and end times of an anomalous driving event. In the phase of anomaly detection and analysis, driver pose background estimation, mask extraction, and signal activity spikes are utilized. A Deep CNN-LSTM classifier was applied to candidate anomalies to detect and classify final anomalies. The proposed method achieved an overlap score of 0.5424 and ranked 9th on the public leader board in the AI City Challenge 2023, according to experimental validation results.

15.7CVMay 29, 2023Code

GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray Classification

Bin Wang, Hongyi Pan, Armstrong Aboah et al.

Eye tracking research is important in computer vision because it can help us understand how humans interact with the visual world. Specifically for high-risk applications, such as in medical imaging, eye tracking can help us to comprehend how radiologists and other medical professionals search, analyze, and interpret images for diagnostic and clinical purposes. Hence, the application of eye tracking techniques in disease classification has become increasingly popular in recent years. Contemporary works usually transform gaze information collected by eye tracking devices into visual attention maps (VAMs) to supervise the learning process. However, this is a time-consuming preprocessing step, which stops us from applying eye tracking to radiologists' daily work. To solve this problem, we propose a novel gaze-guided graph neural network (GNN), GazeGNN, to leverage raw eye-gaze data without being converted into VAMs. In GazeGNN, to directly integrate eye gaze into image classification, we create a unified representation graph that models both images and gaze pattern information. With this benefit, we develop a real-time, real-world, end-to-end disease classification algorithm for the first time in the literature. This achievement demonstrates the practicality and feasibility of integrating real-time eye tracking techniques into the daily work of radiologists. To our best knowledge, GazeGNN is the first work that adopts GNN to integrate image and eye-gaze data. Our experiments on the public chest X-ray dataset show that our proposed method exhibits the best classification performance compared to existing methods. The code is available at https://github.com/ukaukaaaa/GazeGNN.

16.4CVJan 24, 2025

Context-CrackNet: A Context-Aware Framework for Precise Segmentation of Tiny Cracks in Pavement images

Blessing Agyei Kyem, Joshua Kofi Asamoah, Armstrong Aboah

The accurate detection and segmentation of pavement distresses, particularly tiny and small cracks, are critical for early intervention and preventive maintenance in transportation infrastructure. Traditional manual inspection methods are labor-intensive and inconsistent, while existing deep learning models struggle with fine-grained segmentation and computational efficiency. To address these challenges, this study proposes Context-CrackNet, a novel encoder-decoder architecture featuring the Region-Focused Enhancement Module (RFEM) and Context-Aware Global Module (CAGM). These innovations enhance the model's ability to capture fine-grained local details and global contextual dependencies, respectively. Context-CrackNet was rigorously evaluated on ten publicly available crack segmentation datasets, covering diverse pavement distress scenarios. The model consistently outperformed 9 state-of-the-art segmentation frameworks, achieving superior performance metrics such as mIoU and Dice score, while maintaining competitive inference efficiency. Ablation studies confirmed the complementary roles of RFEM and CAGM, with notable improvements in mIoU and Dice score when both modules were integrated. Additionally, the model's balance of precision and computational efficiency highlights its potential for real-time deployment in large-scale pavement monitoring systems.

11.8CVMar 27, 2025

Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations

Eugene Denteh, Andrews Danyo, Joshua Kofi Asamoah et al.

Transportation planning plays a critical role in shaping urban development, economic mobility, and infrastructure sustainability. However, traditional planning methods often struggle to accurately predict long-term urban growth and transportation demands. This may sometimes result in infrastructure demolition to make room for current transportation planning demands. This study integrates a Temporal Fusion Transformer to predict travel patterns from demographic data with a Generative Adversarial Network to predict future urban settings through satellite imagery. The framework achieved a 0.76 R-square score in travel behavior prediction and generated high-fidelity satellite images with a Structural Similarity Index of 0.81. The results demonstrate that integrating predictive analytics and spatial visualization can significantly improve the decision-making process, fostering more sustainable and efficient urban development. This research highlights the importance of data-driven methodologies in modern transportation planning and presents a step toward optimizing infrastructure placement, capacity, and long-term viability.

11.8CVApr 25, 2025

An Improved ResNet50 Model for Predicting Pavement Condition Index (PCI) Directly from Pavement Images

Andrews Danyo, Anthony Dontoh, Armstrong Aboah

Accurately predicting the Pavement Condition Index (PCI), a measure of roadway conditions, from pavement images is crucial for infrastructure maintenance. This study proposes an enhanced version of the Residual Network (ResNet50) architecture, integrated with a Convolutional Block Attention Module (CBAM), to predict PCI directly from pavement images without additional annotations. By incorporating CBAM, the model autonomously prioritizes critical features within the images, improving prediction accuracy. Compared to the original baseline ResNet50 and DenseNet161 architectures, the enhanced ResNet50-CBAM model achieved a significantly lower mean absolute percentage error (MAPE) of 58.16%, compared to the baseline models that achieved 70.76% and 65.48% respectively. These results highlight the potential of using attention mechanisms to refine feature extraction, ultimately enabling more accurate and efficient assessments of pavement conditions. This study emphasizes the importance of targeted feature refinement in advancing automated pavement analysis through attention mechanisms.

13.1CVOct 13, 2025

Task-Specific Dual-Model Framework for Comprehensive Traffic Safety Video Description and Analysis

Blessing Agyei Kyem, Neema Jakisa Owor, Andrews Danyo et al.

Traffic safety analysis requires complex video understanding to capture fine-grained behavioral patterns and generate comprehensive descriptions for accident prevention. In this work, we present a unique dual-model framework that strategically utilizes the complementary strengths of VideoLLaMA and Qwen2.5-VL through task-specific optimization to address this issue. The core insight behind our approach is that separating training for captioning and visual question answering (VQA) tasks minimizes task interference and allows each model to specialize more effectively. Experimental results demonstrate that VideoLLaMA is particularly effective in temporal reasoning, achieving a CIDEr score of 1.1001, while Qwen2.5-VL excels in visual understanding with a VQA accuracy of 60.80\%. Through extensive experiments on the WTS dataset, our method achieves an S2 score of 45.7572 in the 2025 AI City Challenge Track 2, placing 10th on the challenge leaderboard. Ablation studies validate that our separate training strategy outperforms joint training by 8.6\% in VQA accuracy while maintaining captioning quality.

3.7CVJan 13, 2024

3D Object Detection and High-Resolution Traffic Parameters Extraction Using Low-Resolution LiDAR Data

Linlin Zhang, Xiang Yu, Armstrong Aboah et al.

Traffic volume data collection is a crucial aspect of transportation engineering and urban planning, as it provides vital insights into traffic patterns, congestion, and infrastructure efficiency. Traditional manual methods of traffic data collection are both time-consuming and costly. However, the emergence of modern technologies, particularly Light Detection and Ranging (LiDAR), has revolutionized the process by enabling efficient and accurate data collection. Despite the benefits of using LiDAR for traffic data collection, previous studies have identified two major limitations that have impeded its widespread adoption. These are the need for multiple LiDAR systems to obtain complete point cloud information of objects of interest, as well as the labor-intensive process of annotating 3D bounding boxes for object detection tasks. In response to these challenges, the current study proposes an innovative framework that alleviates the need for multiple LiDAR systems and simplifies the laborious 3D annotation process. To achieve this goal, the study employed a single LiDAR system, that aims at reducing the data acquisition cost and addressed its accompanying limitation of missing point cloud information by developing a Point Cloud Completion (PCC) framework to fill in missing point cloud information using point density. Furthermore, we also used zero-shot learning techniques to detect vehicles and pedestrians, as well as proposed a unique framework for extracting low to high features from the object of interest, such as height, acceleration, and speed. Using the 2D bounding box detection and extracted height information, this study is able to generate 3D bounding boxes automatically without human intervention.

6.2CVJan 20, 2025

A Review Paper of the Effects of Distinct Modalities and ML Techniques to Distracted Driving Detection

Anthony. Dontoh, Stephanie. Ivey, Logan. Sirbaugh et al.

Distracted driving remains a significant global challenge with severe human and economic repercussions, demanding improved detection and intervention strategies. While previous studies have extensively explored single-modality approaches, recent research indicates that these systems often fall short in identifying complex distraction patterns, particularly cognitive distractions. This systematic review addresses critical gaps by providing a comprehensive analysis of machine learning (ML) and deep learning (DL) techniques applied across various data modalities - visual,, sensory, auditory, and multimodal. By categorizing and evaluating studies based on modality, data accessibility, and methodology, this review clarifies which approaches yield the highest accuracy and are best suited for specific distracted driving detection goals. The findings offer clear guidance on the advantages of multimodal versus single-modal systems and capture the latest advancements in the field. Ultimately, this review contributes valuable insights for developing robust distracted driving detection frameworks, supporting enhanced road safety and mitigation strategies.

2.3APNov 16, 2021

Identifying the Factors that Influence Urban Public Transit Demand

Armstrong Aboah, Lydia Johnson, Setul Shah

The rise in urbanization throughout the United States (US) in recent years has required urban planners and transportation engineers to have greater consideration for the transportation services available to residents of a metropolitan region. This compels transportation authorities to provide better and more reliable modes of public transit through improved technologies and increased service quality. These improvements can be achieved by identifying and understanding the factors that influence urban public transit demand. Common factors that can influence urban public transit demand can be internal and/or external factors. Internal factors include policy measures such as transit fares, service headways, and travel times. External factors can include geographic, socioeconomic, and highway facility characteristics. There is inherent simultaneity between transit supply and demand, thus a two-stage least squares (2SLS) regression modeling procedure should be conducted to forecast urban transit supply and demand. As such, two multiple linear regression models should be developed: one to predict transit supply and a second to predict transit demand. It was found that service area density, total average cost per trip, and the average number of vehicles operated in maximum service can be used to forecast transit supply, expressed as vehicle revenue hours. Furthermore, estimated vehicle revenue hours and total average fares per trip can be used to forecast transit demand, expressed as unlinked passenger trips. Additional data such as socioeconomic information of the surrounding areas for each transit agency and travel time information of the various transit systems would be useful to improve upon the models developed.

7.5LGNov 16, 2021

Comparative Analysis of Machine Learning Models for Predicting Travel Time

Armstrong Aboah, Elizabeth Arthur

In this paper, five different deep learning models are being compared for predicting travel time. These models are autoregressive integrated moving average (ARIMA) model, recurrent neural network (RNN) model, autoregressive (AR) model, Long-short term memory (LSTM) model, and gated recurrent units (GRU) model. The aim of this study is to investigate the performance of each developed model for forecasting travel time. The dataset used in this paper consists of travel time and travel speed information from the state of Missouri. The learning rate used for building each model was varied from 0.0001-0.01. The best learning rate was found to be 0.001. The study concluded that the ARIMA model was the best model architecture for travel time prediction and forecasting.

8.0CVJun 20, 2021

Mobile Sensing for Multipurpose Applications in Transportation

Armstrong Aboah, Michael Boeding, Yaw Adu-Gyamfi

Routine and consistent data collection is required to address contemporary transportation issues.The cost of data collection increases significantly when sophisticated machines are used to collect data. Due to this constraint, State Departments of Transportation struggles to collect consistent data for analyzing and resolving transportation problems in a timely manner. Recent advancements in the sensors integrated into smartphones have resulted in a more affordable method of data collection.The primary objective of this study is to develop and implement a smartphone application for data collection.The currently designed app consists of three major modules: a frontend graphical user interface (GUI), a sensor module, and a backend module. While the frontend user interface enables interaction with the app, the sensor modules collect relevant data such as video and accelerometer readings while the app is in use. The backend, on the other hand, is made up of firebase storage, which is used to store the gathered data.In comparison to other developed apps for collecting pavement information, this current app is not overly reliant on the internet enabling the app to be used in areas of restricted internet access.The developed application was evaluated by collecting data on the i70W highway connecting Columbia, Missouri, and Kansas City, Missouri.The data was analyzed for a variety of purposes, including calculating the International Roughness Index (IRI), identifying pavement distresses, and understanding driver's behaviour and environment .The results of the application indicate that the data collected by the app is of high quality.

15.1CVApr 14, 2021

A Vision-based System for Traffic Anomaly Detection using Deep Learning and Decision Trees

Armstrong Aboah, Maged Shoman, Vishal Mandal et al.

Any intelligent traffic monitoring system must be able to detect anomalies such as traffic accidents in real time. In this paper, we propose a Decision-Tree - enabled approach powered by Deep Learning for extracting anomalies from traffic cameras while accurately estimating the start and end time of the anomalous event. Our approach included creating a detection model, followed by anomaly detection and analysis. YOLOv5 served as the foundation for our detection model. The anomaly detection and analysis step entail traffic scene background estimation, road mask extraction, and adaptive thresholding. Candidate anomalies were passed through a decision tree to detect and analyze final anomalies. The proposed approach yielded an F1 score of 0.8571, and an S4 score of 0.5686, per the experimental validation.