CVMar 10, 2023Code
A POV-based Highway Vehicle Trajectory Dataset and Prediction ArchitectureVinit Katariya, Ghazal Alinezhad Noghre, Armin Danesh Pazho et al.
Vehicle Trajectory datasets that provide multiple point-of-views (POVs) can be valuable for various traffic safety and management applications. Despite the abundance of trajectory datasets, few offer a comprehensive and diverse range of driving scenes, capturing multiple viewpoints of various highway layouts, merging lanes, and configurations. This limits their ability to capture the nuanced interactions between drivers, vehicles, and the roadway infrastructure. We introduce the \emph{Carolinas Highway Dataset (CHD\footnote{\emph{CHD} available at: \url{https://github.com/TeCSAR-UNCC/Carolinas\_Dataset}})}, a vehicle trajectory, detection, and tracking dataset. \emph{CHD} is a collection of 1.6 million frames captured in highway-based videos from eye-level and high-angle POVs at eight locations across Carolinas with 338,000 vehicle trajectories. The locations, timing of recordings, and camera angles were carefully selected to capture various road geometries, traffic patterns, lighting conditions, and driving behaviors. We also present \emph{PishguVe}\footnote{\emph{PishguVe} code available at: \url{https://github.com/TeCSAR-UNCC/PishguVe}}, a novel vehicle trajectory prediction architecture that uses attention-based graph isomorphism and convolutional neural networks. The results demonstrate that \emph{PishguVe} outperforms existing algorithms to become the new state-of-the-art (SotA) in bird's-eye, eye-level, and high-angle POV trajectory datasets. Specifically, it achieves a 12.50\% and 10.20\% improvement in ADE and FDE, respectively, over the current SotA on NGSIM dataset. Compared to best-performing models on CHD, \emph{PishguVe} achieves lower ADE and FDE on eye-level data by 14.58\% and 27.38\%, respectively, and improves ADE and FDE on high-angle data by 8.3\% and 6.9\%, respectively.
24.1CVApr 18Code
EdgeVTP: Exploration of Latency-efficient Trajectory Prediction for Edge-based Embedded Vision ApplicationsSeungjin Kim, Reza Jafarpourmarzouni, Christopher Neff et al.
Vehicle trajectory prediction is central to highway perception, but deployment on roadside edge devices necessitates bounded, deterministic end-to-end latency. We present EdgeVTP, an embedded-first trajectory predictor that combines interaction-aware graph modeling with a lightweight transformer backbone and a one-shot curve decoder. By predicting future motion as compact curve parameters (anchored at the last observed position) rather than horizon-scaled autoregressive waypoints, EdgeVTP reduces decoding overhead while producing smooth trajectories. To keep runtime predictable in crowded scenes, we explicitly bound interaction complexity via a locality graph with a hard neighbor cap. Across three highway benchmarks and two Jetson-class platforms, EdgeVTP achieves the lowest measured end-to-end latency under a protocol that includes graph construction and post-processing, while attaining state-of-the-art (SotA) prediction accuracy on two of the three datasets and competitive error on other benchmarks. Our code is available at https://github.com/SeungjinStevenKim/EdgeVTP.
CVAug 26, 2024Code
Towards Adaptive Human-centric Video Anomaly Detection: A Comprehensive Framework and A New BenchmarkArmin Danesh Pazho, Shanle Yao, Ghazal Alinezhad Noghre et al.
Human-centric Video Anomaly Detection (VAD) aims to identify human behaviors that deviate from normal. At its core, human-centric VAD faces substantial challenges, such as the complexity of diverse human behaviors, the rarity of anomalies, and ethical constraints. These challenges limit access to high-quality datasets and highlight the need for a dataset and framework supporting continual learning. Moving towards adaptive human-centric VAD, we introduce the HuVAD (Human-centric privacy-enhanced Video Anomaly Detection) dataset and a novel Unsupervised Continual Anomaly Learning (UCAL) framework. UCAL enables incremental learning, allowing models to adapt over time, bridging traditional training and real-world deployment. HuVAD prioritizes privacy by providing de-identified annotations and includes seven indoor/outdoor scenes, offering over 5x more pose-annotated frames than previous datasets. Our standard and continual benchmarks, utilize a comprehensive set of metrics, demonstrating that UCAL-enhanced models achieve superior performance in 82.14% of cases, setting a new state-of-the-art (SOTA). The dataset can be accessed at https://github.com/TeCSAR-UNCC/HuVAD.
CVNov 14, 2023Code
VegaEdge: Edge AI Confluence Anomaly Detection for Real-Time Highway IoT-ApplicationsVinit Katariya, Fatema-E- Jannat, Armin Danesh Pazho et al.
Vehicle anomaly detection plays a vital role in highway safety applications such as accident prevention, rapid response, traffic flow optimization, and work zone safety. With the surge of the Internet of Things (IoT) in recent years, there has arisen a pressing demand for Artificial Intelligence (AI) based anomaly detection methods designed to meet the requirements of IoT devices. Catering to this futuristic vision, we introduce a lightweight approach to vehicle anomaly detection by utilizing the power of trajectory prediction. Our proposed design identifies vehicles deviating from expected paths, indicating highway risks from different camera-viewing angles from real-world highway datasets. On top of that, we present VegaEdge - a sophisticated AI confluence designed for real-time security and surveillance applications in modern highway settings through edge-centric IoT-embedded platforms equipped with our anomaly detection approach. Extensive testing across multiple platforms and traffic scenarios showcases the versatility and effectiveness of VegaEdge. This work also presents the Carolinas Anomaly Dataset (CAD), to bridge the existing gap in datasets tailored for highway anomalies. In real-world scenarios, our anomaly detection approach achieves an AUC-ROC of 0.94, and our proposed VegaEdge design, on an embedded IoT platform, processes 738 trajectories per second in a typical highway setting. The dataset is available at https://github.com/TeCSAR-UNCC/Carolinas_Dataset#chd-anomaly-test-set .
10.0AIMay 6
Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at IntersectionsVinit Katariya, Seungjin Kim, Curtis Craig et al.
Artificial intelligence (AI) and computer vision are transforming transportation data collection. This study introduces an AI-enabled analytics framework leveraging existing CCTV infrastructure to evaluate the impact of soft interventions, such as temporary pedestrian refuges and curb extensions, on vehicle speed and safety. Using deep learning and perspective-based speed estimation, we evaluated driver behavior before and after interventions, with repeated post-installation monitoring in Week 1 and Week 2, in Minneapolis. Findings reveal that at unsignalized intersections, mean and 85th-percentile speeds fell by up to 18.75% and 16.56%, respectively, while pass-through traffic decreased by as much as 12.2%. Signalized intersections showed comparable reductions except one location, with mean and 85th-percentile speeds dropping by up to 20.0% and 17.19%. These results demonstrate the traffic-calming effectiveness of soft infrastructure and underscore the utility of AI-powered methods for rapid, low-cost, and evidence-based transport policy evaluation.
CVOct 14, 2022
Pishgu: Universal Path Prediction Network Architecture for Real-time Cyber-physical Edge SystemsGhazal Alinezhad Noghre, Vinit Katariya, Armin Danesh Pazho et al.
Path prediction is an essential task for many real-world Cyber-Physical Systems (CPS) applications, from autonomous driving and traffic monitoring/management to pedestrian/worker safety. These real-world CPS applications need a robust, lightweight path prediction that can provide a universal network architecture for multiple subjects (e.g., pedestrians and vehicles) from different perspectives. However, most existing algorithms are tailor-made for a unique subject with a specific camera perspective and scenario. This article presents Pishgu, a universal lightweight network architecture, as a robust and holistic solution for path prediction. Pishgu's architecture can adapt to multiple path prediction domains with different subjects (vehicles, pedestrians), perspectives (bird's-eye, high-angle), and scenes (sidewalk, highway). Our proposed architecture captures the inter-dependencies within the subjects in each frame by taking advantage of Graph Isomorphism Networks and the attention module. We separately train and evaluate the efficacy of our architecture on three different CPS domains across multiple perspectives (vehicle bird's-eye view, pedestrian bird's-eye view, and human high-angle view). Pishgu outperforms state-of-the-art solutions in the vehicle bird's-eye view domain by 42% and 61% and pedestrian high-angle view domain by 23% and 22% in terms of ADE and FDE, respectively. Additionally, we analyze the domain-specific details for various datasets to understand their effect on path prediction and model interpretation. Finally, we report the latency and throughput for all three domains on multiple embedded platforms showcasing the robustness and adaptability of Pishgu for real-world integration into CPS applications.
CVMar 9, 2023
Understanding the Challenges and Opportunities of Pose-based Anomaly DetectionGhazal Alinezhad Noghre, Armin Danesh Pazho, Vinit Katariya et al.
Pose-based anomaly detection is a video-analysis technique for detecting anomalous events or behaviors by examining human pose extracted from the video frames. Utilizing pose data alleviates privacy and ethical issues. Also, computation-wise, the complexity of pose-based models is lower than pixel-based approaches. However, it introduces more challenges, such as noisy skeleton data, losing important pixel information, and not having enriched enough features. These problems are exacerbated by a lack of anomaly detection datasets that are good enough representatives of real-world scenarios. In this work, we analyze and quantify the characteristics of two well-known video anomaly datasets to better understand the difficulties of pose-based anomaly detection. We take a step forward, exploring the discriminating power of pose and trajectory for video anomaly detection and their effectiveness based on context. We believe these experiments are beneficial for a better comprehension of pose-based anomaly detection and the datasets currently available. This will aid researchers in tackling the task of anomaly detection with a more lucid perspective, accelerating the development of robust models with better performance.
CVNov 11, 2023
VT-Former: An Exploratory Study on Vehicle Trajectory Prediction for Highway Surveillance through Graph Isomorphism and TransformerArmin Danesh Pazho, Ghazal Alinezhad Noghre, Vinit Katariya et al.
Enhancing roadway safety has become an essential computer vision focus area for Intelligent Transportation Systems (ITS). As a part of ITS, Vehicle Trajectory Prediction (VTP) aims to forecast a vehicle's future positions based on its past and current movements. VTP is a pivotal element for road safety, aiding in applications such as traffic management, accident prevention, work-zone safety, and energy optimization. While most works in this field focus on autonomous driving, with the growing number of surveillance cameras, another sub-field emerges for surveillance VTP with its own set of challenges. In this paper, we introduce VT-Former, a novel transformer-based VTP approach for highway safety and surveillance. In addition to utilizing transformers to capture long-range temporal patterns, a new Graph Attentive Tokenization (GAT) module has been proposed to capture intricate social interactions among vehicles. This study seeks to explore both the advantages and the limitations inherent in combining transformer architecture with graphs for VTP. Our investigation, conducted across three benchmark datasets from diverse surveillance viewpoints, showcases the State-of-the-Art (SotA) or comparable performance of VT-Former in predicting vehicle trajectories. This study underscores the potential of VT-Former and its architecture, opening new avenues for future research and exploration.
CVSep 23, 2025Code
Adversarially-Refined VQ-GAN with Dense Motion Tokenization for Spatio-Temporal HeatmapsGabriel Maldonado, Narges Rashvand, Armin Danesh Pazho et al.
Continuous human motion understanding remains a core challenge in computer vision due to its high dimensionality and inherent redundancy. Efficient compression and representation are crucial for analyzing complex motion dynamics. In this work, we introduce an adversarially-refined VQ-GAN framework with dense motion tokenization for compressing spatio-temporal heatmaps while preserving the fine-grained traces of human motion. Our approach combines dense motion tokenization with adversarial refinement, which eliminates reconstruction artifacts like motion smearing and temporal misalignment observed in non-adversarial baselines. Our experiments on the CMU Panoptic dataset provide conclusive evidence of our method's superiority, outperforming the dVAE baseline by 9.31% SSIM and reducing temporal instability by 37.1%. Furthermore, our dense tokenization strategy enables a novel analysis of motion complexity, revealing that 2D motion can be optimally represented with a compact 128-token vocabulary, while 3D motion's complexity demands a much larger 1024-token codebook for faithful reconstruction. These results establish practical deployment feasibility across diverse motion analysis applications. The code base for this work is available at https://github.com/TeCSAR-UNCC/Pose-Quantization.
SPMar 8, 2024
Enhancing Automatic Modulation Recognition for IoT Applications Using TransformersNarges Rashvand, Kenneth Witham, Gabriel Maldonado et al.
Automatic modulation recognition (AMR) is vital for accurately identifying modulation types within incoming signals, a critical task for optimizing operations within edge devices in IoT ecosystems. This paper presents an innovative approach that leverages Transformer networks, initially designed for natural language processing, to address the challenges of efficient AMR. Our transformer network architecture is designed with the mindset of real-time edge computing on IoT devices. Four tokenization techniques are proposed and explored for creating proper embeddings of RF signals, specifically focusing on overcoming the limitations related to the model size often encountered in IoT scenarios. Extensive experiments reveal that our proposed method outperformed advanced deep learning techniques, achieving the highest recognition accuracy. Notably, our model achieves an accuracy of 65.75 on the RML2016 and 65.80 on the CSPB.ML.2018+ dataset.
LGAug 1, 2021
DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in HighwaysVinit Katariya, Mohammadreza Baharani, Nichole Morris et al.
Vehicle trajectory prediction is essential for enabling safety-critical intelligent transportation systems (ITS) applications used in management and operations. While there have been some promising advances in the field, there is a need for modern deep learning algorithms that allow real-time trajectory prediction on embedded IoT devices. This article presents DeepTrack, a novel deep learning algorithm customized for real-time vehicle trajectory prediction and monitoring applications in arterial management, freeway management, traffic incident management, and work zone management for high-speed incoming traffic. In contrast to previous methods, the vehicle dynamics are encoded using Temporal Convolutional Networks (TCNs) to provide more robust time prediction with less computation. DeepTrack also uses depthwise convolution, which reduces the complexity of models compared to existing approaches in terms of model size and operations. Overall, our experimental results demonstrate that DeepTrack achieves comparable accuracy to state-of-the-art trajectory prediction models but with smaller model sizes and lower computational complexity, making it more suitable for real-world deployment.