CVDec 18, 2022
Performance Analysis of YOLO-based Architectures for Vehicle Detection from Traffic Images in BangladeshRefaat Mohammad Alamgir, Ali Abir Shuvro, Mueeze Al Mushabbir et al.
The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.
AINov 15, 2020
Automated Intersection Management with MiniZincMd. Mushfiqur Rahman, Nahian Muhtasim Zahin, Kazi Raiyan Mahmud et al.
Ill-managed intersections are the primary reasons behind the increasing traffic problem in urban areas, leading to nonoptimal traffic-flow and unnecessary deadlocks. In this paper, we propose an automated intersection management system that extracts data from a well-defined grid of sensors and optimizes traffic flow by controlling traffic signals. The data extraction mechanism is independent of the optimization algorithm and this paper primarily emphasizes the later one. We have used MiniZinc modeling language to define our system as a constraint satisfaction problem which can be solved using any off-the-shelf solver. The proposed system performs much better than the systems currently in use. Our system reduces the mean waiting time and standard deviation of the waiting time of vehicles and avoids deadlocks.
AINov 15, 2020
Automated Large-scale Class Scheduling in MiniZincMd. Mushfiqur Rahman, Sabah Binte Noor, Fazlul Hasan Siddiqui
Class Scheduling is a highly constrained task. Educational institutes spend a lot of resources, in the form of time and manual computation, to find a satisficing schedule that fulfills all the requirements. A satisficing class schedule accommodates all the students to all their desired courses at convenient timing. The scheduler also needs to take into account the availability of course teachers on the given slots. With the added limitation of available classrooms, the number of solutions satisfying all constraints in this huge search-space, further decreases. This paper proposes an efficient system to generate class schedules that can fulfill every possible need of a typical university. Though it is primarily a fixed-credit scheduler, it can be adjusted for open-credit systems as well. The model is designed in MiniZinc and solved using various off-the-shelf solvers. The proposed scheduling system can find a balanced schedule for a moderate-sized educational institute in less than a minute.
CVSep 15, 2020
Video captioning with stacked attention and semantic hard pullMd. Mushfiqur Rahman, Thasin Abedin, Khondokar S. S. Prottoy et al.
Video captioning, i.e. the task of generating captions from video sequences creates a bridge between the Natural Language Processing and Computer Vision domains of computer science. The task of generating a semantically accurate description of a video is quite complex. Considering the complexity, of the problem, the results obtained in recent research works are praiseworthy. However, there is plenty of scope for further investigation. This paper addresses this scope and proposes a novel solution. Most video captioning models comprise two sequential/recurrent layers - one as a video-to-context encoder and the other as a context-to-caption decoder. This paper proposes a novel architecture, namely Semantically Sensible Video Captioning (SSVC) which modifies the context generation mechanism by using two novel approaches - "stacked attention" and "spatial hard pull". As there are no exclusive metrics for evaluating video captioning models, we emphasize both quantitative and qualitative analysis of our model. Hence, we have used the BLEU scoring metric for quantitative analysis and have proposed a human evaluation metric for qualitative analysis, namely the Semantic Sensibility (SS) scoring metric. SS Score overcomes the shortcomings of common automated scoring metrics. This paper reports that the use of the aforementioned novelties improves the performance of state-of-the-art architectures.