LGJun 3Code
Enhancing the MADDPG Algorithm for Multi-Agent Learning via Action Inference and Importance SamplingMarc Walden, Jason Liu, Shaashwath Sivakumar et al.
We investigate multi-agent deep reinforcement learning and propose two enhancements to the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. First, we introduce a novel Action Inference mechanism that enables each agent to predict other agents' intended actions, thereby improving the accuracy and stability of its own policy. Second, we apply an importance sampling strategy, using geometric distribution, in the replay buffer to prioritize more recent and informative experiences, which helps mitigate the non-stationarity inherent in multi-agent environments. We evaluate both modifications on the discrete-action Predator-Prey task provided by the PettingZoo library, a flexible Python interface for general multi-agent reinforcement learning benchmarks. Our results indicate that Action Inference is effective in improving learning stability and inter-agent cooperation and that importance sampling using geometric distribution can lead to significant improvements in exploration efficiency over standard MADDPG. Code available at https://github.com/shaashwathsivakumar/MARL_Proj
CVJun 30, 2022
Timestamp-Supervised Action Segmentation with Graph Convolutional NetworksHamza Khan, Sanjay Haresh, Awais Ahmed et al.
We introduce a novel approach for temporal activity segmentation with timestamp supervision. Our main contribution is a graph convolutional network, which is learned in an end-to-end manner to exploit both frame features and connections between neighboring frames to generate dense framewise labels from sparse timestamp labels. The generated dense framewise labels can then be used to train the segmentation model. In addition, we propose a framework for alternating learning of both the segmentation model and the graph convolutional model, which first initializes and then iteratively refines the learned models. Detailed experiments on four public datasets, including 50 Salads, GTEA, Breakfast, and Desktop Assembly, show that our method is superior to the multi-layer perceptron baseline, while performing on par with or better than the state of the art in temporal activity segmentation with timestamp supervision.
IVJul 18, 2024
A review of handcrafted and deep radiomics in neurological diseases: transitioning from oncology to clinical neuroimagingElizaveta Lavrova, Henry C. Woodruff, Hamza Khan et al.
Medical imaging technologies have undergone extensive development, enabling non-invasive visualization of clinical information. The traditional review of medical images by clinicians remains subjective, time-consuming, and prone to human error. With the recent availability of medical imaging data, quantification have become important goals in the field. Radiomics, a methodology aimed at extracting quantitative information from imaging data, has emerged as a promising approach to uncover hidden biological information and support decision-making in clinical practice. This paper presents a review of the radiomic pipeline from the clinical neuroimaging perspective, providing a detailed overview of each step with practical advice. It discusses the application of handcrafted and deep radiomics in neuroimaging, stratified by neurological diagnosis. Although radiomics shows great potential for increasing diagnostic precision and improving treatment quality in neurology, several limitations hinder its clinical implementation. Addressing these challenges requires collaborative efforts, advancements in image harmonization methods, and the establishment of reproducible and standardized pipelines with transparent reporting. By overcoming these obstacles, radiomics can significantly impact clinical neurology and enhance patient care.
CVDec 4, 2024
Video LLMs for Temporal Reasoning in Long VideosFawad Javed Fateh, Umer Ahmed, Hamza Khan et al.
This paper introduces TemporalVLM, a video large language model (video LLM) capable of effective temporal reasoning and fine-grained understanding in long videos. At the core, our approach includes a visual encoder for mapping a long-term input video into features which are time-aware and contain both local and global cues. In particular, it first divides the input video into short-term clips, which are jointly encoded with their timestamps and fused across overlapping temporal windows into time-sensitive local features. Next, the local features are passed through a bidirectional long short-term memory (BiLSTM) module for global feature aggregation. The extracted time-aware and multi-level features are important for accurate temporal reasoning and fine-grained understanding in long videos. Moreover, to facilitate the evaluation of TemporalVLM, we present a large-scale long video dataset of industry assembly processes, namely IndustryASM, which consists of videos recorded on factory floors with actions and timestamps annotated by industrial engineers for time and motion studies and temporal action segmentation evaluation. Finally, extensive experiments on datasets of long videos, including TimeIT and IndustryASM, show that TemporalVLM achieves superior performance than previous methods across temporal reasoning and fine-grained understanding tasks, namely dense video captioning, temporal video grounding, video highlight detection, and temporal action segmentation. To the best of our knowledge, our work is the first to incorporate LSTMs into video LLMs.
NISep 26, 2025
Leveraging Wireless Sensor Networks for Real-Time Monitoring and Control of Industrial EnvironmentsMuhammad Junaid Asif, Shazia Saqib, Rana Fayyaz Ahmad et al.
This research proposes an extensive technique for monitoring and controlling the industrial parameters using Internet of Things (IoT) technology based on wireless communication. We proposed a system based on NRF transceivers to establish a strong Wireless Sensor Network (WSN), enabling transfer of real-time data from multiple sensors to a central setup that is driven by ARDUINO microcontrollers. Different key parameters, crucial for industrial setup such as temperature, humidity, soil moisture and fire detection, are monitored and displayed on an LCD screen, enabling factory administration to oversee the industrial operations remotely over the internet. Our proposed system bypasses the need for physical presence for monitoring by addressing the shortcomings of conventional wired communication systems. Other than monitoring, there is an additional feature to remotely control these parameters by controlling the speed of DC motors through online commands. Given the rising incidence of industrial fires over the worldwide between 2020 and 2024 due to an array of hazards, this system with dual functionality boosts the overall operational efficiency and safety. This overall integration of IoT and Wireless Sensor Network (WSN) reduces the potential risks linked with physical monitoring, providing rapid responses in emergency scenarios, including the activation of firefighting equipment. The results show that innovations in wireless communication perform an integral part in industrial process automation and safety, paving the way to more intelligent and responsive operating environments. Overall, this study highlights the potential for change of IoT-enabled systems to revolutionize monitoring and control in a variety of industrial applications, resulting in increased productivity and safety.
CVAug 26, 2025
Advancements in Crop Analysis through Deep Learning and Explainable AIHamza Khan
Rice is a staple food of global importance in terms of trade, nutrition, and economic growth. Among Asian nations such as China, India, Pakistan, Thailand, Vietnam and Indonesia are leading producers of both long and short grain varieties, including basmati, jasmine, arborio, ipsala, and kainat saila. To ensure consumer satisfaction and strengthen national reputations, monitoring rice crops and grain quality is essential. Manual inspection, however, is labour intensive, time consuming and error prone, highlighting the need for automated solutions for quality control and yield improvement. This study proposes an automated approach to classify five rice grain varieties using Convolutional Neural Networks (CNN). A publicly available dataset of 75000 images was used for training and testing. Model evaluation employed accuracy, recall, precision, F1-score, ROC curves, and confusion matrices. Results demonstrated high classification accuracy with minimal misclassifications, confirming the model effectiveness in distinguishing rice varieties. In addition, an accurate diagnostic method for rice leaf diseases such as Brown Spot, Blast, Bacterial Blight, and Tungro was developed. The framework combined explainable artificial intelligence (XAI) with deep learning models including CNN, VGG16, ResNet50, and MobileNetV2. Explainability techniques such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) revealed how specific grain and leaf features influenced predictions, enhancing model transparency and reliability. The findings demonstrate the strong potential of deep learning in agricultural applications, paving the way for robust, interpretable systems that can support automated crop quality inspection and disease diagnosis, ultimately benefiting farmers, consumers, and the agricultural economy.
CVMay 7, 2025
Exploring Convolutional Neural Networks for Rice Grain Classification: An Explainable AI ApproachMuhammad Junaid Asif, Hamza Khan, Rabia Tehseen et al.
Rice is an essential staple food worldwide that is important in promoting international trade, economic growth, and nutrition. Asian countries such as China, India, Pakistan, Thailand, Vietnam, and Indonesia are notable for their significant contribution to the cultivation and utilization of rice. These nations are also known for cultivating different rice grains, including short and long grains. These sizes are further classified as basmati, jasmine, kainat saila, ipsala, arborio, etc., catering to diverse culinary preferences and cultural traditions. For both local and international trade, inspecting and maintaining the quality of rice grains to satisfy customers and preserve a country's reputation is necessary. Manual quality check and classification is quite a laborious and time-consuming process. It is also highly prone to mistakes. Therefore, an automatic solution must be proposed for the effective and efficient classification of different varieties of rice grains. This research paper presents an automatic framework based on a convolutional neural network (CNN) for classifying different varieties of rice grains. We evaluated the proposed model based on performance metrics such as accuracy, recall, precision, and F1-Score. The CNN model underwent rigorous training and validation, achieving a remarkable accuracy rate and a perfect area under each class's Receiver Operating Characteristic (ROC) curve. The confusion matrix analysis confirmed the model's effectiveness in distinguishing between the different rice varieties, indicating minimal misclassifications. Additionally, the integration of explainability techniques such as LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) provided valuable insights into the model's decision-making process, revealing how specific features of the rice grains influenced classification outcomes.
NIMar 12, 2020
Deep Learning Assisted CSI Estimation for Joint URLLC and eMBB Resource AllocationHamza Khan, M. Majid Butt, Sumudu Samarakoon et al.
Multiple-input multiple-output (MIMO) is a key for the fifth generation (5G) and beyond wireless communication systems owing to higher spectrum efficiency, spatial gains, and energy efficiency. Reaping the benefits of MIMO transmission can be fully harnessed if the channel state information (CSI) is available at the transmitter side. However, the acquisition of transmitter side CSI entails many challenges. In this paper, we propose a deep learning assisted CSI estimation technique in highly mobile vehicular networks, based on the fact that the propagation environment (scatterers, reflectors) is almost identical thereby allowing a data driven deep neural network (DNN) to learn the non-linear CSI relations with negligible overhead. Moreover, we formulate and solve a dynamic network slicing based resource allocation problem for vehicular user equipments (VUEs) requesting enhanced mobile broadband (eMBB) and ultra-reliable low latency (URLLC) traffic slices. The formulation considers a threshold rate violation probability minimization for the eMBB slice while satisfying a probabilistic threshold rate criterion for the URLLC slice. Simulation result shows that an overhead reduction of 50% can be achieved with 12% increase in threshold violations compared to an ideal case with perfect CSI knowledge.
NIJan 22, 2020
Reinforcement Learning Based Vehicle-cell Association Algorithm for Highly Mobile Millimeter Wave CommunicationHamza Khan, Anis Elgabli, Sumudu Samarakoon et al.
Vehicle-to-everything (V2X) communication is a growing area of communication with a variety of use cases. This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks. The aim is to maximize the time average rate per vehicular user (VUE) while ensuring a target minimum rate for all VUEs with low signaling overhead. We first formulate the user (vehicle) association problem as a discrete non-convex optimization problem. Then, by leveraging tools from machine learning, specifically distributed deep reinforcement learning (DDRL) and the asynchronous actor critic algorithm (A3C), we propose a low complexity algorithm that approximates the solution of the proposed optimization problem. The proposed DDRL-based algorithm endows every road side unit (RSU) with a local RL agent that selects a local action based on the observed input state. Actions of different RSUs are forwarded to a central entity, that computes a global reward which is then fed back to RSUs. It is shown that each independently trained RL performs the vehicle-RSU association action with low control overhead and less computational complexity compared to running an online complex algorithm to solve the non-convex optimization problem. Finally, simulation results show that the proposed solution achieves up to 15\% gains in terms of sum rate and 20\% reduction in VUE outages compared to several baseline designs.