MMSep 3, 2022
Deep Live Video Ad Placement on the 5G EdgeMohammad Hosseini
The video broadcasting industry has been growing significantly in the recent years, specially on delivering personalized contents to the end users. While video broadcasting has continued to grow beyond TV, video adverting has become a key marketing tool to deliver targeted messages directly to the audience. However, unfortunately for broadband TV, a key problem is that the TV commercials target the broad audience, therefore lacking user-specific and personalized ad contents. In this paper, we propose a deep edge-cloud ad-placement system, and briefly describe our methodologies and the architecture of our designed ad placement system for delivering both the Video on Demand (VoD) and live broadcast TV contents over MMT streaming protocol. The aim of our paper is to showcase how to enable targeted, personalized, and user-specific advertising services deployed on the future 5G MEC platforms, which in turn can have high potentials to increase ad revenues for the mobile operator industry.
CVSep 10, 2023
Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content ModerationMohammad Hosseini, Mahmudul Hasan
To address the increasing need for efficient and accurate content moderation, we propose an efficient and lightweight deep classification ensemble structure. Our approach is based on a combination of simple visual features, designed for high-accuracy classification of violent content with low false positives. Our ensemble architecture utilizes a set of lightweight models with narrowed-down color features, and we apply it to both images and videos. We evaluated our approach using a large dataset of explosion and blast contents and compared its performance to popular deep learning models such as ResNet-50. Our evaluation results demonstrate significant improvements in prediction accuracy, while benefiting from 7.64x faster inference and lower computation cost. While our approach is tailored to explosion detection, it can be applied to other similar content moderation and violence detection use cases as well. Based on our experiments, we propose a "think small, think many" philosophy in classification scenarios. We argue that transforming a single, large, monolithic deep model into a verification-based step model ensemble of multiple small, simple, and lightweight models with narrowed-down visual features can possibly lead to predictions with higher accuracy.
CVMar 4, 2025
Revolutionizing Traffic Management with AI-Powered Machine Vision: A Step Toward Smart CitiesSeyed Hossein Hosseini DolatAbadi, Sayyed Mohammad Hossein Hashemi, Mohammad Hosseini et al.
The rapid urbanization of cities and increasing vehicular congestion have posed significant challenges to traffic management and safety. This study explores the transformative potential of artificial intelligence (AI) and machine vision technologies in revolutionizing traffic systems. By leveraging advanced surveillance cameras and deep learning algorithms, this research proposes a system for real-time detection of vehicles, traffic anomalies, and driver behaviors. The system integrates geospatial and weather data to adapt dynamically to environmental conditions, ensuring robust performance in diverse scenarios. Using YOLOv8 and YOLOv11 models, the study achieves high accuracy in vehicle detection and anomaly recognition, optimizing traffic flow and enhancing road safety. These findings contribute to the development of intelligent traffic management solutions and align with the vision of creating smart cities with sustainable and efficient urban infrastructure.
CLSep 25, 2025
PerHalluEval: Persian Hallucination Evaluation Benchmark for Large Language ModelsMohammad Hosseini, Kimia Hosseini, Shayan Bali et al.
Hallucination is a persistent issue affecting all large language Models (LLMs), particularly within low-resource languages such as Persian. PerHalluEval (Persian Hallucination Evaluation) is the first dynamic hallucination evaluation benchmark tailored for the Persian language. Our benchmark leverages a three-stage LLM-driven pipeline, augmented with human validation, to generate plausible answers and summaries regarding QA and summarization tasks, focusing on detecting extrinsic and intrinsic hallucinations. Moreover, we used the log probabilities of generated tokens to select the most believable hallucinated instances. In addition, we engaged human annotators to highlight Persian-specific contexts in the QA dataset in order to evaluate LLMs' performance on content specifically related to Persian culture. Our evaluation of 12 LLMs, including open- and closed-source models using PerHalluEval, revealed that the models generally struggle in detecting hallucinated Persian text. We showed that providing external knowledge, i.e., the original document for the summarization task, could mitigate hallucination partially. Furthermore, there was no significant difference in terms of hallucination when comparing LLMs specifically trained for Persian with others.
NCSep 23, 2025
Dynamical Modeling of Behaviorally Relevant Spatiotemporal Patterns in Neural Imaging DataMohammad Hosseini, Maryam M. Shanechi
High-dimensional imaging of neural activity, such as widefield calcium and functional ultrasound imaging, provide a rich source of information for understanding the relationship between brain activity and behavior. Accurately modeling neural dynamics in these modalities is crucial for understanding this relationship but is hindered by the high-dimensionality, complex spatiotemporal dependencies, and prevalent behaviorally irrelevant dynamics in these modalities. Existing dynamical models often employ preprocessing steps to obtain low-dimensional representations from neural image modalities. However, this process can discard behaviorally relevant information and miss spatiotemporal structure. We propose SBIND, a novel data-driven deep learning framework to model spatiotemporal dependencies in neural images and disentangle their behaviorally relevant dynamics from other neural dynamics. We validate SBIND on widefield imaging datasets, and show its extension to functional ultrasound imaging, a recent modality whose dynamical modeling has largely remained unexplored. We find that our model effectively identifies both local and long-range spatial dependencies across the brain while also dissociating behaviorally relevant neural dynamics. Doing so, SBIND outperforms existing models in neural-behavioral prediction. Overall, SBIND provides a versatile tool for investigating the neural mechanisms underlying behavior using imaging modalities.
DLNov 7, 2024
GREI Data Repository AI TaxonomyJohn Chodacki, Mark Hanhel, Stefano Iacus et al.
The Generalist Repository Ecosystem Initiative (GREI), funded by the NIH, developed an AI taxonomy tailored to data repository roles to guide AI integration across repository management. It categorizes the roles into stages, including acquisition, validation, organization, enhancement, analysis, sharing, and user support, providing a structured framework for implementing AI in repository workflows.
CVMar 18, 2021
The Case for High-Accuracy Classification: Think Small, Think Many!Mohammad Hosseini, Mahmudul Hasan
To facilitate implementation of high-accuracy deep neural networks especially on resource-constrained devices, maintaining low computation requirements is crucial. Using very deep models for classification purposes not only decreases the neural network training speed and increases the inference time, but also need more data for higher prediction accuracy and to mitigate false positives. In this paper, we propose an efficient and lightweight deep classification ensemble structure based on a combination of simple color features, which is particularly designed for "high-accuracy" image classifications with low false positives. We designed, implemented, and evaluated our approach for explosion detection use-case applied to images and videos. Our evaluation results based on a large test test show considerable improvements on the prediction accuracy compared to the popular ResNet-50 model, while benefiting from 7.64x faster inference and lower computation cost. While we applied our approach to explosion detection, our approach is general and can be applied to other similar classification use cases as well. Given the insight gained from our experiments, we hence propose a "think small, think many" philosophy in classification scenarios: that transforming a single, large, monolithic deep model into a verification-based step model ensemble of multiple small, simple, lightweight models with narrowed-down color spaces can possibly lead to predictions with higher accuracy.
MMNov 3, 2019
Adaptive Rate Allocation for View-Aware Point-Cloud StreamingMohammad Hosseini
In the context of view-dependent point-cloud streaming in a scene, our rate allocation is "adaptive" in the sense that it priorities the point-cloud models depending on the camera view and the visibility of the objects and their distance as described. The algorithm delivers higher bitrate to the point-cloud models which are inside user's viewport, more likely for the user to look at, or are closer to the view camera or, while delivers lower quality level to the point-cloud models outside of a user's immediate viewport or farther away from the camera. For that purpose, we hereby explain the rate allocation problem within the context of multi-point-cloud streaming where multiple point-cloud models are aimed to be streamed to the target device, and propose a rate allocation heuristic algorithm to enable the adaptations within this context. To the best of our knowledge, this is the first work to mathematically model, and propose a rate allocation heuristic algorithm within the context of point-cloud streaming.
MMApr 29, 2018
Dynamic Adaptive Point Cloud StreamingMohammad Hosseini, Christian Timmerer
High-quality point clouds have recently gained interest as an emerging form of representing immersive 3D graphics. Unfortunately, these 3D media are bulky and severely bandwidth intensive, which makes it difficult for streaming to resource-limited and mobile devices. This has called researchers to propose efficient and adaptive approaches for streaming of high-quality point clouds. In this paper, we run a pilot study towards dynamic adaptive point cloud streaming, and extend the concept of dynamic adaptive streaming over HTTP (DASH) towards DASH-PC, a dynamic adaptive bandwidth-efficient and view-aware point cloud streaming system. DASH-PC can tackle the huge bandwidth demands of dense point cloud streaming while at the same time can semantically link to human visual acuity to maintain high visual quality when needed. In order to describe the various quality representations, we propose multiple thinning approaches to spatially sub-sample point clouds in the 3D space, and design a DASH Media Presentation Description manifest specific for point cloud streaming. Our initial evaluations show that we can achieve significant bandwidth and performance improvement on dense point cloud streaming with minor negative quality impacts compared to the baseline scenario when no adaptations is applied.
SENov 25, 2017
Communication and Synchronization of Distributed Medical Models: Design, Development, and Performance AnalysisMohammad Hosseini, Richard Berlin, Lui Sha et al.
Model-based development is a widely-used method to describe complex systems that enables the rapid prototyping. Advances in the science of distributed systems has led to the development of large scale statechart models which are distributed among multiple locations. Taking medicine for example, models of best-practice guidelines during rural ambulance transport are distributed across hospital settings from a rural hospital, to an ambulance, to a central tertiary hospital. Unfortunately, these medical models require continuous and real-time communication across individual medical models in physically distributed treatment locations which provides vital assistance to the clinicians and physicians. This makes it necessary to offer methods for model-driven communication and synchronization in a distributed environment. In this paper, we describe ModelSink, a middleware to address the problem of communication and synchronization of heterogeneous distributed models. Being motivated by the synchronization requirements during emergency ambulance transport, we use medical best-practice models as a case study to illustrate the notion of distributed models. Through ModelSink, we achieve an efficient communication architecture, open-loop-safe protocol, and queuing and mapping mechanisms compliant with the semantics of statechart-based model-driven development. We evaluated the performance of ModelSink on distributed sets of medical models that we have developed to assess how ModelSink performs in various loads. Our work is intended to assist clinicians, EMT, and medical staff to prevent unintended deviations from medical best practices, and overcome connectivity and coordination challenges that exist in a distributed hospital network. Our practice suggests that there are in fact additional potential domains beyond medicine where our middleware can provide needed utility.
NIJul 16, 2017
Towards Physiology-Aware DASH: Bandwidth-Compliant Prioritized Clinical Multimedia Communication in AmbulancesMohammad Hosseini, Yu Jiang, Richard R. Berlin et al.
The ultimate objective of medical cyber-physical systems is to enhance the safety and effectiveness of patient care. To ensure safe and effective care during emergency patient transfer from rural areas to center tertiary hospitals, reliable and real-time communication is essential. Unfortunately, real-time monitoring of patients involves transmission of various clinical multimedia data including videos, medical images, and vital signs, which requires use of mobile network with high-fidelity communication bandwidth. However, the wireless networks along the roads in rural areas range from 4G to 2G to low speed satellite links, which poses a significant challenge to transmit critical patient information. In this paper, we present a bandwidth-compliant criticality-aware system for transmission of massive clinical multimedia data adaptive to varying bandwidths during patient transport. Model-based clinical automata are used to determine the criticality of clinical multimedia data. We borrow concepts from DASH, and propose physiology-aware adaptation techniques to transmit more critical clinical data with higher fidelity in response to changes in disease, clinical states, and bandwidth condition. In collaboration with Carle's ambulance service center, we develop a bandwidth profiler, and use it as proof of concept to support our experiments. Our preliminary evaluation results show that our solutions ensure that most critical patient's clinical data are communicated with higher fidelity.
MMJan 23, 2017
Adaptive 360 VR Video Streaming based on MPEG-DASH SRDMohammad Hosseini, Viswanathan Swaminathan
We demonstrate an adaptive bandwidth-efficient 360 VR video streaming system based on MPEG-DASH SRD. We extend MPEG-DASH SRD to the 3D space of 360 VR videos, and showcase a dynamic view-aware adaptation technique to tackle the high bandwidth demands of streaming 360 VR videos to wireless VR headsets. We spatially partition the underlying 3D mesh into multiple 3D sub-meshes, and construct an efficient 3D geometry mesh called hexaface sphere to optimally represent tiled 360 VR videos in the 3D space. We then spatially divide the 360 videos into multiple tiles while encoding and packaging, use MPEG-DASH SRD to describe the spatial relationship of tiles in the 3D space, and prioritize the tiles in the Field of View (FoV) for view-aware adaptation. Our initial evaluation results show that we can save up to 72% of the required bandwidth on 360 VR video streaming with minor negative quality impacts compared to the baseline scenario when no adaptations is applied.
MMSep 28, 2016
Adaptive 360 VR Video Streaming: Divide and Conquer!Mohammad Hosseini, Viswanathan Swaminathan
While traditional multimedia applications such as games and videos are still popular, there has been a significant interest in the recent years towards new 3D media such as 3D immersion and Virtual Reality (VR) applications, especially 360 VR videos. 360 VR video is an immersive spherical video where the user can look around during playback. Unfortunately, 360 VR videos are extremely bandwidth intensive, and therefore are difficult to stream at acceptable quality levels. In this paper, we propose an adaptive bandwidth-efficient 360 VR video streaming system using a divide and conquer approach. In our approach, we propose a dynamic view-aware adaptation technique to tackle the huge streaming bandwidth demands of 360 VR videos. We spatially divide the videos into multiple tiles while encoding and packaging, use MPEG-DASH SRD to describe the spatial relationship of tiles in the 360-degree space, and prioritize the tiles in the Field of View (FoV). In order to describe such tiled representations, we extend MPEG-DASH SRD to the 3D space of 360 VR videos. We spatially partition the underlying 3D mesh, and construct an efficient 3D geometry mesh called hexaface sphere to optimally represent a tiled 360 VR video in the 3D space. Our initial evaluation results report up to 72% bandwidth savings on 360 VR video streaming with minor negative quality impacts compared to the baseline scenario when no adaptations is applied.
MMMar 19, 2016
Towards Coordinated Bandwidth Adaptations for Hundred-Scale 3D Tele-Immersive SystemsMohammad Hosseini, Gregorij Kurillo, Seyed Rasoul Etesami et al.
3D tele-immersion improves the state of collaboration among geographically distributed participants. Unlike the traditional 2D videos, a 3D tele-immersive system employs multiple 3D cameras based in each physical site to cover a much larger field of view, generating a very large amount of stream data. One of the major challenges is how to efficiently transmit these bulky 3D streaming data to bandwidth-constrained sites. In this paper, we study an adaptive Human Visual System (HVS) -compliant bandwidth management framework for efficient delivery of hundred-scale streams produced from distributed 3D tele-immersive sites to a receiver site with limited bandwidth budget. Our adaptation framework exploits the semantics link of HVS with multiple 3D streams in the 3D tele-immersive environment. We developed TELEVIS, a visual simulation tool to showcase a HVS-aware tele-immersive system for realistic cases. Our evaluation results show that the proposed adaptation can improve the total quality per unit of bandwidth used to deliver streams in 3D tele-immersive systems.
SEOct 19, 2015
SINk: A Middleware for Synchronization of Heterogeneous Software InterfacesMohammad Hosseini, Yu Jiang, Poliang Wu et al.
Software is everywhere. The increasing requirement of supporting a wide variety of domains has rapidly increased the complexity of software systems, making them hard to maintain and the training process harder for end-users, which in turn ultimately demanded the development of user-friendly application software with simple interfaces that makes them easy, especially for non-professional personnel, to employ, and interact with. However, due to the lack of source code access for third-party software and the lack of non-graphical interfaces such as web-services or RMI (Remote Method Invocation) access to application functionality, synchronization between heterogeneous closed-box software interfaces and a user-friendly version of those interfaces has become a major challenge. In this paper, we design SINk, a middleware that enables synchronization of multiple heterogeneous software applications, using only graphical interface, without the need for source code access or access to the entire platform's control. SINk helps with synchronization of closed-box industry software, where in fact the only possible way of communication is through software interfaces. It leverages efficient client sever architecture, socket based protocol, adaptation to resolution changes, and parameter mapping mechanisms to transfer control events to ensure the real-time requirements of synchronization among multiple interfaces are met. Our proof-of-concept evaluation shows there is in fact potential usage of our middleware in a wide variety of domains.