Reza Rawassizadeh

LG
h-index43
25papers
281citations
Novelty37%
AI Score53

25 Papers

CVApr 15, 2023Code
Beta-Rank: A Robust Convolutional Filter Pruning Method For Imbalanced Medical Image Analysis

Morteza Homayounfar, Mohamad Koohi-Moghadam, Reza Rawassizadeh et al.

As deep neural networks include a high number of parameters and operations, it can be a challenge to implement these models on devices with limited computational resources. Despite the development of novel pruning methods toward resource-efficient models, it has become evident that these models are not capable of handling "imbalanced" and "limited number of data points". We proposed a novel filter pruning method by considering the input and output of filters along with the values of the filters that deal with imbalanced datasets better than others. Our pruning method considers the fact that all information about the importance of a filter may not be reflected in the value of the filter. Instead, it is reflected in the changes made to the data after the filter is applied to it. In this work, three methods are compared with the same training conditions except for the ranking values of each method, and 14 methods are compared from other papers. We demonstrated that our model performed significantly better than other methods for imbalanced medical datasets. For example, when we removed up to 58% of FLOPs for the IDRID dataset and up to 45% for the ISIC dataset, our model was able to yield an equivalent (or even superior) result to the baseline model. To evaluate FLOP and parameter reduction using our model in real-world settings, we built a smartphone app, where we demonstrated a reduction of up to 79% in memory usage and 72% in prediction time. All codes and parameters for training different models are available at https://github.com/mohofar/Beta-Rank

CVNov 16, 2022Code
LightDepth: A Resource Efficient Depth Estimation Approach for Dealing with Ground Truth Sparsity via Curriculum Learning

Fatemeh Karimi, Amir Mehrpanah, Reza Rawassizadeh

Advances in neural networks enable tackling complex computer vision tasks such as depth estimation of outdoor scenes at unprecedented accuracy. Promising research has been done on depth estimation. However, current efforts are computationally resource-intensive and do not consider the resource constraints of autonomous devices, such as robots and drones. In this work, we present a fast and battery-efficient approach for depth estimation. Our approach devises model-agnostic curriculum-based learning for depth estimation. Our experiments show that the accuracy of our model performs on par with the state-of-the-art models, while its response time outperforms other models by 71%. All codes are available online at https://github.com/fatemehkarimii/LightDepth.

SDAug 24, 2023
A Survey of AI Music Generation Tools and Models

Yueyue Zhu, Jared Baca, Banafsheh Rekabdar et al.

In this work, we provide a comprehensive survey of AI music generation tools, including both research projects and commercialized applications. To conduct our analysis, we classified music generation approaches into three categories: parameter-based, text-based, and visual-based classes. Our survey highlights the diverse possibilities and functional features of these tools, which cater to a wide range of users, from regular listeners to professional musicians. We observed that each tool has its own set of advantages and limitations. As a result, we have compiled a comprehensive list of these factors that should be considered during the tool selection process. Moreover, our survey offers critical insights into the underlying mechanisms and challenges of AI music generation.

CVOct 9, 2023
Augmenting Vision-Based Human Pose Estimation with Rotation Matrix

Milad Vazan, Fatemeh Sadat Masoumi, Ruizhi Ou et al.

Fitness applications are commonly used to monitor activities within the gym, but they often fail to automatically track indoor activities inside the gym. This study proposes a model that utilizes pose estimation combined with a novel data augmentation method, i.e., rotation matrix. We aim to enhance the classification accuracy of activity recognition based on pose estimation data. Through our experiments, we experiment with different classification algorithms along with image augmentation approaches. Our findings demonstrate that the SVM with SGD optimization, using data augmentation with the Rotation Matrix, yields the most accurate results, achieving a 96% accuracy rate in classifying five physical activities. Conversely, without implementing the data augmentation techniques, the baseline accuracy remains at a modest 64%.

CLOct 24, 2022
Speeding Up Question Answering Task of Language Models via Inverted Index

Xiang Ji, Yesim Sungu-Eryilmaz, Elaheh Momeni et al.

Natural language processing applications, such as conversational agents and their question-answering capabilities, are widely used in the real world. Despite the wide popularity of large language models (LLMs), few real-world conversational agents take advantage of LLMs. Extensive resources consumed by LLMs disable developers from integrating them into end-user applications. In this study, we leverage an inverted indexing mechanism combined with LLMs to improve the efficiency of question-answering models for closed-domain questions. Our experiments show that using the index improves the average response time by 97.44%. In addition, due to the reduced search scope, the average BLEU score improved by 0.23 while using the inverted index.

LGJul 5, 2024
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models

Heng Lu, Mehdi Alemi, Reza Rawassizadeh

Deep reinforcement learning (DRL) has achieved remarkable success across various domains, such as video games, robotics, and, recently, large language models. However, the computational costs and memory requirements of DRL models often limit their deployment in resource-constrained environments. The challenge underscores the urgent need to explore neural network compression methods to make RDL models more practical and broadly applicable. Our study investigates the impact of two prominent compression methods, quantization and pruning on DRL models. We examine how these techniques influence four performance factors: average return, memory, inference time, and battery utilization across various DRL algorithms and environments. Despite the decrease in model size, we identify that these compression techniques generally do not improve the energy efficiency of DRL models, but the model size decreases. We provide insights into the trade-offs between model compression and DRL performance, offering guidelines for deploying efficient DRL models in resource-constrained settings.

LGJul 9, 2025Code
Attentions Under the Microscope: A Comparative Study of Resource Utilization for Variants of Self-Attention

Zhengyu Tian, Anantha Padmanaban Krishna Kumar, Hemant Krishnakumar et al.

As large language models (LLMs) and visual language models (VLMs) grow in scale and application, attention mechanisms have become a central computational bottleneck due to their high memory and time complexity. While many efficient attention variants have been proposed, there remains a lack of rigorous evaluation on their actual energy usage and hardware resource demands during training. In this work, we benchmark eight attention mechanisms in training GPT-2 architecture, measuring key metrics including training time, GPU memory usage, FLOPS, CPU usage, and power consumption. Our results reveal that attention mechanisms with optimized kernel implementations, including Flash Attention, Locality-Sensitive Hashing (LSH) Attention, and Multi-Head Latent Attention (MLA), achieve the best energy efficiency. We further show that lower GPU power alone does not guarantee reduced energy use, as training time plays an equally important role. Our study highlights the importance of energy-aware benchmarking in attention design and provides a practical insight for selecting resource-efficient mechanisms. All our codes are available at GitHub.

IVMay 6, 2020Code
CovidCTNet: An Open-Source Deep Learning Approach to Identify Covid-19 Using CT Image

Tahereh Javaheri, Morteza Homayounfar, Zohreh Amoozgar et al.

Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, its accuracy in detection is only ~70-75%. Another approved strategy is computed tomography (CT) imaging. CT imaging has a much higher sensitivity of ~80-98%, but similar accuracy of 70%. To enhance the accuracy of CT imaging detection, we developed an open-source set of algorithms called CovidCTNet that successfully differentiates Covid-19 from community-acquired pneumonia (CAP) and other lung diseases. CovidCTNet increases the accuracy of CT imaging detection to 90% compared to radiologists (70%). The model is designed to work with heterogeneous and small sample sizes independent of the CT imaging hardware. In order to facilitate the detection of Covid-19 globally and assist radiologists and physicians in the screening process, we are releasing all algorithms and parametric details in an open-source format. Open-source sharing of our CovidCTNet enables developers to rapidly improve and optimize services, while preserving user privacy and data ownership.

CVMar 19
Interpretable Prostate Cancer Detection using a Small Cohort of MRI Images

Vahid Monfared, Mohammad Hadi Gharib, Ali Sabri et al.

Prostate cancer is a leading cause of mortality in men, yet interpretation of T2-weighted prostate MRI remains challenging due to subtle and heterogeneous lesions. We developed an interpretable framework for automatic cancer detection using a small dataset of 162 T2-weighted images (102 cancer, 60 normal), addressing data scarcity through transfer learning and augmentation. We performed a comprehensive comparison of Vision Transformers (ViT, Swin), CNNs (ResNet18), and classical methods (Logistic Regression, SVM, HOG+SVM). Transfer-learned ResNet18 achieved the best performance (90.9% accuracy, 95.2% sensitivity, AUC 0.905) with only 11M parameters, while Vision Transformers showed lower performance despite substantially higher complexity. Notably, HOG+SVM achieved comparable accuracy (AUC 0.917), highlighting the effectiveness of handcrafted features in small datasets. Unlike state-of-the-art approaches relying on biparametric MRI (T2+DWI) and large cohorts, our method achieves competitive performance using only T2-weighted images, reducing acquisition complexity and computational cost. In a reader study of 22 cases, five radiologists achieved a mean sensitivity of 67.5% (Fleiss Kappa = 0.524), compared to 95.2% for the AI model, suggesting potential for AI-assisted screening to reduce missed cancers and improve consistency. Code and data are publicly available.

LGMay 3
NeuroViz: Real-time Interactive Visualization of Forward and Backward Passes in Neural Network Training

Reza Rawassizadeh, Tanvi Sharma

Training neural networks is difficult to interpret, particularly for newcomers. We introduce NeuroViz, an interactive visualization tool that supports real-time exploration of fully connected neural network training. Users can configure network architecture, activation functions, learning rates, and datasets, then observe activations, weight updates, and loss progression. NeuroViz visualizes weight changes in direct correspondence with activation signals in both forward and backward passes, enabling users to distinguish pre- and post-update states within individual epochs and view dynamically updating per-neuron equations. We conduct a comparative user study with 31 participants against six established visualization tools and we achieved the highest usability score (SUS 80.97, in the 'excellent' range), with mean rankings of 2.47 for clarity and 2.23 for usefulness (lower is better). Over 70% of participants reported that the visualizations substantially increased their perception of neural network training transparency. The implemented instance is accessible at https://neuroviz.org.

DBApr 12, 2024
Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases

Xiang Zhang, Khatoon Khedri, Reza Rawassizadeh

Large Language Models (LLMs) can automate or substitute different types of tasks in the software engineering process. This study evaluates the resource utilization and accuracy of LLM in interpreting and executing natural language queries against traditional SQL within relational database management systems. We empirically examine the resource utilization and accuracy of nine LLMs varying from 7 to 34 Billion parameters, including Llama2 7B, Llama2 13B, Mistral, Mixtral, Optimus-7B, SUS-chat-34B, platypus-yi-34b, NeuralHermes-2.5-Mistral-7B and Starling-LM-7B-alpha, using a small transaction dataset. Our findings indicate that using LLMs for database queries incurs significant energy overhead (even small and quantized models), making it an environmentally unfriendly approach. Therefore, we advise against replacing relational databases with LLMs due to their substantial resource utilization.

SEFeb 7, 2025
Analyzing the Resource Utilization of Lambda Functions on Mobile Devices: Case Studies on Kotlin and Swift

Chibundom U. Ejimuda, Gaston Longhitano, Reza Rawassizadeh

With billions of smartphones in use globally, the daily time spent on these devices contributes significantly to overall electricity consumption. Given this scale, even minor reductions in smartphone power use could result in substantial energy savings. This study explores the impact of Lambda functions on resource consumption in mobile programming. While Lambda functions are known for enhancing code readability and conciseness, their use does not add to the functional capabilities of a programming language. Our research investigates the implications of using Lambda functions in terms of battery utilization, memory usage, and execution time compared to equivalent code structures without Lambda functions. Our findings reveal that Lambda functions impose a considerable resource overhead on mobile devices without offering additional functionalities.

LGOct 24, 2025
Pruning and Quantization Impact on Graph Neural Networks

Khatoon Khedri, Reza Rawassizadeh, Qifu Wen et al.

Graph neural networks (GNNs) are known to operate with high accuracy on learning from graph-structured data, but they suffer from high computational and resource costs. Neural network compression methods are used to reduce the model size while maintaining reasonable accuracy. Two of the common neural network compression techniques include pruning and quantization. In this research, we empirically examine the effects of three pruning methods and three quantization methods on different GNN models, including graph classification tasks, node classification tasks, and link prediction. We conducted all experiments on three graph datasets, including Cora, Proteins, and BBBP. Our findings demonstrate that unstructured fine-grained and global pruning can significantly reduce the model's size(50\%) while maintaining or even improving precision after fine-tuning the pruned model. The evaluation of different quantization methods on GNN shows diverse impacts on accuracy, inference time, and model size across different datasets.

LGSep 1, 2025
GradES: Significantly Faster Training in Transformers with Gradient-Based Early Stopping

Qifu Wen, Xi Zeng, Zihan Zhou et al.

Early stopping monitors global validation loss and halts all parameter updates simultaneously, which is computationally costly for large transformers due to the extended time required for validation inference. We propose \textit{GradES}, a novel gradient-based early stopping approach that operates within transformer components (attention projections and Feed-Forward layer matrices). We found that different components converge at varying rates during fine-tuning for both language and vision-language models. \textit{GradES} tracks the magnitude of gradient changes in backpropagation for these matrices during training. When a projection matrix's magnitude of gradient changes fall below a convergence threshold $τ$, we exclude that projection matrix from further updates individually, eliminating costly validation passes while allowing slow converging matrices to continue learning. \textit{GradES} speeds up training time by 1.57--7.22$\times$ while simultaneously enhancing generalization through early prevention of overfitting, resulting in 1.2\% higher average accuracy in language tasks and 3.88\% on multimodal benchmarks.

SDAug 31, 2025
TinyMusician: On-Device Music Generation with Knowledge Distillation and Mixed Precision Quantization

Hainan Wang, Mehdi Hosseinzadeh, Reza Rawassizadeh

The success of the generative model has gained unprecedented attention in the music generation area. Transformer-based architectures have set new benchmarks for model performance. However, their practical adoption is hindered by some critical challenges: the demand for massive computational resources and inference time, due to their large number of parameters. These obstacles make them infeasible to deploy on edge devices, such as smartphones and wearables, with limited computational resources. In this work, we present TinyMusician, a lightweight music generation model distilled from MusicGen (a State-of-the-art music generation model). TinyMusician integrates two innovations: (i) Stage-mixed Bidirectional and Skewed KL-Divergence and (ii) Adaptive Mixed-Precision Quantization. The experimental results demonstrate that TinyMusician retains 93% of the MusicGen-Small performance with 55% less model size. TinyMusician is the first mobile-deployable music generation model that eliminates cloud dependency while maintaining high audio fidelity and efficient resource usage

IRJan 31, 2022
ODSearch: Fast and Resource Efficient On-device Natural Language Search for Fitness Trackers' Data

Reza Rawassizadeh, Yi Rong

Mobile and wearable technologies have promised significant changes to the healthcare industry. Although cutting-edge communication and cloud-based technologies have allowed for these upgrades, their implementation and popularization in low-income countries have been challenging. We propose "ODSearch", an On-device Search framework equipped with a natural language interface for mobile and wearable devices. To implement search, "ODSearch" employs compression and Bloom filter, it provides near real-time search query responses without network dependency. In particular, the Bloom filter reduces the temporal scope of the search and compression reduces the size of the data to be searched. Our experiments were conducted on a mobile phone and smartwatch. We compared "ODSearch" with current state-of-the-art search mechanisms, and it outperformed them on average by 53 times in execution time, 26 times in energy usage, and 2.3% in memory utilization.

LGFeb 2, 2021
FEDZIP: A Compression Framework for Communication-Efficient Federated Learning

Amirhossein Malekijoo, Mohammad Javad Fadaeieslam, Hanieh Malekijou et al.

Federated Learning marks a turning point in the implementation of decentralized machine learning (especially deep learning) for wireless devices by protecting users' privacy and safeguarding raw data from third-party access. It assigns the learning process independently to each client. First, clients locally train a machine learning model based on local data. Next, clients transfer local updates of model weights and biases (training data) to a server. Then, the server aggregates updates (received from clients) to create a global learning model. However, the continuous transfer between clients and the server increases communication costs and is inefficient from a resource utilization perspective due to the large number of parameters (weights and biases) used by deep learning models. The cost of communication becomes a greater concern when the number of contributing clients and communication rounds increases. In this work, we propose a novel framework, FedZip, that significantly decreases the size of updates while transferring weights from the deep learning model between clients and their servers. FedZip implements Top-z sparsification, uses quantization with clustering, and implements compression with three different encoding methods. FedZip outperforms state-of-the-art compression frameworks and reaches compression rates up to 1085x, and preserves up to 99% of bandwidth and 99% of energy for clients during communication.

HCNov 12, 2020
Immediate or Reflective?: Effects of Real-timeFeedback on Group Discussions over Videochat

Samiha Samrose, Reza Rawassizadeh, Ehsan Hoque

Having a group discussion with the members holding conflicting viewpoints is difficult. It is especially challenging for machine-mediated discussions in which the subtle social cues are hard to notice. We present a fully automated videochat framework that can automatically analyze audio-video data of the participants and provide real-time feedback on participation, interruption, volume, and facial emotion. In a heated discourse, these features are especially aligned with the undesired characteristics of dominating the conversation without taking turns, interrupting constantly, raising voice, and expressing negative emotion. We conduct a treatment-control user study with 40 participants having 20 sessions in total. We analyze the immediate and the reflective effects of real-time feedback on participants. Our findings show that while real-time feedback can make the ongoing discussion significantly less spontaneous, its effects propagate to successive sessions bringing significantly more expressiveness to the team. Our explorations with instant and propagated impacts of real-time feedback can be useful for developing design strategies for various collaborative environments.

LGJun 8, 2020
SEFR: A Fast Linear-Time Classifier for Ultra-Low Power Devices

Hamidreza Keshavarz, Mohammad Saniee Abadeh, Reza Rawassizadeh

A fundamental challenge for running machine learning algorithms on battery-powered devices is the time and energy limitations, as these devices have constraints on resources. There are resource-efficient classifier algorithms that can run on these devices, but their accuracy is often sacrificed for resource efficiency. Here, we propose an ultra-low power classifier, SEFR, with linear time complexity, both in the training and the testing phases. SEFR is comparable to state-of-the-art classifiers in terms of classification accuracy, but it is 63 times faster and 70 times more energy efficient than the average of state-of-the-art and baseline classifiers on binary class datasets. The energy and memory consumption of SEFR is very insignificant, and it can even perform both train and test phases on microcontrollers. To our knowledge, this is the first multipurpose classification algorithm specifically designed to perform both training and testing on ultra-low power devices.

CYMay 5, 2019
Public vs Media Opinion on Robots

Alireza Javaheri, Navid Moghadamnejad, Hamidreza Keshavarz et al.

Fast proliferation of robots in people's everyday lives during recent years calls for a profound examination of public consensus, which is the ultimate determinant of the future of this industry. This paper investigates text corpora, consisting of posts in Twitter, Google News, Bing News, and Kickstarter, over an 8 year period to quantify the public and media opinion about this emerging technology. Results demonstrate that the news platforms and the public take an overall positive position on robots. However, there is a deviation between news coverage and people's attitude. Among various robot types, sex robots raise the fiercest debate. Besides, our evaluation reveals that the public and news media conceptualization of robotics has altered over the recent years. More specifically, a shift from the solely industrial-purposed machines, towards more social, assistive, and multi-purpose gadgets is visible.

HCNov 7, 2018
A Virtual Conversational Agent for Teens with Autism: Experimental Results and Design Lessons

Mohammad Rafayet Ali, Zahra Razavi, Abdullah Al Mamun et al.

We present the design of an online social skills development interface for teenagers with autism spectrum disorder (ASD). The interface is intended to enable private conversation practice anywhere, anytime using a web-browser. Users converse informally with a virtual agent, receiving feedback on nonverbal cues in real-time, and summary feedback. The prototype was developed in consultation with an expert UX designer, two psychologists, and a pediatrician. Using the data from 47 individuals, feedback and dialogue generation were automated using a hidden Markov model and a schema-driven dialogue manager capable of handling multi-topic conversations. We conducted a study with nine high-functioning ASD teenagers. Through a thematic analysis of post-experiment interviews, identified several key design considerations, notably: 1) Users should be fully briefed at the outset about the purpose and limitations of the system, to avoid unrealistic expectations. 2) An interface should incorporate positive acknowledgment of behavior change. 3) Realistic appearance of a virtual agent and responsiveness are important in engaging users. 4) Conversation personalization, for instance in prompting laconic users for more input and reciprocal questions, would help the teenagers engage for longer terms and increase the system's utility.

HCNov 22, 2016
A Natural Language Query Interface for Searching Personal Information on Smartwatches

Reza Rawassizadeh, Chelsea Dobbins, Manouchehr Nourizadeh et al.

Currently, personal assistant systems, run on smartphones and use natural language interfaces. However, these systems rely mostly on the web for finding information. Mobile and wearable devices can collect an enormous amount of contextual personal data such as sleep and physical activities. These information objects and their applications are known as quantified-self, mobile health or personal informatics, and they can be used to provide a deeper insight into our behavior. To our knowledge, existing personal assistant systems do not support all types of quantified-self queries. In response to this, we have undertaken a user study to analyze a set of "textual questions/queries" that users have used to search their quantified-self or mobile health data. Through analyzing these questions, we have constructed a light-weight natural language based query interface, including a text parser algorithm and a user interface, to process the users' queries that have been used for searching quantified-self information. This query interface has been designed to operate on small devices, i.e. smartwatches, as well as augmenting the personal assistant systems by allowing them to process end users' natural language queries about their quantified-self data.

AIAug 11, 2016
Learning Mobile App Usage Routine through Learning Automata

Ramin Rahnamoun, Reza Rawassizadeh, Arash Maskooki

Since its conception, smart app market has grown exponentially. Success in the app market depends on many factors among which the quality of the app is a significant contributor, such as energy use. Nevertheless, smartphones, as a subset of mobile computing devices. inherit the limited power resource constraint. Therefore, there is a challenge of maintaining the resource while increasing the target app quality. This paper introduces Learning Automata (LA) as an online learning method to learn and predict the app usage routines of the users. Such prediction can leverage the app cache functionality of the operating system and thus (i) decreases app launch time and (ii) preserve battery. Our algorithm, which is an online learning approach, temporally updates and improves the internal states of itself. In particular, it learns the transition probabilities between app launching. Each App launching instance updates the transition probabilities related to that App, and this will result in improving the prediction. We benefit from a real-world lifelogging dataset and our experimental results show considerable success with respect to the two baseline methods that are used currently for smartphone app prediction approaches.

HCDec 20, 2014
Micro-Navigation for Urban Bus Passengers: Using the Internet of Things to Improve the Public Transport Experience

Stefan Foell, Gerd Kortuem, Reza Rawassizadeh et al.

Public bus services are widely deployed in cities around the world because they provide cost-effective and economic public transportation. However, from a passenger point of view urban bus systems can be complex and difficult to navigate, especially for disadvantaged users, i.e. tourists, novice users, older people, and people with impaired cognitive or physical abilities. We present Urban Bus Navigator (UBN), a reality-aware urban navigation system for bus passengers with the ability to recognize and track the physical public transport infrastructure such as buses. Unlike traditional location-aware mobile transport applications, UBN acts as a true navigation assistant for public transport users. Insights from a six-month long trial in Madrid indicate that UBN removes barriers for public transport usage and has a positive impact on how people feel about public transport journeys.

HCNov 18, 2014
Scalable Mining of Daily Behavioral Patterns in Context Sensing Life-Log Data

Reza Rawassizadeh, Elaheh Momeni, Prajna Shetty

Despite the advent of wearable devices and the proliferation of smartphones, there still is no ideal platform that can continuously sense and precisely collect all available contextual information. Ideally, mobile sensing data collection approaches should deal with uncertainty and data loss originating from software and hardware restrictions. We have conducted life logging data collection experiments from 35 users and created a rich dataset (9.26 million records) to represent the real-world deployment issues of mobile sensing systems. We create a novel set of algorithms to identify human behavioral motifs while considering the uncertainty of collected data objects. Our work benefits from combinations of sensors available on a device and identifies behavioral patterns with a temporal granularity similar to human time perception. Employing a combination of sensors rather than focusing on only one sensor can handle uncertainty by neglecting sensor data that is not available and focusing instead on available data. Moreover, by experimenting on two real, large datasets, we demonstrate that using a sliding window significantly improves the scalability of our algorithms, which can be used by applications for small devices, such as smartphones and wearables.