Aram Avetisyan

LG
h-index111
8papers
20citations
Novelty52%
AI Score43

8 Papers

CVDec 29, 2025
Multi-Track Multimodal Learning on iMiGUE: Micro-Gesture and Emotion Recognition

Arman Martirosyan, Shahane Tigranyan, Maria Razzhivina et al.

Micro-gesture recognition and behavior-based emotion prediction are both highly challenging tasks that require modeling subtle, fine-grained human behaviors, primarily leveraging video and skeletal pose data. In this work, we present two multimodal frameworks designed to tackle both problems on the iMiGUE dataset. For micro-gesture classification, we explore the complementary strengths of RGB and 3D pose-based representations to capture nuanced spatio-temporal patterns. To comprehensively represent gestures, video, and skeletal embeddings are extracted using MViTv2-S and 2s-AGCN, respectively. Then, they are integrated through a Cross-Modal Token Fusion module to combine spatial and pose information. For emotion recognition, our framework extends to behavior-based emotion prediction, a binary classification task identifying emotional states based on visual cues. We leverage facial and contextual embeddings extracted using SwinFace and MViTv2-S models and fuse them through an InterFusion module designed to capture emotional expressions and body gestures. Experiments conducted on the iMiGUE dataset, within the scope of the MiGA 2025 Challenge, demonstrate the robust performance and accuracy of our method in the behavior-based emotion prediction task, where our approach secured 2nd place.

OCJan 13
Accelerated Methods with Complexity Separation Under Data Similarity for Federated Learning Problems

Dmitry Bylinkin, Sergey Skorik, Dmitriy Bystrov et al.

Heterogeneity within data distribution poses a challenge in many modern federated learning tasks. We formalize it as an optimization problem involving a computationally heavy composite under data similarity. By employing different sets of assumptions, we present several approaches to develop communication-efficient methods. An optimal algorithm is proposed for the convex case. The constructed theory is validated through a series of experiments across various problems.

SPJul 15, 2024
SSSD-ECG-nle: New Label Embeddings with Structured State-Space Models for ECG generation

Sergey Skorik, Aram Avetisyan

An electrocardiogram (ECG) is vital for identifying cardiac diseases, offering crucial insights for diagnosing heart conditions and informing potentially life-saving treatments. However, like other types of medical data, ECGs are subject to privacy concerns when distributed and analyzed. Diffusion models have made significant progress in recent years, creating the possibility for synthesizing data comparable to the real one and allowing their widespread adoption without privacy concerns. In this paper, we use diffusion models with structured state spaces for generating digital 10-second 12-lead ECG signals. We propose the SSSD-ECG-nle architecture based on SSSD-ECG with a modified conditioning mechanism and demonstrate its efficiency on downstream tasks. We conduct quantitative and qualitative evaluations, including analyzing convergence speed, the impact of adding positive samples, and assessment with physicians' expert knowledge. Finally, we share the results of physician evaluations and also make synthetic data available to ensure the reproducibility of the experiments described.

LGAug 19, 2025
Communication-Efficient Federated Learning with Adaptive Number of Participants

Sergey Skorik, Vladislav Dorofeev, Gleb Molodtsov et al.

Rapid scaling of deep learning models has enabled performance gains across domains, yet it introduced several challenges. Federated Learning (FL) has emerged as a promising framework to address these concerns by enabling decentralized training. Nevertheless, communication efficiency remains a key bottleneck in FL, particularly under heterogeneous and dynamic client participation. Existing methods, such as FedAvg and FedProx, or other approaches, including client selection strategies, attempt to mitigate communication costs. However, the problem of choosing the number of clients in a training round remains extremely underexplored. We introduce Intelligent Selection of Participants (ISP), an adaptive mechanism that dynamically determines the optimal number of clients per round to enhance communication efficiency without compromising model accuracy. We validate the effectiveness of ISP across diverse setups, including vision transformers, real-world ECG classification, and training with gradient compression. Our results show consistent communication savings of up to 30\% without losing the final quality. Applying ISP to different real-world ECG classification setups highlighted the selection of the number of clients as a separate task of federated learning.

LGMay 12, 2025
Trial and Trust: Addressing Byzantine Attacks with Comprehensive Defense Strategy

Gleb Molodtsov, Daniil Medyakov, Sergey Skorik et al.

Recent advancements in machine learning have improved performance while also increasing computational demands. While federated and distributed setups address these issues, their structure is vulnerable to malicious influences. In this paper, we address a specific threat, Byzantine attacks, where compromised clients inject adversarial updates to derail global convergence. We combine the trust scores concept with trial function methodology to dynamically filter outliers. Our methods address the critical limitations of previous approaches, allowing functionality even when Byzantine nodes are in the majority. Moreover, our algorithms adapt to widely used scaled methods like Adam and RMSProp, as well as practical scenarios, including local training and partial participation. We validate the robustness of our methods by conducting extensive experiments on both synthetic and real ECG data collected from medical institutions. Furthermore, we provide a broad theoretical analysis of our algorithms and their extensions to aforementioned practical setups. The convergence guarantees of our methods are comparable to those of classical algorithms developed without Byzantine interference.

LGJun 4, 2024
Self-Trained Model for ECG Complex Delineation

Aram Avetisyan, Nikolas Khachaturov, Ariana Asatryan et al.

Electrocardiogram (ECG) delineation plays a crucial role in assisting cardiologists with accurate diagnoses. Prior research studies have explored various methods, including the application of deep learning techniques, to achieve precise delineation. However, existing approaches face limitations primarily related to dataset size and robustness. In this paper, we introduce a dataset for ECG delineation and propose a novel self-trained method aimed at leveraging a vast amount of unlabeled ECG data. Our approach involves the pseudolabeling of unlabeled data using a neural network trained on our dataset. Subsequently, we train the model on the newly labeled samples to enhance the quality of delineation. We conduct experiments demonstrating that our dataset is a valuable resource for training robust models and that our proposed self-trained method improves the prediction quality of ECG delineation.

LGJun 2, 2024
Local Methods with Adaptivity via Scaling

Savelii Chezhegov, Sergey Skorik, Nikolas Khachaturov et al.

The rapid development of machine learning and deep learning has introduced increasingly complex optimization challenges that must be addressed. Indeed, training modern, advanced models has become difficult to implement without leveraging multiple computing nodes in a distributed environment. Distributed optimization is also fundamental to emerging fields such as federated learning. Specifically, there is a need to organize the training process to minimize the time lost due to communication. A widely used and extensively researched technique to mitigate the communication bottleneck involves performing local training before communication. This approach is the focus of our paper. Concurrently, adaptive methods that incorporate scaling, notably led by Adam, have gained significant popularity in recent years. Therefore, this paper aims to merge the local training technique with the adaptive approach to develop efficient distributed learning methods. We consider the classical Local SGD method and enhance it with a scaling feature. A crucial aspect is that the scaling is described generically, allowing us to analyze various approaches, including Adam, RMSProp, and OASIS, in a unified manner. In addition to theoretical analysis, we validate the performance of our methods in practice by training a neural network.

SPMay 19, 2023
Deep Neural Networks Generalization and Fine-Tuning for 12-lead ECG Classification

Aram Avetisyan, Shahane Tigranyan, Ariana Asatryan et al.

Numerous studies are aimed at diagnosing heart diseases based on 12-lead electrocardiographic (ECG) records using deep learning methods. These studies usually use specific datasets that differ in size and parameters, such as patient metadata, number of doctors annotating ECGs, types of devices for ECG recording, data preprocessing techniques, etc. It is well-known that high-quality deep neural networks trained on one ECG dataset do not necessarily perform well on another dataset or clinical settings. In this paper, we propose a methodology to improve the quality of heart disease prediction regardless of the dataset by training neural networks on a variety of datasets with further fine-tuning for the specific dataset. To show its applicability, we train different neural networks on a large private dataset TIS containing various ECG records from multiple hospitals and on a relatively small public dataset PTB-XL. We demonstrate that training the networks on a large dataset and fine-tuning it on a small dataset from another source outperforms the networks trained only on one small dataset. We also show how the ability of a deep neural networks to generalize allows to improve classification quality of more diseases.