Soorin Yim

LG
h-index5
6papers
9citations
Novelty46%
AI Score39

6 Papers

LGSep 25, 2024
Task Addition in Multi-Task Learning by Geometrical Alignment

Soorin Yim, Dae-Woong Jeong, Sung Moon Ko et al.

Training deep learning models on limited data while maintaining generalization is one of the fundamental challenges in molecular property prediction. One effective solution is transferring knowledge extracted from abundant datasets to those with scarce data. Recently, a novel algorithm called Geometrically Aligned Transfer Encoder (GATE) has been introduced, which uses soft parameter sharing by aligning the geometrical shapes of task-specific latent spaces. However, GATE faces limitations in scaling to multiple tasks due to computational costs. In this study, we propose a task addition approach for GATE to improve performance on target tasks with limited data while minimizing computational complexity. It is achieved through supervised multi-task pre-training on a large dataset, followed by the addition and training of task-specific modules for each target task. Our experiments demonstrate the superior performance of the task addition strategy for GATE over conventional multi-task methods, with comparable computational costs.

LGMay 3, 2024
Multitask Extension of Geometrically Aligned Transfer Encoder

Sung Moon Ko, Sumin Lee, Dae-Woong Jeong et al.

Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup. Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data.

LGOct 27, 2025
Towards a Generalizable AI for Materials Discovery: Validation through Immersion Coolant Screening

Hyunseung Kim, Dae-Woong Jeong, Changyoung Park et al.

Artificial intelligence (AI) has emerged as a powerful accelerator of materials discovery, yet most existing models remain problem-specific, requiring additional data collection and retraining for each new property. Here we introduce and validate GATE (Geometrically Aligned Transfer Encoder) -- a generalizable AI framework that jointly learns 34 physicochemical properties spanning thermal, electrical, mechanical, and optical domains. By aligning these properties within a shared geometric space, GATE captures cross-property correlations that reduce disjoint-property bias -- a key factor causing false positives in multi-criteria screening. To demonstrate its generalizable utility, GATE -- without any problem-specific model reconfiguration -- applied to the discovery of immersion cooling fluids for data centers, a stringent real-world challenge defined by the Open Compute Project (OCP). Screening billions of candidates, GATE identified 92,861 molecules as promising for practical deployment. Four were experimentally or literarily validated, showing strong agreement with wet-lab measurements and performance comparable to or exceeding a commercial coolant. These results establish GATE as a generalizable AI platform readily applicable across diverse materials discovery tasks.

LGSep 25, 2025
Robust Multi-Omics Integration from Incomplete Modalities Significantly Improves Prediction of Alzheimer's Disease

Sungjoon Park, Kyungwook Lee, Soorin Yim et al.

Multi-omics data capture complex biomolecular interactions and provide insights into metabolism and disease. However, missing modalities hinder integrative analysis across heterogeneous omics. To address this, we present MOIRA (Multi-Omics Integration with Robustness to Absent modalities), an early integration method enabling robust learning from incomplete omics data via representation alignment and adaptive aggregation. MOIRA leverages all samples, including those with missing modalities, by projecting each omics dataset onto a shared embedding space where a learnable weighting mechanism fuses them. Evaluated on the Religious Order Study and Memory and Aging Project (ROSMAP) dataset for Alzheimer's Disease (AD), MOIRA outperformed existing approaches, and further ablation studies confirmed modality-wise contributions. Feature importance analysis revealed AD-related biomarkers consistent with prior literature, highlighting the biological relevance of our approach.

LGJun 16, 2025
Geometric Embedding Alignment via Curvature Matching in Transfer Learning

Sung Moon Ko, Jaewan Lee, Sumin Lee et al.

Geometrical interpretations of deep learning models offer insightful perspectives into their underlying mathematical structures. In this work, we introduce a novel approach that leverages differential geometry, particularly concepts from Riemannian geometry, to integrate multiple models into a unified transfer learning framework. By aligning the Ricci curvature of latent space of individual models, we construct an interrelated architecture, namely Geometric Embedding Alignment via cuRvature matching in transfer learning (GEAR), which ensures comprehensive geometric representation across datapoints. This framework enables the effective aggregation of knowledge from diverse sources, thereby improving performance on target tasks. We evaluate our model on 23 molecular task pairs sourced from various domains and demonstrate significant performance gains over existing benchmark model under both random (14.4%) and scaffold (8.3%) data splits.

QMJul 5, 2021
An in silico drug repurposing pipeline to identify drugs with the potential to inhibit SARS-CoV-2 replication

Méabh MacMahon, Woochang Hwang, Soorin Yim et al.

Drug repurposing provides an opportunity to redeploy drugs, which ideally are already approved for use in humans, for the treatment of other diseases. For example, the repurposing of dexamethasone and baricitinib has played a crucial role in saving patient lives during the ongoing SARS-CoV-2 pandemic. There remains a need to expand therapeutic approaches to prevent life-threatening complications in patients with COVID-19. Using an in silico approach based on structural similarity to drugs already in clinical trials for COVID-19, potential drugs were predicted for repurposing. For a subset of identified drugs with different targets to their corresponding COVID-19 clinical trial drug, a mechanism of action analysis was applied to establish whether they might have a role in inhibiting the replication of SARS-CoV-2. Of sixty drugs predicted in this study, two with the potential to inhibit SARS-CoV-2 replication were identified using mechanism of action analysis. Triamcinolone is a corticosteroid that is structurally similar to dexamethasone; gallopamil is a calcium channel blocker that is structurally similar to verapamil. In silico approaches indicate possible mechanisms of action for both drugs in inhibiting SARS-CoV-2 replication. The identification of these drugs as potentially useful for patients with COVID-19 who are at a higher risk of developing severe disease supports the use of in silico approaches to facilitate quick and cost-effective drug repurposing. Such drugs could expand the number of treatments available to patients who are not protected by vaccination.