Kyuhan Lee

SI
h-index8
5papers
71citations
Novelty60%
AI Score32

5 Papers

AO-PHOct 20, 2022
Deep-Learning-Based Precipitation Nowcasting with Ground Weather Station Data and Radar Data

Jihoon Ko, Kyuhan Lee, Hyunjin Hwang et al.

Recently, many deep-learning techniques have been applied to various weather-related prediction tasks, including precipitation nowcasting (i.e., predicting precipitation levels and locations in the near future). Most existing deep-learning-based approaches for precipitation nowcasting, however, consider only radar and/or satellite images as inputs, and meteorological observations collected from ground weather stations, which are sparsely located, are relatively unexplored. In this paper, we propose ASOC, a novel attentive method for effectively exploiting ground-based meteorological observations from multiple weather stations. ASOC is designed to capture temporal dynamics of the observations and also contextual relationships between them. ASOC is easily combined with existing image-based precipitation nowcasting models without changing their architectures. We show that such a combination improves the average critical success index (CSI) of predicting heavy (at least 10 mm/hr) and light (at least 1 mm/hr) rainfall events at 1-6 hr lead times by 5.7%, compared to the original image-based model, using the radar images and ground-based observations around South Korea collected from 2014 to 2020.

SIOct 18, 2024
A Persuasion-Based Prompt Learning Approach to Improve Smishing Detection through Data Augmentation

Ho Sung Shim, Hyoungjun Park, Kyuhan Lee et al.

Smishing, which aims to illicitly obtain personal information from unsuspecting victims, holds significance due to its negative impacts on our society. In prior studies, as a tool to counteract smishing, machine learning (ML) has been widely adopted, which filters and blocks smishing messages before they reach potential victims. However, a number of challenges remain in ML-based smishing detection, with the scarcity of annotated datasets being one major hurdle. Specifically, given the sensitive nature of smishing-related data, there is a lack of publicly accessible data that can be used for training and evaluating ML models. Additionally, the nuanced similarities between smishing messages and other types of social engineering attacks such as spam messages exacerbate the challenge of smishing classification with limited resources. To tackle this challenge, we introduce a novel data augmentation method utilizing a few-shot prompt learning approach. What sets our approach apart from extant methods is the use of the principles of persuasion, a psychology theory which explains the underlying mechanisms of smishing. By designing prompts grounded in the persuasion principles, our augmented dataset could effectively capture various, important aspects of smishing messages, enabling ML models to be effectively trained. Our evaluation within a real-world context demonstrates that our augmentation approach produces more diverse and higher-quality smishing data instances compared to other cutting-edging approaches, leading to substantial improvements in the ability of ML models to detect the subtle characteristics of smishing messages. Moreover, our additional analyses reveal that the performance improvement provided by our approach is more pronounced when used with ML models that have a larger number of parameters, demonstrating its effectiveness in training large-scale ML models.

DBApr 1, 2025
MARIOH: Multiplicity-Aware Hypergraph Reconstruction

Kyuhan Lee, Geon Lee, Kijung Shin

Hypergraphs offer a powerful framework for modeling higher-order interactions that traditional pairwise graphs cannot fully capture. However, practical constraints often lead to their simplification into projected graphs, resulting in substantial information loss and ambiguity in representing higher-order relationships. In this work, we propose MARIOH, a supervised approach for reconstructing the original hypergraph from its projected graph by leveraging edge multiplicity. To overcome the difficulties posed by the large search space, MARIOH integrates several key ideas: (a) identifying provable size-2 hyperedges, which reduces the candidate search space, (b) predicting the likelihood of candidates being hyperedges by utilizing both structural and multiplicity-related features, and (c) not only targeting promising hyperedge candidates but also examining less confident ones to explore alternative possibilities. Together, these ideas enable MARIOH to efficiently and effectively explore the search space. In our experiments using 10 real-world datasets, MARIOH achieves up to 74.51% higher reconstruction accuracy compared to state-of-the-art methods.

CVFeb 17, 2022
Effective Training Strategies for Deep-learning-based Precipitation Nowcasting and Estimation

Jihoon Ko, Kyuhan Lee, Hyunjin Hwang et al.

Deep learning has been successfully applied to precipitation nowcasting. In this work, we propose a pre-training scheme and a new loss function for improving deep-learning-based nowcasting. First, we adapt U-Net, a widely-used deep-learning model, for the two problems of interest here: precipitation nowcasting and precipitation estimation from radar images. We formulate the former as a classification problem with three precipitation intervals and the latter as a regression problem. For these tasks, we propose to pre-train the model to predict radar images in the near future without requiring ground-truth precipitation, and we also propose the use of a new loss function for fine-tuning to mitigate the class imbalance problem. We demonstrate the effectiveness of our approach using radar images and precipitation datasets collected from South Korea over seven years. It is highlighted that our pre-training scheme and new loss function improve the critical success index (CSI) of nowcasting of heavy rainfall (at least 10 mm/hr) by up to 95.7% and 43.6%, respectively, at a 5-hr lead time. We also demonstrate that our approach reduces the precipitation estimation error by up to 10.7%, compared to the conventional approach, for light rainfall (between 1 and 10 mm/hr). Lastly, we report the sensitivity of our approach to different resolutions and a detailed analysis of four cases of heavy rainfall.

SIJan 24, 2020
MONSTOR: An Inductive Approach for Estimating and Maximizing Influence over Unseen Networks

Jihoon Ko, Kyuhan Lee, Kijung Shin et al.

Influence maximization (IM) is one of the most important problems in social network analysis. Its objective is to find a given number of seed nodes that maximize the spread of information through a social network. Since it is an NP-hard problem, many approximate/heuristic methods have been developed, and a number of them repeat Monte Carlo (MC) simulations over and over to reliably estimate the influence (i.e., the number of infected nodes) of a seed set. In this work, we present an inductive machine learning method, called Monte Carlo Simulator (MONSTOR), for estimating the influence of given seed nodes in social networks unseen during training. To the best of our knowledge, MONSTOR is the first inductive method for this purpose. MONSTOR can greatly accelerate existing IM algorithms by replacing repeated MC simulations. In our experiments, MONSTOR provided highly accurate estimates, achieving 0.998 or higher Pearson and Spearman correlation coefficients in unseen real-world social networks. Moreover, IM algorithms equipped with MONSTOR are more accurate than state-of-the-art competitors in 63% of IM use cases.