LGApr 7, 2022Code
Explicit Feature Interaction-aware Graph Neural NetworksMinkyu Kim, Hyun-Soo Choi, Jinho Kim
Graph neural networks (GNNs) are powerful tools for handling graph-structured data. However, their design often limits them to learning only higher-order feature interactions, leaving low-order feature interactions overlooked. To address this problem, we introduce a novel GNN method called explicit feature interaction-aware graph neural network (EFI-GNN). Unlike conventional GNNs, EFI-GNN is a multilayer linear network designed to model arbitrary-order feature interactions explicitly within graphs. To validate the efficacy of EFI-GNN, we conduct experiments using various datasets. The experimental results demonstrate that EFI-GNN has competitive performance with existing GNNs, and when a GNN is jointly trained with EFI-GNN, predictive performance sees an improvement. Furthermore, the predictions made by EFI-GNN are interpretable, owing to its linear construction. The source code of EFI-GNN is available at https://github.com/gim4855744/EFI-GNN
LGSep 30, 2022Code
Higher-order Neural Additive Models: An Interpretable Machine Learning Model with Feature InteractionsMinkyu Kim, Hyun-Soo Choi, Jinho Kim
Neural Additive Models (NAMs) have recently demonstrated promising predictive performance while maintaining interpretability. However, their capacity is limited to capturing only first-order feature interactions, which restricts their effectiveness on real-world datasets. To address this limitation, we propose Higher-order Neural Additive Models (HONAMs), an interpretable machine learning model that effectively and efficiently captures feature interactions of arbitrary orders. HONAMs improve predictive accuracy without compromising interpretability, an essential requirement in high-stakes applications. This advantage of HONAM can help analyze and extract high-order interactions present in datasets. The source code for HONAM is publicly available at https://github.com/gim4855744/HONAM/.
LGMay 25, 2023
On the Impact of Knowledge Distillation for Model InterpretabilityHyeongrok Han, Siwon Kim, Hyun-Soo Choi et al.
Several recent studies have elucidated why knowledge distillation (KD) improves model performance. However, few have researched the other advantages of KD in addition to its improving model performance. In this study, we have attempted to show that KD enhances the interpretability as well as the accuracy of models. We measured the number of concept detectors identified in network dissection for a quantitative comparison of model interpretability. We attributed the improvement in interpretability to the class-similarity information transferred from the teacher to student models. First, we confirmed the transfer of class-similarity information from the teacher to student model via logit distillation. Then, we analyzed how class-similarity information affects model interpretability in terms of its presence or absence and degree of similarity information. We conducted various quantitative and qualitative experiments and examined the results on different datasets, different KD methods, and according to different measures of interpretability. Our research showed that KD models by large models could be used more reliably in various fields.
LGSep 11, 2021
Towards a Rigorous Evaluation of Time-series Anomaly DetectionSiwon Kim, Kukjin Choi, Hyun-Soo Choi et al.
In recent years, proposed studies on time-series anomaly detection (TAD) report high F1 scores on benchmark TAD datasets, giving the impression of clear improvements in TAD. However, most studies apply a peculiar evaluation protocol called point adjustment (PA) before scoring. In this paper, we theoretically and experimentally reveal that the PA protocol has a great possibility of overestimating the detection performance; that is, even a random anomaly score can easily turn into a state-of-the-art TAD method. Therefore, the comparison of TAD methods after applying the PA protocol can lead to misguided rankings. Furthermore, we question the potential of existing TAD methods by showing that an untrained model obtains comparable detection performance to the existing methods even when PA is forbidden. Based on our findings, we propose a new baseline and an evaluation protocol. We expect that our study will help a rigorous evaluation of TAD and lead to further improvement in future researches.
BMNov 25, 2019
Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural InformationSeonwoo Min, Seunghyun Park, Siwon Kim et al.
Bridging the exponentially growing gap between the numbers of unlabeled and labeled protein sequences, several studies adopted semi-supervised learning for protein sequence modeling. In these studies, models were pre-trained with a substantial amount of unlabeled data, and the representations were transferred to various downstream tasks. Most pre-training methods solely rely on language modeling and often exhibit limited performance. In this paper, we introduce a novel pre-training scheme called PLUS, which stands for Protein sequence representations Learned Using Structural information. PLUS consists of masked language modeling and a complementary protein-specific pre-training task, namely same-family prediction. PLUS can be used to pre-train various model architectures. In this work, we use PLUS to pre-train a bidirectional recurrent neural network and refer to the resulting model as PLUS-RNN. Our experiment results demonstrate that PLUS-RNN outperforms other models of similar size solely pre-trained with the language modeling in six out of seven widely used protein biology tasks. Furthermore, we present the results from our qualitative interpretation analyses to illustrate the strengths of PLUS-RNN. PLUS provides a novel way to exploit evolutionary relationships among unlabeled proteins and is broadly applicable across a variety of protein biology tasks. We expect that the gap between the numbers of unlabeled and labeled proteins will continue to grow exponentially, and the proposed pre-training method will play a larger role.
MMFeb 28, 2019
PixelSteganalysis: Pixel-wise Hidden Information Removal with Low Visual DegradationDahuin Jung, Ho Bae, Hyun-Soo Choi et al.
Recently, the field of steganography has experienced rapid developments based on deep learning (DL). DL based steganography distributes secret information over all the available bits of the cover image, thereby posing difficulties in using conventional steganalysis methods to detect, extract or remove hidden secret images. However, our proposed framework is the first to effectively disable covert communications and transactions that use DL based steganography. We propose a DL based steganalysis technique that effectively removes secret images by restoring the distribution of the original images. We formulate a problem and address it by exploiting sophisticated pixel distributions and an edge distribution of images by using a deep neural network. Based on the given information, we remove the hidden secret information at the pixel level. We evaluate our technique by comparing it with conventional steganalysis methods using three public benchmarks. As the decoding method of DL based steganography is approximate (lossy) and is different from the decoding method of conventional steganography, we also introduce a new quantitative metric called the destruction rate (DT). The experimental results demonstrate performance improvements of 10-20% in both the decoded rate and the DT.
MMJan 30, 2019
PixelSteganalysis: Destroying Hidden Information with a Low Degree of Visual DegradationDahuin Jung, Ho Bae, Hyun-Soo Choi et al.
Steganography is the science of unnoticeably concealing a secret message within a certain image, called a cover image. The cover image with the secret message is called a stego image. Steganography is commonly used for illegal purposes such as terrorist activities and pornography. To thwart covert communications and transactions, attacking algorithms against steganography, called steganalysis, exist. Currently, there are many studies implementing deep learning to the steganography algorithm. However, conventional steganalysis is no longer effective for deep learning based steganography algorithms. Our framework is the first one to disturb covert communications and transactions via the recent deep learning-based steganography algorithms. We first extract a sophisticated pixel distribution of the potential stego image from the auto-regressive model induced by deep learning. Using the extracted pixel distributions, we detect whether an image is the stego or not at the pixel level. Each pixel value is adjusted as required and the adjustment induces an effective removal of the secret image. Because the decoding method of deep learning-based steganography algorithms is approximate (lossy), which is different from the conventional steganography, we propose a new quantitative metric that is more suitable for measuring the accurate effect. We evaluate our method using three public benchmarks in comparison with a conventional steganalysis method and show up to a 20% improvement in terms of decoding rate.