43.4OSApr 17
NetCAS: Dynamic Cache and Backend Device Management in Networked EnvironmentsJoon Yong Hwang, Chanseo Park, Younghoon Kim
Modern storage systems often combine fast cache with slower backend devices to accelerate I/O. As performance gaps narrow, concurrently accessing both devices, rather than relying solely on cache hits, can improve throughput. However, in data centers, remote backend storage accessed over networks suffers from unpredictable contention, complicating this split. We present NetCAS, a framework that dynamically splits I/O between cache and backend devices based on real-time network feedback and a precomputed Perf Profile. Unlike traditional hit-rate-based policies, NetCAS adapts split ratios to workload configuration and networking performance. NetCAS employs a low-overhead batched round-robin scheduler to enforce splits, avoiding per-request costs. It achieves up to 174% higher performance than traditional caching in remote storage environments and outperforms converging schemes like Orthus by up to 3.5X under fluctuating network conditions.
MLAug 9, 2022
Multiple Instance Neural Networks Based on Sparse Attention for Cancer Detection using T-cell Receptor SequencesYounghoon Kim, Tao Wang, Danyi Xiong et al.
Early detection of cancers has been much explored due to its paramount importance in biomedical fields. Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology. However, the one-to-many correspondence between a patient and multiple TCR sequences hinders researchers from simply adopting classical statistical/machine learning methods. There were recent attempts to model this type of data in the context of multiple instance learning (MIL). Despite the novel application of MIL to cancer detection using TCR sequences and the demonstrated adequate performance in several tumor types, there is still room for improvement, especially for certain cancer types. Furthermore, explainable neural network models are not fully investigated for this application. In this article, we propose multiple instance neural networks based on sparse attention (MINN-SA) to enhance the performance in cancer detection and explainability. The sparse attention structure drops out uninformative instances in each bag, achieving both interpretability and better predictive performance in combination with the skip connection. Our experiments show that MINN-SA yields the highest area under the ROC curve (AUC) scores on average measured across 10 different types of cancers, compared to existing MIL approaches. Moreover, we observe from the estimated attentions that MINN-SA can identify the TCRs that are specific for tumor antigens in the same T cell repertoire.
QUANT-PHSep 26, 2023
Investigation of factors regarding the effects of COVID-19 pandemic on college students' depression by quantum annealerJunggu Choi, Kion Kim, Soohyun Park et al.
Diverse cases regarding the impact, with its related factors, of the COVID-19 pandemic on mental health have been reported in previous studies. College student groups have been frequently selected as the target population in previous studies because they are easily affected by pandemics. In this study, multivariable datasets were collected from 751 college students based on the complex relationships between various mental health factors. We utilized quantum annealing (QA)-based feature selection algorithms that were executed by commercial D-Wave quantum computers to determine the changes in the relative importance of the associated factors before and after the pandemic. Multivariable linear regression (MLR) and XGBoost models were also applied to validate the QA-based algorithms. Based on the experimental results, we confirm that QA-based algorithms have comparable capabilities in factor analysis research to the MLR models that have been widely used in previous studies. Furthermore, the performance of the QA-based algorithms was validated through the important factor results from the algorithms. Pandemic-related factors (e.g., confidence in the social system) and psychological factors (e.g., decision-making in uncertain situations) were more important in post-pandemic conditions. We believe that our study will serve as a reference for researchers studying similar topics.
CLMay 23, 2024Code
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMsJaewoo Yang, Hayun Kim, Younghoon Kim
Modern large language models (LLMs) have established state-of-the-art performance through architectural improvements, but still require significant computational cost for inference. In an effort to reduce the inference cost, post-training quantization (PTQ) has become a popular approach, quantizing weights and activations to lower precision, such as INT8. In this paper, we reveal the challenges of activation quantization in GLU variants, which are widely used in feed-forward network (FFN) of modern LLMs, such as LLaMA family. The problem is that severe local quantization errors, caused by excessive magnitudes of activation in GLU variants, significantly degrade the performance of the quantized LLM. We denote these activations as activation spikes. Our further observations provide a systematic pattern of activation spikes: 1) The activation spikes occur in the FFN of specific layers, particularly in the early and late layers, 2) The activation spikes are dedicated to a couple of tokens, rather than being shared across a sequence. Based on our observations, we propose two empirical methods, Quantization-free Module (QFeM) and Quantization-free Prefix (QFeP), to isolate the activation spikes during quantization. Our extensive experiments validate the effectiveness of the proposed methods for the activation quantization, especially with coarse-grained scheme, of latest LLMs with GLU variants, including LLaMA-2/3, Mistral, Mixtral, SOLAR, and Gemma. In particular, our methods enhance the current alleviation techniques (e.g., SmoothQuant) that fail to control the activation spikes. Code is available at https://github.com/onnoo/activation-spikes.
LGMar 6, 2023
Testing the Channels of Convolutional Neural NetworksKang Choi, Donghyun Son, Younghoon Kim et al.
Neural networks have complex structures, and thus it is hard to understand their inner workings and ensure correctness. To understand and debug convolutional neural networks (CNNs) we propose techniques for testing the channels of CNNs. We design FtGAN, an extension to GAN, that can generate test data with varying the intensity (i.e., sum of the neurons) of a channel of a target CNN. We also proposed a channel selection algorithm to find representative channels for testing. To efficiently inspect the target CNN's inference computations, we define unexpectedness score, which estimates how similar the inference computation of the test data is to that of the training data. We evaluated FtGAN with five public datasets and showed that our techniques successfully identify defective channels in five different CNN models.
HCAug 9, 2021
Gemini2: Generating Keyframe-Oriented Animated Transitions Between Statistical GraphicsYounghoon Kim, Jeffrey Heer
Complex animated transitions may be easier to understand when divided into separate, consecutive stages. However, effective staging requires careful attention to both animation semantics and timing parameters. We present Gemini2, a system for creating staged animations from a sequence of chart keyframes. Given only a start state and an end state, Gemini2 can automatically recommend intermediate keyframes for designers to consider. The Gemini2 recommendation engine leverages Gemini, our prior work, and GraphScape to itemize the given complex change into semantic edit operations and to recombine operations into stages with a guided order for clearly conveying the semantics. To evaluate Gemini2's recommendations, we conducted a human-subject study in which participants ranked recommended animations from both Gemini2 and Gemini. We find that Gemini2's animation recommendation ranking is well aligned with subjects' preferences, and Gemini2 can recommend favorable animations that Gemini cannot support.
LGMay 24, 2021
GMAC: A Distributional Perspective on Actor-Critic FrameworkDaniel Wontae Nam, Younghoon Kim, Chan Y. Park
In this paper, we devise a distributional framework on actor-critic as a solution to distributional instability, action type restriction, and conflation between samples and statistics. We propose a new method that minimizes the Cramér distance with the multi-step Bellman target distribution generated from a novel Sample-Replacement algorithm denoted SR($λ$), which learns the correct value distribution under multiple Bellman operations. Parameterizing a value distribution with Gaussian Mixture Model further improves the efficiency and the performance of the method, which we name GMAC. We empirically show that GMAC captures the correct representation of value distributions and improves the performance of a conventional actor-critic method with low computational cost, in both discrete and continuous action spaces using Arcade Learning Environment (ALE) and PyBullet environment.
IVMay 7, 2021
Deep reinforcement learning-designed radiofrequency waveform in MRIDongmyung Shin, Younghoon Kim, Chungseok Oh et al.
Carefully engineered radiofrequency (RF) pulses play a key role in a number of systems such as mobile phone, radar, and magnetic resonance imaging. The design of an RF waveform, however, is often posed as an inverse problem with no general solution. As a result, various design methods each with a specific purpose have been developed based on the intuition of human experts. In this work, we propose an artificial intelligence (AI)-powered RF pulse design framework, DeepRF, which utilizes the self-learning characteristics of deep reinforcement learning to generate a novel RF pulse. The effectiveness of DeepRF is demonstrated using four types of RF pulses that are commonly used. The DeepRF-designed pulses successfully satisfy the design criteria while reporting reduced energy. Analyses demonstrate the pulses utilize new mechanisms of magnetization manipulation, suggesting the potentials of DeepRF in discovering unseen design dimensions beyond human intuition. This work may lay the foundation for an emerging field of AI-driven RF waveform design.
CVOct 22, 2020
Learning Loss for Test-Time AugmentationIldoo Kim, Younghoon Kim, Sungwoong Kim
Data augmentation has been actively studied for robust neural networks. Most of the recent data augmentation methods focus on augmenting datasets during the training phase. At the testing phase, simple transformations are still widely used for test-time augmentation. This paper proposes a novel instance-level test-time augmentation that efficiently selects suitable transformations for a test input. Our proposed method involves an auxiliary module to predict the loss of each possible transformation given the input. Then, the transformations having lower predicted losses are applied to the input. The network obtains the results by averaging the prediction results of augmented inputs. Experimental results on several image classification benchmarks show that the proposed instance-aware test-time augmentation improves the model's robustness against various corruptions.
HCSep 3, 2020
Gemini: A Grammar and Recommender System for AnimatedTransitions in Statistical GraphicsYounghoon Kim, Jeffrey Heer
Animated transitions help viewers follow changes between related visualizations. Specifying effective animations demands significant effort: authors must select the elements and properties to animate, provide transition parameters, and coordinate the timing of stages. To facilitate this process, we present Gemini, a declarative grammar and recommendation system for animated transitions between single-view statistical graphics. Gemini specifications define transition "steps" in terms of high-level visual components (marks, axes, legends) and composition rules to synchronize and concatenate steps. With this grammar, Gemini can recommend animation designs to augment and accelerate designers' work. Gemini enumerates staged animation designs for given start and end states, and ranks those designs using a cost function informed by prior perceptual studies. To evaluate Gemini, we conduct both a formative study on Mechanical Turk to assess and tune our ranking function, and a summative study in which 8 experienced visualization developers implement animations in D3 that we then compare to Gemini's suggestions. We find that most designs (9/11) are exactly replicable in Gemini, with many (8/11) achievable via edits to suggestions, and that Gemini suggestions avoid multiple participant errors.
MEJul 9, 2020
Penalized Estimation and Forecasting of Multiple Subject Intensive Longitudinal DataZachary F. Fisher, Younghoon Kim, Barbara Fredrickson et al.
Intensive Longitudinal Data (ILD) is increasingly available to social and behavioral scientists. With this increased availability come new opportunities for modeling and predicting complex biological, behavioral, and physiological phenomena. Despite these new opportunities psychological researchers have not taken full advantage of promising opportunities inherent to this data, the potential to forecast psychological processes at the individual level. To address this gap in the literature we present a novel modeling framework that addresses a number of topical challenges and open questions in the psychological literature on modeling dynamic processes. First, how can we model and forecast ILD when the length of individual time series and the number of variables collected are roughly equivalent, or when time series lengths are shorter than what is typically required for time series analyses? Second, how can we best take advantage of the cross-sectional (between-person) information inherent to most ILD scenarios while acknowledging individuals differ both quantitatively (e.g. in parameter magnitude) and qualitatively (e.g. in structural dynamics)? Despite the acknowledged between-person heterogeneity in many psychological processes is it possible to leverage group-level information to support improved forecasting at the individual level? In the remainder of the manuscript, we attempt to address these and other pressing questions relevant to the forecasting of multiple-subject ILD.