SDNov 5, 2023
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference EfficiencySungho Jeon, Ching-Feng Yeh, Hakan Inan et al.
In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders. These speech transformers rely on mixing convolutional modules with self-attention modules. They achieve state-of-the-art performance on ASR with top efficiency. We first show that employing these speech transformers as an encoder significantly improves the efficiency of pre-trained audio models as well. However, our study shows that we can achieve comparable efficiency with advanced self-attention solely. We demonstrate that this simpler approach is particularly beneficial with a low-bit weight quantization technique of a neural network to improve efficiency. We hypothesize that it prevents propagating the errors between different quantized modules compared to recent speech transformers mixing quantized convolution and the quantized self-attention modules.
LGJun 7, 2024
REP: Resource-Efficient Prompting for Rehearsal-Free Continual LearningSungho Jeon, Xinyue Ma, Kwang In Kim et al.
Recent rehearsal-free continual learning (CL) methods guided by prompts achieve strong performance on vision tasks with non-stationary data but remain resource-intensive, hindering real-world edge deployment. We introduce resource-efficient prompting (REP), which improves the computational and memory efficiency of prompt-based rehearsal-free continual learning methods while minimizing accuracy trade-offs. Our approach employs swift prompt selection to refine input data using a carefully provisioned model and introduces adaptive token merging (AToM) and adaptive layer dropping (ALD) for efficient prompt updates. AToM and ALD selectively skip data and model layers while preserving task-specific features during the learning of new tasks. Extensive experiments on multiple image classification datasets demonstrate REP's superior resource efficiency over state-of-the-art rehearsal-free CL methods.
SDJan 20, 2017
Empirical Study of Drone Sound Detection in Real-Life Environment with Deep Neural NetworksSungho Jeon, Jong-Woo Shin, Young-Jun Lee et al.
This work aims to investigate the use of deep neural network to detect commercial hobby drones in real-life environments by analyzing their sound data. The purpose of work is to contribute to a system for detecting drones used for malicious purposes, such as for terrorism. Specifically, we present a method capable of detecting the presence of commercial hobby drones as a binary classification problem based on sound event detection. We recorded the sound produced by a few popular commercial hobby drones, and then augmented this data with diverse environmental sound data to remedy the scarcity of drone sound data in diverse environments. We investigated the effectiveness of state-of-the-art event sound classification methods, i.e., a Gaussian Mixture Model (GMM), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN), for drone sound detection. Our empirical results, which were obtained with a testing dataset collected on an urban street, confirmed the effectiveness of these models for operating in a real environment. In summary, our RNN models showed the best detection performance with an F-Score of 0.8009 with 240 ms of input audio with a short processing time, indicating their applicability to real-time detection systems.
CRAug 27, 2016
Passive Fingerprinting of SCADA in Critical Infrastructure Network without Deep Packet InspectionSungho Jeon, Jeong-Han Yun, Seungoh Choi et al.
We present the first technique of passive fingerprinting for Supervisory Control And Data Acquisition (SCADA) networks without Deep Packet Inspection (DPI) and experience on real environment. Unlike existing work, our method does not rely on the functions of a specific product or DPI of the SCADA protocol. Our inference method, which is based on the intrinsic characteristics of SCADA, first identifies the network port used for the SCADA protocol, then consecutively infers the field devices and master server. We evaluated the effectiveness of our method using two network traces collected from a real environment for a month and a half, three days from different CI respectively. This confirmed the ability of our method to capture most of the SCADA with high F-score nearly 1, except for HMIs connected to master server, and demonstrated the practical applicability of the method.