CYApr 27, 2022
Identifying Critical LMS Features for Predicting At-risk StudentsYing Guo, Cengiz Gunay, Sairam Tangirala et al.
Learning management systems (LMSs) have become essential in higher education and play an important role in helping educational institutions to promote student success. Traditionally, LMSs have been used by postsecondary institutions in administration, reporting, and delivery of educational content. In this paper, we present an additional use of LMS by using its data logs to perform data-analytics and identify academically at-risk students. The data-driven insights would allow educational institutions and educators to develop and implement pedagogical interventions targeting academically at-risk students. We used anonymized data logs created by Brightspace LMS during fall 2019, spring 2020, and fall 2020 semesters at our college. Supervised machine learning algorithms were used to predict the final course performance of students, and several algorithms were found to perform well with accuracy above 90%. SHAP value method was used to assess the relative importance of features used in the predictive models. Unsupervised learning was also used to group students into different clusters based on the similarities in their interaction/involvement with LMS. In both of supervised and unsupervised learning, we identified two most-important features (Number_Of_Assignment_Submissions and Content_Completed). More importantly, our study lays a foundation and provides a framework for developing a real-time data analytics metric that may be incorporated into a LMS.
ASDec 24, 2024
Text-Aware Adapter for Few-Shot Keyword SpottingYoungmoon Jung, Jinyoung Lee, Seungjin Lee et al.
Recent advances in flexible keyword spotting (KWS) with text enrollment allow users to personalize keywords without uttering them during enrollment. However, there is still room for improvement in target keyword performance. In this work, we propose a novel few-shot transfer learning method, called text-aware adapter (TA-adapter), designed to enhance a pre-trained flexible KWS model for specific keywords with limited speech samples. To adapt the acoustic encoder, we leverage a jointly pre-trained text encoder to generate a text embedding that acts as a representative vector for the keyword. By fine-tuning only a small portion of the network while keeping the core components' weights intact, the TA-adapter proves highly efficient for few-shot KWS, enabling a seamless return to the original pre-trained model. In our experiments, the TA-adapter demonstrated significant performance improvements across 35 distinct keywords from the Google Speech Commands V2 dataset, with only a 0.14% increase in the total number of parameters.
CLAug 5, 2025
Beyond Hard Sharing: Efficient Multi-Task Speech-to-Text Modeling with Supervised Mixture of ExpertsHojun Jin, Eunsoo Hong, Ziwon Hyung et al.
Hard-parameter sharing is a common strategy to train a single model jointly across diverse tasks. However, this often leads to task interference, impeding overall model performance. To address the issue, we propose a simple yet effective Supervised Mixture of Experts (S-MoE). Unlike traditional Mixture of Experts models, S-MoE eliminates the need for training gating functions by utilizing special guiding tokens to route each task to its designated expert. By assigning each task to a separate feedforward network, S-MoE overcomes the limitations of hard-parameter sharing. We further apply S-MoE to a speech-to-text model, enabling the model to process mixed-bandwidth input while jointly performing automatic speech recognition (ASR) and speech translation (ST). Experimental results demonstrate the effectiveness of the proposed S-MoE, achieving a 6.35% relative improvement in Word Error Rate (WER) when applied to both the encoder and decoder.
SDJun 12, 2024
CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword SpottingSichen Jin, Youngmoon Jung, Seungjin Lee et al.
This paper introduces a novel approach for streaming openvocabulary keyword spotting (KWS) with text-based keyword enrollment. For every input frame, the proposed method finds the optimal alignment ending at the frame using connectionist temporal classification (CTC) and aggregates the frame-level acoustic embedding (AE) to obtain higher-level (i.e., character, word, or phrase) AE that aligns with the text embedding (TE) of the target keyword text. After that, we calculate the similarity of the aggregated AE and the TE. To the best of our knowledge, this is the first attempt to dynamically align the audio and the keyword text on-the-fly to attain the joint audio-text embedding for KWS. Despite operating in a streaming fashion, our approach achieves competitive performance on the LibriPhrase dataset compared to the non-streaming methods with a mere 155K model parameters and a decoding algorithm with time complexity O(U), where U is the length of the target keyword at inference time.
ASJun 8, 2024
Relational Proxy Loss for Audio-Text based Keyword SpottingYoungmoon Jung, Seungjin Lee, Joon-Young Yang et al.
In recent years, there has been an increasing focus on user convenience, leading to increased interest in text-based keyword enrollment systems for keyword spotting (KWS). Since the system utilizes text input during the enrollment phase and audio input during actual usage, we call this task audio-text based KWS. To enable this task, both acoustic and text encoders are typically trained using deep metric learning loss functions, such as triplet- and proxy-based losses. This study aims to improve existing methods by leveraging the structural relations within acoustic embeddings and within text embeddings. Unlike previous studies that only compare acoustic and text embeddings on a point-to-point basis, our approach focuses on the relational structures within the embedding space by introducing the concept of Relational Proxy Loss (RPL). By incorporating RPL, we demonstrated improved performance on the Wall Street Journal (WSJ) corpus.
CRAug 14, 2021
A Policy-based Versioning SSD with Intel SGXJinwoo Ahn, Seungjin Lee, Jinhoon Lee et al.
Privileged malware neutralizes software-based versioning systems and destroys data. To counter this threat, a versioning solid-state drive (SSD) that performs versioning inside the SSD has been studied. An SSD is a suitable candidate for data versioning because it can preserve previous versions without additional copying, and provide high security with a very small trusted computing base (TCB). However, the versioning SSDs studied so far commonly use a full disk versioning method that preserves all file versions in a batch. This paper demonstrates that SSDs, which provide full disk versioning, can be exposed to data tampering attacks when the retention time of data is less than the malware's dwell time. To deal with this threat, we propose SGX-SSD, a policy-based per-file versioning SSD to keep a deeper history for only the important files of users. However, since the SSD isn't aware of a file semantic, and the versioning policy information should be securely received from the untrusted host computer, implementing the per-file versioning in SSD is a huge challenge. To solve this problem, SGX-SSD utilizes the Intel SGX and has a secure host interface to securely receive policy information (configuration values) from the user. Also, to solve the file semantic unawareness problem of the SSD, a piggyback module is designed to give a file hint at the host layer, and an algorithm for selective versioning based on the policy is implemented in the SSD. To prove our system, we prototyped SGX-SSD the Jasmine OpenSSD platform in Linux environment. In the experimental evaluation, we proved that SGX-SSD provides strong security with little additional overhead for selective per-file versioning.
CRApr 28, 2020
SGX-SSD: A Policy-based Versioning SSD with Intel SGXJinwoo Ahn, Seungjin Lee, Jinhoon Lee et al.
This paper demonstrates that SSDs, which perform device-level versioning, can be exposed to data tampering attacks when the retention time of data is less than the malware's dwell time. To deal with that threat, we propose SGX-SSD, a SGX-based versioning SSD which selectively preserves file history based on the given policy. The proposed system adopts Intel SGX to implement the version policy management system that is safe from high-privileged malware. Based on the policy, only the necessary data is selectively preserved in SSD that prevents files with less priority from wasting space and also ensures the integrity of important files.
IRJan 24, 2019
Sequential Skip Prediction with Few-shot in Streamed Music ContentsSungkyun Chang, Seungjin Lee, Kyogu Lee
This paper provides an outline of the algorithms submitted for the WSDM Cup 2019 Spotify Sequential Skip Prediction Challenge (team name: mimbres). In the challenge, complete information including acoustic features and user interaction logs for the first half of a listening session is provided. Our goal is to predict whether the individual tracks in the second half of the session will be skipped or not, only given acoustic features. We proposed two different kinds of algorithms that were based on metric learning and sequence learning. The experimental results showed that the sequence learning approach performed significantly better than the metric learning approach. Moreover, we conducted additional experiments to find that significant performance gain can be achieved using complete user log information.
IRAug 31, 2018
Content-based feature exploration for transparent music recommendation using self-attentive genre classificationSeungjin Lee, Juheon Lee, Kyogu lee
Interpretation of retrieved results is an important issue in music recommender systems, particularly from a user perspective. In this study, we investigate the methods for providing interpretability of content features using self-attention. We extract lyric features with the self-attentive genre classification model trained on 140,000 tracks of lyrics. Likewise, we extract acoustic features using the acoustic model with self-attention trained on 120,000 tracks of acoustic signals. The experimental results show that the proposed methods provide the characteristics that are interpretable in terms of both lyrical and musical contents. We demonstrate this by visualizing the attention weights, and by presenting the most similar songs found using lyric or audio features.