LGAug 22, 2023Code
Class Label-aware Graph Anomaly DetectionJunghoon Kim, Yeonjun In, Kanghoon Yoon et al.
Unsupervised GAD methods assume the lack of anomaly labels, i.e., whether a node is anomalous or not. One common observation we made from previous unsupervised methods is that they not only assume the absence of such anomaly labels, but also the absence of class labels (the class a node belongs to used in a general node classification task). In this work, we study the utility of class labels for unsupervised GAD; in particular, how they enhance the detection of structural anomalies. To this end, we propose a Class Label-aware Graph Anomaly Detection framework (CLAD) that utilizes a limited amount of labeled nodes to enhance the performance of unsupervised GAD. Extensive experiments on ten datasets demonstrate the superior performance of CLAD in comparison to existing unsupervised GAD methods, even in the absence of ground-truth class label information. The source code for CLAD is available at \url{https://github.com/jhkim611/CLAD}.
ASJun 27, 2022
Avocodo: Generative Adversarial Network for Artifact-free VocoderTaejun Bak, Junmo Lee, Hanbin Bae et al.
Neural vocoders based on the generative adversarial neural network (GAN) have been widely used due to their fast inference speed and lightweight networks while generating high-quality speech waveforms. Since the perceptually important speech components are primarily concentrated in the low-frequency bands, most GAN-based vocoders perform multi-scale analysis that evaluates downsampled speech waveforms. This multi-scale analysis helps the generator improve speech intelligibility. However, in preliminary experiments, we discovered that the multi-scale analysis which focuses on the low-frequency bands causes unintended artifacts, e.g., aliasing and imaging artifacts, which degrade the synthesized speech waveform quality. Therefore, in this paper, we investigate the relationship between these artifacts and GAN-based vocoders and propose a GAN-based vocoder, called Avocodo, that allows the synthesis of high-fidelity speech with reduced artifacts. We introduce two kinds of discriminators to evaluate speech waveforms in various perspectives: a collaborative multi-band discriminator and a sub-band discriminator. We also utilize a pseudo quadrature mirror filter bank to obtain downsampled multi-band speech waveforms while avoiding aliasing. According to experimental results, Avocodo outperforms baseline GAN-based vocoders, both objectively and subjectively, while reproducing speech with fewer artifacts.
SDNov 20, 2023
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech SynthesisJungil Kong, Junmo Lee, Jeongmin Kim et al.
In this work, we propose a novel method for modeling numerous speakers, which enables expressing the overall characteristics of speakers in detail like a trained multi-speaker model without additional training on the target speaker's dataset. Although various works with similar purposes have been actively studied, their performance has not yet reached that of trained multi-speaker models due to their fundamental limitations. To overcome previous limitations, we propose effective methods for feature learning and representing target speakers' speech characteristics by discretizing the features and conditioning them to a speech synthesis model. Our method obtained a significantly higher similarity mean opinion score (SMOS) in subjective similarity evaluation than seen speakers of a high-performance multi-speaker model, even with unseen speakers. The proposed method also outperforms a zero-shot method by significant margins. Furthermore, our method shows remarkable performance in generating new artificial speakers. In addition, we demonstrate that the encoded latent features are sufficiently informative to reconstruct an original speaker's speech completely. It implies that our method can be used as a general methodology to encode and reconstruct speakers' characteristics in various tasks.
SINov 19, 2024Code
Revisiting Fake News Detection: Towards Temporality-aware Evaluation by Leveraging Engagement EarlinessJunghoon Kim, Junmo Lee, Yeonjun In et al.
Social graph-based fake news detection aims to identify news articles containing false information by utilizing social contexts, e.g., user information, tweets and comments. However, conventional methods are evaluated under less realistic scenarios, where the model has access to future knowledge on article-related and context-related data during training. In this work, we newly formalize a more realistic evaluation scheme that mimics real-world scenarios, where the data is temporality-aware and the detection model can only be trained on data collected up to a certain point in time. We show that the discriminative capabilities of conventional methods decrease sharply under this new setting, and further propose DAWN, a method more applicable to such scenarios. Our empirical findings indicate that later engagements (e.g., consuming or reposting news) contribute more to noisy edges that link real news-fake news pairs in the social graph. Motivated by this, we utilize feature representations of engagement earliness to guide an edge weight estimator to suppress the weights of such noisy edges, thereby enhancing the detection performance of DAWN. Through extensive experiments, we demonstrate that DAWN outperforms existing fake news detection methods under real-world environments. The source code is available at https://github.com/LeeJunmo/DAWN.
CRDec 10, 2025
BugSweeper: Function-Level Detection of Smart Contract Vulnerabilities Using Graph Neural NetworksUisang Lee, Changhoon Chung, Junmo Lee et al.
The rapid growth of Ethereum has made it more important to quickly and accurately detect smart contract vulnerabilities. While machine-learning-based methods have shown some promise, many still rely on rule-based preprocessing designed by domain experts. Rule-based preprocessing methods often discard crucial context from the source code, potentially causing certain vulnerabilities to be overlooked and limiting adaptability to newly emerging threats. We introduce BugSweeper, an end-to-end deep learning framework that detects vulnerabilities directly from the source code without manual engineering. BugSweeper represents each Solidity function as a Function-Level Abstract Syntax Graph (FLAG), a novel graph that combines its Abstract Syntax Tree (AST) with enriched control-flow and data-flow semantics. Then, our two-stage Graph Neural Network (GNN) analyzes these graphs. The first-stage GNN filters noise from the syntax graphs, while the second-stage GNN conducts high-level reasoning to detect diverse vulnerabilities. Extensive experiments on real-world contracts show that BugSweeper significantly outperforms all state-of-the-art detection methods. By removing the need for handcrafted rules, our approach offers a robust, automated, and scalable solution for securing smart contracts without any dependence on security experts.
DCMay 6, 2023
A Blockchain-based Platform for Reliable Inference and Training of Large-Scale ModelsSanghyeon Park, Junmo Lee, Soo-Mook Moon
As artificial intelligence (AI) continues to permeate various domains, concerns surrounding trust and transparency in AI-driven inference and training processes have emerged, particularly with respect to potential biases and traceability challenges. Decentralized solutions such as blockchain have been proposed to tackle these issues, but they often struggle when dealing with large-scale models, leading to time-consuming inference and inefficient training verification. To overcome these limitations, we introduce BRAIN, a Blockchain-based Reliable AI Network, a novel platform specifically designed to ensure reliable inference and training of large models. BRAIN harnesses a unique two-phase transaction mechanism, allowing real-time processing via pipelining by separating request and response transactions. Each randomly-selected inference committee commits and reveals the inference results, and upon reaching an agreement through a smart contract, then the requested operation is executed using the consensus result. Additionally, BRAIN carries out training by employing a randomly-selected training committee. They submit commit and reveal transactions along with their respective scores, enabling local model aggregation based on the median value of the scores. Experimental results demonstrate that BRAIN delivers considerably higher inference throughput at reasonable gas fees. In particular, BRAIN's tasks-per-second performance is 454.4293 times greater than that of a naive single-phase implementation.
CRDec 8, 2020
RouTEE: A Secure Payment Network Routing Hub using Trusted Execution EnvironmentsJunmo Lee, Seongjun Kim, Sanghyeon Park et al.
Cryptocurrencies such as Bitcoin and Ethereum have made payment transactions possible without a trusted third party, but they have a scalability issue due to their consensus mechanisms. Payment networks have emerged to overcome this limitation by executing transactions outside of the blockchain, which is why these are referred to as off-chain transactions. In order to establish a payment channel between two users, the users lock their deposits in the blockchain, and then they can pay each other through the channel. Furthermore, payment networks support multi-hop payments that allow users to transfer their balances to other users who are connected to them via multiple channels. However, multi-hop payments are hard to be accomplished, as they are heavily dependent on routing users on a payment path from a sender to a receiver. Although routing hubs can make multi-hop payments more practical and efficient, they need a lot of collateral locked for a long period and have privacy issues in terms of payment history. We propose RouTEE, a secure payment routing hub that is fully feasible without the hub's deposit. Unlike existing payment networks, RouTEE provides high balance liquidity, and details about payments are concealed from hosts by leveraging trusted execution environments (TEEs). RouTEE is designed to make rational hosts behave honestly, by introducing a new routing fee scheme and a secure settlement method. Moreover, users do not need to monitor the blockchain in real-time or run full nodes. They can participate in RouTEE by simply verifying block headers through light clients; furthermore, having only one channel with RouTEE is sufficient to interact with other users. Our implementation demonstrates that RouTEE is highly efficient and outperforms Lightning Network that is the state-of-the-art payment network.
ASJul 30, 2020
Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level ConditioningJae-Sung Bae, Hanbin Bae, Young-Sun Joo et al.
This paper proposes a controllable end-to-end text-to-speech (TTS) system to control the speaking speed (speed-controllable TTS; SCTTS) of synthesized speech with sentence-level speaking-rate value as an additional input. The speaking-rate value, the ratio of the number of input phonemes to the length of input speech, is adopted in the proposed system to control the speaking speed. Furthermore, the proposed SCTTS system can control the speaking speed while retaining other speech attributes, such as the pitch, by adopting the global style token-based style encoder. The proposed SCTTS does not require any additional well-trained model or an external speech database to extract phoneme-level duration information and can be trained in an end-to-end manner. In addition, our listening tests on fast-, normal-, and slow-speed speech showed that the SCTTS can generate more natural speech than other phoneme duration control approaches which increase or decrease duration at the same rate for the entire sentence, especially in the case of slow-speed speech.
ASJul 30, 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial NetworkJinhyeok Yang, Junmo Lee, Youngik Kim et al.
We present a novel high-fidelity real-time neural vocoder called VocGAN. A recently developed GAN-based vocoder, MelGAN, produces speech waveforms in real-time. However, it often produces a waveform that is insufficient in quality or inconsistent with acoustic characteristics of the input mel spectrogram. VocGAN is nearly as fast as MelGAN, but it significantly improves the quality and consistency of the output waveform. VocGAN applies a multi-scale waveform generator and a hierarchically-nested discriminator to learn multiple levels of acoustic properties in a balanced way. It also applies the joint conditional and unconditional objective, which has shown successful results in high-resolution image synthesis. In experiments, VocGAN synthesizes speech waveforms 416.7x faster on a GTX 1080Ti GPU and 3.24x faster on a CPU than real-time. Compared with MelGAN, it also exhibits significantly improved quality in multiple evaluation metrics including mean opinion score (MOS) with minimal additional overhead. Additionally, compared with Parallel WaveGAN, another recently developed high-fidelity vocoder, VocGAN is 6.98x faster on a CPU and exhibits higher MOS.