CVNov 4, 2022
Unsupervised Visual Representation Learning via Mutual Information Regularized AssignmentDong Hoon Lee, Sungik Choi, Hyunwoo Kim et al.
This paper proposes Mutual Information Regularized Assignment (MIRA), a pseudo-labeling algorithm for unsupervised representation learning inspired by information maximization. We formulate online pseudo-labeling as an optimization problem to find pseudo-labels that maximize the mutual information between the label and data while being close to a given model probability. We derive a fixed-point iteration method and prove its convergence to the optimal solution. In contrast to baselines, MIRA combined with pseudo-label prediction enables a simple yet effective clustering-based representation learning without incorporating extra training techniques or artificial constraints such as sampling strategy, equipartition constraints, etc. With relatively small training epochs, representation learned by MIRA achieves state-of-the-art performance on various downstream tasks, including the linear/k-NN evaluation and transfer learning. Especially, with only 400 epochs, our method applied to ImageNet dataset with ResNet-50 architecture achieves 75.6% linear evaluation accuracy.
CVDec 13, 2024Code
Learning to Merge Tokens via Decoupled Embedding for Efficient Vision TransformersDong Hoon Lee, Seunghoon Hong
Recent token reduction methods for Vision Transformers (ViTs) incorporate token merging, which measures the similarities between token embeddings and combines the most similar pairs. However, their merging policies are directly dependent on intermediate features in ViTs, which prevents exploiting features tailored for merging and requires end-to-end training to improve token merging. In this paper, we propose Decoupled Token Embedding for Merging (DTEM) that enhances token merging through a decoupled embedding learned via a continuously relaxed token merging process. Our method introduces a lightweight embedding module decoupled from the ViT forward pass to extract dedicated features for token merging, thereby addressing the restriction from using intermediate features. The continuously relaxed token merging, applied during training, enables us to learn the decoupled embeddings in a differentiable manner. Thanks to the decoupled structure, our method can be seamlessly integrated into existing ViT backbones and trained either modularly by learning only the decoupled embeddings or end-to-end by fine-tuning. We demonstrate the applicability of DTEM on various tasks, including classification, captioning, and segmentation, with consistent improvement in token merging. Especially in the ImageNet-1k classification, DTEM achieves a 37.2% reduction in FLOPs while maintaining a top-1 accuracy of 79.85% with DeiT-small. Code is available at \href{https://github.com/movinghoon/dtem}{link}.
LGOct 24, 2025Code
Disentangled Representation Learning via Modular Compositional BiasWhie Jung, Dong Hoon Lee, Seunghoon Hong
Recent disentangled representation learning (DRL) methods heavily rely on factor specific strategies-either learning objectives for attributes or model architectures for objects-to embed inductive biases. Such divergent approaches result in significant overhead when novel factors of variation do not align with prior assumptions, such as statistical independence or spatial exclusivity, or when multiple factors coexist, as practitioners must redesign architectures or objectives. To address this, we propose a compositional bias, a modular inductive bias decoupled from both objectives and architectures. Our key insight is that different factors obey distinct recombination rules in the data distribution: global attributes are mutually exclusive, e.g., a face has one nose, while objects share a common support (any subset of objects can co-exist). We therefore randomly remix latents according to factor-specific rules, i.e., a mixing strategy, and force the encoder to discover whichever factor structure the mixing strategy reflects through two complementary objectives: (i) a prior loss that ensures every remix decodes into a realistic image, and (ii) the compositional consistency loss introduced by Wiedemer et al. (arXiv:2310.05327), which aligns each composite image with its corresponding composite latent. Under this general framework, simply adjusting the mixing strategy enables disentanglement of attributes, objects, and even both, without modifying the objectives or architectures. Extensive experiments demonstrate that our method shows competitive performance in both attribute and object disentanglement, and uniquely achieves joint disentanglement of global style and objects. Code is available at https://github.com/whieya/Compositional-DRL.
CVSep 9, 2025Code
Universal Few-Shot Spatial Control for Diffusion ModelsKiet T. Nguyen, Chanhuyk Lee, Donggyun Kim et al.
Spatial conditioning in pretrained text-to-image diffusion models has significantly improved fine-grained control over the structure of generated images. However, existing control adapters exhibit limited adaptability and incur high training costs when encountering novel spatial control conditions that differ substantially from the training tasks. To address this limitation, we propose Universal Few-Shot Control (UFC), a versatile few-shot control adapter capable of generalizing to novel spatial conditions. Given a few image-condition pairs of an unseen task and a query condition, UFC leverages the analogy between query and support conditions to construct task-specific control features, instantiated by a matching mechanism and an update on a small set of task-specific parameters. Experiments on six novel spatial control tasks show that UFC, fine-tuned with only 30 annotated examples of novel tasks, achieves fine-grained control consistent with the spatial conditions. Notably, when fine-tuned with 0.1% of the full training data, UFC achieves competitive performance with the fully supervised baselines in various control tasks. We also show that UFC is applicable agnostically to various diffusion backbones and demonstrate its effectiveness on both UNet and DiT architectures. Code is available at https://github.com/kietngt00/UFC.
CRDec 20, 2021
Forensic Issues and Techniques to Improve Security in SSD with Flex Capacity FeatureNa Young Ahn, Dong Hoon Lee
Over-provisioning technology is typically introduced as a means to improve the performance of storage systems, such as databases. The over-provisioning area is both hidden and difficult for normal users to access. This paper focuses on attack models for such hidden areas. Malicious hackers use advanced over-provisioning techniques that vary capacity according to workload, and as such, our focus is on attack models that use variable over-provisioning technology. According to these attack models, it is possible to scan for invalid blocks containing original data or malware code that is hidden in the over-provisioning area. In this paper, we outline the different forensic processes performed for each memory cell type of the over-provisioning area and disclose security enhancement techniques that increase immunity to these attack models. This leads to a discussion of forensic possibilities and countermeasures for SSDs that can change the over-provisioning area. We also present information-hiding attacks and information-exposing attacks on the invalidation area of the SSD. Our research provides a good foundation upon which the performance and security of SSD-based databases can be further improved.
CVJun 22, 2021
Unsupervised Embedding Adaptation via Early-Stage Feature Reconstruction for Few-Shot ClassificationDong Hoon Lee, Sae-Young Chung
We propose unsupervised embedding adaptation for the downstream few-shot classification task. Based on findings that deep neural networks learn to generalize before memorizing, we develop Early-Stage Feature Reconstruction (ESFR) -- a novel adaptation scheme with feature reconstruction and dimensionality-driven early stopping that finds generalizable features. Incorporating ESFR consistently improves the performance of baseline methods on all standard settings, including the recently proposed transductive method. ESFR used in conjunction with the transductive method further achieves state-of-the-art performance on mini-ImageNet, tiered-ImageNet, and CUB; especially with 1.2%~2.0% improvements in accuracy over the previous best performing method on 1-shot setting.
CYApr 29, 2020
Balancing Personal Privacy and Public Safety during COVID-19: The Case of South KoreaNa Young Ahn, Jun Eun Park, Dong Hoon Lee et al.
There has been vigorous debate on how different countries responded to the COVID-19 pandemic. To secure public safety, South Korea actively used personal information at the risk of personal privacy whereas France encouraged voluntary cooperation at the risk of public safety. In this article, after a brief comparison of contextual differences with France, we focus on South Korea's approaches to epidemiological investigations. To evaluate the issues pertaining to personal privacy and public health, we examine the usage patterns of original data, de-identification data, and encrypted data. Our specific proposal discusses the COVID index, which considers collective infection, outbreak intensity, availability of medical infrastructure, and the death rate. Finally, we summarize the findings and lessons for future research and the policy implications.
CRMar 30, 2020
Hold the Door! Fingerprinting Your Car Key to Prevent Keyless Entry Car TheftKyungho Joo, Wonsuk Choi, Dong Hoon Lee
Recently, the traditional way to unlock car doors has been replaced with a keyless entry system which proves more convenient for automobile owners. When a driver with a key fob is in the vicinity of the vehicle, doors automatically unlock on user command. However, unfortunately, it has been shown that these keyless entry systems are vulnerable to signal relaying attacks. While it is evident that automobile manufacturers incorporate preventative methods to secure these keyless entry systems, they continue to be vulnerable to a range of attacks. Relayed signals result in valid packets that are verified as legitimate, and this makes it is difficult to distinguish a legitimate door unlock request from a malicious signal. In response to this vulnerability, this paper presents an RF fingerprinting method (coined HOld the DOoR, HODOR) to detect attacks on keyless entry systems the first attempt to exploit the RF fingerprint technique in the automotive domain. HODOR is designed as a sub authentication method that supports existing authentication systems for keyless entry systems and does not require any modification of the main system to perform. Through a series of experiments, the results demonstrate that HODOR competently and reliably detects attacks on keyless entry systems. HODOR achieves both an average false positive rate (FPR) of 0.27 percent with a false negative rate (FNR) of 0 percent for the detection of simulated attacks, corresponding to current research on keyless entry car theft.
CRDec 28, 2019
Schemes for Privacy Data Destruction in a NAND Flash MemoryNa-Young Ahn, Dong Hoon Lee
We propose schemes for efficiently destroying privacy data in a NAND flash memory. Generally, even if privcy data is discarded from NAND flash memories, there is a high probability that the data will remain in an invalid block. This is a management problem that arises from the specificity of a program operation and an erase operation of NAND flash memories. When updating pages or performing a garbage collection, there is a problem that valid data remains in at least one unmapped memory block. Is it possible to apply the obligation to delete privacy data from existing NAND flash memory? This paper is the answer to this question. We propose a partial overwriting scheme, a SLC programming scheme, and a deletion duty pulse application scheme for invalid pages to effectively solve privacy data destruction issues due to the remaining data. Such privacy data destruction schemes basically utilize at least one state in which data can be written to the programmed cells based on a multi-level cell program operation. Our privacy data destruction schemes have advantages in terms of block management as compared with conventional erase schemes, and are very economical in terms of time and cost. The proposed privacy data destruction schemes can be easily applied to many storage devices and data centers using NAND flash memories.
CRSep 5, 2018
Multi-Client Order-Revealing EncryptionJieun Eom, Dong Hoon Lee, Kwangsu Lee
Order-revealing encryption is a useful cryptographic primitive that provides range queries on encrypted data since anyone can compare the order of plaintexts by running a public comparison algorithm. Most studies on order-revealing encryption focus only on comparing ciphertexts generated by a single client, and there is no study on comparing ciphertexts generated by multiple clients. In this paper, we propose the concept of multi-client order-revealing encryption that supports comparisons not only on ciphertexts generated by one client but also on ciphertexts generated by multiple clients. We also define a simulation-based security model for multi-client order-revealing encryption. The security model is defined with respect to the leakage function which quantifies how much information is leaked from the scheme. Next, we present two specific multi-client order-revealing encryption schemes with different leakage functions in bilinear maps and prove their security in the random oracle model. Finally, we give the implementation of the proposed schemes and suggest methods to improve the performance of ciphertext comparisons.
CRMay 20, 2017
Countermeasure against Side-Channel Attack in Shared Memory of TrustZoneNa-Young Ahn, Dong Hoon Lee
In this paper we introduced countermeasures against side-channel attacks in the shared memory of TrustZone. We proposed zero-contention cache memory or policy between REE and TEE to prevent from TruSpy attacks in TrustZone. And we suggested that delay time of data path of REE is equal or similar to that of data path of TEE to prevent timing side-channel attacks. Also, we proposed security information flow control based on the Clark-Wilson model, and built the information flow control mechanism using Authentication Tokenization Program (ATP). Accordingly we can expect the improved integrity of the information content between REE and TEE on mobile devices.
CROct 25, 2016
Revocable Hierarchical Identity-Based Encryption from Multilinear MapsSeunghwan Park, Dong Hoon Lee, Kwangsu Lee
In identity-based encryption (IBE) systems, an efficient key delegation method to manage a large number of users and an efficient key revocation method to handle the dynamic credentials of users are needed. Revocable hierarchical IBE (RHIBE) can provide these two methods by organizing the identities of users as a hierarchy and broadcasting an update key for non-revoked users per each time period. To provide the key revocation functionality, previous RHIBE schemes use a tree-based revocation scheme. However, this approach has an inherent limitation such that the number of update key elements depends on the number of revoked users. In this paper, we propose two new RHIBE schemes in multilinear maps that use the public-key broadcast encryption scheme instead of using the tree-based revocation scheme to overcome the mentioned limitation. In our first RHIBE scheme, the number of private key elements and update key elements is reduced to $O(\ell)$ and $O(\ell)$ respectively where $\ell$ is the depth of a hierarchical identity. In our second RHIBE scheme, we can further reduce the number of private key elements from $O(\ell)$ to $O(1)$.
CRJul 2, 2016
Identifying ECUs Using Inimitable Characteristics of Signals in Controller Area NetworksWonsuk Choi, Hyo Jin Jo, Samuel Woo et al.
In the last several decades, the automotive industry has come to incorporate the latest Information and Communications (ICT) technology, increasingly replacing mechanical components of vehicles with electronic components. These electronic control units (ECUs) communicate with each other in an in-vehicle network that makes the vehicle both safer and easier to drive. Controller Area Networks (CANs) are the current standard for such high quality in-vehicle communication. Unfortunately, however, CANs do not currently offer protection against security attacks. In particular, they do not allow for message authentication and hence are open to attacks that replay ECU messages for malicious purposes. Applying the classic cryptographic method of message authentication code (MAC) is not feasible since the CAN data frame is not long enough to include a sufficiently long MAC to provide effective authentication. In this paper, we propose a novel identification method, which works in the physical layer of an in-vehicle CAN network. Our method identifies ECUs using inimitable characteristics of signals enabling detection of a compromised or alien ECU being used in a replay attack. Unlike previous attempts to address security issues in the in-vehicle CAN network, our method works by simply adding a monitoring unit to the existing network, making it deployable in current systems and compliant with required CAN standards. Our experimental results show that the bit string and classification algorithm that we utilized yielded more accurate identification of compromised ECUs than any other method proposed to date. The false positive rate is more than 2 times lower than the method proposed by P.-S. Murvay et al. This paper is also the first to identify potential attack models that systems should be able to detect.
CRFeb 27, 2015
Anonymous HIBE with Short Ciphertexts: Full Security in Prime Order GroupsKwangsu Lee, Jong Hwan Park, Dong Hoon Lee
Anonymous Hierarchical Identity-Based Encryption (HIBE) is an extension of Identity-Based Encryption (IBE), and it provides not only a message hiding property but also an identity hiding property. Anonymous HIBE schemes can be applicable to anonymous communication systems and public key encryption systems with keyword searching. However, previous anonymous HIBE schemes have some disadvantages that the security was proven in the weaker model, the size of ciphertexts is not short, or the construction was based on composite order bilinear groups. In this paper, we propose the first efficient anonymous HIBE scheme with short ciphertexts in prime order (asymmetric) bilinear groups, and prove its security in the full model with an efficient reduction. To achieve this, we use the dual system encryption methodology of Waters. We also present the benchmark results of our scheme by measuring the performance of our implementation.
CRFeb 24, 2015
Sequential Aggregate Signatures with Short Public Keys without Random OraclesKwangsu Lee, Dong Hoon Lee, Moti Yung
The notion of aggregate signature has been motivated by applications and it enables any user to compress different signatures signed by different signers on different messages into a short signature. Sequential aggregate signature, in turn, is a special kind of aggregate signature that only allows a signer to add his signature into an aggregate signature in sequential order. This latter scheme has applications in diversified settings such as in reducing bandwidth of certificate chains and in secure routing protocols. Lu, Ostrovsky, Sahai, Shacham, and Waters (EUROCRYPT 2006) presented the first sequential aggregate signature scheme in the standard model. The size of their public key, however, is quite large (i.e., the number of group elements is proportional to the security parameter), and therefore, they suggested as an open problem the construction of such a scheme with short keys. In this paper, we propose the first sequential aggregate signature schemes with short public keys (i.e., a constant number of group elements) in prime order (asymmetric) bilinear groups that are secure under static assumptions in the standard model. Furthermore, our schemes employ a constant number of pairing operations per message signing and message verification operation. Technically, we start with a public-key signature scheme based on the recent dual system encryption technique of Lewko and Waters (TCC 2010). This technique cannot directly provide an aggregate signature scheme since, as we observed, additional elements should be published in a public key to support aggregation. Thus, our constructions are careful augmentation techniques for the dual system technique to allow it to support sequential aggregate signature schemes. We also propose a multi-signature scheme with short public parameters in the standard model.
CRNov 18, 2014
Security Analysis of the Unrestricted Identity-Based Aggregate Signature SchemeKwangsu Lee, Dong Hoon Lee
Aggregate signatures allow anyone to combine different signatures signed by different signers on different messages into a single short signature. An ideal aggregate signature scheme is an identity-based aggregate signature (IBAS) scheme that supports full aggregation since it can reduce the total transmitted data by using an identity string as a public key and anyone can freely aggregate different signatures. Constructing a secure IBAS scheme that supports full aggregation in bilinear maps is an important open problem. Recently, Yuan {\it et al.} proposed an IBAS scheme with full aggregation in bilinear maps and claimed its security in the random oracle model under the computational Diffie-Hellman assumption. In this paper, we show that there exists an efficient forgery attacker on their IBAS scheme and their security proof has a serious flaw.