Taeho Lee

LG
h-index8
7papers
46citations
Novelty61%
AI Score46

7 Papers

LGJun 13, 2025Code
TruncQuant: Truncation-Ready Quantization for DNNs with Flexible Weight Bit Precision

Jinhee Kim, Seoyeon Yoon, Taeho Lee et al.

The deployment of deep neural networks on edge devices is a challenging task due to the increasing complexity of state-of-the-art models, requiring efforts to reduce model size and inference latency. Recent studies explore models operating at diverse quantization settings to find the optimal point that balances computational efficiency and accuracy. Truncation, an effective approach for achieving lower bit precision mapping, enables a single model to adapt to various hardware platforms with little to no cost. However, formulating a training scheme for deep neural networks to withstand the associated errors introduced by truncation remains a challenge, as the current quantization-aware training schemes are not designed for the truncation process. We propose TruncQuant, a novel truncation-ready training scheme allowing flexible bit precision through bit-shifting in runtime. We achieve this by aligning TruncQuant with the output of the truncation process, demonstrating strong robustness across bit-width settings, and offering an easily implementable training scheme within existing quantization-aware frameworks. Our code is released at https://github.com/a2jinhee/TruncQuant.

20.1LGMar 12
Taming the Adversary: Stable Minimax Deep Deterministic Policy Gradient via Fractional Objectives

Taeho Lee, Donghwan Lee

Reinforcement learning (RL) has achieved remarkable success in a wide range of control and decision-making tasks. However, RL agents often exhibit unstable or degraded performance when deployed in environments subject to unexpected external disturbances and model uncertainties. Consequently, ensuring reliable performance under such conditions remains a critical challenge. In this paper, we propose minimax deep deterministic policy gradient (MMDDPG), a framework for learning disturbance-resilient policies in continuous control tasks. The training process is formulated as a minimax optimization problem between a user policy and an adversarial disturbance policy. In this problem, the user learns a robust policy that minimizes the objective function, while the adversary generates disturbances that maximize it. To stabilize this interaction, we introduce a fractional objective that balances task performance and disturbance magnitude. This objective prevents excessively aggressive disturbances and promotes robust learning. Experimental evaluations in MuJoCo environments demonstrate that the proposed MMDDPG achieves significantly improved robustness against both external force perturbations and model parameter variations.

LGMar 20, 2025
Deep Q-Learning with Gradient Target Tracking

Bum Geun Park, Taeho Lee, Donghwan Lee

This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard deep Q-network (DQN), the target network is a copy of the online network's weights, held fixed for a number of iterations before being periodically replaced via a hard update. While this stabilizes training by providing consistent targets, it introduces a new challenge: the hard update period must be carefully tuned to achieve optimal performance. To address this issue, we propose two gradient-based target update methods: DQN with asymmetric gradient target tracking (AGT2-DQN) and DQN with symmetric gradient target tracking (SGT2-DQN). These methods replace the conventional hard target updates with continuous and structured updates using gradient descent, which effectively eliminates the need for manual tuning. We provide a theoretical analysis proving the convergence of these methods in tabular settings. Additionally, empirical evaluations demonstrate their advantages over standard DQN baselines, which suggest that gradient-based target updates can serve as an effective alternative to conventional target update mechanisms in Q-learning.

ROFeb 28, 2025
Robust Deterministic Policy Gradient for Disturbance Attenuation and Its Application to Quadrotor Control

Taeho Lee, Donghwan Lee

Practical control systems pose significant challenges in identifying optimal control policies due to uncertainties in the system model and external disturbances. While $H_\infty$ control techniques are commonly used to design robust controllers that mitigate the effects of disturbances, these methods often require complex and computationally intensive calculations. To address this issue, this paper proposes a reinforcement learning algorithm called robust deterministic policy gradient (RDPG), which formulates the $H_\infty$ control problem as a two-player zero-sum dynamic game. In this formulation, one player (the user) aims to minimize the cost, while the other player (the adversary) seeks to maximize it. We then employ deterministic policy gradient (DPG) and its deep reinforcement learning counterpart to train a robust control policy with effective disturbance attenuation. In particular, for practical implementation, we introduce an algorithm called robust deep deterministic policy gradient (RDDPG), which employs a deep neural network architecture and integrates techniques from the twin-delayed deep deterministic policy gradient (TD3) to enhance stability and learning efficiency. To evaluate the proposed algorithm, we implement it on an unmanned aerial vehicle (UAV) tasked with following a predefined path in a disturbance-prone environment. The experimental results demonstrate that the proposed method outperforms other control approaches in terms of robustness against disturbances, enabling precise real-time tracking of moving targets even under severe disturbance conditions.

NIOct 3, 2016
Source Accountability with Domain-brokered Privacy

Taeho Lee, Christos Pappas, David Barrera et al.

In an ideal network, every packet would be attributable to its sender, while host identities and transmitted content would remain private. Designing such a network is challenging because source accountability and communication privacy are typically viewed as conflicting properties. In this paper, we propose an architecture that guarantees source accountability and privacy-preserving communication by enlisting ISPs as accountability agents and privacy brokers. While ISPs can link every packet in their network to their customers, customer identity remains unknown to the rest of the Internet. In our architecture, network communication is based on Ephemeral Identifiers (EphIDs)---cryptographic tokens that can be linked to a source only by the source's ISP. We demonstrate that EphIDs can be generated and processed efficiently, and we analyze the practical considerations for deployment.

CRApr 28, 2016
RITM: Revocation in the Middle

Pawel Szalachowski, Laurent Chuat, Taeho Lee et al.

Although TLS is used on a daily basis by many critical applications, the public-key infrastructure that it relies on still lacks an adequate revocation mechanism. An ideal revocation mechanism should be inexpensive, efficient, secure, and privacy-preserving. Moreover, rising trends in pervasive encryption pose new scalability challenges that a modern revocation system should address. In this paper, we investigate how network nodes can deliver certificate-validity information to clients. We present RITM, a framework in which middleboxes (as opposed to clients, servers, or certification authorities) store revocation-related data. RITM provides a secure revocation-checking mechanism that preserves user privacy. We also propose to take advantage of content-delivery networks (CDNs) and argue that they would constitute a fast and cost-effective way to disseminate revocations. Additionally, RITM keeps certification authorities accountable for the revocations that they have issued, and it minimizes overhead at clients and servers, as they have to neither store nor download any messages. We also describe feasible deployment models and present an evaluation of RITM to demonstrate its feasibility and benefits in a real-world deployment.

CRSep 8, 2015
A Practical System for Guaranteed Access in the Presence of DDoS Attacks and Flash Crowds

Yi-Hsuan Kung, Taeho Lee, Po-Ning Tseng et al.

With the growing incidents of flash crowds and sophisticated DDoS attacks mimicking benign traffic, it becomes challenging to protect Internet-based services solely by differentiating attack traffic from legitimate traffic. While fair-sharing schemes are commonly suggested as a defense when differentiation is difficult, they alone may suffer from highly variable or even unbounded waiting times. We propose RainCheck Filter (RCF), a lightweight primitive that guarantees bounded waiting time for clients despite server flooding without keeping per-client state on the server. RCF achieves strong waiting time guarantees by prioritizing clients based on how long the clients have waited-as if the server maintained a queue in which the clients lined up waiting for service. To avoid keeping state for every incoming client request, the server sends to the client a raincheck, a timestamped cryptographic token that not only informs the client to retry later but also serves as a proof of the client's priority level within the virtual queue. We prove that every client complying with RCF can access the server in bounded time, even under a flash crowd incident or a DDoS attack. Our large-scale simulations confirm that RCF provides a small and predictable maximum waiting time while existing schemes cannot. To demonstrate its deployability, we implement RCF as a Python module such that web developers can protect a critical server resource by adding only three lines of code.