CLApr 2, 2024
HyperCLOVA X Technical ReportKang Min Yoo, Jaegeun Han, Sookyo In et al.
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.
48.5LGApr 4
Delayed Homomorphic Reinforcement Learning for Environments with Delayed FeedbackJongsoo Lee, Jangwon Kim, Soohee Han
Reinforcement learning in real-world systems is often accompanied by delayed feedback, which breaks the Markov assumption and impedes both learning and control. Canonical state augmentation approaches cause the state-space explosion, which introduces a severe sample-complexity burden. Despite recent progress, the state-of-the-art augmentation-based baselines remain incomplete: they either predominantly reduce the burden on the critic or adopt non-unified treatments for the actor and critic. To provide a structured and sample-efficient solution, we propose delayed homomorphic reinforcement learning (DHRL), a framework grounded in MDP homomorphisms that collapses belief-equivalent augmented states and enables efficient policy learning on the resulting abstract MDP without loss of optimality. We provide theoretical analyses of state-space compression bounds and sample complexity, and introduce a practical algorithm. Experiments on continuous control tasks in MuJoCo benchmark confirm that our algorithm outperforms strong augmentation-based baselines, particularly under long delays.
LGJul 25, 2025
Reinforcement Learning via Conservative Agent for Environments with Random DelaysJongsoo Lee, Jangwon Kim, Jiseok Jeong et al.
Real-world reinforcement learning applications are often hindered by delayed feedback from environments, which violates the Markov assumption and introduces significant challenges. Although numerous delay-compensating methods have been proposed for environments with constant delays, environments with random delays remain largely unexplored due to their inherent variability and unpredictability. In this study, we propose a simple yet robust agent for decision-making under random delays, termed the conservative agent, which reformulates the random-delay environment into its constant-delay equivalent. This transformation enables any state-of-the-art constant-delay method to be directly extended to the random-delay environments without modifying the algorithmic structure or sacrificing performance. We evaluate the conservative agent-based algorithm on continuous control tasks, and empirical results demonstrate that it significantly outperforms existing baseline algorithms in terms of asymptotic performance and sample efficiency.
IVJul 5, 2025
EdgeSRIE: A hybrid deep learning framework for real-time speckle reduction and image enhancement on portable ultrasound systemsHyunwoo Cho, Jongsoo Lee, Jinbum Kang et al.
Speckle patterns in ultrasound images often obscure anatomical details, leading to diagnostic uncertainty. Recently, various deep learning (DL)-based techniques have been introduced to effectively suppress speckle; however, their high computational costs pose challenges for low-resource devices, such as portable ultrasound systems. To address this issue, EdgeSRIE, which is a lightweight hybrid DL framework for real-time speckle reduction and image enhancement in portable ultrasound imaging, is introduced. The proposed framework consists of two main branches: an unsupervised despeckling branch, which is trained by minimizing a loss function between speckled images, and a deblurring branch, which restores blurred images to sharp images. For hardware implementation, the trained network is quantized to 8-bit integer precision and deployed on a low-resource system-on-chip (SoC) with limited power consumption. In the performance evaluation with phantom and in vivo analyses, EdgeSRIE achieved the highest contrast-to-noise ratio (CNR) and average gradient magnitude (AGM) compared with the other baselines (different 2-rule-based methods and other 4-DL-based methods). Furthermore, EdgeSRIE enabled real-time inference at over 60 frames per second while satisfying computational requirements (< 20K parameters) on actual portable ultrasound hardware. These results demonstrated the feasibility of EdgeSRIE for real-time, high-quality ultrasound imaging in resource-limited environments.