SDAug 28, 2023
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-SpeechHyungchan Yoon, Changhwan Kim, Eunwoo Song et al.
For personalized speech generation, a neural text-to-speech (TTS) model must be successfully implemented with limited data from a target speaker. To this end, the baseline TTS model needs to be amply generalized to out-of-domain data (i.e., target speaker's speech). However, approaches to address this out-of-domain generalization problem in TTS have yet to be thoroughly studied. In this work, we propose an effective pruning method for a transformer known as sparse attention, to improve the TTS model's generalization abilities. In particular, we prune off redundant connections from self-attention layers whose attention weights are below the threshold. To flexibly determine the pruning strength for searching optimal degree of generalization, we also propose a new differentiable pruning method that allows the model to automatically learn the thresholds. Evaluations on zero-shot multi-speaker TTS verify the effectiveness of our method in terms of voice quality and speaker similarity.
CLNov 7, 2023
Improving Korean NLP Tasks with Linguistically Informed Subword Tokenization and Sub-character DecompositionTaehee Jeon, Bongseok Yang, Changhwan Kim et al.
We introduce a morpheme-aware subword tokenization method that utilizes sub-character decomposition to address the challenges of applying Byte Pair Encoding (BPE) to Korean, a language characterized by its rich morphology and unique writing system. Our approach balances linguistic accuracy with computational efficiency in Pre-trained Language Models (PLMs). Our evaluations show that this technique achieves good performances overall, notably improving results in the syntactic task of NIKL-CoLA. This suggests that integrating morpheme type information can enhance language models' syntactic and semantic capabilities, indicating that adopting more linguistic insights can further improve performance beyond standard morphological analysis.
ROSep 30, 2021
Coordination of two robotic manipulators for object retrieval in clutterJeeho Ahn, ChangHwan Kim, Changjoo Nam
We consider the problem of retrieving a target object from a confined space by two robotic manipulators where overhand grasps are not allowed. If other movable obstacles occlude the target, more than one object should be relocated to clear the path to reach the target object. With two robots, the relocation could be done efficiently by simultaneously performing relocation tasks. However, the precedence constraint between the tasks (e.g, some objects at the front should be removed to manipulate the objects in the back) makes the simultaneous task execution difficult. We propose a coordination method that determines which robot relocates which object so as to perform tasks simultaneously. Given a set of objects to be relocated, the objective is to maximize the number of turn-takings of the robots in performing relocation tasks. Thus, one robot can pick an object in the clutter while the other robot places an object in hand to the outside of the clutter. However, the object to be relocated may not be accessible to all robots, so taking turns could not always be achieved. Our method is based on the optimal uniform-cost search so the number of turn-takings is proven to be maximized. We also propose a greedy variant whose computation time is shorter. From experiments, we show that our method reduces the completion time of the mission by at least 22.9% (at most 27.3%) compared to the methods with no consideration of turn-taking.
LGAug 20, 2020
BOIL: Towards Representation Change for Few-shot LearningJaehoon Oh, Hyungjun Yoo, ChangHwan Kim et al.
Model Agnostic Meta-Learning (MAML) is one of the most representative of gradient-based meta-learning algorithms. MAML learns new tasks with a few data samples using inner updates from a meta-initialization point and learns the meta-initialization parameters with outer updates. It has recently been hypothesized that representation reuse, which makes little change in efficient representations, is the dominant factor in the performance of the meta-initialized model through MAML in contrast to representation change, which causes a significant change in representations. In this study, we investigate the necessity of representation change for the ultimate goal of few-shot learning, which is solving domain-agnostic tasks. To this aim, we propose a novel meta-learning algorithm, called BOIL (Body Only update in Inner Loop), which updates only the body (extractor) of the model and freezes the head (classifier) during inner loop updates. BOIL leverages representation change rather than representation reuse. This is because feature vectors (representations) have to move quickly to their corresponding frozen head vectors. We visualize this property using cosine similarity, CKA, and empirical results without the head. BOIL empirically shows significant performance improvement over MAML, particularly on cross-domain tasks. The results imply that representation change in gradient-based meta-learning approaches is a critical component.
ROMar 24, 2020
Where to relocate?: Object rearrangement inside cluttered and confined environments for robotic manipulationSang Hun Cheong, Brian Y. Cho, Jinhwi Lee et al.
We present an algorithm determining where to relocate objects inside a cluttered and confined space while rearranging objects to retrieve a target object. Although methods that decide what to remove have been proposed, planning for the placement of removed objects inside a workspace has not received much attention. Rather, removed objects are often placed outside the workspace, which incurs additional laborious work (e.g., motion planning and execution of the manipulator and the mobile base, perception of other areas). Some other methods manipulate objects only inside the workspace but without a principle so the rearrangement becomes inefficient. In this work, we consider both monotone (each object is moved only once) and non-monotone arrangement problems which have shown to be NP-hard. Once the sequence of objects to be relocated is given by any existing algorithm, our method aims to minimize the number of pick-and-place actions to place the objects until the target becomes accessible. From extensive experiments, we show that our method reduces the number of pick-and-place actions and the total execution time (the reduction is up to 23.1% and 28.1% respectively) compared to baseline methods while achieving higher success rates.
ROMar 24, 2020
Fast and resilient manipulation planning for target retrieval in clutterChangjoo Nam, Jinhwi Lee, Sang Hun Cheong et al.
This paper presents a task and motion planning (TAMP) framework for a robotic manipulator in order to retrieve a target object from clutter. We consider a configuration of objects in a confined space with a high density so no collision-free path to the target exists. The robot must relocate some objects to retrieve the target without collisions. For fast completion of object rearrangement, the robot aims to optimize the number of pick-and-place actions which often determines the efficiency of a TAMP framework. We propose a task planner incorporating motion planning to generate executable plans which aims to minimize the number of pick-and-place actions. In addition to fully known and static environments, our method can deal with uncertain and dynamic situations incurred by occluded views. Our method is shown to reduce the number of pick-and-place actions compared to baseline methods (e.g., at least 28.0% of reduction in a known static environment with 20 objects).
ROJul 9, 2019
Planning for target retrieval using a robotic manipulator in cluttered and occluded environmentsChangjoo Nam, Jinhwi Lee, Younggil Cho et al.
This paper presents planning algorithms for a robotic manipulator with a fixed base in order to grasp a target object in cluttered environments. We consider a configuration of objects in a confined space with a high density so no collision-free path to the target exists. The robot must relocate some objects to retrieve the target while avoiding collisions. For fast completion of the retrieval task, the robot needs to compute a plan optimizing an appropriate objective value directly related to the execution time of the relocation plan. We propose planning algorithms that aim to minimize the number of objects to be relocated. Our objective value is appropriate for the object retrieval task because grasping and releasing objects often dominate the total running time. In addition to the algorithm working in fully known and static environments, we propose algorithms that can deal with uncertain and dynamic situations incurred by occluded views. The proposed algorithms are shown to be complete and run in polynomial time. Our methods reduce the total running time significantly compared to a baseline method (e.g., 25.1% of reduction in a known static environment with 10 objects
ROFeb 19, 2019
Efficient Obstacle Rearrangement for Object Manipulation Tasks in Cluttered EnvironmentsJinhwi Lee, Younggil Cho, Changjoo Nam et al.
We present an algorithm that produces a plan for relocating obstacles in order to grasp a target in clutter by a robotic manipulator without collisions. We consider configurations where objects are densely populated in a constrained and confined space. Thus, there exists no collision-free path for the manipulator without relocating obstacles. Since the problem of planning for object rearrangement has shown to be NP-hard, it is difficult to perform manipulation tasks efficiently which could frequently happen in service domains (e.g., taking out a target from a shelf or a fridge). Our proposed planner employs a collision avoidance scheme which has been widely used in mobile robot navigation. The planner determines an obstacle to be removed quickly in real time. It also can deal with dynamic changes in the configuration (e.g., changes in object poses). Our method is shown to be complete and runs in polynomial time. Experimental results in a realistic simulated environment show that our method improves up to 31% of the execution time compared to other competitors.