Xiao Li

h-index35

4papers

494citations

Novelty51%

AI Score33

Ranked #117,030 of 194,257 authors (top 60%)#7,164 in AI (top 57%)

4 Papers

32.0SDOct 27, 2022Code

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Jingyi li, Weiping tu, Li xiao

Voice conversion (VC) can be achieved by first extracting source content information and target speaker information, and then reconstructing waveform with these information. However, current approaches normally either extract dirty content information with speaker information leaked in, or demand a large amount of annotated data for training. Besides, the quality of reconstructed waveform can be degraded by the mismatch between conversion model and vocoder. In this paper, we adopt the end-to-end framework of VITS for high-quality waveform reconstruction, and propose strategies for clean content information extraction without text annotation. We disentangle content information by imposing an information bottleneck to WavLM features, and propose the spectrogram-resize based data augmentation to improve the purity of extracted content information. Experimental results show that the proposed method outperforms the latest VC models trained with annotated data and has greater robustness.

5.6OCJun 30, 2022

Randomized Coordinate Subgradient Method for Nonsmooth Composite Optimization

Lei Zhao, Ding Chen, Daoli Zhu et al.

Coordinate-type subgradient methods for addressing nonsmooth optimization problems are relatively underexplored due to the set-valued nature of the subdifferential. In this work, our study focuses on nonsmooth composite optimization problems, encompassing a wide class of convex and weakly convex (nonconvex nonsmooth) problems. By utilizing the chain rule of the composite structure properly, we introduce the Randomized Coordinate Subgradient method (RCS) for tackling this problem class. To the best of our knowledge, this is the first coordinate subgradient method for solving general nonsmooth composite optimization problems. In theory, we consider the linearly bounded subgradients assumption for the objective function, which is more general than the traditional Lipschitz continuity assumption, to account for practical scenarios. We then conduct convergence analysis for RCS in both convex and weakly convex cases based on this generalized Lipschitz-type assumption. Specifically, we establish the $\widetilde{\mathcal{O}}$$(1/\sqrt{k})$ convergence rate in expectation and the $\tilde o(1/\sqrt{k})$ almost sure asymptotic convergence rate in terms of the suboptimality gap when $f$ is convex. For the case when $f$ is weakly convex and its subdifferential satisfies the global metric subregularity property, we derive the $\mathcal{O}(\varepsilon^{-4})$ iteration complexity in expectation. We also establish an asymptotic convergence result. To justify the global metric subregularity property utilized in the analysis, we establish this error bound condition for the concrete (real-valued) robust phase retrieval problem. We also provide a convergence lemma and the relationship between the global metric subregularity properties of a weakly convex function and its Moreau envelope. Finally, we conduct several experiments to demonstrate the possible superiority of RCS over the subgradient method.

12.3LGOct 24, 2023Code

Neural Collapse in Multi-label Learning with Pick-all-label Loss

Pengyu Li, Xiao Li, Yutong Wang et al.

We study deep neural networks for the multi-label classification (MLab) task through the lens of neural collapse (NC). Previous works have been restricted to the multi-class classification setting and discovered a prevalent NC phenomenon comprising of the following properties for the last-layer features: (i) the variability of features within every class collapses to zero, (ii) the set of feature means form an equi-angular tight frame (ETF), and (iii) the last layer classifiers collapse to the feature mean upon some scaling. We generalize the study to multi-label learning, and prove for the first time that a generalized NC phenomenon holds with the "pick-all-label" formulation, which we term as MLab NC. While the ETF geometry remains consistent for features with a single label, multi-label scenarios introduce a unique combinatorial aspect we term the "tag-wise average" property, where the means of features with multiple labels are the scaled averages of means for single-label instances. Theoretically, under proper assumptions on the features, we establish that the only global optimizer of the pick-all-label cross-entropy loss satisfy the multi-label NC. In practice, we demonstrate that our findings can lead to better test performance with more efficient training techniques for MLab learning.

36.2AIDec 11, 2016

Reinforcement Learning With Temporal Logic Rewards

Xiao Li, Cristian-Ioan Vasile, Calin Belta

Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the de- sired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively simple tasks. Real world applications typically involve more complex tasks with rich temporal and logical structure. In this paper we take advantage of the expressive power of temporal logic (TL) to specify complex rules the robot should follow, and incorporate domain knowledge into learning. We propose Truncated Linear Temporal Logic (TLTL) as specifications language, that is arguably well suited for the robotics applications, together with quantitative semantics, i.e., robustness degree. We propose a RL approach to learn tasks expressed as TLTL formulae that uses their associated robustness degree as reward functions, instead of the manually crafted heuristics trying to capture the same specifications. We show in simulated trials that learning is faster and policies obtained using the proposed approach outperform the ones learned using heuristic rewards in terms of the robustness degree, i.e., how well the tasks are satisfied. Furthermore, we demonstrate the proposed RL approach in a toast-placing task learned by a Baxter robot.