Hao-Chung Cheng

QUANT-PH
h-index11
14papers
189citations
Novelty61%
AI Score59

14 Papers

QUANT-PHJun 4
Error Exponents for Quantum Packing Problems via An Operator Layer Cake Theorem

Hao-Chung Cheng, Po-Chieh Liu

In this work, we prove a one-shot random coding bound for classical-quantum channel coding, a problem conjectured by Burnashev and Holevo in 1998. By choosing the optimal input distribution, the bound implies the optimal error exponent (i.e., the reliability function) of classical-quantum channels for rates above the critical rate, even in infinite-dimensional Hilbert spaces. Our result extends to various quantum packing-type problems, including classical communication over any fully quantum channel with or without entanglement-assistance, constant composition codes, and classical data compression with quantum side information via fixed-length or variable-length coding. Our technical ingredient is to establish an operator layer cake theorem - the directional derivative of an operator logarithm admits an integral representation of certain projections. This shows that a kind of pretty-good measurement is equivalent to a randomized Holevo-Helstrom measurement, which provides an operational explanation of why the pretty-good measurement is pretty good.

QUANT-PHJun 4
Multiple Quantum Hypothesis Testing: One-Shot Pairwise Bounds and Sharp Asymptotics

Hao-Chung Cheng, Po-Chieh Liu

We consider Bayesian discrimination among multiple quantum states and establish a dimension-free one-shot upper bound on the minimum probability of error in terms of the sum of pairwise errors. This resolves a conjecture of Audenaert and Mosonyi [J. Math. Phys. 55 (2014)] and improves the multiple quantum Chernoff bound of Li [Ann. Statist. 44 (2016)] by removing its dimension-dependent prefactor. In the asymptotic many-copy regime, our bound proves the achievability of the multiple quantum Chernoff distance for arbitrary separable Hilbert spaces, thereby settling the previously open infinite-dimensional case, and further yields constant-factor sharp asymptotics for the optimal error probability. In binary quantum hypothesis testing, we prove that the minimum error probability is characterized, up to universal constants, by a trace harmonic-mean quantity. Consequently, the optimal binary quantum error probability is within a factor of two of the optimal classical error probability for the associated Nussbaum-Szkoła distributions, complementing the lower bound of Nussbaum and Szkoła [Ann. Statist. 37 (2009)].

SDMay 5Code
MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline

Fang-Duo Tsai, Yi-An Lai, Fei-Yueh Chen et al.

While end-to-end lyrics-to-song models offer convenience for casual users, professional songwriters require score-to-song systems that allow them to retain authorship over the core melody. However, existing score-to-song methods are limited to short-form snippets and fail to maintain coherence in long-form generation, particularly during vocal-silent sections like intros and bridges. To address this long-form bottleneck, we propose MIDI-informed singing accompaniment generation (MIDI-SAG). Unlike conventional audio-only models, MIDI-SAG utilizes symbolic timing and chord information derived from the vocal MIDI to provide a stable musical roadmap. By incorporating structure planning, which defines temporal boundaries and semantic labels, our framework facilitates consistent generation across both vocal and non-vocal sections. We demonstrate the feasibility of this compositional pipeline by leveraging specialized pre-trained modules, enabling data-efficient training on a single GPU. Our experiments show the potential of this approach for both professional score-to-song and general lyrics-to-song tasks. While an early exploration, MIDI-SAG suggests a promising direction for structured, long-form music synthesis. Audio demos are available, and the code will be open-sourced at https://composerflow.github.io/web_revealed/.

SDJul 21, 2024Code
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation

Yun-Han Lan, Wen-Yi Hsiao, Hao-Chung Cheng et al.

Existing text-to-music models can produce high-quality audio with great diversity. However, textual prompts alone cannot precisely control temporal musical features such as chords and rhythm of the generated music. To address this challenge, we introduce MusiConGen, a temporally-conditioned Transformer-based text-to-music model that builds upon the pretrained MusicGen framework. Our innovation lies in an efficient finetuning mechanism, tailored for consumer-grade GPUs, that integrates automatically-extracted rhythm and chords as the condition signal. During inference, the condition can either be musical features extracted from a reference audio signal, or be user-defined symbolic chord sequence, BPM, and textual prompts. Our performance evaluation on two datasets -- one derived from extracted features and the other from user-created inputs -- demonstrates that MusiConGen can generate realistic backing track music that aligns well with the specified conditions. We open-source the code and model checkpoints, and provide audio examples online, https://musicongen.github.io/musicongen_demo/.

MLOct 3, 2022
Online Self-Concordant and Relatively Smooth Minimization, With Applications to Online Portfolio Selection and Learning Quantum States

Chung-En Tsai, Hao-Chung Cheng, Yen-Huan Li

Consider an online convex optimization problem where the loss functions are self-concordant barriers, smooth relative to a convex function $h$, and possibly non-Lipschitz. We analyze the regret of online mirror descent with $h$. Then, based on the result, we prove the following in a unified manner. Denote by $T$ the time horizon and $d$ the parameter dimension. 1. For online portfolio selection, the regret of $\widetilde{\text{EG}}$, a variant of exponentiated gradient due to Helmbold et al., is $\tilde{O} ( T^{2/3} d^{1/3} )$ when $T > 4 d / \log d$. This improves on the original $\tilde{O} ( T^{3/4} d^{1/2} )$ regret bound for $\widetilde{\text{EG}}$. 2. For online portfolio selection, the regret of online mirror descent with the logarithmic barrier is $\tilde{O}(\sqrt{T d})$. The regret bound is the same as that of Soft-Bayes due to Orseau et al. up to logarithmic terms. 3. For online learning quantum states with the logarithmic loss, the regret of online mirror descent with the log-determinant function is also $\tilde{O} ( \sqrt{T d} )$. Its per-iteration time is shorter than all existing algorithms we know.

QUANT-PHApr 28
Distributed Quantum Hypothesis Testing under Zero-rate Communication Constraints

Sreejith Sreekumar, Christoph Hirche, Hao-Chung Cheng et al.

The trade-offs between error probabilities in quantum hypothesis testing are by now well-understood in the centralized setting, but much less is known for distributed settings. Here, we study a distributed binary hypothesis testing problem to infer a bipartite quantum state shared between two remote parties, where one of these parties communicates to the tester at (asymptotic) zero-rate, while the other party communicates to the tester at zero-rate or higher. As our main contribution, we derive an efficiently computable single-letter formula for the Stein's exponent of this problem, when the state under the alternative is the product of their marginals. For proving the converse direction of our result, we utilize a novel technique based on reverse hypercontractivity of a quantum markov semigroup combined with the pinching method. For the general case with vanishing type I error probability, we show that the Stein's exponent when (at least) one of the parties communicates classically at zero-rate is given by a multi-letter expression involving regularized measured relative entropy maximized over a sub-class of binary outcome separable measurements. When the state under the alternative commutes with the product of marginals under the null and has a larger support, we show that the exponent is characterized as a max-min optimization of regularized measured relative entropy over a class of local binary outcome projective measurements. While this expression becomes single-letter for the fully classical case, we further prove that this already does not happen in the same way for classical-quantum states in general. The converse proof of the max-min characterization relies on an extension of the classical blowing-up lemma to bipartite quantum states whose marginals commute, which could be of independent interest.

QUANT-PHNov 23, 2022
Faster Stochastic First-Order Method for Maximum-Likelihood Quantum State Tomography

Chung-En Tsai, Hao-Chung Cheng, Yen-Huan Li

In maximum-likelihood quantum state tomography, both the sample size and dimension grow exponentially with the number of qubits. It is therefore desirable to develop a stochastic first-order method, just like stochastic gradient descent for modern machine learning, to compute the maximum-likelihood estimate. To this end, we propose an algorithm called stochastic mirror descent with the Burg entropy. Its expected optimization error vanishes at a $O ( \sqrt{ ( 1 / t ) d \log t } )$ rate, where $d$ and $t$ denote the dimension and number of iterations, respectively. Its per-iteration time complexity is $O ( d^3 )$, independent of the sample size. To the best of our knowledge, this is currently the computationally fastest stochastic first-order method for maximum-likelihood quantum state tomography.

SDJul 23, 2024
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning

Fang-Duo Tsai, Shih-Lun Wu, Haven Kim et al.

Text-to-music models allow users to generate nearly realistic musical audio with textual commands. However, editing music audios remains challenging due to the conflicting desiderata of performing fine-grained alterations on the audio while maintaining a simple user interface. To address this challenge, we propose Audio Prompt Adapter (or AP-Adapter), a lightweight addition to pretrained text-to-music models. We utilize AudioMAE to extract features from the input audio, and construct attention-based adapters to feedthese features into the internal layers of AudioLDM2, a diffusion-based text-to-music model. With 22M trainable parameters, AP-Adapter empowers users to harness both global (e.g., genre and timbre) and local (e.g., melody) aspects of music, using the original audio and a short text as inputs. Through objective and subjective studies, we evaluate AP-Adapter on three tasks: timbre transfer, genre transfer, and accompaniment generation. Additionally, we demonstrate its effectiveness on out-of-domain audios containing unseen instruments during training.

OCNov 5, 2023
Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging

Chung-En Tsai, Hao-Chung Cheng, Yen-Huan Li

Consider the problem of minimizing an expected logarithmic loss over either the probability simplex or the set of quantum density matrices. This problem includes tasks such as solving the Poisson inverse problem, computing the maximum-likelihood estimate for quantum state tomography, and approximating positive semi-definite matrix permanents with the currently tightest approximation ratio. Although the optimization problem is convex, standard iteration complexity guarantees for first-order methods do not directly apply due to the absence of Lipschitz continuity and smoothness in the loss function. In this work, we propose a stochastic first-order algorithm named $B$-sample stochastic dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d^2/\varepsilon^2)$ time, matching the state of the art, where $d$ denotes the dimension. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d^3/\varepsilon^2)$ time. This improves on the time complexities of existing stochastic first-order methods by a factor of $d^{ω-2}$ and those of batch methods by a factor of $d^2$, where $ω$ denotes the matrix multiplication exponent. Numerical experiments demonstrate that empirically, our algorithm outperforms existing methods with explicit complexity guarantees.

QUANT-PHMar 26, 2024
An invitation to the sample complexity of quantum hypothesis testing

Hao-Chung Cheng, Nilanjana Datta, Nana Liu et al.

Quantum hypothesis testing (QHT) has been traditionally studied from the information-theoretic perspective, wherein one is interested in the optimal decay rate of error probabilities as a function of the number of samples of an unknown state. In this paper, we study the sample complexity of QHT, wherein the goal is to determine the minimum number of samples needed to reach a desired error probability. By making use of the wealth of knowledge that already exists in the literature on QHT, we characterize the sample complexity of binary QHT in the symmetric and asymmetric settings, and we provide bounds on the sample complexity of multiple QHT. In more detail, we prove that the sample complexity of symmetric binary QHT depends logarithmically on the inverse error probability and inversely on the negative logarithm of the fidelity. As a counterpart of the quantum Stein's lemma, we also find that the sample complexity of asymmetric binary QHT depends logarithmically on the inverse type II error probability and inversely on the quantum relative entropy, provided that the type II error probability is sufficiently small. We then provide lower and upper bounds on the sample complexity of multiple QHT, with it remaining an intriguing open question to improve these bounds. The final part of our paper outlines and reviews how sample complexity of QHT is relevant to a broad swathe of research areas and can enhance understanding of many fundamental concepts, including quantum algorithms for simulation and search, quantum learning and classification, and foundations of quantum mechanics. As such, we view our paper as an invitation to researchers coming from different communities to study and contribute to the problem of sample complexity of QHT, and we outline a number of open directions for future research.

SDJun 23, 2025
MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners

Fang-Duo Tsai, Shih-Lun Wu, Weijaw Lee et al.

We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals. The key finding is that positional embeddings, which have been seldom used by text-to-music generation models in the conditioner for text conditions, are critical when the condition of interest is a function of time. Using melody control as an example, our experiments show that simply adding rotary positional embeddings to the decoupled cross-attention layers increases control accuracy from 56.6% to 61.1%, while requiring 6.75 times fewer trainable parameters than state-of-the-art fine-tuning mechanisms, using the same pre-trained diffusion Transformer model of Stable Audio Open. We evaluate various forms of musical attribute control, audio inpainting, and audio outpainting, demonstrating improved controllability over MusicGen-Large and Stable Audio Open ControlNet at a significantly lower fine-tuning cost, with only 85M trainble parameters. Source code, model checkpoints, and demo examples are available at: https://musecontrollite.github.io/web/.

QUANT-PHFeb 23, 2022
Optimal Second-Order Rates for Quantum Soft Covering and Privacy Amplification

Yu-Chen Shen, Li Gao, Hao-Chung Cheng

We study quantum soft covering and privacy amplification against quantum side information. The former task aims to approximate a quantum state by sampling from a prior distribution and querying a quantum channel. The latter task aims to extract uniform and independent randomness against quantum adversaries. For both tasks, we use trace distance to measure the closeness between the processed state and the ideal target state. We show that the minimal amount of samples for achieving an $\varepsilon$-covering is given by the $(1-\varepsilon)$-hypothesis testing information (with additional logarithmic additive terms), while the maximal extractable randomness for an $\varepsilon$-secret extractor is characterized by the conditional $(1-\varepsilon)$-hypothesis testing entropy. When performing independent and identical repetitions of the tasks, our one-shot characterizations lead to tight asymptotic expansions of the above-mentioned operational quantities. We establish their second-order rates given by the quantum mutual information variance and the quantum conditional information variance, respectively. Moreover, our results extend to the moderate deviation regime, which are the optimal asymptotic rates when the trace distances vanish at sub-exponential speed. Our proof technique is direct analysis of trace distance without smoothing.

QUANT-PHFeb 21, 2022
Strong Converse for Privacy Amplification against Quantum Side Information

Yu-Chen Shen, Li Gao, Hao-Chung Cheng

We establish a one-shot strong converse bound for privacy amplification against quantum side information using trace distance as a security criterion. This strong converse bound implies that in the independent and identical scenario, the trace distance exponentially converges to one in every finite blocklength when the rate of the extracted randomness exceeds the quantum conditional entropy. The established one-shot bound has an application to bounding the information leakage of classical-quantum wiretap channel coding and private communication over quantum channels. That is, the trace distance between Alice and Eavesdropper's joint state and its decoupled state vanishes as the rate of randomness used in hashing exceeds the quantum mutual information. On the other hand, the trace distance converges to one when the rate is below the quantum mutual information, resulting in an exponential strong converse. Our result also leads to an exponential strong converse for entropy accumulation, which complements a recent result by Dupuis [arXiv:2105.05342]. Lastly, our result and its applications apply to the moderate deviation regime. Namely, we characterize the asymptotic behaviors of the trace distances when the associated rates approach the fundamental thresholds with speeds slower than $O(1/\sqrt{n})$.

QUANT-PHJan 3, 2015
The Learnability of Unknown Quantum Measurements

Hao-Chung Cheng, Min-Hsiu Hsieh, Ping-Cheng Yeh

Quantum machine learning has received significant attention in recent years, and promising progress has been made in the development of quantum algorithms to speed up traditional machine learning tasks. In this work, however, we focus on investigating the information-theoretic upper bounds of sample complexity - how many training samples are sufficient to predict the future behaviour of an unknown target function. This kind of problem is, arguably, one of the most fundamental problems in statistical learning theory and the bounds for practical settings can be completely characterised by a simple measure of complexity. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson's famous result [Proc. R. Soc. A 463:3089-3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher complexities are derived explicitly. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks.