SPMar 13, 2012
On some properties of nonnegative weakly irreducible tensorsYuning Yang, Qingzhi Yang
In this paper, we mainly focus on how to generalize some conclusions from nonnegative irreducible tensors to nonnegative weakly irreducible tensors. To do so, a basic and important lemma is proven using new tools. First, we give the definition of stochastic tensors. Then we show that every nonnegative weakly irreducible tensor with spectral radius being one is diagonally similar to a unique weakly irreducible stochastic tensor. Based on it, we prove some important lemmas, which help us to generalize the results related. Some counterexamples are provided to show that some conclusions for nonnegative irreducible tensors do not hold for nonnegative weakly irreducible tensors.
LGJun 23, 2024Code
TimeAutoDiff: A Unified Framework for Generation, Imputation, Forecasting, and Time-Varying Metadata Conditioning of Heterogeneous Time Series Tabular DataNamjoon Suh, Yuning Yang, Din-Yin Hsieh et al.
We present TimeAutoDiff, a unified latent-diffusion framework for four fundamental time-series tasks: unconditional generation, missing-data imputation, forecasting, and time-varying-metadata conditional generation. The model natively supports heterogeneous features including continuous, binary, and categorical variables. We unify all tasks using a masked-modeling strategy in which a binary mask specifies which time-series cells are observed and which must be generated. TimeAutoDiff combines a lightweight variational autoencoder, which maps mixed-type features into a continuous latent sequence, with a diffusion model that learns temporal dynamics in this latent space. Two architectural choices provide strong speed and scalability benefits. The diffusion model samples an entire latent trajectory at once rather than denoising one timestep at a time, greatly reducing reverse-diffusion calls. In addition, the VAE compresses along the feature axis, enabling efficient modeling of wide tables in a low-dimensional latent space. Empirical evaluation shows that TimeAutoDiff matches or surpasses strong baselines in synthetic sequence fidelity and consistently improves imputation and forecasting performance. Metadata conditioning enables realistic scenario exploration, allowing users to edit metadata sequences and produce coherent counterfactual trajectories that preserve cross-feature dependencies. Ablation studies highlight the importance of the VAE's feature encoding and key components of the denoiser. A distance-to-closest-record audit further indicates that the model generalizes without excessive memorization. Code is available at https://github.com/namjoonsuh/TimeAutoDiff
CLNov 18, 2024Code
Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media ContextsJingxuan Li, Yuning Yang, Shengqi Yang et al.
The recent progress in Vision-Language Models (VLMs) has broadened the scope of multimodal applications. However, evaluations often remain limited to functional tasks, neglecting abstract dimensions such as personality traits and human values. To address this gap, we introduce Value-Spectrum, a novel Visual Question Answering (VQA) benchmark aimed at assessing VLMs based on Schwartz's value dimensions that capture core human values guiding people's preferences and actions. We design a VLM agent pipeline to simulate video browsing and construct a vector database comprising over 50,000 short videos from TikTok, YouTube Shorts, and Instagram Reels. These videos span multiple months and cover diverse topics, including family, health, hobbies, society, technology, etc. Benchmarking on Value-Spectrum highlights notable variations in how VLMs handle value-oriented content. Beyond identifying VLMs' intrinsic preferences, we also explore the ability of VLM agents to adopt specific personas when explicitly prompted, revealing insights into the adaptability of the model in role-playing scenarios. These findings highlight the potential of Value-Spectrum as a comprehensive evaluation set for tracking VLM preferences in value-based tasks and abilities to simulate diverse personas. The complete code and data are available at: https://github.com/Jeremyyny/Value-Spectrum.
16.5NAMar 27
Improving Sketching Algorithms for Low-Rank Matrix Approximation via Sketch-Power IterationsChao Chang, Yuning Yang
Power iteration can improve the accuracy of randomized SVD, but requires multiple data passes, making it impractical in streaming or memory-constrained settings. We introduce a lightweight yet effective sketch-power iteration, allowing power-like iterations with only a single pass of the data, which can be incorporated into one-pass algorithms for low-rank approximation. As an example, we integrate the sketch-power iteration into a one-pass algorithm proposed by Tropp et al., and introduce strategies to reduce its storage cost. We establish meaningful error bounds: given a fixed storage budget, the sketch sizes derived from the bounds closely match the optimal ones observed in reality. This allows one to preselect reasonable parameters. Numerical experiments on both synthetic and real-world datasets indicate that, under the same storage constraints, applying one or two sketch-power iterations can substantially improve the approximation accuracy of the considered one-pass algorithms. In particular, experiments on real data with flat spectrum show that the method can approximate the dominant singular vectors well.
LGMay 15, 2024
SPD-CFL: Stepwise Parameter Dropout for Efficient Continual Federated LearningYuning Yang, Han Yu, Chuan Sun et al.
Federated Learning (FL) is a collaborative machine learning paradigm for training models on local sensitive data with privacy protection. Pre-trained transformer-based models have emerged as useful foundation models (FMs) to be fine-tuned for a wide range of downstream tasks. However, large-scale pre-trained models make it challenging for traditional FL due to high communication overhead in the resource-constrained IoT. This has inspired the field of parameter-efficient fine-tuning (PEFT) research. Existing PEFT methods attempt to optimize model performance at the given dropout level. Such an approach places the burden on human users to find a dropout rate that provides a satisfactory level of performance through trial-and-error, which is time consuming and resource intensive. To address this limitation, we propose the Step-wise Parameter Dropout for Continual Federated Learning (SPD-CFL) approach. Instead of pre-defining a desired dropout rate, it allows users to specify the target level of performance and then attempts to find the most suitable dropout rate for the given FL model. Specifically, on the server side, SPD-CFL drops trainable parameters in a stepwise manner to improve communication efficiency by reducing the rank of low-rank adaptation (LoRA). The sensitivity-based gradient consistency (SGC) measure is designed to facilitate the adaptive adjustment of parameter dropout. In addition, SPD-CFL introduces continual learning (CL) on the client side to mitigate performance degradation due to the inconsistent optima with distinct parameter dropout rates under heterogeneous FL. Extensive experiments on the public benchmark dataset CIFAR-10 and a real-world medical Face dataset demonstrate significant superiority of SPD-CFL over state-of-the-art methods. Compared to the best-performing baseline, it achieves a 2.07% higher test AUC while reducing communication overhead by 29.53%.
LGJun 23, 2025
Towards Group Fairness with Multiple Sensitive Attributes in Federated Foundation ModelsYuning Yang, Han Yu, Tianrun Gao et al.
The deep integration of foundation models (FM) with federated learning (FL) enhances personalization and scalability for diverse downstream tasks, making it crucial in sensitive domains like healthcare. Achieving group fairness has become an increasingly prominent issue in the era of federated foundation models (FFMs), since biases in sensitive attributes might lead to inequitable treatment for under-represented demographic groups. Existing studies mostly focus on achieving fairness with respect to a single sensitive attribute. This renders them unable to provide clear interpretability of dependencies among multiple sensitive attributes which is required to achieve group fairness. Our paper takes the first attempt towards a causal analysis of the relationship between group fairness across various sensitive attributes in the FFM. We extend the FFM structure to trade off multiple sensitive attributes simultaneously and quantify the causal effect behind the group fairness through causal discovery and inference. Extensive experiments validate its effectiveness, offering insights into interpretability towards building trustworthy and fair FFM systems.
NADec 5, 2020
Several Approximation Algorithms for Sparse Best Rank-1 Approximation to Higher-Order TensorsXianpeng Mao, Yuning Yang
Sparse tensor best rank-1 approximation (BR1Approx), which is a sparsity generalization of the dense tensor BR1Approx, and is a higher-order extension of the sparse matrix BR1Approx, is one of the most important problems in sparse tensor decomposition and related problems arising from statistics and machine learning. By exploiting the multilinearity as well as the sparsity structure of the problem, four approximation algorithms are proposed, which are easily implemented, of low computational complexity, and can serve as initial procedures for iterative algorithms. In addition, theoretically guaranteed worst-case approximation lower bounds are proved for all the algorithms. We provide numerical experiments on synthetic and real data to illustrate the effectiveness of the proposed algorithms.
MLMar 7, 2015
Higher order Matching Pursuit for Low Rank Tensor LearningYuning Yang, Siamak Mehrkanoon, Johan A. K. Suykens
Low rank tensor learning, such as tensor completion and multilinear multitask learning, has received much attention in recent years. In this paper, we propose higher order matching pursuit for low rank tensor learning problems with a convex or a nonconvex cost function, which is a generalization of the matching pursuit type methods. At each iteration, the main cost of the proposed methods is only to compute a rank-one tensor, which can be done efficiently, making the proposed methods scalable to large scale problems. Moreover, storing the resulting rank-one tensors is of low storage requirement, which can help to break the curse of dimensionality. The linear convergence rate of the proposed methods is established in various circumstances. Along with the main methods, we also provide a method of low computational complexity for approximately computing the rank-one tensors, with provable approximation ratio, which helps to improve the efficiency of the main methods and to analyze the convergence rate. Experimental results on synthetic as well as real datasets verify the efficiency and effectiveness of the proposed methods.