Likun Zhang

CV
h-index2
9papers
25citations
Novelty53%
AI Score42

9 Papers

MLJul 16, 2023
Flexible and efficient emulation of spatial extremes processes via variational autoencoders

Likun Zhang, Xiaoyu Ma, Christopher K. Wikle et al.

Many real-world processes have complex tail dependence structures that cannot be characterized using classical Gaussian processes. More flexible spatial extremes models exhibit appealing extremal dependence properties but are often exceedingly prohibitive to fit and simulate from in high dimensions. In this paper, we aim to push the boundaries on computation and modeling of high-dimensional spatial extremes via integrating a new spatial extremes model that has flexible and non-stationary dependence properties in the encoding-decoding structure of a variational autoencoder called the XVAE. The XVAE can emulate spatial observations and produce outputs that have the same statistical properties as the inputs, especially in the tail. Our approach also provides a novel way of making fast inference with complex extreme-value processes. Through extensive simulation studies, we show that our XVAE is substantially more time-efficient than traditional Bayesian inference while outperforming many spatial extremes models with a stationary dependence structure. Lastly, we analyze a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea, which includes 30 years of daily measurements at 16703 grid cells. We demonstrate how to use XVAE to identify regions susceptible to marine heatwaves under climate change and examine the spatial and temporal variability of the extremal dependence structure.

DCApr 8
Nexus: Transparent I/O Offloading for High-Density Serverless Computing

JooYoung Park, Kevin Nguetchouang, Jovan Stojkovic et al.

Serverless computing relies on extreme multi-tenancy to remain economically viable, driving providers to rely on virtual machines (VMs) that ensure strong isolation and seamless ecosystem compatibility with the FaaS programming model. However, current architectures tightly couple application processing logic with I/O processing, forcing every VM to duplicate a heavy communication fabric (cloud SDK, RPC, and TCP/IP). Our analysis reveals this duplication consumes over 25% of a function's memory footprint, and may double the CPU cycles in VMs compared to bare-metal execution. While prior systems attempt to solve this using WebAssembly or library OSes, they naively sacrifice ecosystem compatibility, forcing developers to migrate code and dependencies to new languages. We introduce Nexus, a serverless-native KVM-based hypervisor that transparently decouples compute from I/O. Nexus shifts the execution model by intercepting communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the heavyweight communication fabric from the guest VM, while preserving the conventional serverless programming model. By structurally separating these domains, Nexus unlocks asynchronous I/O optimizations: overlapping input payload prefetching with VM restoration from a snapshot and writing output payloads back to storage off the critical path. Compared to the production baseline, Nexus reduces overall node-level CPU and memory consumption by up to 44% and 31%, respectively, thus increasing deployment density by 37%. Also, Nexus reduces warm- and cold-start latency by 39% and 10%, respectively, bringing the response time within 20% of that of a WASM-based, ecosystem-incompatible hypervisor.

LGSep 4, 2024
CoAst: Validation-Free Contribution Assessment for Federated Learning based on Cross-Round Valuation

Hao Wu, Likun Zhang, Shucheng Li et al.

In the federated learning (FL) process, since the data held by each participant is different, it is necessary to figure out which participant has a higher contribution to the model performance. Effective contribution assessment can help motivate data owners to participate in the FL training. Research works in this field can be divided into two directions based on whether a validation dataset is required. Validation-based methods need to use representative validation data to measure the model accuracy, which is difficult to obtain in practical FL scenarios. Existing validation-free methods assess the contribution based on the parameters and gradients of local models and the global model in a single training round, which is easily compromised by the stochasticity of model training. In this work, we propose CoAst, a practical method to assess the FL participants' contribution without access to any validation data. The core idea of CoAst involves two aspects: one is to only count the most important part of model parameters through a weights quantization, and the other is a cross-round valuation based on the similarity between the current local parameters and the global parameter updates in several subsequent communication rounds. Extensive experiments show that CoAst has comparable assessment reliability to existing validation-based methods and outperforms existing validation-free methods.

CVSep 24, 2024
Training Data Attribution: Was Your Model Secretly Trained On Data Created By Mine?

Likun Zhang, Hao Wu, Lingcui Zhang et al.

The emergence of text-to-image models has recently sparked significant interest, but the attendant is a looming shadow of potential infringement by violating the user terms. Specifically, an adversary may exploit data created by a commercial model to train their own without proper authorization. To address such risk, it is crucial to investigate the attribution of a suspicious model's training data by determining whether its training data originates, wholly or partially, from a specific source model. To trace the generated data, existing methods require applying extra watermarks during either the training or inference phases of the source model. However, these methods are impractical for pre-trained models that have been released, especially when model owners lack security expertise. To tackle this challenge, we propose an injection-free training data attribution method for text-to-image models. It can identify whether a suspicious model's training data stems from a source model, without additional modifications on the source model. The crux of our method lies in the inherent memorization characteristic of text-to-image models. Our core insight is that the memorization of the training dataset is passed down through the data generated by the source model to the model trained on that data, making the source model and the infringing model exhibit consistent behaviors on specific samples. Therefore, our approach involves developing algorithms to uncover these distinct samples and using them as inherent watermarks to verify if a suspicious model originates from the source model. Our experiments demonstrate that our method achieves an accuracy of over 80\% in identifying the source of a suspicious model's training data, without interfering the original training or generation process of the source model.

MLJan 12
Covariance-Driven Regression Trees: Reducing Overfitting in CART

Likun Zhang, Wei Ma

Decision trees are powerful machine learning algorithms, widely used in fields such as economics and medicine for their simplicity and interpretability. However, decision trees such as CART are prone to overfitting, especially when grown deep or the sample size is small. Conventional methods to reduce overfitting include pre-pruning and post-pruning, which constrain the growth of uninformative branches. In this paper, we propose a complementary approach by introducing a covariance-driven splitting criterion for regression trees (CovRT). This method is more robust to overfitting than the empirical risk minimization criterion used in CART, as it produces more balanced and stable splits and more effectively identifies covariates with true signals. We establish an oracle inequality of CovRT and prove that its predictive accuracy is comparable to that of CART in high-dimensional settings. We find that CovRT achieves superior prediction accuracy compared to CART in both simulations and real-world tasks.

CVMar 23, 2024
Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content

Zhicheng Du, Zhaotian Xie, Huazhang Ying et al.

This study explores the ability of Image Captioning (IC) models to decode masked visual content sourced from diverse datasets. Our findings reveal the IC model's capability to generate captions from masked images, closely resembling the original content. Notably, even in the presence of masks, the model adeptly crafts descriptive textual information that goes beyond what is observable in the original image-generated captions. While the decoding performance of the IC model experiences a decline with an increase in the masked region's area, the model still performs well when important regions of the image are not masked at high coverage.

CVFeb 21, 2025
AutoMR: A Universal Time Series Motion Recognition Pipeline

Likun Zhang, Sicheng Yang, Zhuo Wang et al.

In this paper, we present an end-to-end automated motion recognition (AutoMR) pipeline designed for multimodal datasets. The proposed framework seamlessly integrates data preprocessing, model training, hyperparameter tuning, and evaluation, enabling robust performance across diverse scenarios. Our approach addresses two primary challenges: 1) variability in sensor data formats and parameters across datasets, which traditionally requires task-specific machine learning implementations, and 2) the complexity and time consumption of hyperparameter tuning for optimal model performance. Our library features an all-in-one solution incorporating QuartzNet as the core model, automated hyperparameter tuning, and comprehensive metrics tracking. Extensive experiments demonstrate its effectiveness on 10 diverse datasets, achieving state-of-the-art performance. This work lays a solid foundation for deploying motion-capture solutions across varied real-world applications.

LGFeb 20, 2020
Input Perturbation: A New Paradigm between Central and Local Differential Privacy

Yilin Kang, Yong Liu, Ben Niu et al.

Traditionally, there are two models on differential privacy: the central model and the local model. The central model focuses on the machine learning model and the local model focuses on the training data. In this paper, we study the \textit{input perturbation} method in differentially private empirical risk minimization (DP-ERM), preserving privacy of the central model. By adding noise to the original training data and training with the `perturbed data', we achieve ($ε$,$δ$)-differential privacy on the final model, along with some kind of privacy on the original data. We observe that there is an interesting connection between the local model and the central model: the perturbation on the original data causes the perturbation on the gradient, and finally the model parameters. This observation means that our method builds a bridge between local and central model, protecting the data, the gradient and the model simultaneously, which is more superior than previous central methods. Detailed theoretical analysis and experiments show that our method achieves almost the same (or even better) performance as some of the best previous central methods with more protections on privacy, which is an attractive result. Moreover, we extend our method to a more general case: the loss function satisfies the Polyak-Lojasiewicz condition, which is more general than strong convexity, the constraint on the loss function in most previous work.

NAJul 23, 2016
Oscillation-free method for semilinear diffusion equations under noisy initial conditions

R. C. Harwood, Likun Zhang, V. S. Manoranjan

Noise in initial conditions from measurement errors can create unwanted oscillations which propagate in numerical solutions. We present a technique of prohibiting such oscillation errors when solving initial-boundary-value problems of semilinear diffusion equations. Symmetric Strang splitting is applied to the equation for solving the linear diffusion and nonlinear remainder separately. An oscillation-free scheme is developed for overcoming any oscillatory behavior when numerically solving the linear diffusion portion. To demonstrate the ills of stable oscillations, we compare our method using a weighted implicit Euler scheme to the Crank-Nicolson method. The oscillation-free feature and stability of our method are analyzed through a local linearization. The accuracy of our oscillation-free method is proved and its usefulness is further verified through solving a Fisher-type equation where oscillation-free solutions are successfully produced in spite of random errors in the initial conditions.