Jiashi Gao

CV
4papers
16citations
Novelty51%
AI Score24

4 Papers

CVOct 25, 2022
Deep Boosting Robustness of DNN-based Image Watermarking via DBMark

Guanhui Ye, Jiashi Gao, Wei Xie et al.

Image watermarking is a technique for hiding information into images that can withstand distortions while requiring the encoded image to be perceptually identical to the original image. Recent work based on deep neural networks (DNN) has achieved impressive progression in digital watermarking. Higher robustness under various distortions is the eternal pursuit of digital image watermarking approaches. In this paper, we propose DBMARK, a novel end-to-end digital image watermarking framework to deep boost the robustness of DNN-based image watermarking. The key novelty is the synergy of invertible neural networks (INN) and effective watermark features generation. The framework generates watermark features with redundancy and error correction ability through the effective neural network based message processor, synergized with the powerful information embedding and extraction abilities of INN to achieve higher robustness and invisibility. The powerful learning ability of neural networks enables the message processor to adapt to various distortions. In addition, we propose to embed the watermark information in the discrete wavelet transform (DWT) domain and design low-low (LL) sub-band loss to enhance invisibility. Extensive experiment results demonstrate the superiority of the proposed framework compared with the state-of-the-art ones under various distortions such as dropout, cropout, crop, Gaussian filter, and JPEG compression.

CVOct 13, 2023
CopyScope: Model-level Copyright Infringement Quantification in the Diffusion Workflow

Junlei Zhou, Jiashi Gao, Ziwei Wang et al.

Web-based AI image generation has become an innovative art form that can generate novel artworks with the rapid development of the diffusion model. However, this new technique brings potential copyright infringement risks as it may incorporate the existing artworks without the owners' consent. Copyright infringement quantification is the primary and challenging step towards AI-generated image copyright traceability. Previous work only focused on data attribution from the training data perspective, which is unsuitable for tracing and quantifying copyright infringement in practice because of the following reasons: (1) the training datasets are not always available in public; (2) the model provider is the responsible party, not the image. Motivated by this, in this paper, we propose CopyScope, a new framework to quantify the infringement of AI-generated images from the model level. We first rigorously identify pivotal components within the AI image generation pipeline. Then, we propose to take advantage of Fréchet Inception Distance (FID) to effectively capture the image similarity that fits human perception naturally. We further propose the FID-based Shapley algorithm to evaluate the infringement contribution among models. Extensive experiments demonstrate that our work not only reveals the intricacies of infringement quantification but also effectively depicts the infringing models quantitatively, thus promoting accountability in AI image-generation tasks.

LGSep 28, 2023
Anti-Matthew FL: Bridging the Performance Gap in Federated Learning to Counteract the Matthew Effect

Jiashi Gao, Xin Yao, Xuetao Wei

Federated learning (FL) stands as a paradigmatic approach that facilitates model training across heterogeneous and diverse datasets originating from various data providers. However, conventional FLs fall short of achieving consistent performance, potentially leading to performance degradation for clients who are disadvantaged in data resources. Influenced by the Matthew effect, deploying a performance-imbalanced global model in applications further impedes the generation of high-quality data from disadvantaged clients, exacerbating the disparities in data resources among clients. In this work, we propose anti-Matthew fairness for the global model at the client level, requiring equal accuracy and equal decision bias across clients. To balance the trade-off between achieving anti-Matthew fairness and performance optimality, we formalize the anti-Matthew effect federated learning (anti-Matthew FL) as a multi-constrained multi-objectives optimization (MCMOO) problem and propose a three-stage multi-gradient descent algorithm to obtain the Pareto optimality. We theoretically analyze the convergence and time complexity of our proposed algorithms. Additionally, through extensive experimentation, we demonstrate that our proposed anti-Matthew FL outperforms other state-of-the-art FL algorithms in achieving a high-performance global model while effectively bridging performance gaps among clients. We hope this work provides valuable insights into the manifestation of the Matthew effect in FL and other decentralized learning scenarios and can contribute to designing fairer learning mechanisms, ultimately fostering societal welfare.

CVFeb 8, 2022
Social-DualCVAE: Multimodal Trajectory Forecasting Based on Social Interactions Pattern Aware and Dual Conditional Variational Auto-Encoder

Jiashi Gao, Xinming Shi, James J. Q. Yu

Pedestrian trajectory forecasting is a fundamental task in multiple utility areas, such as self-driving, autonomous robots, and surveillance systems. The future trajectory forecasting is multi-modal, influenced by physical interaction with scene contexts and intricate social interactions among pedestrians. The mainly existing literature learns representations of social interactions by deep learning networks, while the explicit interaction patterns are not utilized. Different interaction patterns, such as following or collision avoiding, will generate different trends of next movement, thus, the awareness of social interaction patterns is important for trajectory forecasting. Moreover, the social interaction patterns are privacy concerned or lack of labels. To jointly address the above issues, we present a social-dual conditional variational auto-encoder (Social-DualCVAE) for multi-modal trajectory forecasting, which is based on a generative model conditioned not only on the past trajectories but also the unsupervised classification of interaction patterns. After generating the category distribution of the unlabeled social interaction patterns, DualCVAE, conditioned on the past trajectories and social interaction pattern, is proposed for multi-modal trajectory prediction by latent variables estimating. A variational bound is derived as the minimization objective during training. The proposed model is evaluated on widely used trajectory benchmarks and outperforms the prior state-of-the-art methods.