Tianyun Yang

CV
h-index21
9papers
445citations
Novelty49%
AI Score47

9 Papers

CVMar 13, 2023
Progressive Open Space Expansion for Open-Set Model Attribution

Tianyun Yang, Danding Wang, Fan Tang et al.

Despite the remarkable progress in generative technology, the Janus-faced issues of intellectual property protection and malicious content supervision have arisen. Efforts have been paid to manage synthetic images by attributing them to a set of potential source models. However, the closed-set classification setting limits the application in real-world scenarios for handling contents generated by arbitrary models. In this study, we focus on a challenging task, namely Open-Set Model Attribution (OSMA), to simultaneously attribute images to known models and identify those from unknown ones. Compared to existing open-set recognition (OSR) tasks focusing on semantic novelty, OSMA is more challenging as the distinction between images from known and unknown models may only lie in visually imperceptible traces. To this end, we propose a Progressive Open Space Expansion (POSE) solution, which simulates open-set samples that maintain the same semantics as closed-set samples but embedded with different imperceptible traces. Guided by a diversity constraint, the open space is simulated progressively by a set of lightweight augmentation models. We consider three real-world scenarios and construct an OSMA benchmark dataset, including unknown models trained with different random seeds, architectures, and datasets from known ones. Extensive experiments on the dataset demonstrate POSE is superior to both existing model attribution methods and off-the-shelf OSR methods.

CVJul 29, 2023
Model Synthesis for Zero-Shot Model Attribution

Tianyun Yang, Juan Cao, Danding Wang et al.

Nowadays, generative models are shaping various fields such as art, design, and human-computer interaction, yet accompanied by challenges related to copyright infringement and content management. In response, existing research seeks to identify the unique fingerprints on the images they generate, which can be leveraged to attribute the generated images to their source models. Existing methods, however, are constrained to identifying models within a static set included in the classifier training, failing to adapt to newly emerged unseen models dynamically. To bridge this gap, we aim to develop a generalized model fingerprint extractor capable of zero-shot attribution, effectively attributes unseen models without exposure during training. Central to our method is a model synthesis technique, which generates numerous synthetic models mimicking the fingerprint patterns of real-world generative models. The design of the synthesis technique is motivated by observations on how the basic generative model's architecture building blocks and parameters influence fingerprint patterns, and it is validated through two designed metrics that examine synthetic models' fidelity and diversity. Our experiments demonstrate that this fingerprint extractor, trained solely on synthetic models, achieves impressive zero-shot generalization on a wide range of real-world generative models, improving model identification and verification accuracy on unseen models by over 40% and 15%, respectively, compared to existing approaches.

LGMar 24
Off-Policy Value-Based Reinforcement Learning for Large Language Models

Peng-Yuan Wang, Ziniu Li, Tian Xu et al.

Improving data utilization efficiency is critical for scaling reinforcement learning (RL) for long-horizon tasks where generating trajectories is expensive. However, the dominant RL methods for LLMs are largely on-policy: they update each batch of data only once, discard it, and then collect fresh samples, resulting in poor sample efficiency. In this work, we explore an alternative value-based RL framework for LLMs that naturally enables off-policy learning. We propose ReVal, a Bellman-update-based method that combines stepwise signals capturing internal consistency with trajectory-level signals derived from outcome verification. ReVal naturally supports replay-buffer-based training, allowing efficient reuse of past trajectories. Experiments on standard mathematical reasoning benchmarks show that ReVal not only converges faster but also outperforms GRPO in final performance. On DeepSeek-R1-Distill-1.5B, ReVal improves training efficiency and achieves improvement of 2.7% in AIME24 and 4.5% in out-of-domain benchmark GPQA over GRPO. These results suggest that value-based RL is a practical alternative to policy-based methods for LLM training.

LGSep 30, 2025
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Ziniu Li, Congliang Chen, Tianyun Yang et al.

Large Language Models (LLMs) can self-improve through reinforcement learning, where they generate trajectories to explore and discover better solutions. However, this exploration process is computationally expensive, often forcing current methods to assign limited exploration budgets to each task. This uniform allocation creates problematic edge cases: easy tasks consistently succeed while difficult tasks consistently fail, both producing zero gradients during training updates for the widely used Group Relative Policy Optimization (GRPO). We address this problem from the lens of exploration budget allocation. Viewing each task's exploration as an "item" with a distinct "value" and "cost", we establish a connection to the classical knapsack problem. This formulation allows us to derive an optimal assignment rule that adaptively distributes resources based on the model's current learning status. When applied to GRPO, our method increases the effective ratio of non-zero policy gradients by 20-40% during training. Acting as a computational "free lunch", our approach could reallocate exploration budgets from tasks where learning is saturated to those where it is most impactful. This enables significantly larger budgets (e.g., 93 rollouts) for especially challenging problems, which would be computationally prohibitive under a uniform allocation. These improvements translate to meaningful gains on mathematical reasoning benchmarks, with average improvements of 2-4 points and peak gains of 9 points on specific tasks. Notably, achieving comparable performance with traditional homogeneous allocation would require about 2x the computational resources.

LGAug 12, 2025
Bridging Formal Language with Chain-of-Thought Reasoning to Geometry Problem Solving

Tianyun Yang, Yunwen Li, Ziniu Li et al.

Large vision language models exhibit notable limitations on Geometry Problem Solving (GPS) because of their unreliable diagram interpretation and pure natural-language reasoning. A recent line of work mitigates this by using symbolic solvers: the model directly generates a formal program that a geometry solver can execute. However, this direct program generation lacks intermediate reasoning, making the decision process opaque and prone to errors. In this work, we explore a new approach that integrates Chain-of-Thought (CoT) with formal language. The model interleaves natural language reasoning with incremental emission of solver-executable code, producing a hybrid reasoning trace in which critical derivations are expressed in formal language. To teach this behavior at scale, we combine (1) supervised fine-tuning on an 11K newly developed synthetic dataset with interleaved natural language reasoning and automatic formalization, and (2) solver-in-the-loop reinforcement learning that jointly optimizes both the CoT narrative and the resulting program through outcome-based rewards. Built on Qwen2.5-VL-7B, our new model, named GF-Reasoner, achieves up to 15% accuracy improvements on standard GPS benchmarks, surpassing both 7B-scale peers and the much larger model Qwen2.5-VL-72B. By exploiting high-order geometric knowledge and offloading symbolic computation to the solver, the generated reasoning traces are noticeably shorter and cleaner. Furthermore, we present a comprehensive analysis of method design choices (e.g., reasoning paradigms, data synthesis, training epochs, etc.), providing actionable insights for future research.

CVFeb 28, 2022
Deepfake Network Architecture Attribution

Tianyun Yang, Ziyao Huang, Juan Cao et al.

With the rapid progress of generation technology, it has become necessary to attribute the origin of fake images. Existing works on fake image attribution perform multi-class classification on several Generative Adversarial Network (GAN) models and obtain high accuracies. While encouraging, these works are restricted to model-level attribution, only capable of handling images generated by seen models with a specific seed, loss and dataset, which is limited in real-world scenarios when fake images may be generated by privately trained models. This motivates us to ask whether it is possible to attribute fake images to the source models' architectures even if they are finetuned or retrained under different configurations. In this work, we present the first study on Deepfake Network Architecture Attribution to attribute fake images on architecture-level. Based on an observation that GAN architecture is likely to leave globally consistent fingerprints while traces left by model weights vary in different regions, we provide a simple yet effective solution named DNA-Det for this problem. Extensive experiments on multiple cross-test setups and a large-scale dataset demonstrate the effectiveness of DNA-Det.

CVJun 16, 2021
Learning to Disentangle GAN Fingerprint for Fake Image Attribution

Tianyun Yang, Juan Cao, Qiang Sheng et al.

Rapid pace of generative models has brought about new threats to visual forensics such as malicious personation and digital copyright infringement, which promotes works on fake image attribution. Existing works on fake image attribution mainly rely on a direct classification framework. Without additional supervision, the extracted features could include many content-relevant components and generalize poorly. Meanwhile, how to obtain an interpretable GAN fingerprint to explain the decision remains an open question. Adopting a multi-task framework, we propose a GAN Fingerprint Disentangling Network (GFD-Net) to simultaneously disentangle the fingerprint from GAN-generated images and produce a content-irrelevant representation for fake image attribution. A series of constraints are provided to guarantee the stability and discriminability of the fingerprint, which in turn helps content-irrelevant feature extraction. Further, we perform comprehensive analysis on GAN fingerprint, providing some clues about the properties of GAN fingerprint and which factors dominate the fingerprint in GAN architecture. Experiments show that our GFD-Net achieves superior fake image attribution performance in both closed-world and open-world testing. We also apply our method in binary fake image detection and exhibit a significant generalization ability on unseen generators.

MMMar 11, 2020
Exploring the Role of Visual Content in Fake News Detection

Juan Cao, Peng Qi, Qiang Sheng et al.

The increasing popularity of social media promotes the proliferation of fake news, which has caused significant negative societal effects. Therefore, fake news detection on social media has recently become an emerging research area of great concern. With the development of multimedia technology, fake news attempts to utilize multimedia content with images or videos to attract and mislead consumers for rapid dissemination, which makes visual content an important part of fake news. Despite the importance of visual content, our understanding of the role of visual content in fake news detection is still limited. This chapter presents a comprehensive review of the visual content in fake news, including the basic concepts, effective visual features, representative detection methods and challenging issues of multimedia fake news detection. This chapter can help readers to understand the role of visual content in fake news detection, and effectively utilize visual content to assist in detecting multimedia fake news.

MMAug 13, 2019
Exploiting Multi-domain Visual Information for Fake News Detection

Peng Qi, Juan Cao, Tianyun Yang et al.

The increasing popularity of social media promotes the proliferation of fake news. With the development of multimedia technology, fake news attempts to utilize multimedia contents with images or videos to attract and mislead readers for rapid dissemination, which makes visual contents an important part of fake news. Fake-news images, images attached in fake news posts,include not only fake images which are maliciously tampered but also real images which are wrongly used to represent irrelevant events. Hence, how to fully exploit the inherent characteristics of fake-news images is an important but challenging problem for fake news detection. In the real world, fake-news images may have significantly different characteristics from real-news images at both physical and semantic levels, which can be clearly reflected in the frequency and pixel domain, respectively. Therefore, we propose a novel framework Multi-domain Visual Neural Network (MVNN) to fuse the visual information of frequency and pixel domains for detecting fake news. Specifically, we design a CNN-based network to automatically capture the complex patterns of fake-news images in the frequency domain; and utilize a multi-branch CNN-RNN model to extract visual features from different semantic levels in the pixel domain. An attention mechanism is utilized to fuse the feature representations of frequency and pixel domains dynamically. Extensive experiments conducted on a real-world dataset demonstrate that MVNN outperforms existing methods with at least 9.2% in accuracy, and can help improve the performance of multimodal fake news detection by over 5.2%.