Daniel Zeng

LG
h-index20
17papers
293citations
Novelty53%
AI Score41

17 Papers

CVJul 4, 2022
ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy

Daniel Zeng, Tailin Wu, Jure Leskovec · mit

Visual relations form the basis of understanding our compositional world, as relationships between visual objects capture key information in a scene. It is then advantageous to learn relations automatically from the data, as learning with predefined labels cannot capture all possible relations. However, current relation learning methods typically require supervision, and are not designed to generalize to scenes with more complicated relational structures than those seen during training. Here, we introduce ViRel, a method for unsupervised discovery and learning of Visual Relations with graph-level analogy. In a setting where scenes within a task share the same underlying relational subgraph structure, our learning method of contrasting isomorphic and non-isomorphic graphs discovers the relations across tasks in an unsupervised manner. Once the relations are learned, ViRel can then retrieve the shared relational graph structure for each task by parsing the predicted relational structure. Using a dataset based on grid-world and the Abstract Reasoning Corpus, we show that our method achieves above 95% accuracy in relation classification, discovers the relation graph structure for most tasks, and further generalizes to unseen tasks with more complicated relational structures.

CVMay 23, 2022Code
Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation

Jiazhi Xu, Sheng Huang, Fengtao Zhou et al.

Multi-Label Image Classification (MLIC) approaches usually exploit label correlations to achieve good performance. However, emphasizing correlation like co-occurrence may overlook discriminative features of the target itself and lead to model overfitting, thus undermining the performance. In this study, we propose a generic framework named Parallel Self-Distillation (PSD) for boosting MLIC models. PSD decomposes the original MLIC task into several simpler MLIC sub-tasks via two elaborated complementary task decomposition strategies named Co-occurrence Graph Partition (CGP) and Dis-occurrence Graph Partition (DGP). Then, the MLIC models of fewer categories are trained with these sub-tasks in parallel for respectively learning the joint patterns and the category-specific patterns of labels. Finally, knowledge distillation is leveraged to learn a compact global ensemble of full categories with these learned patterns for reconciling the label correlation exploitation and model overfitting. Extensive results on MS-COCO and NUS-WIDE datasets demonstrate that our framework can be easily plugged into many MLIC approaches and improve performances of recent state-of-the-art approaches. The explainable visual study also further validates that our method is able to learn both the category-specific and co-occurring features. The source code is released at https://github.com/Robbie-Xu/CPSD.

AIMar 4, 2022
Aggregate effects of advertising decisions: a complex systems look at search engine advertising via an experimental study

Yanwu Yang, Xin Li, Bernard J. Jansen et al.

Purpose: We model group advertising decisions, which are the collective decisions of every single advertiser within the set of advertisers who are competing in the same auction or vertical industry, and examine resulting market outcomes, via a proposed simulation framework named EXP-SEA (Experimental Platform for Search Engine Advertising) supporting experimental studies of collective behaviors in the context of search engine advertising. Design: We implement the EXP-SEA to validate the proposed simulation framework, also conduct three experimental studies on the aggregate impact of electronic word-of-mouth, the competition level, and strategic bidding behaviors. EXP-SEA supports heterogeneous participants, various auction mechanisms, and also ranking and pricing algorithms. Findings: Findings from our three experiments show that (a) both the market profit and advertising indexes such as number of impressions and number of clicks are larger when the eWOM effect presents, meaning social media certainly has some effect on search engine advertising outcomes, (b) the competition level has a monotonic increasing effect on the market performance, thus search engines have an incentive to encourage both the eWOM among search users and competition among advertisers, and (c) given the market-level effect of the percentage of advertisers employing a dynamic greedy bidding strategy, there is a cut-off point for strategic bidding behaviors. Originality: This is one of the first research works to explore collective group decisions and resulting phenomena in the complex context of search engine advertising via developing and validating a simulation framework that supports assessments of various advertising strategies and estimations of the impact of mechanisms on the search market.

LGAug 2, 2023
Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning

Haorui Li, Jiaqi Liang, Linjing Li et al.

Hierarchical reinforcement learning composites subpolicies in different hierarchies to accomplish complex tasks.Automated subpolicies discovery, which does not depend on domain knowledge, is a promising approach to generating subpolicies.However, the degradation problem is a challenge that existing methods can hardly deal with due to the lack of consideration of diversity or the employment of weak regularizers. In this paper, we propose a novel task-agnostic regularizer called the Wasserstein Diversity-Enriched Regularizer (WDER), which enlarges the diversity of subpolicies by maximizing the Wasserstein distances among action distributions. The proposed WDER can be easily incorporated into the loss function of existing methods to boost their performance further.Experimental results demonstrate that our WDER improves performance and sample efficiency in comparison with prior work without modifying hyperparameters, which indicates the applicability and robustness of the WDER.

LGOct 28, 2025Code
Pearl: A Foundation Model for Placing Every Atom in the Right Location

Genesis Research Team, Alejandro Dobles, Nina Jovic et al. · cmu

Accurately predicting the three-dimensional structures of protein-ligand complexes remains a fundamental challenge in computational drug discovery that limits the pace and success of therapeutic design. Deep learning methods have recently shown strong potential as structural prediction tools, achieving promising accuracy across diverse biomolecular systems. However, their performance and utility are constrained by scarce experimental data, inefficient architectures, physically invalid poses, and the limited ability to exploit auxiliary information available at inference. To address these issues, we introduce Pearl (Placing Every Atom in the Right Location), a foundation model for protein-ligand cofolding at scale. Pearl addresses these challenges with three key innovations: (1) training recipes that include large-scale synthetic data to overcome data scarcity; (2) architectures that incorporate an SO(3)-equivariant diffusion module to inherently respect 3D rotational symmetries, improving generalization and sample efficiency, and (3) controllable inference, including a generalized multi-chain templating system supporting both protein and non-polymeric components as well as dual unconditional/conditional modes. Pearl establishes a new state-of-the-art performance in protein-ligand cofolding. On the key metric of generating accurate (RMSD < 2 Å) and physically valid poses, Pearl surpasses AlphaFold 3 and other open source baselines on the public Runs N' Poses and PoseBusters benchmarks, delivering 14.5% and 14.2% improvements, respectively, over the next best model. In the pocket-conditional cofolding regime, Pearl delivers $3.6\times$ improvement on a proprietary set of challenging, real-world drug targets at the more rigorous RMSD < 1 Å threshold. Finally, we demonstrate that model performance correlates directly with synthetic dataset size used in training.

CLDec 24, 2023
YAYI-UIE: A Chat-Enhanced Instruction Tuning Framework for Universal Information Extraction

Xinglin Xiao, Yijie Wang, Nan Xu et al.

The difficulty of the information extraction task lies in dealing with the task-specific label schemas and heterogeneous data structures. Recent work has proposed methods based on large language models to uniformly model different information extraction tasks. However, these existing methods are deficient in their information extraction capabilities for Chinese languages other than English. In this paper, we propose an end-to-end chat-enhanced instruction tuning framework for universal information extraction (YAYI-UIE), which supports both Chinese and English. Specifically, we utilize dialogue data and information extraction data to enhance the information extraction performance jointly. Experimental results show that our proposed framework achieves state-of-the-art performance on Chinese datasets while also achieving comparable performance on English datasets under both supervised settings and zero-shot settings.

LGDec 13, 2023
LDM$^2$: A Large Decision Model Imitating Human Cognition with Dynamic Memory Enhancement

Xingjin Wang, Linjing Li, Daniel Zeng

With the rapid development of large language models (LLMs), it is highly demanded that LLMs can be adopted to make decisions to enable the artificial general intelligence. Most approaches leverage manually crafted examples to prompt the LLMs to imitate the decision process of human. However, designing optimal prompts is difficult and the patterned prompts can hardly be generalized to more complex environments. In this paper, we propose a novel model named Large Decision Model with Memory (LDM$^2$), which leverages a dynamic memory mechanism to construct dynamic prompts, guiding the LLMs in making proper decisions according to the faced state. LDM$^2$ consists of two stages: memory formation and memory refinement. In the former stage, human behaviors are decomposed into state-action tuples utilizing the powerful summarizing ability of LLMs. Then, these tuples are stored in the memory, whose indices are generated by the LLMs, to facilitate the retrieval of the most relevant subset of memorized tuples based on the current state. In the latter stage, our LDM$^2$ employs tree exploration to discover more suitable decision processes and enrich the memory by adding valuable state-action tuples. The dynamic circle of exploration and memory enhancement provides LDM$^2$ a better understanding of the global environment. Extensive experiments conducted in two interactive environments have shown that our LDM$^2$ outperforms the baselines in terms of both score and success rate, which demonstrates its effectiveness.

CLMay 27, 2025
Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models?

Yifei Wang, Yu Sheng, Linjing Li et al.

Recent advances in handling long sequences have facilitated the exploration of long-context in-context learning (ICL). While much of the existing research emphasizes performance improvements driven by additional in-context examples, the influence on the trustworthiness of generated responses remains underexplored. This paper addresses this gap by investigating how increased examples influence predictive uncertainty, an essential aspect in trustworthiness. We begin by systematically quantifying the uncertainty of ICL with varying shot counts, analyzing the impact of example quantity. Through uncertainty decomposition, we introduce a novel perspective on performance enhancement, with a focus on epistemic uncertainty (EU). Our results reveal that additional examples reduce total uncertainty in both simple and complex tasks by injecting task-specific knowledge, thereby diminishing EU and enhancing performance. For complex tasks, these advantages emerge only after addressing the increased noise and uncertainty associated with longer inputs. Finally, we explore the evolution of internal confidence across layers, unveiling the mechanisms driving the reduction in uncertainty.

CVJan 11, 2024
Short-Form Videos and Mental Health: A Knowledge-Guided Neural Topic Model

Jiaheng Xie, Ruicheng Liang, Yidong Chai et al.

Along with the rise of short-form videos, their mental impacts on viewers have led to widespread consequences, prompting platforms to predict videos' impact on viewers' mental health. Subsequently, they can take intervention measures according to their community guidelines. Nevertheless, applicable predictive methods lack relevance to well-established medical knowledge, which outlines clinically proven external and environmental factors of mental disorders. To account for such medical knowledge, we resort to an emergent methodological discipline, seeded Neural Topic Models (NTMs). However, existing seeded NTMs suffer from the limitations of single-origin topics, unknown topic sources, unclear seed supervision, and suboptimal convergence. To address those challenges, we develop a novel Knowledge-Guided NTM to predict a short-form video's suicidal thought impact on viewers. Extensive empirical analyses using TikTok and Douyin datasets prove that our method outperforms state-of-the-art benchmarks. Our method also discovers medically relevant topics from videos that are linked to suicidal thought impact. We contribute to IS with a novel video analytics method that is generalizable to other video classification problems. Practically, our method can help platforms understand videos' suicidal thought impacts, thus moderating videos that violate their community guidelines.

LGMay 21, 2023
PRODIGY: Enabling In-context Learning Over Graphs

Qian Huang, Hongyu Ren, Peng Chen et al.

In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks by conditioning on prompt examples, without optimizing any parameters. While large language models have demonstrated this ability, how in-context learning could be performed over graphs is unexplored. In this paper, we develop \textbf{Pr}etraining \textbf{O}ver \textbf{D}iverse \textbf{I}n-Context \textbf{G}raph S\textbf{y}stems (PRODIGY), the first pretraining framework that enables in-context learning over graphs. The key idea of our framework is to formulate in-context learning over graphs with a novel \emph{prompt graph} representation, which connects prompt examples and queries. We then propose a graph neural network architecture over the prompt graph and a corresponding family of in-context pretraining objectives. With PRODIGY, the pretrained model can directly perform novel downstream classification tasks on unseen graphs via in-context learning. We provide empirical evidence of the effectiveness of our framework by showcasing its strong in-context learning performance on tasks involving citation networks and knowledge graphs. Our approach outperforms the in-context learning accuracy of contrastive pretraining baselines with hard-coded adaptation by 18\% on average across all setups. Moreover, it also outperforms standard finetuning with limited data by 33\% on average with in-context learning.

AIFeb 28, 2022
Learning Parameters for a Generalized Vidale-Wolfe Response Model with Flexible Ad Elasticity and Word-of-Mouth

Yanwu Yang, Baozhu Feng, Daniel Zeng

In this research, we investigate a generalized form of Vidale-Wolfe (GVW) model. One key element of our modeling work is that the GVW model contains two useful indexes representing advertiser's elasticity and the word-of-mouth (WoM) effect, respectively. Moreover, we discuss some desirable properties of the GVW model, and present a deep neural network (DNN)-based estimation method to learn its parameters. Furthermore, based on three realworld datasets, we conduct computational experiments to validate the GVW model and identified properties. In addition, we also discuss potential advantages of the GVW model over econometric models. The research outcome shows that both the ad elasticity index and the WoM index have significant influences on advertising responses, and the GVW model has potential advantages over econometric models of advertising, in terms of several interesting phenomena drawn from practical advertising situations. The GVW model and its deep learning-based estimation method provide a basis to support big data-driven advertising analytics and decision makings; in the meanwhile, identified properties and experimental findings of this research illuminate critical managerial insights for advertisers in various advertising forms.

IRFeb 28, 2022
Keyword Optimization in Sponsored Search Advertising: A Multi-Level Computational Framework

Yanwu Yang, Bernard J. Jansen, Yinghui Yang et al.

In sponsored search advertising, keywords serve as an essential bridge linking advertisers, search users and search engines. Advertisers have to deal with a series of keyword decisions throughout the entire lifecycle of search advertising campaigns. This paper proposes a multi-level and closed-form computational framework for keyword optimization (MKOF) to support various keyword decisions. Based on this framework, we develop corresponding optimization strategies for keyword targeting, keyword assignment and keyword grouping at different levels (e.g., market, campaign and adgroup). With two real-world datasets obtained from past search advertising campaigns, we conduct computational experiments to evaluate our keyword optimization framework and instantiated strategies. Experimental results show that our method can approach the optimal solution in a steady way, and it outperforms two baseline keyword strategies commonly used in practice. The proposed MKOF framework also provides a valid experimental environment to implement and assess various keyword strategies in sponsored search advertising.

LGSep 30, 2019
Spread-gram: A spreading-activation schema of network structural learning

Jie Bai, Linjing Li, Daniel Zeng

Network representation learning has exploded recently. However, existing studies usually reconstruct networks as sequences or matrices, which may cause information bias or sparsity problem during model training. Inspired by a cognitive model of human memory, we propose a network representation learning scheme. In this scheme, we learn node embeddings by adjusting the proximity of nodes traversing the spreading structure of the network. Our proposed method shows a significant improvement in multiple analysis tasks based on various real-world networks, ranging from semantic networks to protein interaction networks, international trade networks, human behavior networks, etc. In particular, our model can effectively discover the hierarchical structures in networks. The well-organized model training speeds up the convergence to only a small number of iterations, and the training time is linear with respect to the edge numbers.

STAug 26, 2018
Evolutionary dynamics of cryptocurrency transaction networks: An empirical study

Jiaqi Liang, Linjing Li, Daniel Zeng

Cryptocurrency is a well-developed blockchain technology application that is currently a heated topic throughout the world. The public availability of transaction histories offers an opportunity to analyze and compare different cryptocurrencies. In this paper, we present a dynamic network analysis of three representative blockchain-based cryptocurrencies: Bitcoin, Ethereum, and Namecoin. By analyzing the accumulated network growth, we find that, unlike most other networks, these cryptocurrency networks do not always densify over time, and they are changing all the time with relatively low node and edge repetition ratios. Therefore, we then construct separate networks on a monthly basis, trace the changes of typical network characteristics (including degree distribution, degree assortativity, clustering coefficient, and the largest connected component) over time, and compare the three. We find that the degree distribution of these monthly transaction networks cannot be well fitted by the famous power-law distribution, at the same time, different currency still has different network properties, e.g., both Bitcoin and Ethereum networks are heavy-tailed with disassortative mixing, however, only the former can be treated as a small world. These network properties reflect the evolutionary characteristics and competitive power of these three cryptocurrencies and provide a foundation for future research.

IRSep 27, 2013
A Random Walk Model for Item Recommendation in Folksonomies

Zhu Zhang, Daniel Zeng, Ahmed Abbasi et al.

Social tagging, as a novel approach to information organization and discovery, has been widely adopted in many Web2.0 applications. The tags provide a new type of information that can be exploited by recommender systems. Nevertheless, the sparsity of ternary <user, tag, item> interaction data limits the performance of tag-based collaborative filtering. This paper proposes a random-walk-based algorithm to deal with the sparsity problem in social tagging data, which captures the potential transitive associations between users and items through their interaction with tags. In particular, two smoothing strategies are presented from both the user-centric and item-centric perspectives. Experiments on real-world data sets empirically demonstrate the efficacy of the proposed algorithm.

IRSep 27, 2013
Evaluating the Usefulness of Sentiment Information for Focused Crawlers

Tianjun Fu, Ahmed Abbasi, Daniel Zeng et al.

Despite the prevalence of sentiment-related content on the Web, there has been limited work on focused crawlers capable of effectively collecting such content. In this study, we evaluated the efficacy of using sentiment-related information for enhanced focused crawling of opinion-rich web content regarding a particular topic. We also assessed the impact of using sentiment-labeled web graphs to further improve collection accuracy. Experimental results on a large test bed encompassing over half a million web pages revealed that focused crawlers utilizing sentiment information as well as sentiment-labeled web graphs are capable of gathering more holistic collections of opinion-related content regarding a particular topic. The results have important implications for business and marketing intelligence gathering efforts in the Web 2.0 era.

IRMar 9, 2013
The Powerful Model Adpredictor for Search Engine Switching Detection Challenge

Heng Gao, Yongbao Li, Qiudan Li et al.

The purpose of the Switching Detection Challenge in the 2013 WSCD workshop was to predict users' search engine swithcing actions given records about search sessions and logs.Our solution adopted the powerful prediction model Adpredictor and utilized the method of feature engineering. We successfully applied the click through rate (CTR) prediction model Adpredicitor into our solution framework, and then the discovery of effective features and the multiple classification of different switching type make our model outperforms many competitors. We achieved an AUC score of 0.84255 on the private leaderboard and ranked the 5th among all the competitors in the competition.