LGSep 25, 2023
Follow-ups Also Matter: Improving Contextual Bandits via Post-serving ContextsChaoqi Wang, Ziyu Ye, Zhe Feng et al. · utoronto
Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems in which valuable additional context can be observed after arm selection. For example, content recommendation platforms like Youtube, Instagram, Tiktok also observe valuable follow-up information pertinent to the user's reward after recommendation (e.g., how long the user stayed, what is the user's watch speed, etc.). To improve online learning efficiency in these applications, we study a novel contextual bandit problem with post-serving contexts and design a new algorithm, poLinUCB, that achieves tight regret under standard assumptions. Core to our technical proof is a robustified and generalized version of the well-known Elliptical Potential Lemma (EPL), which can accommodate noise in data. Such robustification is necessary for tackling our problem, and we believe it could also be of general interest. Extensive empirical tests on both synthetic and real-world datasets demonstrate the significant benefit of utilizing post-serving contexts as well as the superior performance of our algorithm over the state-of-the-art approaches.
GTJul 10, 2024
Deep Reinforcement Learning for Sequential Combinatorial AuctionsSai Srivatsa Ravindranath, Zhe Feng, Di Wang et al.
Revenue-optimal auction design is a challenging problem with significant theoretical and practical implications. Sequential auction mechanisms, known for their simplicity and strong strategyproofness guarantees, are often limited by theoretical results that are largely existential, except for certain restrictive settings. Although traditional reinforcement learning methods such as Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) are applicable in this domain, they struggle with computational demands and convergence issues when dealing with large and continuous action spaces. In light of this and recognizing that we can model transitions differentiable for our settings, we propose using a new reinforcement learning framework tailored for sequential combinatorial auctions that leverages first-order gradients. Our extensive evaluations show that our approach achieves significant improvement in revenue over both analytical baselines and standard reinforcement learning algorithms. Furthermore, we scale our approach to scenarios involving up to 50 agents and 50 items, demonstrating its applicability in complex, real-world auction settings. As such, this work advances the computational tools available for auction design and contributes to bridging the gap between theoretical results and practical implementations in sequential auction design.
LGAug 29, 2022
Online Bidding Algorithms for Return-on-Spend Constrained AdvertisersZhe Feng, Swati Padmanabhan, Di Wang
Online advertising has recently grown into a highly competitive and complex multi-billion-dollar industry, with advertisers bidding for ad slots at large scales and high frequencies. This has resulted in a growing need for efficient "auto-bidding" algorithms that determine the bids for incoming queries to maximize advertisers' targets subject to their specified constraints. This work explores efficient online algorithms for a single value-maximizing advertiser under an increasingly popular constraint: Return-on-Spend (RoS). We quantify efficiency in terms of regret relative to the optimal algorithm, which knows all queries a priori. We contribute a simple online algorithm that achieves near-optimal regret in expectation while always respecting the specified RoS constraint when the input sequence of queries are i.i.d. samples from some distribution. We also integrate our results with the previous work of Balseiro, Lu, and Mirrokni [BLM20] to achieve near-optimal regret while respecting both RoS and fixed budget constraints. Our algorithm follows the primal-dual framework and uses online mirror descent (OMD) for the dual updates. However, we need to use a non-canonical setup of OMD, and therefore the classic low-regret guarantee of OMD, which is for the adversarial setting in online learning, no longer holds. Nonetheless, in our case and more generally where low-regret dynamics are applied in algorithm design, the gradients encountered by OMD can be far from adversarial but influenced by our algorithmic choices. We exploit this key insight to show our OMD setup achieves low regret in the realm of our algorithm.
CVApr 22
LaplacianFormer:Rethinking Linear Attention with Laplacian KernelZhe Feng, Sen Lian, Changwei Wang et al.
The quadratic complexity of softmax attention presents a major obstacle for scaling Transformers to high-resolution vision tasks. Existing linear attention variants often replace the softmax with Gaussian kernels to reduce complexity, but such approximations lack theoretical grounding and tend to oversuppress mid-range token interactions. We propose LaplacianFormer, a Transformer variant that employs a Laplacian kernel as a principled alternative to softmax, motivated by empirical observations and theoretical analysis. To address expressiveness degradation under low-rank approximations, we introduce a provably injective feature map that retains fine-grained token information. For efficient computation, we adopt a Nyström approximation of the kernel matrix and solve the resulting system using Newton--Schulz iteration, avoiding costly matrix inversion and SVD. We further develop custom CUDA implementations for both the kernel and solver, enabling high-throughput forward and backward passes suitable for edge deployment. Experiments on ImageNet show that LaplacianFormer achieves strong performance-efficiency trade-offs while improving attention expressiveness.
GTMay 19
On the Coordination of Value-Maximizing BiddersYanru Guan, Jiahao Zhang, Zhe Feng et al.
While the auto-bidding literature predominantly considers independent bidding, we investigate the coordination problem among multiple auto-bidders in online advertising platforms. Two motivating scenarios are: collaborative bidding among multiple bidders managed by a third-party bidding agent, and strategic bid selection for multiple ad campaigns managed by a single advertiser. We formalize this coordination problem as a theoretical model and investigate the coordination mechanism where only the highest-value bidder competes with outside bidders, while other coordinated bidders refrain from competing. We demonstrate that such a coordination mechanism dominates independent bidding, improving both Return-on-Spend (RoS) compliance and the total value accrued for the participating auto-bidders or ad campaigns, for a broad class of auto-bidding algorithms. Additionally, our simulations on synthetic and real-world datasets support the theoretical result that coordination outperforms independent bidding. These findings highlight both the theoretical potential and the practical robustness of coordinated auto-bidding in online auctions.
GTJun 1, 2023
Pairwise Ranking Losses of Click-Through Rates Prediction for Welfare Maximization in Ad AuctionsBoxiang Lyu, Zhe Feng, Zachary Robertson et al.
We study the design of loss functions for click-through rates (CTR) to optimize (social) welfare in advertising auctions. Existing works either only focus on CTR predictions without consideration of business objectives (e.g., welfare) in auctions or assume that the distribution over the participants' expected cost-per-impression (eCPM) is known a priori, then use various additional assumptions on the parametric form of the distribution to derive loss functions for predicting CTRs. In this work, we bring back the welfare objectives of ad auctions into CTR predictions and propose a novel weighted rankloss to train the CTR model. Compared to existing literature, our approach provides a provable guarantee on welfare but without assumptions on the eCPMs' distribution while also avoiding the intractability of naively applying existing learning-to-rank methods. Further, we propose a theoretically justifiable technique for calibrating the losses using labels generated from a teacher network, only assuming that the teacher network has bounded $\ell_2$ generalization error. Finally, we demonstrate the advantages of the proposed loss on synthetic and real-world data.
LGJun 2, 2022
Incrementality Bidding via Reinforcement Learning under Mixed and Delayed RewardsAshwinkumar Badanidiyuru, Zhe Feng, Tianxi Li et al.
Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner \emph{without} knowing the incrementality parameters in advance. We formulate the offline version of this problem as a specially structured episodic Markov Decision Process (MDP) and then, for its online learning counterpart, propose a novel reinforcement learning (RL) algorithm with regret at most $\widetilde{O}(H^2\sqrt{T})$, which depends on the number of rounds $H$ and number of episodes $T$, but does not depend on the number of actions (i.e., possible bids). A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is \emph{mixed} and \emph{delayed}. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent of interest.
GTFeb 16, 2023
User Response in Ad Auctions: An MDP Formulation of Long-Term Revenue OptimizationYang Cai, Zhe Feng, Christopher Liaw et al.
We propose a new Markov Decision Process (MDP) model for ad auctions to capture the user response to the quality of ads, with the objective of maximizing the long-term discounted revenue. By incorporating user response, our model takes into consideration all three parties involved in the auction (advertiser, auctioneer, and user). The state of the user is modeled as a user-specific click-through rate (CTR) with the CTR changing in the next round according to the set of ads shown to the user in the current round. We characterize the optimal mechanism for this MDP as a Myerson's auction with a notion of modified virtual value, which relies on the value distribution of the advertiser, the current user state, and the future impact of showing the ad to the user. Leveraging this characterization, we design a sample-efficient and computationally-efficient algorithm which outputs an approximately optimal policy that requires only sample access to the true MDP and the value distributions of the bidders. Finally, we propose a simple mechanism built upon second price auctions with personalized reserve prices and show it can achieve a constant-factor approximation to the optimal long term discounted revenue.
CLAug 30, 2023
Knowledge-grounded Natural Language Recommendation ExplanationAnthony Colas, Jun Araki, Zhengyu Zhou et al.
Explanations accompanied by a recommendation can assist users in understanding the decision made by recommendation systems, which in turn increases a user's confidence and trust in the system. Recently, research has focused on generating natural language explanations in a human-readable format. Thus far, the proposed approaches leverage item reviews written by users, which are often subjective, sparse in language, and unable to account for new items that have not been purchased or reviewed before. Instead, we aim to generate fact-grounded recommendation explanations that are objectively described with item features while implicitly considering a user's preferences, based on the user's purchase history. To achieve this, we propose a knowledge graph (KG) approach to natural language explainable recommendation. Our approach draws on user-item features through a novel collaborative filtering-based KG representation to produce fact-grounded, personalized explanations, while jointly learning user-item representations for recommendation scoring. Experimental results show that our approach consistently outperforms previous state-of-the-art models on natural language explainable recommendation.
ROApr 3, 2025Code
Multimodal Fusion and Vision-Language Models: A Survey for Robot VisionXiaofeng Han, Shunpeng Chen, Zenghuang Fu et al.
Robot vision has greatly benefited from advancements in multimodal fusion techniques and vision-language models (VLMs). We adopt a task-oriented perspective to systematically review the applications and advancements of multimodal fusion methods and VLMs in the field of robot vision. For semantic scene understanding tasks, we categorize fusion approaches into encoder-decoder frameworks, attention-based architectures, and graph neural networks. Meanwhile, we also analyze the architectural characteristics and practical implementations of these fusion strategies in key tasks such as simultaneous localization and mapping (SLAM), 3D object detection, navigation, and manipulation. We compare the evolutionary paths and applicability of VLMs based on large language models (LLMs) with traditional multimodal fusion methods.Additionally, we conduct an in-depth analysis of commonly used datasets, evaluating their applicability and challenges in real-world robotic scenarios. Building on this analysis, we identify key challenges in current research, including cross-modal alignment, efficient fusion, real-time deployment, and domain adaptation. We propose future directions such as self-supervised learning for robust multimodal representations, structured spatial memory and environment modeling to enhance spatial intelligence, and the integration of adversarial robustness and human feedback mechanisms to enable ethically aligned system deployment. Through a comprehensive review, comparative analysis, and forward-looking discussion, we provide a valuable reference for advancing multimodal perception and interaction in robotic vision. A comprehensive list of studies in this survey is available at https://github.com/Xiaofeng-Han-Res/MF-RV.
IRMar 23
One Model, Two Markets: Bid-Aware Generative RecommendationYanchen Jiang, Zhe Feng, Christopher P. Mah et al.
Generative Recommender Systems using semantic ids, such as TIGER (Rajput et al., 2023), have emerged as a widely adopted competitive paradigm in sequential recommendation. However, existing architectures are designed solely for semantic retrieval and do not address concerns such as monetization via ad revenue and incorporation of bids for commercial retrieval. We propose GEM-Rec, a unified framework that integrates commercial relevance and monetization objectives directly into the generative sequence. We introduce control tokens to decouple the decision of whether to show an ad from which item to show. This allows the model to learn valid placement patterns directly from interaction logs, which inherently reflect past successful ad placements. Complementing this, we devise a Bid-Aware Decoding mechanism that handles real-time pricing, injecting bids directly into the inference process to steer the generation toward high-value items. We prove that this approach guarantees allocation monotonicity, ensuring that higher bids weakly increase an ad's likelihood of being shown without requiring model retraining. Experiments demonstrate that GEM-Rec allows platforms to dynamically optimize for semantic relevance and platform revenue.
CLDec 8, 2023
DelucionQA: Detecting Hallucinations in Domain-specific Question AnsweringMobashir Sadat, Zhengyu Zhou, Lukas Lange et al.
Hallucination is a well-known phenomenon in text generated by large language models (LLMs). The existence of hallucinatory responses is found in almost all application scenarios e.g., summarization, question-answering (QA) etc. For applications requiring high reliability (e.g., customer-facing assistants), the potential existence of hallucination in LLM-generated text is a critical problem. The amount of hallucination can be reduced by leveraging information retrieval to provide relevant background information to the LLM. However, LLMs can still generate hallucinatory content for various reasons (e.g., prioritizing its parametric knowledge over the context, failure to capture the relevant information from the context, etc.). Detecting hallucinations through automated methods is thus paramount. To facilitate research in this direction, we introduce a sophisticated dataset, DelucionQA, that captures hallucinations made by retrieval-augmented LLMs for a domain-specific QA task. Furthermore, we propose a set of hallucination detection methods to serve as baselines for future works from the research community. Analysis and case study are also provided to share valuable insights on hallucination phenomena in the target scenario.
GTApr 11, 2024
Auctions with LLM SummariesKumar Avinava Dubey, Zhe Feng, Rahul Kidambi et al.
We study an auction setting in which bidders bid for placement of their content within a summary generated by a large language model (LLM), e.g., an ad auction in which the display is a summary paragraph of multiple ads. This generalizes the classic ad settings such as position auctions to an LLM generated setting, which allows us to handle general display formats. We propose a novel factorized framework in which an auction module and an LLM module work together via a prediction model to provide welfare maximizing summary outputs in an incentive compatible manner. We provide a theoretical analysis of this framework and synthetic experiments to demonstrate the feasibility and validity of the system together with welfare comparisons.
AIJan 15
FilDeep: Learning Large Deformations of Elastic-Plastic Solids with Multi-Fidelity DataJianheng Tang, Shilong Tao, Zhe Feng et al.
The scientific computation of large deformations in elastic-plastic solids is crucial in various manufacturing applications. Traditional numerical methods exhibit several inherent limitations, prompting Deep Learning (DL) as a promising alternative. The effectiveness of current DL techniques typically depends on the availability of high-quantity and high-accuracy datasets, which are yet difficult to obtain in large deformation problems. During the dataset construction process, a dilemma stands between data quantity and data accuracy, leading to suboptimal performance in the DL models. To address this challenge, we focus on a representative application of large deformations, the stretch bending problem, and propose FilDeep, a Fidelity-based Deep Learning framework for large Deformation of elastic-plastic solids. Our FilDeep aims to resolve the quantity-accuracy dilemma by simultaneously training with both low-fidelity and high-fidelity data, where the former provides greater quantity but lower accuracy, while the latter offers higher accuracy but in less quantity. In FilDeep, we provide meticulous designs for the practical large deformation problem. Particularly, we propose attention-enabled cross-fidelity modules to effectively capture long-range physical interactions across MF data. To the best of our knowledge, our FilDeep presents the first DL framework for large deformation problems using MF data. Extensive experiments demonstrate that our FilDeep consistently achieves state-of-the-art performance and can be efficiently deployed in manufacturing.
IRFeb 29, 2024
Improved Online Learning Algorithms for CTR Prediction in Ad AuctionsZhe Feng, Christopher Liaw, Zixin Zhou
In this work, we investigate the online learning problem of revenue maximization in ad auctions, where the seller needs to learn the click-through rates (CTRs) of each ad candidate and charge the price of the winner through a pay-per-click manner. We focus on two models of the advertisers' strategic behaviors. First, we assume that the advertiser is completely myopic; i.e.~in each round, they aim to maximize their utility only for the current round. In this setting, we develop an online mechanism based on upper-confidence bounds that achieves a tight $O(\sqrt{T})$ regret in the worst-case and negative regret when the values are static across all the auctions and there is a gap between the highest expected value (i.e.~value multiplied by their CTR) and second highest expected value ad. Next, we assume that the advertiser is non-myopic and cares about their long term utility. This setting is much more complex since an advertiser is incentivized to influence the mechanism by bidding strategically in earlier rounds. In this setting, we provide an algorithm to achieve negative regret for the static valuation setting (with a positive gap), which is in sharp contrast with the prior work that shows $O(T^{2/3})$ regret when the valuation is generated by adversary.
LGMar 5, 2025
Online Bidding under RoS Constraints without Knowing the ValueSushant Vijayan, Zhe Feng, Swati Padmanabhan et al.
We consider the problem of bidding in online advertising, where an advertiser aims to maximize value while adhering to budget and Return-on-Spend (RoS) constraints. Unlike prior work that assumes knowledge of the value generated by winning each impression ({e.g.,} conversions), we address the more realistic setting where the advertiser must simultaneously learn the optimal bidding strategy and the value of each impression opportunity. This introduces a challenging exploration-exploitation dilemma: the advertiser must balance exploring different bids to estimate impression values with exploiting current knowledge to bid effectively. To address this, we propose a novel Upper Confidence Bound (UCB)-style algorithm that carefully manages this trade-off. Via a rigorous theoretical analysis, we prove that our algorithm achieves $\widetilde{O}(\sqrt{T\log(|\mathcal{B}|T)})$ regret and constraint violation, where $T$ is the number of bidding rounds and $\mathcal{B}$ is the domain of possible bids. This establishes the first optimal regret and constraint violation bounds for bidding in the online setting with unknown impression values. Moreover, our algorithm is computationally efficient and simple to implement. We validate our theoretical findings through experiments on synthetic data, demonstrating that our algorithm exhibits strong empirical performance compared to existing approaches.
CVFeb 28, 2025
PreMind: Multi-Agent Video Understanding for Advanced Indexing of Presentation-style VideosKangda Wei, Zhengyu Zhou, Bingqing Wang et al.
In recent years, online lecture videos have become an increasingly popular resource for acquiring new knowledge. Systems capable of effectively understanding/indexing lecture videos are thus highly desirable, enabling downstream tasks like question answering to help users efficiently locate specific information within videos. This work proposes PreMind, a novel multi-agent multimodal framework that leverages various large models for advanced understanding/indexing of presentation-style videos. PreMind first segments videos into slide-presentation segments using a Vision-Language Model (VLM) to enhance modern shot-detection techniques. Each segment is then analyzed to generate multimodal indexes through three key steps: (1) extracting slide visual content, (2) transcribing speech narratives, and (3) consolidating these visual and speech contents into an integrated understanding. Three innovative mechanisms are also proposed to improve performance: leveraging prior lecture knowledge to refine visual understanding, detecting/correcting speech transcription errors using a VLM, and utilizing a critic agent for dynamic iterative self-reflection in vision analysis. Compared to traditional video indexing methods, PreMind captures rich, reliable multimodal information, allowing users to search for details like abbreviations shown only on slides. Systematic evaluations on the public LPM dataset and an internal enterprise dataset are conducted to validate PreMind's effectiveness, supported by detailed analyses.
LGApr 6
MAVEN: A Mesh-Aware Volumetric Encoding Network for Simulating 3D Flexible DeformationZhe Feng, Shilong Tao, Haonan Sun et al.
Deep learning-based approaches, particularly graph neural networks (GNNs), have gained prominence in simulating flexible deformations and contacts of solids, due to their ability to handle unstructured physical fields and nonlinear regression on graph structures. However, existing GNNs commonly represent meshes with graphs built solely from vertices and edges. These approaches tend to overlook higher-dimensional spatial features, e.g., 2D facets and 3D cells, from the original geometry. As a result, it is challenging to accurately capture boundary representations and volumetric characteristics, though this information is critically important for modeling contact interactions and internal physical quantity propagation, particularly under sparse mesh discretization. In this paper, we introduce MAVEN, a mesh-aware volumetric encoding network for simulating 3D flexible deformation, which explicitly models geometric mesh elements of higher dimension to achieve a more accurate and natural physical simulation. MAVEN establishes learnable mappings among 3D cells, 2D facets, and vertices, enabling flexible mutual transformations. Explicit geometric features are incorporated into the model to alleviate the burden of implicitly learning geometric patterns. Experimental results show that MAVEN consistently achieves state-of-the-art performance across established datasets and a novel metal stretch-bending task featuring large deformations and prolonged contacts.
LGDec 7, 2023
Learning Thresholds with Latent Values and Censored FeedbackJiahao Zhang, Tao Lin, Weiqiang Zheng et al. · harvard, pku
In this paper, we investigate a problem of actively learning threshold in latent space, where the unknown reward $g(γ, v)$ depends on the proposed threshold $γ$ and latent value $v$ and it can be $only$ achieved if the threshold is lower than or equal to the unknown latent value. This problem has broad applications in practical scenarios, e.g., reserve price optimization in online auctions, online task assignments in crowdsourcing, setting recruiting bars in hiring, etc. We first characterize the query complexity of learning a threshold with the expected reward at most $ε$ smaller than the optimum and prove that the number of queries needed can be infinitely large even when $g(γ, v)$ is monotone with respect to both $γ$ and $v$. On the positive side, we provide a tight query complexity $\tildeΘ(1/ε^3)$ when $g$ is monotone and the CDF of value distribution is Lipschitz. Moreover, we show a tight $\tildeΘ(1/ε^3)$ query complexity can be achieved as long as $g$ satisfies one-sided Lipschitzness, which provides a complete characterization for this problem. Finally, we extend this model to an online learning setting and demonstrate a tight $Θ(T^{2/3})$ regret bound using continuous-arm bandit techniques and the aforementioned query complexity results.
DSOct 24, 2025
A Unified Approach to Submodular Maximization Under NoiseKshipra Bhawalkar, Yang Cai, Zhe Feng et al.
We consider the problem of maximizing a submodular function with access to a noisy value oracle for the function instead of an exact value oracle. Similar to prior work, we assume that the noisy oracle is persistent in that multiple calls to the oracle for a specific set always return the same value. In this model, Hassidim and Singer (2017) design a $(1-1/e)$-approximation algorithm for monotone submodular maximization subject to a cardinality constraint, and Huang et al (2022) design a $(1-1/e)/2$-approximation algorithm for monotone submodular maximization subject to any arbitrary matroid constraint. In this paper, we design a meta-algorithm that allows us to take any "robust" algorithm for exact submodular maximization as a black box and transform it into an algorithm for the noisy setting while retaining the approximation guarantee. By using the meta-algorithm with the measured continuous greedy algorithm, we obtain a $(1-1/e)$-approximation (resp. $1/e$-approximation) for monotone (resp. non-monotone) submodular maximization subject to a matroid constraint under noise. Furthermore, by using the meta-algorithm with the double greedy algorithm, we obtain a $1/2$-approximation for unconstrained (non-monotone) submodular maximization under noise.
CLOct 22, 2025
ToolDreamer: Instilling LLM Reasoning Into Tool RetrieversSaptarshi Sengupta, Zhengyu Zhou, Jun Araki et al.
Tool calling has become increasingly popular for Large Language Models (LLMs). However, for large tool sets, the resulting tokens would exceed the LLM's context window limit, making it impossible to include every tool. Hence, an external retriever is used to provide LLMs with the most relevant tools for a query. Existing retrieval models rank tools based on the similarity between a user query and a tool description (TD). This leads to suboptimal retrieval as user requests are often poorly aligned with the language of TD. To remedy the issue, we propose ToolDreamer, a framework to condition retriever models to fetch tools based on hypothetical (synthetic) TD generated using an LLM, i.e., description of tools that the LLM feels will be potentially useful for the query. The framework enables a more natural alignment between queries and tools within the language space of TD's. We apply ToolDreamer on the ToolRet dataset and show that our method improves the performance of sparse and dense retrievers with and without training, thus showcasing its flexibility. Through our proposed framework, our aim is to offload a portion of the reasoning burden to the retriever so that the LLM may effectively handle a large collection of tools without inundating its context window.
CLSep 29, 2025
Incentive-Aligned Multi-Source LLM SummariesYanchen Jiang, Zhe Feng, Aranyak Mehta
Large language models (LLMs) are increasingly used in modern search and answer systems to synthesize multiple, sometimes conflicting, texts into a single response, yet current pipelines offer weak incentives for sources to be accurate and are vulnerable to adversarial content. We introduce Truthful Text Summarization (TTS), an incentive-aligned framework that improves factual robustness without ground-truth labels. TTS (i) decomposes a draft synthesis into atomic claims, (ii) elicits each source's stance on every claim, (iii) scores sources with an adapted multi-task peer-prediction mechanism that rewards informative agreement, and (iv) filters unreliable sources before re-summarizing. We establish formal guarantees that align a source's incentives with informative honesty, making truthful reporting the utility-maximizing strategy. Experiments show that TTS improves factual accuracy and robustness while preserving fluency, aligning exposure with informative corroboration and disincentivizing manipulation.
GTJun 7, 2024
How to Strategize Human Content Creation in the Era of GenAI?Seyed A. Esmaeili, Kevin Lim, Kshipra Bhawalkar et al.
Generative AI (GenAI) will have significant impact on content creation platforms. In this paper, we study the dynamic competition between a GenAI and a human contributor. Unlike the human, the GenAI's content only improves when more contents are created by the human over time; however, GenAI has the advantage of generating content at a lower cost. We study the algorithmic problem in this dynamic competition model about how the human contributor can maximize her utility when competing against the GenAI for content generation over a set of topics. In time-sensitive content domains (e.g., news or pop music creation) where contents' value diminishes over time, we show that there is no polynomial time algorithm for finding the human's optimal (dynamic) strategy, unless the randomized exponential time hypothesis is false. Fortunately, we are able to design a polynomial time algorithm that naturally cycles between myopically optimizing over a short time window and pausing and provably guarantees an approximation ratio of $\frac{1}{2}$. We then turn to time-insensitive content domains where contents do not lose their value (e.g., contents on history facts). Interestingly, we show that this setting permits a polynomial time algorithm that maximizes the human's utility in the long run. Finally, we conduct simulations that demonstrate the advantage of our algorithms in comparison to a collection of baselines.
AIFeb 22, 2022
Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement LearningJibang Wu, Zixuan Zhang, Zhe Feng et al.
In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long term interest with incentives of the gig service providers. This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximizes the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Nevertheless, in the population level where the model is known, it turns out that we can efficiently determine the optimal (resp. $ε$-optimal) policy with finite (resp. infinite) states and outcomes, through a modified formulation of the Bellman equation. Our main technical contribution is to study the MPP under the online reinforcement learning (RL) setting, where the goal is to learn the optimal signaling policy by interacting with with the underlying MPP, without the knowledge of the sender's utility functions, prior distributions, and the Markov transition kernels. We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles. Our algorithm enjoys sample efficiency by achieving a sublinear $\sqrt{T}$-regret upper bound. Furthermore, both our algorithm and theory can be applied to MPPs with large space of outcomes and states via function approximation, and we showcase such a success under the linear setting.
GTJan 29, 2022
A Context-Integrated Transformer-Based Neural Network for Auction DesignZhijian Duan, Jingwu Tang, Yutong Yin et al.
One of the central problems in auction design is developing an incentive-compatible mechanism that maximizes the auctioneer's expected revenue. While theoretical approaches have encountered bottlenecks in multi-item auctions, recently, there has been much progress on finding the optimal mechanism through deep learning. However, these works either focus on a fixed set of bidders and items, or restrict the auction to be symmetric. In this work, we overcome such limitations by factoring \emph{public} contextual information of bidders and items into the auction learning framework. We propose $\mathtt{CITransNet}$, a context-integrated transformer-based neural network for optimal auction design, which maintains permutation-equivariance over bids and contexts while being able to find asymmetric solutions. We show by extensive experiments that $\mathtt{CITransNet}$ can recover the known optimal solutions in single-item settings, outperform strong baselines in multi-item auctions, and generalize well to cases other than those in training.
CLOct 15, 2021
Modeling Endorsement for Multi-Document Abstractive SummarizationLogan Lebanoff, Bingqing Wang, Zhe Feng et al.
A crucial difference between single- and multi-document summarization is how salient content manifests itself in the document(s). While such content may appear at the beginning of a single document, essential information is frequently reiterated in a set of documents related to a particular topic, resulting in an endorsement effect that increases information salience. In this paper, we model the cross-document endorsement effect and its utilization in multiple document summarization. Our method generates a synopsis from each document, which serves as an endorser to identify salient content from other documents. Strongly endorsed text segments are used to enrich a neural encoder-decoder model to consolidate them into an abstractive summary. The method has a great potential to learn from fewer examples to identify salient content, which alleviates the need for costly retraining when the set of documents is dynamically adjusted. Through extensive experiments on benchmark multi-document summarization datasets, we demonstrate the effectiveness of our proposed method over strong published baselines. Finally, we shed light on future research directions and discuss broader challenges of this task using a case study.
LGSep 7, 2021
Learning to Bid in Contextual First Price AuctionsAshwinkumar Badanidiyuru, Zhe Feng, Guru Guruganesh
In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_t\in \mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others $m_t = α_0\cdot x_t + z_t$, where $α_0\in \mathbb{R}^d$ is unknown to the learner and $z_t$ is randomly sampled from a noise distribution $\mathcal{F}$ with log-concave density function $f$. We consider both \emph{binary feedback} (the learner can only observe whether she wins or not) and \emph{full information feedback} (the learner can observe $m_t$) at the end of each time $t$. For binary feedback, when the noise distribution $\mathcal{F}$ is known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method to achieve at most $\widetilde{O}(\sqrt{\log(d) T})$ regret. Moreover, we generalize this algorithm to the setting with binary feedback and the noise distribution is unknown but belongs to a parametrized family of distributions. For the full information feedback with \emph{unknown} noise distribution, we provide an algorithm that achieves regret at most $\widetilde{O}(\sqrt{dT})$. Our approach combines an estimator for log-concave density functions and then MLE method to learn the noise distribution $\mathcal{F}$ and linear weight $α_0$ simultaneously. We also provide a lower bound result such that any bidding policy in a broad class must achieve regret at least $Ω(\sqrt{T})$, even when the learner receives the full information feedback and $\mathcal{F}$ is known.
GTJul 7, 2021
Deep Learning for Two-Sided MatchingSai Srivatsa Ravindranath, Zhe Feng, Shira Li et al.
We initiate the study of deep learning for the automated design of two-sided matching mechanisms. What is of most interest is to use machine learning to understand the possibility of new tradeoffs between strategy-proofness and stability. These properties cannot be achieved simultaneously, but the efficient frontier is not understood. We introduce novel differentiable surrogates for quantifying ordinal strategy-proofness and stability and use them to train differentiable matching mechanisms that map discrete preferences to valid randomized matchings. We demonstrate that the efficient frontier characterized by these learned mechanisms is substantially better than that achievable through a convex combination of baselines of deferred acceptance (stable and strategy-proof for only one side of the market), top trading cycles (strategy-proof for one side, but not stable), and randomized serial dictatorship (strategy-proof for both sides, but not stable). This gives a new target for economic theory and opens up new possibilities for machine learning pipelines in matching market design.
CLJul 5, 2021
Weakly Supervised Named Entity Tagging with Learnable Logical RulesJiacheng Li, Haibo Ding, Jingbo Shang et al.
We study the problem of building entity tagging systems by using a few rules as weak supervision. Previous methods mostly focus on disambiguation entity types based on contexts and expert-provided rules, while assuming entity spans are given. In this work, we propose a novel method TALLOR that bootstraps high-quality logical rules to train a neural tagger in a fully automated manner. Specifically, we introduce compound rules that are composed from simple rules to increase the precision of boundary detection and generate more diverse pseudo labels. We further design a dynamic label selection strategy to ensure pseudo label quality and therefore avoid overfitting the neural tagger. Experiments on three datasets demonstrate that our method outperforms other weakly supervised methods and even rivals a state-of-the-art distantly supervised tagger with a lexicon of over 2,000 terms when starting from only 20 simple rules. Our method can serve as a tool for rapidly building taggers in emerging domains and tasks. Case studies show that learned rules can potentially explain the predicted entities.
CLApr 13, 2021
GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity RecognitionXinyan Zhao, Haibo Ding, Zhe Feng
Instead of using expensive manual annotations, researchers have proposed to train named entity recognition (NER) systems using heuristic labeling rules. However, devising labeling rules is challenging because it often requires a considerable amount of manual effort and domain expertise. To alleviate this problem, we propose \textsc{GLaRA}, a graph-based labeling rule augmentation framework, to learn new labeling rules from unlabeled data. We first create a graph with nodes representing candidate rules extracted from unlabeled data. Then, we design a new graph neural network to augment labeling rules by exploring the semantic relations between rules. We finally apply the augmented rules on unlabeled data to generate weak labels and train a NER model using the weakly labeled data. We evaluate our method on three NER datasets and find that we can achieve an average improvement of +20\% F1 score over the best baseline when given a small set of seed rules.
CLApr 5, 2021
A New Approach to Overgenerating and Scoring Abstractive SummariesKaiqiang Song, Bingqing Wang, Zhe Feng et al.
We propose a new approach to generate multiple variants of the target summary with diverse content and varying lengths, then score and select admissible ones according to users' needs. Abstractive summarizers trained on single reference summaries may struggle to produce outputs that achieve multiple desirable properties, i.e., capturing the most important information, being faithful to the original, grammatical and fluent. In this paper, we propose a two-staged strategy to generate a diverse set of candidate summaries from the source text in stage one, then score and select admissible ones in stage two. Importantly, our generator gives a precise control over the length of the summary, which is especially well-suited when space is limited. Our selectors are designed to predict the optimal summary length and put special emphasis on faithfulness to the original text. Both stages can be effectively trained, optimized and evaluated. Our experiments on benchmark summarization datasets suggest that this paradigm can achieve state-of-the-art performance.
GTJun 11, 2020
Reserve Price Optimization for First Price AuctionsZhe Feng, Sébastien Lahaie, Jon Schneider et al.
The display advertising industry has recently transitioned from second- to first-price auctions as its primary mechanism for ad allocation and pricing. In light of this, publishers need to re-evaluate and optimize their auction parameters, notably reserve prices. In this paper, we propose a gradient-based algorithm to adaptively update and optimize reserve prices based on estimates of bidders' responsiveness to experimental shocks in reserves. Our key innovation is to draw on the inherent structure of the revenue objective in order to reduce the variance of gradient estimates and improve convergence rates in both theory and practice. We show that revenue in a first-price auction can be usefully decomposed into a \emph{demand} component and a \emph{bidding} component, and introduce techniques to reduce the variance of each component. We characterize the bias-variance trade-offs of these techniques and validate the performance of our proposed algorithm through experiments on synthetic data and real display ad auctions data from Google ad exchange.
CLNov 23, 2019
Controlling the Amount of Verbatim Copying in Abstractive SummarizationKaiqiang Song, Bingqing Wang, Zhe Feng et al.
An abstract must not change the meaning of the original text. A single most effective way to achieve that is to increase the amount of copying while still allowing for text abstraction. Human editors can usually exercise control over copying, resulting in summaries that are more extractive than abstractive, or vice versa. However, it remains poorly understood whether modern neural abstractive summarizers can provide the same flexibility, i.e., learning from single reference summaries to generate multiple summary hypotheses with varying degrees of copying. In this paper, we present a neural summarization model that, by learning from single human abstracts, can produce a broad spectrum of summaries ranging from purely extractive to highly generative ones. We frame the task of summarization as language modeling and exploit alternative mechanisms to generate summary hypotheses. Our method allows for control over copying during both training and decoding stages of a neural summarization model. Through extensive experiments we illustrate the significance of our proposed method on controlling the amount of verbatim copying and achieve competitive results over strong baselines. Our analysis further reveals interesting and unobvious facts.
LGJun 4, 2019
The Intrinsic Robustness of Stochastic Bandits to Strategic ManipulationZhe Feng, David C. Parkes, Haifeng Xu
Motivated by economic applications such as recommender systems, we study the behavior of stochastic bandits algorithms under \emph{strategic behavior} conducted by rational actors, i.e., the arms. Each arm is a \emph{self-interested} strategic player who can modify its own reward whenever pulled, subject to a cross-period budget constraint, in order to maximize its own expected number of times of being pulled. We analyze the robustness of three popular bandit algorithms: UCB, $\varepsilon$-Greedy, and Thompson Sampling. We prove that all three algorithms achieve a regret upper bound $\mathcal{O}(\max \{ B, K\ln T\})$ where $B$ is the total budget across arms, $K$ is the total number of arms and $T$ is length of the time horizon. This regret guarantee holds under \emph{arbitrary adaptive} manipulation strategy of arms. Our second set of main results shows that this regret bound is \emph{tight} -- in fact for UCB it is tight even when we restrict the arms' manipulation strategies to form a \emph{Nash equilibrium}. The lower bound makes use of a simple manipulation strategy, the same for all three algorithms, yielding a bound of $Ω(\max \{ B, K\ln T\})$. Our results illustrate the robustness of classic bandits algorithms against strategic manipulations as long as $B=o(T)$.
GTJan 21, 2019
Online Learning for Measuring Incentive Compatibility in Ad AuctionsZhe Feng, Okke Schrijvers, Eric Sodomka
In this paper we investigate the problem of measuring end-to-end Incentive Compatibility (IC) regret given black-box access to an auction mechanism. Our goal is to 1) compute an estimate for IC regret in an auction, 2) provide a measure of certainty around the estimate of IC regret, and 3) minimize the time it takes to arrive at an accurate estimate. We consider two main problems, with different informational assumptions: In the \emph{advertiser problem} the goal is to measure IC regret for some known valuation $v$, while in the more general \emph{demand-side platform (DSP) problem} we wish to determine the worst-case IC regret over all possible valuations. The problems are naturally phrased in an online learning model and we design $Regret-UCB$ algorithms for both problems. We give an online learning algorithm where for the advertiser problem the error of determining IC shrinks as $O\Big(\frac{|B|}{T}\cdot\Big(\frac{\ln T}{n} + \sqrt{\frac{\ln T}{n}}\Big)\Big)$ (where $B$ is the finite set of bids, $T$ is the number of time steps, and $n$ is number of auctions per time step), and for the DSP problem it shrinks as $O\Big(\frac{|B|}{T}\cdot\Big( \frac{|B|\ln T}{n} + \sqrt{\frac{|B|\ln T}{n}}\Big)\Big)$. For the DSP problem, we also consider stronger IC regret estimation and extend our $Regret-UCB$ algorithm to achieve better IC regret error. We validate the theoretical results using simulations with Generalized Second Price (GSP) auctions, which are known to not be incentive compatible and thus have strictly positive IC regret.
CVJun 18, 2018
An Ensemble of Transfer, Semi-supervised and Supervised Learning Methods for Pathological Heart Sound ClassificationAhmed Imtiaz Humayun, Md. Tauhiduzzaman Khan, Shabnam Ghaffarzadegan et al.
In this work, we propose an ensemble of classifiers to distinguish between various degrees of abnormalities of the heart using Phonocardiogram (PCG) signals acquired using digital stethoscopes in a clinical setting, for the INTERSPEECH 2018 Computational Paralinguistics (ComParE) Heart Beats SubChallenge. Our primary classification framework constitutes a convolutional neural network with 1D-CNN time-convolution (tConv) layers, which uses features transferred from a model trained on the 2016 Physionet Heart Sound Database. We also employ a Representation Learning (RL) approach to generate features in an unsupervised manner using Deep Recurrent Autoencoders and use Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) classifiers. Finally, we utilize an SVM classifier on a high-dimensional segment-level feature extracted using various functionals on short-term acoustic features, i.e., Low-Level Descriptors (LLD). An ensemble of the three different approaches provides a relative improvement of 11.13% compared to our best single sub-system in terms of the Unweighted Average Recall (UAR) performance metric on the evaluation dataset.
CVJun 15, 2018
Learning Front-end Filter-bank Parameters using Convolutional Neural Networks for Abnormal Heart Sound DetectionAhmed Imtiaz Humayun, Shabnam Ghaffarzadegan, Zhe Feng et al.
Automatic heart sound abnormality detection can play a vital role in the early diagnosis of heart diseases, particularly in low-resource settings. The state-of-the-art algorithms for this task utilize a set of Finite Impulse Response (FIR) band-pass filters as a front-end followed by a Convolutional Neural Network (CNN) model. In this work, we propound a novel CNN architecture that integrates the front-end bandpass filters within the network using time-convolution (tConv) layers, which enables the FIR filter-bank parameters to become learnable. Different initialization strategies for the learnable filters, including random parameters and a set of predefined FIR filter-bank coefficients, are examined. Using the proposed tConv layers, we add constraints to the learnable FIR filters to ensure linear and zero phase responses. Experimental evaluations are performed on a balanced 4-fold cross-validation task prepared using the PhysioNet/CinC 2016 dataset. Results demonstrate that the proposed models yield superior performance compared to the state-of-the-art system, while the linear phase FIR filterbank method provides an absolute improvement of 9.54% over the baseline in terms of an overall accuracy metric.
GTNov 3, 2017
Learning to Bid Without Knowing your ValueZhe Feng, Chara Podimata, Vasilis Syrgkanis
We address online learning in complex auction settings, such as sponsored search auctions, where the value of the bidder is unknown to her, evolving in an arbitrary manner and observed only if the bidder wins an allocation. We leverage the structure of the utility of the bidder and the partial feedback that bidders typically receive in auctions, in order to provide algorithms with regret rates against the best fixed bid in hindsight, that are exponentially faster in convergence in terms of dependence on the action space, than what would have been derived by applying a generic bandit algorithm and almost equivalent to what would have been achieved in the full information setting. Our results are enabled by analyzing a new online learning setting with outcome-based feedback, which generalizes learning with feedback graphs. We provide an online learning algorithm for this setting, of independent interest, with regret that grows only logarithmically with the number of actions and linearly only in the number of potential outcomes (the latter being very small in most auction settings). Last but not least, we show that our algorithm outperforms the bandit approach experimentally and that this performance is robust to dropping some of our theoretical assumptions or introducing noise in the feedback that the bidder receives.
GTJun 12, 2017
Optimal Auctions through Deep Learning: Advances in Differentiable EconomicsPaul Dütting, Zhe Feng, Harikrishna Narasimhan et al.
Designing an incentive compatible auction that maximizes expected revenue is an intricate task. The single-item case was resolved in a seminal piece of work by Myerson in 1981, but more than 40 years later a full analytical understanding of the optimal design still remains elusive for settings with two or more items. In this work, we initiate the exploration of the use of tools from deep learning for the automated design of optimal auctions. We model an auction as a multi-layer neural network, frame optimal auction design as a constrained learning problem, and show how it can be solved using standard machine learning pipelines. In addition to providing generalization bounds, we present extensive experimental results, recovering essentially all known solutions that come from the theoretical analysis of optimal auction design problems and obtaining novel mechanisms for settings in which the optimal mechanism is unknown.