Yin Sun

LG
h-index8
16papers
205citations
Novelty52%
AI Score55

16 Papers

IRJun 3
Rethinking Sales Lead Scoring with LLM-based Hierarchical Preference Ranking

Chenyu Zhang, Yiwen Liu, Yin Sun et al.

Sales lead conversion in high-stakes domains (e.g., automotive, real estate) differs fundamentally from e-commerce recommendation due to prolonged decision cycles and multi-stage funnels. Traditional lead scoring methods rule-based scorecards, machine learning, or pointwise CTR models face severe challenges: sparse supervision, a semantic gap in unstructured CRM logs, and inability to capture relative lead priority. While Large Language Models(LLMs) offer superior semantic understanding of customer interactions, general-purpose LLMs are ill-suited for lead ranking: they generate text rather than comparable scores, and lack alignment with the hierarchical priorities of sales funnels. We introduce an LLM-based discriminative framework for sales lead scoring, which supports joint modeling of structured CRM features and unstructured customer interactions. On top of this framework, we propose HPRO (Hierarchical Preference Ranking Optimization), which augments sales lead scoring with a hierarchical preference ranking objective. HPRO employs a margin-aware Bradley-Terry formulation to transform sparse binary labels into dense, funnel-aware preference pairs, enabling lead scoring to leverage both pointwise and pairwise supervision. Experiments on large-scale data from a leading NEV brand demonstrate state-of-the-art classification (AUC 0.8161) and ranking performance (+39.7% precision among top-ranked leads). A 132-day online A/B test validates 9.5% sales volume uplift, confirming real-world commercial impact.

ITMay 22, 2019
Optimal Status Updating with a Finite-Battery Energy Harvesting Source

Baran Tan Bacinoglu, Yin Sun, Elif Uysal et al.

We consider an energy harvesting source equipped with a finite battery, which needs to send timely status updates to a remote destination. The timeliness of status updates is measured by a non-decreasing penalty function of the Age of Information (AoI). The problem is to find a policy for generating updates that achieves the lowest possible time-average expected age penalty among all online policies. We prove that one optimal solution of this problem is a monotone threshold policy, which satisfies (i) each new update is sent out only when the age is higher than a threshold and (ii) the threshold is a non-increasing function of the instantaneous battery level. Let $τ_B$ denote the optimal threshold corresponding to the full battery level $B$, and $p(\cdot)$ denote the age-penalty function, then we can show that $p(τ_B)$ is equal to the optimum objective value, i.e., the minimum achievable time-average expected age penalty. These structural properties are used to develop an algorithm to compute the optimal thresholds. Our numerical analysis indicates that the improvement in average age with added battery capacity is largest at small battery sizes; specifically, more than half the total possible reduction in age is attained when battery storage increases from one transmission's worth of energy to two. This encourages further study of status update policies for sensors with small battery storage.

MLOct 4, 2023
Hoeffding's Inequality for Markov Chains under Generalized Concentrability Condition

Hao Chen, Abhishek Gupta, Yin Sun et al.

This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM). The generalized concentrability condition establishes a framework that interpolates and extends the existing hypotheses of Markov chain Hoeffding-type inequalities. The flexibility of our framework allows Hoeffding's inequality to be applied beyond the ergodic Markov chains in the traditional sense. We demonstrate the utility by applying our framework to several non-asymptotic analyses arising from the field of machine learning, including (i) a generalization bound for empirical risk minimization with Markovian samples, (ii) a finite sample guarantee for Ployak-Ruppert averaging of SGD, and (iii) a new regret bound for rested Markovian bandits with general state space.

NIAug 15, 2022
How Does Data Freshness Affect Real-time Supervised Learning?

Md Kamran Chowdhury Shisher, Yin Sun

In this paper, we analyze the impact of data freshness on real-time supervised learning, where a neural network is trained to infer a time-varying target (e.g., the position of the vehicle in front) based on features (e.g., video frames) observed at a sensing node (e.g., camera or lidar). One might expect that the performance of real-time supervised learning degrades monotonically as the feature becomes stale. Using an information-theoretic analysis, we show that this is true if the feature and target data sequence can be closely approximated as a Markov chain; it is not true if the data sequence is far from Markovian. Hence, the prediction error of real-time supervised learning is a function of the Age of Information (AoI), where the function could be non-monotonic. Several experiments are conducted to illustrate the monotonic and non-monotonic behaviors of the prediction error. To minimize the inference error in real-time, we propose a new "selection-from-buffer" model for sending the features, which is more general than the "generate-at-will" model used in earlier studies. By using Gittins and Whittle indices, low-complexity scheduling strategies are developed to minimize the inference error, where a new connection between the Gittins index theory and Age of Information (AoI) minimization is discovered. These scheduling results hold (i) for minimizing general AoI functions (monotonic or non-monotonic) and (ii) for general feature transmission time distributions. Data-driven evaluations are presented to illustrate the benefits of the proposed scheduling algorithms.

SYOct 26, 2016
Checks and Balances: A Low-complexity High-gain Uplink Power Controller for CoMP

Fangzhou Chen, Yin Sun, Yiping Qin et al.

Coordinated Multipoint (CoMP) promised substantial throughput gain for next-generation cellular systems. However, realizing this gain is costly in terms of pilots and backhaul bandwidth, and may require substantial modifications in physicallayer hardware. Targeting efficient throughput gain, we develop a novel coordinated power control scheme for uplink cellular networks called Checks and Balances (C&B), which checks the received signal strength of one user and its generated interference to neighboring base stations, and balances the two. C&B has some highly attractive advantages: C&B (i) can be implemented easily in software, (ii) does not require to upgrade non-CoMP physicallayer hardware, (iii) allows for fully distributed implementation for each user equipment (UE), and (iv) does not need extra pilots or backhaul communications. We evaluate the throughput performance of C&B on an uplink LTE system-level simulation platform, which is carefully calibrated with Huawei. Our simulation results show that C&B achieves much better throughput performance, compared to several widely-used power control schemes.

NIMar 11
Goal-Oriented Status Updating for Real-time Remote Inference over Networks with Two-Way Delay

Cagri Ari, Md Kamran Chowdhury Shisher, Yin Sun et al.

We study a setting where an intelligent model (e.g., a pre-trained neural network) infers the real-time value of a target signal using data samples transmitted from a remote source. The transmission scheduler decides (i) the freshness of packets, (ii) their length (i.e., the number of samples they contain), and (iii) when they should be transmitted. The freshness is quantified using the Age of Information (AoI), and the inference quality for a given packet length is a general function of AoI. Previous works assumed i.i.d. transmission delays with immediate feedback or were restricted to the case where inference performance degrades as the input data ages. Our formulation, in addition to capturing non-monotone age dependence, also covers Markovian delay on both forward and feedback links. We model this as an infinite-horizon average-cost Semi-Markov Decision Process. We obtain a closed-form solution that decides on (i) and (iii) for any constant packet length. The solution for when to transmit is an index-based threshold policy, where the index function is expressed in terms of the delay state and AoI at the receiver. In contrast, the freshness of the selected packet is a function of only the delay state. We then separately optimize the value of the constant packet length. Moreover, we also develop an index-based threshold policy for the time-variable packet length case, which allows a complexity reduction. In simulation results, we observe that our goal-oriented scheduler drops inference error down to one-sixth with respect to the age-based scheduling of unit-length packets.

NIFeb 23
Adaptive Underwater Acoustic Communications with Limited Feedback: An AoI-Aware Hierarchical Bandit Approach

Fabio Busacca, Andrea Panebianco, Yin Sun

Underwater Acoustic (UWA) networks are vital for remote sensing and ocean exploration but face inherent challenges such as limited bandwidth, long propagation delays, and highly dynamic channels. These constraints hinder real-time communication and degrade overall system performance. To address these challenges, this paper proposes a bilevel Multi-Armed Bandit (MAB) framework. At the fast inner level, a Contextual Delayed MAB (CD-MAB) jointly optimizes adaptive modulation and transmission power based on both channel state feedback and its Age of Information (AoI), thereby maximizing throughput. At the slower outer level, a Feedback Scheduling MAB dynamically adjusts the channel-state feedback interval according to throughput dynamics: stable throughput allows longer update intervals, while throughput drops trigger more frequent updates. This adaptive mechanism reduces feedback overhead and enhances responsiveness to varying network conditions. The proposed bilevel framework is computationally efficient and well-suited to resource-constrained UWA networks. Simulation results using the DESERT Underwater Network Simulator demonstrate throughput gains of up to 20.61% and energy savings of up to 36.60% compared with Deep Reinforcement Learning (DRL) baselines reported in the existing literature.

CLOct 22, 2025Code
Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimization

Junjie Song, Yiwen Liu, Dapeng Li et al.

Text summarization is a crucial task that requires the simultaneous optimization of multiple objectives, including consistency, coherence, relevance, and fluency, which presents considerable challenges. Although large language models (LLMs) have demonstrated remarkable performance, enhanced by reinforcement learning (RL), few studies have focused on optimizing the multi-objective problem of summarization through RL based on LLMs. In this paper, we introduce hypervolume optimization (HVO), a novel optimization strategy that dynamically adjusts the scores between groups during the reward process in RL by using the hypervolume method. This method guides the model's optimization to progressively approximate the pareto front, thereby generating balanced summaries across multiple objectives. Experimental results on several representative summarization datasets demonstrate that our method outperforms group relative policy optimization (GRPO) in overall scores and shows more balanced performance across different dimensions. Moreover, a 7B foundation model enhanced by HVO performs comparably to GPT-4 in the summarization task, while maintaining a shorter generation length. Our code is publicly available at https://github.com/ai4business-LiAuto/HVO.git

NIApr 25, 2024
Timely Communications for Remote Inference

Md Kamran Chowdhury Shisher, Yin Sun, I-Hong Hou

In this paper, we analyze the impact of data freshness on remote inference systems, where a pre-trained neural network blue infers a time-varying target (e.g., the locations of vehicles and pedestrians) based on features (e.g., video frames) observed at a sensing node (e.g., a camera). One might expect that the performance of a remote inference system degrades monotonically as the feature becomes stale. Using an information-theoretic analysis, we show that this is true if the feature and target data sequence can be closely approximated as a Markov chain, whereas it is not true if the data sequence is far from being Markovian. Hence, the inference error is a function of Age of Information (AoI), where the function could be non-monotonic. To minimize the inference error in real-time, we propose a new "selection-from-buffer" model for sending the features, which is more general than the "generate-at-will" model used in earlier studies. In addition, we design low-complexity scheduling policies to improve inference performance. For single-source, single-channel systems, we provide an optimal scheduling policy. In multi-source, multi-channel systems, the scheduling problem becomes a multi-action restless multi-armed bandit problem. For this setting, we design a new scheduling policy by integrating Whittle index-based source selection and duality-based feature selection-from-buffer algorithms. This new scheduling policy is proven to be asymptotically optimal. These scheduling results hold for minimizing general AoI functions (monotonic or non-monotonic). Data-driven evaluations demonstrate the significant advantages of our proposed scheduling policies.

LGMar 5, 2024
Learning-augmented Online Minimization of Age of Information and Transmission Costs

Zhongdong Liu, Keyuan Zhang, Bin Li et al.

We consider a discrete-time system where a resource-constrained source (e.g., a small sensor) transmits its time-sensitive data to a destination over a time-varying wireless channel. Each transmission incurs a fixed transmission cost (e.g., energy cost), and no transmission results in a staleness cost represented by the Age-of-Information. The source must balance the tradeoff between transmission and staleness costs. To address this challenge, we develop a robust online algorithm to minimize the sum of transmission and staleness costs, ensuring a worst-case performance guarantee. While online algorithms are robust, they are usually overly conservative and may have a poor average performance in typical scenarios. In contrast, by leveraging historical data and prediction models, machine learning (ML) algorithms perform well in average cases. However, they typically lack worst-case performance guarantees. To achieve the best of both worlds, we design a learning-augmented online algorithm that exhibits two desired properties: (i) consistency: closely approximating the optimal offline algorithm when the ML prediction is accurate and trusted; (ii) robustness: ensuring worst-case performance guarantee even ML predictions are inaccurate. Finally, we perform extensive simulations to show that our online algorithm performs well empirically and that our learning-augmented algorithm achieves both consistency and robustness.

LGAug 11, 2025
Multimodal Remote Inference

Keyuan Zhang, Yin Sun, Bo Ji

We consider a remote inference system with multiple modalities, where a multimodal machine learning (ML) model performs real-time inference using features collected from remote sensors. When sensor observations evolve dynamically over time, fresh features are critical for inference tasks. However, timely delivery of features from all modalities is often infeasible because of limited network resources. Towards this end, in this paper, we study a two-modality scheduling problem that seeks to minimize the ML model's inference error, expressed as a penalty function of the Age of Information (AoI) vector of the two modalities. We develop an index-based threshold policy and prove its optimality. Specifically, the scheduler switches to the other modality once the current modality's index function exceeds a predetermined threshold. We show that both modalities share the same threshold and that the index functions and the threshold can be computed efficiently. Our optimality results hold for general AoI functions (which could be non-monotonic and non-separable) and heterogeneous transmission times across modalities. To demonstrate the importance of considering a task-oriented AoI function, we conduct numerical experiments based on robot state prediction and compare our policy with round-robin and uniform random policies (both are oblivious to the AoI and the inference error).n The results show that our policy reduces inference error by up to 55% compared with these baselines.

IRSep 10, 2025
asLLR: LLM based Leads Ranking in Auto Sales

Yin Sun, Yiwen Liu, Junjie Song et al.

In the area of commercial auto sales system, high-quality lead score sequencing determines the priority of a sale's work and is essential for optimizing the efficiency of the sales system. Since CRM (Customer Relationship Management) system contains plenty of textual interaction features between sales and customers, traditional techniques such as Click Through Rate (CTR) prediction struggle with processing the complex information inherent in natural language features, which limits their effectiveness in sales lead ranking. Bridging this gap is critical for enhancing business intelligence and decision-making. Recently, the emergence of large language models (LLMs) has opened new avenues for improving recommendation systems, this study introduces asLLR (LLM-based Leads Ranking in Auto Sales), which integrates CTR loss and Question Answering (QA) loss within a decoder-only large language model architecture. This integration enables the simultaneous modeling of both tabular and natural language features. To verify the efficacy of asLLR, we constructed an innovative dataset derived from the customer lead pool of a prominent new energy vehicle brand, with 300,000 training samples and 40,000 testing samples. Our experimental results demonstrate that asLLR effectively models intricate patterns in commercial datasets, achieving the AUC of 0.8127, surpassing traditional CTR estimation methods by 0.0231. Moreover, asLLR enhances CTR models when used for extracting text features by 0.0058. In real-world sales scenarios, after rigorous online A/B testing, asLLR increased the sales volume by about 9.5% compared to the traditional method, providing a valuable tool for business intelligence and operational decision-making.

LGFeb 9, 2022
A Local Geometric Interpretation of Feature Extraction in Deep Feedforward Neural Networks

Md Kamran Chowdhury Shisher, Tasmeen Zaman Ornee, Yin Sun

In this paper, we present a local geometric analysis to interpret how deep feedforward neural networks extract low-dimensional features from high-dimensional data. Our study shows that, in a local geometric region, the optimal weight in one layer of the neural network and the optimal feature generated by the previous layer comprise a low-rank approximation of a matrix that is determined by the Bayes action of this layer. This result holds (i) for analyzing both the output layer and the hidden layers of the neural network, and (ii) for neuron activation functions with non-vanishing gradients. We use two supervised learning problems to illustrate our results: neural network based maximum likelihood classification (i.e., softmax regression) and neural network based minimum mean square estimation. Experimental validation of these theoretical results will be conducted in our future work.

LGFeb 27, 2021
The Age of Correlated Features in Supervised Learning based Forecasting

Md Kamran Chowdhury Shisher, Heyang Qin, Lei Yang et al.

In this paper, we analyze the impact of information freshness on supervised learning based forecasting. In these applications, a neural network is trained to predict a time-varying target (e.g., solar power), based on multiple correlated features (e.g., temperature, humidity, and cloud coverage). The features are collected from different data sources and are subject to heterogeneous and time-varying ages. By using an information-theoretic approach, we prove that the minimum training loss is a function of the ages of the features, where the function is not always monotonic. However, if the empirical distribution of the training data is close to the distribution of a Markov chain, then the training loss is approximately a non-decreasing age function. Both the training loss and testing loss depict similar growth patterns as the age increases. An experiment on solar power prediction is conducted to validate our theory. Our theoretical and experimental results suggest that it is beneficial to (i) combine the training data with different age values into a large training dataset and jointly train the forecasting decisions for these age values, and (ii) feed the age value as a part of the input feature to the neural network.

LGOct 13, 2020
Multi-Armed Bandits with Dependent Arms

Rahul Singh, Fang Liu, Yin Sun et al.

We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms. More specifically, multiple arms are grouped together to form a cluster, and the reward distributions of arms belonging to the same cluster are known functions of an unknown parameter that is a characteristic of the cluster. Thus, pulling an arm $i$ not only reveals information about its own reward distribution, but also about all those arms that share the same cluster with arm $i$. This "correlation" amongst the arms complicates the exploration-exploitation trade-off that is encountered in the MABP because the observation dependencies allow us to test simultaneously multiple hypotheses regarding the optimality of an arm. We develop learning algorithms based on the UCB principle which utilize these additional side observations appropriately while performing exploration-exploitation trade-off. We show that the regret of our algorithms grows as $O(K\log T)$, where $K$ is the number of clusters. In contrast, for an algorithm such as the vanilla UCB that is optimal for the classical MABP and does not utilize these dependencies, the regret scales as $O(M\log T)$ where $M$ is the number of arms.

SYMay 3, 2019
Sampling for Data Freshness Optimization: Non-linear Age Functions

Yin Sun, Benjamin Cyr

In this paper, we study how to take samples at a data source for improving the freshness of received data samples at a remote receiver. We use non-linear functions of the age of information to measure data freshness, and provide a survey of non-linear age functions and their applications. The sampler design problem is studied to optimize these data freshness metrics, even when there is a sampling rate constraint. This sampling problem is formulated as a constrained Markov decision process (MDP) with a possibly uncountable state space. We present a complete characterization of the optimal solution to this MDP: The optimal sampling policy is a deterministic or randomized threshold policy, where the threshold and the randomization probabilities are characterized based on the optimal objective value of the MDP and the sampling rate constraint. The optimal sampling policy can be computed by bisection search, and the curse of dimensionality is circumvented. These age optimality results hold for (i) general data freshness metrics represented by monotonic functions of the age of information, (ii) general service time distributions of the queueing server, (iii) both continuoustime and discrete-time sampling problems, and (iv) sampling problems both with and without the sampling rate constraint. Numerical results suggest that the optimal sampling policies can be much better than zero-wait sampling and the classic uniform sampling.