SIJul 18, 2022
A Community-Aware Framework for Social Influence MaximizationAbhishek K. Umrawal, Christopher J. Quinn, Vaneet Aggarwal
We consider the problem of Influence Maximization (IM), the task of selecting $k$ seed nodes in a social network such that the expected number of nodes influenced is maximized. We propose a community-aware divide-and-conquer framework that involves (i) learning the inherent community structure of the social network, (ii) generating candidate solutions by solving the influence maximization problem for each community, and (iii) selecting the final set of seed nodes using a novel progressive budgeting scheme. Our experiments on real-world social networks show that the proposed framework outperforms the standard methods in terms of run-time and the heuristic methods in terms of influence. We also study the effect of the community structure on the performance of the proposed framework. Our experiments show that the community structures with higher modularity lead the proposed framework to perform better in terms of run-time and influence.
SIJul 8, 2024
Fractional Budget Allocation for Influence Maximization under General Marketing StrategiesAkhil Bhimaraju, Eliot W. Robson, Lav R. Varshney et al.
We consider the fractional influence maximization problem, i.e., identifying users on a social network to be incentivized with potentially partial discounts to maximize the influence on the network. The larger the discount given to a user, the higher the likelihood of its activation (adopting a new product or innovation), who then attempts to activate its neighboring users, causing a cascade effect of influence through the network. Our goal is to devise efficient algorithms that assign initial discounts to the network's users to maximize the total number of activated users at the end of the cascade, subject to a constraint on the total sum of discounts given. In general, the activation likelihood could be any non-decreasing function of the discount, whereas, our focus lies on the case when the activation likelihood is an affine function of the discount, potentially varying across different users. As this problem is shown to be NP-hard, we propose and analyze an efficient (1-1/e)-approximation algorithm. Furthermore, we run experiments on real-world social networks to show the performance and scalability of our method.
COSep 1, 2022
A Genetic Algorithm-based Framework for Learning Statistical Power ManifoldAbhishek K. Umrawal, Sean P. Lane, Erin P. Hennes
Statistical power is a measure of the replicability of a categorical hypothesis test. Formally, it is the probability of detecting an effect, if there is a true effect present in the population. Hence, optimizing statistical power as a function of some parameters of a hypothesis test is desirable. However, for most hypothesis tests, the explicit functional form of statistical power for individual model parameters is unknown; but calculating power for a given set of values of those parameters is possible using simulated experiments. These simulated experiments are usually computationally expensive. Hence, developing the entire statistical power manifold using simulations can be very time-consuming. We propose a novel genetic algorithm-based framework for learning statistical power manifolds. For a multiple linear regression $F$-test, we show that the proposed algorithm/framework learns the statistical power manifold much faster as compared to a brute-force approach as the number of queries to the power oracle is significantly reduced. We also show that the quality of learning the manifold improves as the number of iterations increases for the genetic algorithm. Such tools are useful for evaluating statistical power trade-offs when researchers have little information regarding a priori best guesses of primary effect sizes of interest or how sampling variability in non-primary effects impacts power for primary ones.
LGJan 2
LOFA: Online Influence Maximization under Full-Bandit Feedback using Lazy Forward SelectionJinyu Xu, Abhishek K. Umrawal
We study the problem of influence maximization (IM) in an online setting, where the goal is to select a subset of nodes$\unicode{x2014}$called the seed set$\unicode{x2014}$at each time step over a fixed time horizon, subject to a cardinality budget constraint, to maximize the expected cumulative influence. We operate under a full-bandit feedback model, where only the influence of the chosen seed set at each time step is observed, with no additional structural information about the network or diffusion process. It is well-established that the influence function is submodular, and existing algorithms exploit this property to achieve low regret. In this work, we leverage this property further and propose the Lazy Online Forward Algorithm (LOFA), which achieves a lower empirical regret. We conduct experiments on a real-world social network to demonstrate that LOFA achieves superior performance compared to existing bandit algorithms in terms of cumulative regret and instantaneous reward.
SIDec 30, 2025
A Community-Aware Framework for Influence Maximization with Explicit Accounting for Inter-Community InfluenceEliot W. Robson, Abhishek K. Umrawal
Influence Maximization (IM) seeks to identify a small set of seed nodes in a social network to maximize expected information spread under a diffusion model. While community-based approaches improve scalability by exploiting modular structure, they typically assume independence between communities, overlooking inter-community influence$\unicode{x2014}$a limitation that reduces effectiveness in real-world networks. We introduce Community-IM++, a scalable framework that explicitly models cross-community diffusion through a principled heuristic based on community-based diffusion degree (CDD) and a progressive budgeting strategy. The algorithm partitions the network, computes CDD to prioritize bridging nodes, and allocates seeds adaptively across communities using lazy evaluation to minimize redundant computations. Experiments on large real-world social networks under different edge weight models show that Community-IM++ achieves near-greedy influence spread at up to 100 times lower runtime, while outperforming Community-IM and degree heuristics across budgets and structural conditions. These results demonstrate the practicality of Community-IM++ for large-scale applications such as viral marketing, misinformation control, and public health campaigns, where efficiency and cross-community reach are critical.
SISep 16, 2025Code
A Pressure-Based Diffusion Model for Influence Maximization on Social NetworksCurt Stutsman, Eliot W. Robson, Abhishek K. Umrawal
In many real-world scenarios, an individual's local social network carries significant influence over the opinions they form and subsequently propagate to others. In this paper, we propose a novel diffusion model -- the Pressure Threshold model (PT) -- for dynamically simulating the spread of influence through a social network. This new model extends the popular Linear Threshold Model (LT) by adjusting a node's outgoing influence proportional to the influence it receives from its activated neighbors. We address the Influence Maximization (IM) problem, which involves selecting the most effective seed nodes to achieve maximal graph coverage after a diffusion process, and how the problem manifests with the PT Model. Experiments conducted on real-world networks, facilitated by enhancements to the open-source network-diffusion Python library, CyNetDiff, demonstrate unique seed node selection for the PT Model when compared to the LT Model. Moreover, analyses demonstrate that densely connected networks amplify pressure effects more significantly than sparse networks.
LGMar 5, 2021Code
DeepFreight: Integrating Deep Reinforcement Learning and Mixed Integer Programming for Multi-transfer Truck Freight DeliveryJiayu Chen, Abhishek K. Umrawal, Tian Lan et al.
With the freight delivery demands and shipping costs increasing rapidly, intelligent control of fleets to enable efficient and cost-conscious solutions becomes an important problem. In this paper, we propose DeepFreight, a model-free deep-reinforcement-learning-based algorithm for multi-transfer freight delivery, which includes two closely-collaborative components: truck-dispatch and package-matching. Specifically, a deep multi-agent reinforcement learning framework called QMIX is leveraged to learn a dispatch policy, with which we can obtain the multi-step joint vehicle dispatch decisions for the fleet with respect to the delivery requests. Then an efficient multi-transfer matching algorithm is executed to assign the delivery requests to the trucks. Also, DeepFreight is integrated with a Mixed-Integer Linear Programming optimizer for further optimization. The evaluation results show that the proposed system is highly scalable and ensures a 100\% delivery success while maintaining low delivery-time and fuel consumption. The codes are available at https://github.com/LucasCJYSDL/DeepFreight.
SIApr 25, 2024
CyNetDiff -- A Python Library for Accelerated Implementation of Network Diffusion ModelsEliot W. Robson, Dhemath Reddy, Abhishek K. Umrawal
In recent years, there has been increasing interest in network diffusion models and related problems. The most popular of these are the independent cascade and linear threshold models. Much of the recent experimental work done on these models requires a large number of simulations conducted on large graphs, a computationally expensive task suited for low-level languages. However, many researchers prefer the use of higher-level languages (such as Python) for their flexibility and shorter development times. Moreover, in many research tasks, these simulations are the most computationally intensive task, so it would be desirable to have a library for these with an interface to a high-level language with the performance of a low-level language. To fill this niche, we introduce CyNetDiff, a Python library with components written in Cython to provide improved performance for these computationally intensive diffusion tasks.
IRJul 3, 2025
Content filtering methods for music recommendation: A reviewTerence Zeng, Abhishek K. Umrawal
Recommendation systems have become essential in modern music streaming platforms, shaping how users discover and engage with songs. One common approach in recommendation systems is collaborative filtering, which suggests content based on the preferences of users with similar listening patterns to the target user. However, this method is less effective on media where interactions are sparse. Music is one such medium, since the average user of a music streaming service will never listen to the vast majority of tracks. Due to this sparsity, there are several challenges that have to be addressed with other methods. This review examines the current state of research in addressing these challenges, with an emphasis on the role of content filtering in mitigating biases inherent in collaborative filtering approaches. We explore various methods of song classification for content filtering, including lyrical analysis using Large Language Models (LLMs) and audio signal processing techniques. Additionally, we discuss the potential conflicts between these different analysis methods and propose avenues for resolving such discrepancies.
CLFeb 28, 2025
JAM: Controllable and Responsible Text Generation via Causal Reasoning and Latent Vector ManipulationYingbing Huang, Deming Chen, Abhishek K. Umrawal
While large language models (LLMs) have made significant strides in generating coherent and contextually relevant text, they often function as opaque black boxes, trained on vast unlabeled datasets with statistical objectives, lacking an interpretable framework for responsible control. In this paper, we introduce JAM (Just A Move), a novel framework that interprets and controls text generation by integrating cause-effect analysis within the latent space of LLMs. Based on our observations, we uncover the inherent causality in LLM generation, which is critical for producing responsible and realistic outputs. Moreover, we explore latent vectors as fundamental components in LLM architectures, aiming to understand and manipulate them for more effective and efficient controllable text generation. We evaluate our framework using a range of tools, including the HHH criteria, toxicity reduction benchmarks, and GPT-4 alignment measures. Our results show that JAM achieves up to a 22% improvement over previous Controllable Text Generation (CTG) methods across multiple quantitative metrics and human-centric evaluations. Furthermore, JAM demonstrates greater computational efficiency compared to other CTG methods. These results highlight the effectiveness and efficiency of JAM for responsible and realistic text generation, paving the way for more interpretable and controllable models.
EMOct 23, 2021
On Parameter Estimation in Unobserved Components Models subject to Linear Inequality ConstraintsAbhishek K. Umrawal, Joshua C. C. Chan
We propose a new \textit{quadratic programming-based} method of approximating a nonstandard density using a multivariate Gaussian density. Such nonstandard densities usually arise while developing posterior samplers for unobserved components models involving inequality constraints on the parameters. For instance, Chan et al. (2016) provided a new model of trend inflation with linear inequality constraints on the stochastic trend. We implemented the proposed quadratic programming-based method for this model and compared it to the existing approximation. We observed that the proposed method works as well as the existing approximation in terms of the final trend estimates while achieving gains in terms of sample efficiency.
AIJul 27, 2020
FlexPool: A Distributed Model-Free Deep Reinforcement Learning Algorithm for Joint Passengers & Goods TransportationKaushik Manchella, Abhishek K. Umrawal, Vaneet Aggarwal
The growth in online goods delivery is causing a dramatic surge in urban vehicle traffic from last-mile deliveries. On the other hand, ride-sharing has been on the rise with the success of ride-sharing platforms and increased research on using autonomous vehicle technologies for routing and matching. The future of urban mobility for passengers and goods relies on leveraging new methods that minimize operational costs and environmental footprints of transportation systems. This paper considers combining passenger transportation with goods delivery to improve vehicle-based transportation. Even though the problem has been studied with a defined dynamics model of the transportation system environment, this paper considers a model-free approach that has been demonstrated to be adaptable to new or erratic environment dynamics. We propose FlexPool, a distributed model-free deep reinforcement learning algorithm that jointly serves passengers & goods workloads by learning optimal dispatch policies from its interaction with the environment. The proposed algorithm pools passengers for a ride-sharing service and delivers goods using a multi-hop transit method. These flexibilities decrease the fleet's operational cost and environmental footprint while maintaining service levels for passengers and goods. Through simulations on a realistic multi-agent urban mobility platform, we demonstrate that FlexPool outperforms other model-free settings in serving the demands from passengers & goods. FlexPool achieves 30% higher fleet utilization and 35% higher fuel efficiency in comparison to (i) model-free approaches where vehicles transport a combination of passengers & goods without the use of multi-hop transit, and (ii) model-free approaches where vehicles exclusively transport either passengers or goods.
LGNov 29, 2018
Stochastic Top-$K$ Subset Bandits with Linear Space and Non-Linear FeedbackMridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn et al.
Many real-world problems like Social Influence Maximization face the dilemma of choosing the best $K$ out of $N$ options at a given time instant. This setup can be modeled as a combinatorial bandit which chooses $K$ out of $N$ arms at each time, with an aim to achieve an efficient trade-off between exploration and exploitation. This is the first work for combinatorial bandits where the feedback received can be a non-linear function of the chosen $K$ arms. The direct use of multi-armed bandit requires choosing among $N$-choose-$K$ options making the state space large. In this paper, we present a novel algorithm which is computationally efficient and the storage is linear in $N$. The proposed algorithm is a divide-and-conquer based strategy, that we call CMAB-SM. Further, the proposed algorithm achieves a \textit{regret bound} of $\tilde O(K^{\frac{1}{2}}N^{\frac{1}{3}}T^{\frac{2}{3}})$ for a time horizon $T$, which is \textit{sub-linear} in all parameters $T$, $N$, and $K$. %When applied to the problem of Social Influence Maximization, the performance of the proposed algorithm surpasses the UCB algorithm and some more sophisticated domain-specific methods.