Peeyush Kumar

h-index7

9papers

64citations

Novelty48%

AI Score35

Ranked #102,922 of 194,257 authors (top 53%)#22,656 in LG (top 56%)

9 Papers

2.3SPMar 4, 2023

Affordable Artificial Intelligence -- Augmenting Farmer Knowledge with AI

Peeyush Kumar, Andrew Nelson, Zerina Kapetanovic et al.

Farms produce hundreds of thousands of data points on the ground daily. Farming technique which combines farming practices with the insights uncovered in these data points using AI technology is called precision farming. Precision farming technology augments and extends farmers' deep knowledge about their land, making production more sustainable and profitable. As part of the larger effort at Microsoft for empowering agricultural labor force to be more productive and sustainable, this paper presents the AI technology for predicting micro-climate conditions on the farm. This article is a chapter in publication by Food and Agriculture Organization of the United Nations and International Telecommunication Union Bangkok, 2021. This publication on artificial intelligence (AI) for agriculture is the fifth in the E-agriculture in Action series, launched in 2016 and jointly produced by FAO and ITU. It aims to raise awareness about existing AI applications in agriculture and to inspire stakeholders to develop and replicate the new ones. Improvement of capacity and tools for capturing and processing data and substantial advances in the field of machine learning open new horizons for data-driven solutions that can support decision-making, facilitate supervision and monitoring, improve the timeliness and effectiveness of safety measures (e.g. use of pesticides), and support automation of many resource-consuming tasks in agriculture. This publication presents the reader with a collection of informative applications highlighting various ways AI is used in agriculture and offering valuable insights on the implementation process, success factors, and lessons learnt.

3.3LGMay 5, 2022

General sum stochastic games with networked information flows

Sarah H. Q. Li, Lillian J. Ratliff, Peeyush Kumar

Inspired by applications such as supply chain management, epidemics, and social networks, we formulate a stochastic game model that addresses three key features common across these domains: 1) network-structured player interactions, 2) pair-wise mixed cooperation and competition among players, and 3) limited global information toward individual decision-making. In combination, these features pose significant challenges for black box approaches taken by deep learning-based multi-agent reinforcement learning (MARL) algorithms and deserve more detailed analysis. We formulate a networked stochastic game with pair-wise general sum objectives and asymmetrical information structure, and empirically explore the effects of information availability on the outcomes of different MARL paradigms such as individual learning and centralized learning decentralized execution.

3.8LGJun 13, 2023

Multi-market Energy Optimization with Renewables via Reinforcement Learning

Lucien Werner, Peeyush Kumar

This paper introduces a deep reinforcement learning (RL) framework for optimizing the operations of power plants pairing renewable energy with storage. The objective is to maximize revenue from energy markets while minimizing storage degradation costs and renewable curtailment. The framework handles complexities such as time coupling by storage devices, uncertainty in renewable generation and energy prices, and non-linear storage models. The study treats the problem as a hierarchical Markov Decision Process (MDP) and uses component-level simulators for storage. It utilizes RL to incorporate complex storage models, overcoming restrictions of optimization-based methods that require convex and differentiable component models. A significant aspect of this approach is ensuring policy actions respect system constraints, achieved via a novel method of projecting potentially infeasible actions onto a safe state-action set. The paper demonstrates the efficacy of this approach through extensive experiments using data from US and Indian electricity markets, comparing the learned RL policies with a baseline control policy and a retrospective optimal control policy. It validates the adaptability of the learning framework with various storage models and shows the effectiveness of RL in a complex energy optimization setting, in the context of multi-market bidding, probabilistic forecasts, and accurate storage component models.

3.8LGJun 20, 2023

Reward Shaping via Diffusion Process in Reinforcement Learning

Peeyush Kumar

Reinforcement Learning (RL) models have continually evolved to navigate the exploration - exploitation trade-off in uncertain Markov Decision Processes (MDPs). In this study, I leverage the principles of stochastic thermodynamics and system dynamics to explore reward shaping via diffusion processes. This provides an elegant framework as a way to think about exploration-exploitation trade-off. This article sheds light on relationships between information entropy, stochastic system dynamics, and their influences on entropy production. This exploration allows us to construct a dual-pronged framework that can be interpreted as either a maximum entropy program for deriving efficient policies or a modified cost optimization program accounting for informational costs and benefits. This work presents a novel perspective on the physical nature of information and its implications for online learning in MDPs, consequently providing a better understanding of information-oriented formulations in RL.

4.1LGOct 19, 2025

Resolution-Aware Retrieval Augmented Zero-Shot Forecasting

Iman Deznabi, Peeyush Kumar, Madalina Fiterau

Zero-shot forecasting aims to predict outcomes for previously unseen conditions without direct historical data, posing a significant challenge for traditional forecasting methods. We introduce a Resolution-Aware Retrieval-Augmented Forecasting model that enhances predictive accuracy by leveraging spatial correlations and temporal frequency characteristics. By decomposing signals into different frequency components, our model employs resolution-aware retrieval, where lower-frequency components rely on broader spatial context, while higher-frequency components focus on local influences. This allows the model to dynamically retrieve relevant data and adapt to new locations with minimal historical context. Applied to microclimate forecasting, our model significantly outperforms traditional forecasting methods, numerical weather prediction models, and modern foundation time series models, achieving 71% lower MSE than HRRR and 34% lower MSE than Chronos on the ERA5 dataset. Our results highlight the effectiveness of retrieval-augmented and resolution-aware strategies, offering a scalable and data-efficient solution for zero-shot forecasting in microclimate modeling and beyond.

4.6LGJan 5, 2024

Zero-shot Microclimate Prediction with Deep Learning

Iman Deznabi, Peeyush Kumar, Madalina Fiterau

Weather station data is a valuable resource for climate prediction, however, its reliability can be limited in remote locations. To compound the issue, making local predictions often relies on sensor data that may not be accessible for a new, previously unmonitored location. In response to these challenges, we propose a novel zero-shot learning approach designed to forecast various climate measurements at new and unmonitored locations. Our method surpasses conventional weather forecasting techniques in predicting microclimate variables by leveraging knowledge extracted from other geographic locations.

3.2LGMar 19, 2017

Near Optimal Hamiltonian-Control and Learning via Chattering

Peeyush Kumar, Wolf Kohn, Zelda B. Zabinsky

Many applications require solving non-linear control problems that are classically not well behaved. This paper develops a simple and efficient chattering algorithm that learns near optimal decision policies through an open-loop feedback strategy. The optimal control problem reduces to a series of linear optimization programs that can be easily solved to recover a relaxed optimal trajectory. This algorithm is implemented on a real-time enterprise scheduling and control process.

1.7AIMar 19, 2017

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options

Peeyush Kumar, Doina Precup

Deliberating on large or continuous state spaces have been long standing challenges in reinforcement learning. Temporal Abstraction have somewhat made this possible, but efficiently planing using temporal abstraction still remains an issue. Moreover using spatial abstractions to learn policies for various situations at once while using temporal abstraction models is an open problem. We propose here an efficient algorithm which is convergent under linear function approximation while planning using temporally abstract actions. We show how this algorithm can be used along with randomly generated option models over multiple time scales to plan agents which need to act real time. Using these randomly generated option models over multiple time scales are shown to reduce number of decision epochs required to solve the given task, hence effectively reducing the time needed for deliberation.

13.4LGMay 17, 2016

Option Discovery in Hierarchical Reinforcement Learning using Spatio-Temporal Clustering

Aravind Srinivas, Ramnandan Krishnamurthy, Peeyush Kumar et al.

This paper introduces an automated skill acquisition framework in reinforcement learning which involves identifying a hierarchical description of the given task in terms of abstract states and extended actions between abstract states. Identifying such structures present in the task provides ways to simplify and speed up reinforcement learning algorithms. These structures also help to generalize such algorithms over multiple tasks without relearning policies from scratch. We use ideas from dynamical systems to find metastable regions in the state space and associate them with abstract states. The spectral clustering algorithm PCCA+ is used to identify suitable abstractions aligned to the underlying structure. Skills are defined in terms of the sequence of actions that lead to transitions between such abstract states. The connectivity information from PCCA+ is used to generate these skills or options. These skills are independent of the learning task and can be efficiently reused across a variety of tasks defined over the same model. This approach works well even without the exact model of the environment by using sample trajectories to construct an approximate estimate. We also present our approach to scaling the skill acquisition framework to complex tasks with large state spaces for which we perform state aggregation using the representation learned from an action conditional video prediction network and use the skill acquisition framework on the aggregated state space.