Daigo Shishika

RO
11papers
90citations
Novelty43%
AI Score52

11 Papers

MAMar 24
Dynamic Adversarial Resource Allocation: the dDAB Game

Yue Guan, Daigo Shishika, Jason R. Marden et al. · gatech

This work introduces the dynamic Defender-Attacker Blotto (dDAB) game, extending the classical static Blotto game to a dynamic resource allocation setting over graphs. In the dDAB game, a defender is required to maintain numerical superiority against attacker resources across a set of key nodes in a connected graph. The engagement unfolds as a discrete-time game, where each player reallocates its resources in turn, with resources allowed to move at most one hop per time step. The primary goal is to determine the necessary and sufficient amount of defender resources required to guarantee sustained defense, along with the corresponding strategies. To address the central challenge arising from graph-constrained resource reallocation, we conduct a reachability analysis, starting with simplified settings where attacker resources act as a single cohesive group. We then extend the framework to allow attacker resources to split and merge arbitrarily, and construct defender strategies using superposition principles. A set-based dynamic programming algorithm is developed to compute the optimal strategies, as well as the minimum amount of defender resources to ensure successful defense. The effectiveness of our approach is demonstrated through numerical simulations and hardware experiments on the Georgia Tech Robotarium platform.

GTApr 27
Asymmetric-Information Resource Allocation Games: An LP Approach to Purposeful Deception

Longxu Pan, Yue Guan, Daigo Shishika et al.

In this work, we introduce the Deceptive Resource Allocation Game (DRAG), which studies purposeful deception within a Bayesian game framework. In DRAG, a Defender allocates resources across the true asset and several decoys to influence an Attacker's beliefs and actions, with the goal of diverting the Attacker away from the true asset. We seek to characterize purposeful deception, whereby the Defender deceives only when doing so improves its performance. To this end, we solve for the Perfect Bayesian Nash Equilibrium (PBNE) of the corresponding game. We show that, despite the coupled belief-policy interdependence, the problem admits an efficient, non-iterative linear programming formulation. Numerical results demonstrate that the resulting policies naturally balance effective allocation and belief manipulation, giving rise to purposeful and emergent deceptive behaviors.

SYMay 22
Deception and Counter Deception in Adversarial Graph Traversal Game

Violetta Rostobaya, James Berneburg, Daigo Shishika

We study deception in adversarial graph traversal, where a mobile agent seeks to reach a goal with minimum cost while an adversary alters edge costs to increase the total traversal cost. Unlike prior works that assume fixed observer-deceiver roles, we model this problem with two-sided incomplete information in which both players possess private information and update beliefs from observed actions. To solve the resulting indefinite-horizon game, we develop an adaptation of the Extensive-Form Double Oracle (XDO) algorithm. While the standard XDO algorithm is designed for finite games, the proposed adaptation ensures bounded computation despite endogenous game termination. We show that the proposed algorithm terminates in finite time and returns an epsilon-Nash equilibrium. Finally, we use Value of Information to characterize the deceptive and counter-deceptive behaviors that emerge from equilibrium strategies.

SYMar 27
Optimal Hiding with Partial Information of the Seeker's Route

Prajakta Surve, Shaunak D. Bopardikar, Daigo Shishika et al.

We consider a hide-and-seek game between a Hider and a Seeker over a finite set of locations. The Hider chooses one location to conceal a stationary treasure, while the Seeker visits the locations sequentially along a route. As the search progresses, the Hider observes a prefix of the Seeker's route. After observing this information, the Hider has the option to relocate the treasure at most once to another unvisited location by paying a switching cost. We study two seeker models. In the first, the Seeker is unaware of the fact that the Hider can relocate. In the second, the Seeker select its route while accounting for the possibility that the Hider observes its path and reallocates. For the restricted case, we define the value-of-information created by the reveal and derive upper bounds in terms of the switching cost using a worst-case evaluation over routes. We also show that seeker awareness reduces the game value, with the difference between the restricted and feedback models bounded by the entry-wise gap between the corresponding payoff matrices. Numerical examples show how this benefit decreases as the switching cost increases and as the reveal occurs later along the route.

MAMar 16
Forecast-Aware Cooperative Planning on Temporal Graphs under Stochastic Adversarial Risk

Manshi Limbu, Xuan Wang, Gregory J. Stein et al.

Cooperative multi-robot missions often require teams of robots to traverse environments where traversal risk evolves due to adversary patrols or shifting hazards with stochastic dynamics. While support coordination - where robots assist teammates in traversing risky regions - can significantly reduce mission costs, its effectiveness depends on the team's ability to anticipate future risk. Existing support-based frameworks assume static risk landscapes and therefore fail to account for predictable temporal trends in risk evolution. We propose a forecast-aware cooperative planning framework that integrates stochastic risk forecasting with anticipatory support allocation on temporal graphs. By modeling adversary dynamics as a first-order Markov stay-move process over graph edges, we propagate the resulting edge-occupancy probabilities forward in time to generate time-indexed edge-risk forecasts. These forecasts guide the proactive allocation of support positions to forecasted risky edges for effective support coordination, while also informing joint robot path planning. Experimental results demonstrate that our approach consistently reduces total expected team cost compared to non-anticipatory baselines, approaching the performance of an oracle planner.

LGMar 16
Game-Theory-Assisted Reinforcement Learning for Border Defense: Early Termination based on Analytical Solutions

Goutam Das, Michael Dorothy, Kyle Volle et al.

Game theory provides the gold standard for analyzing adversarial engagements, offering strong optimality guarantees. However, these guarantees often become brittle when assumptions such as perfect information are violated. Reinforcement learning (RL), by contrast, is adaptive but can be sample-inefficient in large, complex domains. This paper introduces a hybrid approach that leverages game-theoretic insights to improve RL training efficiency. We study a border defense game with limited perceptual range, where defender performance depends on both search and pursuit strategies, making classical differential game solutions inapplicable. Our method employs the Apollonius Circle (AC) to compute equilibrium in the post-detection phase, enabling early termination of RL episodes without learning pursuit dynamics. This allows RL to concentrate on learning search strategies while guaranteeing optimal continuation after detection. Across single- and multi-defender settings, this early termination method yields 10-20% higher rewards, faster convergence, and more efficient search trajectories. Extensive experiments validate these findings and demonstrate the overall effectiveness of our approach.

SYMay 15
Linear Programming Approach to Deceptive Path Planning Game with Goal Selection

Violetta Rostobaya, Yue Guan, James Berneburg et al.

In adversarial settings, a mobile agent may strategically plan its motion to influence an opponent's inference about its intended goal. We study deceptive path planning in a scenario where a mobile agent aims to reach a privately selected goal while an adversarial observer allocates limited defensive resources based on the observed trajectory. Unlike classical path-planning and goal-recognition approaches that model observers as passive inference process, our game-theoretic formulation models them as strategic decision-makers. For the resulting dynamic asymmetric-information game, we develop an efficient solution method that combines a linear programming formulation with the Double Oracle algorithm. To evaluate performance, we introduce metrics that quantify both the risk and the effectiveness of deception and provide illustrative numerical examples.

ROSep 7, 2021
Defending a Perimeter from a Ground Intruder Using an Aerial Defender: Theory and Practice

Elijah S. Lee, Daigo Shishika, Giuseppe Loianno et al.

The perimeter defense game has received interest in recent years as a variant of the pursuit-evasion game. A number of previous works have solved this game to obtain the optimal strategies for defender and intruder, but the derived theory considers the players as point particles with first-order assumptions. In this work, we aim to apply the theory derived from the perimeter defense problem to robots with realistic models of actuation and sensing and observe performance discrepancy in relaxing the first-order assumptions. In particular, we focus on the hemisphere perimeter defense problem where a ground intruder tries to reach the base of a hemisphere while an aerial defender constrained to move on the hemisphere aims to capture the intruder. The transition from theory to practice is detailed, and the designed system is simulated in Gazebo. Two metrics for parametric analysis and comparative study are proposed to evaluate the performance discrepancy.

RODec 29, 2020
Perimeter-defense Game between Aerial Defender and Ground Intruder

Elijah S. Lee, Daigo Shishika, Vijay Kumar

We study a variant of pursuit-evasion game in the context of perimeter defense. In this problem, the intruder aims to reach the base plane of a hemisphere without being captured by the defender, while the defender tries to capture the intruder. The perimeter-defense game was previously studied under the assumption that the defender moves on a circle. We extend the problem to the case where the defender moves on a hemisphere. To solve this problem, we analyze the strategies based on the breaching point at which the intruder tries to reach the target and predict the goal position, defined as optimal breaching point, that is achieved by the optimal strategies on both players. We provide the barrier that divides the state space into defender-winning and intruder-winning regions and prove that the optimal strategies for both players are to move towards the optimal breaching point. Simulation results are presented to demonstrate that the optimality of the game is given as a Nash equilibrium.

ROOct 22, 2020
Minimal Exposure Dubins Orienteering Problem

Douglas G. Macharet, Armando Alves Neto, Daigo Shishika

Different applications, such as environmental monitoring and military operations, demand the observation of predefined target locations, and an autonomous mobile robot can assist in these tasks. In this context, the Orienteering Problem (OP) is a well-known routing problem, in which the goal is to maximize the objective function by visiting the most rewarding locations, however, respecting a limited travel budget (e.g., length, time, energy). However, traditional formulations for routing problems generally neglect some environment peculiarities, such as obstacles or threatening zones. In this paper, we tackle the OP considering Dubins vehicles in the presence of a known deployed sensor field. We propose a novel multi-objective formulation called Minimal Exposure Dubins Orienteering Problem (MEDOP), whose main objectives are: (i) maximize the collected reward, and (ii) minimize the exposure of the agent, i.e., the probability of being detected. The solution is based on an evolutionary algorithm that iteratively varies the subset and sequence of locations to be visited, the orientations on each location, and the turning radius used to determine the paths. Results show that our approach can efficiently find a diverse set of solutions that simultaneously optimize both objectives.

ROJan 24, 2019
Decentralization of Multiagent Policies by Learning What to Communicate

James Paulos, Steven W. Chen, Daigo Shishika et al.

Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system's ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert's observability assumptions are broken.