AIJun 9, 2023
A Domain-Independent Agent Architecture for Adaptive Operation in Evolving Open WorldsShiwali Mohan, Wiktor Piotrowski, Roni Stern et al.
Model-based reasoning agents are ill-equipped to act in novel situations in which their model of the environment no longer sufficiently represents the world. We propose HYDRA - a framework for designing model-based agents operating in mixed discrete-continuous worlds, that can autonomously detect when the environment has evolved from its canonical setup, understand how it has evolved, and adapt the agents' models to perform effectively. HYDRA is based upon PDDL+, a rich modeling language for planning in mixed, discrete-continuous environments. It augments the planning module with visual reasoning, task selection, and action execution modules for closed-loop interaction with complex environments. HYDRA implements a novel meta-reasoning process that enables the agent to monitor its own behavior from a variety of aspects. The process employs a diverse set of computational methods to maintain expectations about the agent's own behavior in an environment. Divergences from those expectations are useful in detecting when the environment has evolved and identifying opportunities to adapt the underlying models. HYDRA builds upon ideas from diagnosis and repair and uses a heuristics-guided search over model changes such that they become competent in novel conditions. The HYDRA framework has been used to implement novelty-aware agents for three diverse domains - CartPole++ (a higher dimension variant of a classic control problem), Science Birds (an IJCAI competition problem), and PogoStick (a specific problem domain in Minecraft). We report empirical observations from these domains to demonstrate the efficacy of various components in the novelty meta-reasoning process.
AIMar 24, 2023
Learning to Operate in Open Worlds by Adapting Planning ModelsWiktor Piotrowski, Roni Stern, Yoni Sher et al.
Planning agents are ill-equipped to act in novel situations in which their domain model no longer accurately represents the world. We introduce an approach for such agents operating in open worlds that detects the presence of novelties and effectively adapts their domain models and consequent action selection. It uses observations of action execution and measures their divergence from what is expected, according to the environment model, to infer existence of a novelty. Then, it revises the model through a heuristics-guided search over model changes. We report empirical evaluations on the CartPole problem, a standard Reinforcement Learning (RL) benchmark. The results show that our approach can deal with a class of novelties very quickly and in an interpretable fashion.
AIMar 29, 2023
Heuristic Search For Physics-Based Problems: Angry Birds in PDDL+Wiktor Piotrowski, Yoni Sher, Sachin Grover et al.
This paper studies how a domain-independent planner and combinatorial search can be employed to play Angry Birds, a well established AI challenge problem. To model the game, we use PDDL+, a planning language for mixed discrete/continuous domains that supports durative processes and exogenous events. The paper describes the model and identifies key design decisions that reduce the problem complexity. In addition, we propose several domain-specific enhancements including heuristics and a search technique similar to preferred operators. Together, they alleviate the complexity of combinatorial search. We evaluate our approach by comparing its performance with dedicated domain-specific solvers on a range of Angry Birds levels. The results show that our performance is on par with these domain-specific approaches in most levels, even without using our domain-specific search enhancements.
AIApr 9
RAMP: Hybrid DRL for Online Learning of Numeric Action ModelsYarin Benyamin, Argaman Mordoch, Shahaf S. Shperberg et al.
Automated planning algorithms require an action model specifying the preconditions and effects of each action, but obtaining such a model is often hard. Learning action models from observations is feasible, but existing algorithms for numeric domains are offline, requiring expert traces as input. We propose the Reinforcement learning, Action Model learning, and Planning (RAMP) strategy for learning numeric planning action models online via interactions with the environment. RAMP simultaneously trains a Deep Reinforcement Learning (DRL) policy, learns a numeric action model from past interactions, and uses that model to plan future actions when possible. These components form a positive feedback loop: the RL policy gathers data to refine the action model, while the planner generates plans to continue training the RL policy. To facilitate this integration of RL and numeric planning, we developed Numeric PDDLGym, an automated framework for converting numeric planning problems to Gym environments. Experimental results on standard IPC numeric domains show that RAMP significantly outperforms PPO, a well-known DRL algorithm, in terms of solvability and plan quality.
AIMar 23, 2022
An Example of the SAM+ Algorithm for Learning Action Models for Stochastic WorldsBrendan Juba, Roni Stern
In this technical report, we provide a complete example of running the SAM+ algorithm, an algorithm for learning stochastic planning action models, on a simplified PPDDL version of the Coffee problem. We provide a very brief description of the SAM+ algorithm and detailed description of our simplified version of the Coffee domain, and then describe the results of running it on the simplified Coffee domain.
MAMay 13
Privacy Preserving Multi Agent Path FindingRotem Lev Lehman, Roni Stern, Guy Shani
In the multi-agent path finding (MAPF) problem, a group of agents search in a graph for a path for each agent where no two paths collide. While in all applications of MAPF the agents must not collide with each other, in some of them the agents may not wish to share their paths due to privacy constraints. In this work, we formulate two types of privacy constraints for MAPF and propose algorithms that preserve them. The first type of privacy we consider is planning-level privacy, which means that during planning, the agents cannot identify exactly the planned location of the other agents. We propose a general framework for obtaining planning-level privacy, which works by adding mock agents to the planning process. The second type of privacy we consider is execution-level privacy, which is relevant when agents have limited sensing capabilities. Execution-level privacy is preserved if none of the agents is allowed to sense the location of the other agents during execution. We show how to adapt two popular MAPF algorithms, namely PIBT and LaCAM, such that they preserve execution-level privacy. Lastly, we propose a post-processing technique that allows the agents to reduce the sum of costs of the returned solution without losing any privacy. We also implemented our algorithms and evaluated them empirically, showing that the proposed post-processing technique indeed improved cost significantly.
AIJun 22, 2023
Novelty Accommodating Multi-Agent Planning in High Fidelity Simulated Open WorldJames Chao, Wiktor Piotrowski, Roni Stern et al.
Autonomous agents operating within real-world environments often rely on automated planners to ascertain optimal actions towards desired goals or the optimization of a specified objective function. Integral to these agents are common architectural components such as schedulers, tasked with determining the timing for executing planned actions, and execution engines, responsible for carrying out these scheduled actions while monitoring their outcomes. We address the significant challenge that arises when unexpected phenomena, termed \textit{novelties}, emerge within the environment, altering its fundamental characteristics, composition, and dynamics. This challenge is inherent in all deployed real-world applications and may manifest suddenly and without prior notice or explanation. The introduction of novelties into the environment can lead to inaccuracies within the planner's internal model, rendering previously generated plans obsolete. Recent research introduced agent design aimed at detecting and adapting to such novelties. However, these designs lack consideration for action scheduling in continuous time-space, coordination of concurrent actions by multiple agents, or memory-based novelty accommodation. Additionally, the application has been primarily demonstrated in lower fidelity environments. In our study, we propose a general purpose AI agent framework designed to detect, characterize, and adapt to novelties in highly noisy, complex, and stochastic environments that support concurrent actions and external scheduling. We showcase the efficacy of our agent through experimentation within a high-fidelity simulator for realistic military scenarios.
SEMay 18, 2025Code
EvoGPT: Enhancing Test Suite Robustness via LLM-Based Generation and Genetic OptimizationLior Broide, Roni Stern
Large Language Models (LLMs) have recently emerged as promising tools for automated unit test generation. We introduce a hybrid framework called EvoGPT that integrates LLM-based test generation with evolutionary search techniques to create diverse, fault-revealing unit tests. Unit tests are initially generated with diverse temperature sampling to maximize behavioral and test suite diversity, followed by a generation-repair loop and coverage-guided assertion enhancement. The resulting test suites are evolved using genetic algorithms, guided by a fitness function prioritizing mutation score over traditional coverage metrics. This design emphasizes the primary objective of unit testing-fault detection. Evaluated on multiple open-source Java projects, EvoGPT achieves an average improvement of 10% in both code coverage and mutation score compared to LLMs and traditional search-based software testing baselines. These results demonstrate that combining LLM-driven diversity, targeted repair, and evolutionary optimization produces more effective and resilient test suites.
AISep 16, 2025Code
Toward PDDL Planning CopilotYarin Benyamin, Argaman Mordoch, Shahaf S. Shperberg et al.
Large Language Models (LLMs) are increasingly being used as autonomous agents capable of performing complicated tasks. However, they lack the ability to perform reliable long-horizon planning on their own. This paper bridges this gap by introducing the Planning Copilot, a chatbot that integrates multiple planning tools and allows users to invoke them through instructions in natural language. The Planning Copilot leverages the Model Context Protocol (MCP), a recently developed standard for connecting LLMs with external tools and systems. This approach allows using any LLM that supports MCP without domain-specific fine-tuning. Our Planning Copilot supports common planning tasks such as checking the syntax of planning problems, selecting an appropriate planner, calling it, validating the plan it generates, and simulating their execution. We empirically evaluate the ability of our Planning Copilot to perform these tasks using three open-source LLMs. The results show that the Planning Copilot highly outperforms using the same LLMs without the planning tools. We also conducted a limited qualitative comparison of our tool against Chat GPT-5, a very recent commercial LLM. Our results shows that our Planning Copilot significantly outperforms GPT-5 despite relying on a much smaller LLM. This suggests dedicated planning tools may be an effective way to enable LLMs to perform planning tasks.
SESep 4, 2019Code
Learning Test TracesEyal Hadad, Roni Stern
Modern software projects include automated tests written to check the programs' functionality. The set of functions invoked by a test is called the trace of the test, and the action of obtaining a trace is called tracing. There are many tracing tools since traces are useful for a variety of software engineering tasks such as test generation, fault localization, and test execution planning. A major drawback in using test traces is that obtaining them, i.e., tracing, can be costly in terms of computational resources and runtime. Prior work attempted to address this in various ways, e.g., by selectively tracing only some of the software components or compressing the trace on-the-fly. However, all these approaches still require building the project and executing the test in order to get its (partial, possibly compressed) trace. This is still very costly in many cases. In this work, we propose a method to predict the trace of each test without executing it, based only on static properties of the test and the tested program, as well as past experience on different tests. This prediction is done by applying supervised learning to learn the relation between various static features of test and function and the likelihood that one will include the other in its trace. Then, we show how to use the predicted traces in a recent automated troubleshooting paradigm called Learn Diagnose and plan (LDP), instead of the actual, costly-to-obtain, test traces. In a preliminary evaluation on real-world open-source projects, we observe that our prediction quality is reasonable. In addition, using our trace predictions in LDP yields almost the same results comparing to when using real traces, while requiring less overhead.
AIApr 25, 2024
Optimal and Bounded Suboptimal Any-Angle Multi-agent PathfindingKonstantin Yakovlev, Anton Andreychuk, Roni Stern
Multi-agent pathfinding (MAPF) is the problem of finding a set of conflict-free paths for a set of agents. Typically, the agents' moves are limited to a pre-defined graph of possible locations and allowed transitions between them, e.g. a 4-neighborhood grid. We explore how to solve MAPF problems when each agent can move between any pair of possible locations as long as traversing the line segment connecting them does not lead to a collision with the obstacles. This is known as any-angle pathfinding. We present the first optimal any-angle multi-agent pathfinding algorithm. Our planner is based on the Continuous Conflict-based Search (CCBS) algorithm and an optimal any-angle variant of the Safe Interval Path Planning (TO-AA-SIPP). The straightforward combination of those, however, scales poorly since any-angle path finding induces search trees with a very large branching factor. To mitigate this, we adapt two techniques from classical MAPF to the any-angle setting, namely Disjoint Splitting and Multi-Constraints. Experimental results on different combinations of these techniques show they enable solving over 30% more problems than the vanilla combination of CCBS and TO-AA-SIPP. In addition, we present a bounded-suboptimal variant of our algorithm, that enables trading runtime for solution cost in a controlled manner.
AIMar 22, 2024
Safe Learning of PDDL Domains with Conditional Effects -- Extended VersionArgaman Mordoch, Enrico Scala, Roni Stern et al.
Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Manually designing such an action model is a notoriously challenging task. An alternative is to automatically learn action models from observation. Such an action model is called safe if every plan created with it is consistent with the real, unknown action model. Algorithms for learning such safe action models exist, yet they cannot handle domains with conditional or universal effects, which are common constructs in many planning problems. We prove that learning non-trivial safe action models with conditional effects may require an exponential number of samples. Then, we identify reasonable assumptions under which such learning is tractable and propose SAM Learning of Conditional Effects (Conditional-SAM), the first algorithm capable of doing so. We analyze Conditional-SAM theoretically and evaluate it experimentally. Our results show that the action models learned by Conditional-SAM can be used to solve perfectly most of the test set problems in most of the experimented domains.
MAJul 22, 2025
Budget Allocation Policies for Real-Time Multi-Agent Path FindingRaz Beck, Roni Stern
Multi-Agent Pathfinding (MAPF) is the problem of finding paths for a set of agents such that each agent reaches its desired destination while avoiding collisions with the other agents. Many MAPF solvers are designed to run offline, that is, first generate paths for all agents and then execute them. Real-Time MAPF (RT-MAPF) embodies a realistic MAPF setup in which one cannot wait until a complete path for each agent has been found before they start to move. Instead, planning and execution are interleaved, where the agents must commit to a fixed number of steps in a constant amount of computation time, referred to as the planning budget. Existing solutions to RT-MAPF iteratively call windowed versions of MAPF algorithms in every planning period, without explicitly considering the size of the planning budget. We address this gap and explore different policies for allocating the planning budget in windowed versions of standard MAPF algorithms, namely Prioritized Planning (PrP) and MAPF-LNS2. Our exploration shows that the baseline approach in which all agents draw from a shared planning budget pool is ineffective in over-constrained situations. Instead, policies that distribute the planning budget over the agents are able to solve more problems with a smaller makespan.
AIMay 28, 2025
Enhancing Lifelong Multi-Agent Path-finding by Using Artificial Potential FieldsArseniy Pertzovsky, Roni Stern, Ariel Felner et al.
We explore the use of Artificial Potential Fields (APFs) to solve Multi-Agent Path Finding (MAPF) and Lifelong MAPF (LMAPF) problems. In MAPF, a team of agents must move to their goal locations without collisions, whereas in LMAPF, new goals are generated upon arrival. We propose methods for incorporating APFs in a range of MAPF algorithms, including Prioritized Planning, MAPF-LNS2, and Priority Inheritance with Backtracking (PIBT). Experimental results show that using APF is not beneficial for MAPF but yields up to a 7-fold increase in overall system throughput for LMAPF.
AIFeb 18, 2025
Integrating Reinforcement Learning, Action Model Learning, and Numeric Planning for Tackling Complex TasksYarin Benyamin, Argaman Mordoch, Shahaf S. Shperberg et al.
Automated Planning algorithms require a model of the domain that specifies the preconditions and effects of each action. Obtaining such a domain model is notoriously hard. Algorithms for learning domain models exist, yet it remains unclear whether learning a domain model and planning is an effective approach for numeric planning environments, i.e., where states include discrete and numeric state variables. In this work, we explore the benefits of learning a numeric domain model and compare it with alternative model-free solutions. As a case study, we use two tasks in Minecraft, a popular sandbox game that has been used as an AI challenge. First, we consider an offline learning setting, where a set of expert trajectories are available to learn from. This is the standard setting for learning domain models. We used the Numeric Safe Action Model Learning (NSAM) algorithm to learn a numeric domain model and solve new problems with the learned domain model and a numeric planner. We call this model-based solution NSAM_(+p), and compare it to several model-free Imitation Learning (IL) and Offline Reinforcement Learning (RL) algorithms. Empirical results show that some IL algorithms can learn faster to solve simple tasks, while NSAM_(+p) allows solving tasks that require long-term planning and enables generalizing to solve problems in larger environments. Then, we consider an online learning setting, where learning is done by moving an agent in the environment. For this setting, we introduce RAMP. In RAMP, observations collected during the agent's execution are used to simultaneously train an RL policy and learn a planning domain action model. This forms a positive feedback loop between the RL policy and the learned domain model. We demonstrate experimentally the benefits of using RAMP, showing that it finds more efficient plans and solves more problems than several RL baselines.
MADec 5, 2024
Transient Multi-Agent Path Finding for Lifelong Navigation in Dense EnvironmentsJonathan Morag, Noy Gabay, Daniel koyfman et al.
Multi-Agent Path Finding (MAPF) deals with finding conflict-free paths for a set of agents from an initial configuration to a given target configuration. The Lifelong MAPF (LMAPF) problem is a well-studied online version of MAPF in which an agent receives a new target when it reaches its current target. The common approach for solving LMAPF is to treat it as a sequence of MAPF problems, periodically replanning from the agents' current configurations to their current targets. A significant drawback in this approach is that in MAPF the agents must reach a configuration in which all agents are at their targets simultaneously, which is needlessly restrictive for LMAPF. Techniques have been proposed to indirectly mitigate this drawback. We describe cases where these mitigation techniques fail. As an alternative, we propose to solve LMAPF problems by solving a sequence of modified MAPF problems, in which the objective is for each agent to eventually visit its target, but not necessarily for all agents to do so simultaneously. We refer to this MAPF variant as Transient MAPF (TMAPF) and propose several algorithms for solving it based on existing MAPF algorithms. A limited experimental evaluation identifies some cases where using a TMAPF algorithm instead of a MAPF algorithm with an LMAPF framework can improve the system throughput significantly.
AIJun 16, 2024
Algorithm Selection for Optimal Multi-Agent Path Finding via Graph EmbeddingCarmel Shabalin, Omri Kaduri, Roni Stern
Multi-agent path finding (MAPF) is the problem of finding paths for multiple agents such that they do not collide. This problem manifests in numerous real-world applications such as controlling transportation robots in automated warehouses, moving characters in video games, and coordinating self-driving cars in intersections. Finding optimal solutions to MAPF is NP-Hard, yet modern optimal solvers can scale to hundreds of agents and even thousands in some cases. Different solvers employ different approaches, and there is no single state-of-the-art approach for all problems. Furthermore, there are no clear, provable, guidelines for choosing when each optimal MAPF solver to use. Prior work employed Algorithm Selection (AS) techniques to learn such guidelines from past data. A major challenge when employing AS for choosing an optimal MAPF algorithm is how to encode the given MAPF problem. Prior work either used hand-crafted features or an image representation of the problem. We explore graph-based encodings of the MAPF problem and show how they can be used on-the-fly with a modern graph embedding algorithm called FEATHER. Then, we show how this encoding can be effectively joined with existing encodings, resulting in a novel AS method we call MAPF Algorithm selection via Graph embedding (MAG). An extensive experimental evaluation of MAG on several MAPF algorithm selection tasks reveals that it is either on-par or significantly better than existing methods.
LGDec 17, 2023
Learning Safe Numeric Planning Action ModelsArgaman Mordoch, Shahaf S. Shperberg, Roni Stern et al.
A significant challenge in applying planning technology to real-world problems lies in obtaining a planning model that accurately represents the problem's dynamics. Obtaining a planning model is even more challenging in mission-critical domains, where a trial-and-error approach to learning how to act is not an option. In such domains, the action model used to generate plans must be safe, in the sense that plans generated with it must be applicable and achieve their goals. % Learning safe action models for planning has been mostly explored for domains in which states are sufficiently described with Boolean variables. % In this work, we go beyond this limitation and propose the Numeric Safe Action Models Learning (N-SAM) algorithm. In this work, we present N-SAM, an action model learning algorithm capable of learning safe numeric preconditions and effects. We prove that N-SAM runs in linear time in the number of observations and, under certain conditions, is guaranteed to return safe action models. However, to preserve this safety guarantee, N-SAM must observe a substantial number of examples for each action before including it in the learned model. We address this limitation of N-SAM and propose N-SAM*, an extension to the N-SAM algorithm that always returns an action model where every observed action is applicable at least in some states, even if it was observed only once. N-SAM* does so without compromising the safety of the returned action model. We prove that N-SAM* is optimal in terms of sample complexity compared to any other algorithm that guarantees safety. N-SAM and N-SAM* are evaluated over an extensive benchmark of numeric planning domains, and their performance is compared to a state-of-the-art numeric action model learning algorithm. We also provide a discussion on the impact of numerical accuracy on the learning process.
AIJul 9, 2021
Playing Angry Birds with a Domain-Independent PDDL+ PlannerWiktor Piotrowski, Roni Stern, Matthew Klenk et al.
This demo paper presents the first system for playing the popular Angry Birds game using a domain-independent planner. Our system models Angry Birds levels using PDDL+, a planning language for mixed discrete/continuous domains. It uses a domain-independent PDDL+ planner to generate plans and executes them. In this demo paper, we present the system's PDDL+ model for this domain, identify key design decisions that reduce the problem complexity, and compare the performance of our system to model-specific methods for this domain. The results show that our system's performance is on par with other domain-specific systems for Angry Birds, suggesting the applicability of domain-independent planning to this benchmark AI challenge.
AIJul 9, 2021
Safe Learning of Lifted Action ModelsBrendan Juba, Hai S. Le, Roni Stern
Creating a domain model, even for classical, domain-independent planning, is a notoriously hard knowledge-engineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the model-free planning problem. Prior work proposed an algorithm for solving the model-free planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe model-free planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories.
MAFeb 14, 2021
Partial Disclosure of Private Dependencies in Privacy Preserving PlanningRotem Lev Lehman, Guy Shani, Roni Stern
In collaborative privacy preserving planning (CPPP), a group of agents jointly creates a plan to achieve a set of goals while preserving each others' privacy. During planning, agents often reveal the private dependencies between their public actions to other agents, that is, which public action facilitates the preconditions of another public action. Previous work in CPPP does not limit the disclosure of such dependencies. In this paper, we explicitly limit the amount of disclosed dependencies, allowing agents to publish only a part of their private dependencies. We investigate different strategies for deciding which dependencies to publish, and how they affect the ability to find solutions. We evaluate the ability of two solvers -- distribute forward search and centralized planning based on a single-agent projection -- to produce plans under this constraint. Experiments over standard CPPP domains show that the proposed dependency-sharing strategies enable generating plans while sharing only a small fraction of all private dependencies.
AIJan 24, 2021
Improving Continuous-time Conflict Based SearchAnton Andreychuk, Konstantin Yakovlev, Eli Boyarski et al.
Conflict-Based Search (CBS) is a powerful algorithmic framework for optimally solving classical multi-agent path finding (MAPF) problems, where time is discretized into the time steps. Continuous-time CBS (CCBS) is a recently proposed version of CBS that guarantees optimal solutions without the need to discretize time. However, the scalability of CCBS is limited because it does not include any known improvements of CBS. In this paper, we begin to close this gap and explore how to adapt successful CBS improvements, namely, prioritizing conflicts (PC), disjoint splitting (DS), and high-level heuristics, to the continuous time setting of CCBS. These adaptions are not trivial, and require careful handling of different types of constraints, applying a generalized version of the Safe interval path planning (SIPP) algorithm, and extending the notion of cardinal conflicts. We evaluate the effect of the suggested enhancements by running experiments both on general graphs and $2^k$-neighborhood grids. CCBS with these improvements significantly outperforms vanilla CCBS, solving problems with almost twice as many agents in some cases and pushing the limits of multiagent path finding in continuous-time domains.
LGJan 11, 2021
Anomaly Detection for Aggregated Data Using Multi-Graph AutoencoderTomer Meirman, Roni Stern, Gilad Katz
In data systems, activities or events are continuously collected in the field to trace their proper executions. Logging, which means recording sequences of events, can be used for analyzing system failures and malfunctions, and identifying the causes and locations of such issues. In our research we focus on creating an Anomaly detection models for system logs. The task of anomaly detection is identifying unexpected events in dataset, which differ from the normal behavior. Anomaly detection models also assist in data systems analysis tasks. Modern systems may produce such a large amount of events monitoring every individual event is not feasible. In such cases, the events are often aggregated over a fixed period of time, reporting the number of times every event has occurred in that time period. This aggregation facilitates scaling, but requires a different approach for anomaly detection. In this research, we present a thorough analysis of the aggregated data and the relationships between aggregated events. Based on the initial phase of our research we present graphs representations of our aggregated dataset, which represent the different relationships between aggregated instances in the same context. Using the graph representation, we propose Multiple-graphs autoencoder MGAE, a novel convolutional graphs-autoencoder model which exploits the relationships of the aggregated instances in our unique dataset. MGAE outperforms standard graph-autoencoder models and the different experiments. With our novel MGAE we present 60% decrease in reconstruction error in comparison to standard graph autoencoder, which is expressed in reconstructing high-degree relationships.
AIJun 1, 2020
Revisiting Bounded-Suboptimal Safe Interval Path PlanningKonstantin Yakovlev, Anton Andreychuk, Roni Stern
Safe-interval path planning (SIPP) is a powerful algorithm for finding a path in the presence of dynamic obstacles. SIPP returns provably optimal solutions. However, in many practical applications of SIPP such as path planning for robots, one would like to trade-off optimality for shorter planning time. In this paper we explore different ways to build a bounded-suboptimal SIPP and discuss their pros and cons. We compare the different bounded-suboptimal versions of SIPP experimentally. While there is no universal winner, the results provide insights into when each method should be used.
AIDec 24, 2019
Bidding in SpadesGal Cohensius, Reshef Meir, Nadav Oved et al.
We present a Spades bidding algorithm that is superior to recreational human players and to publicly available bots. Like in Bridge, the game of Spades is composed of two independent phases, \textit{bidding} and \textit{playing}. This paper focuses on the bidding algorithm, since this phase holds a precise challenge: based on the input, choose the bid that maximizes the agent's winning probability. Our \emph{Bidding-in-Spades} (BIS) algorithm heuristically determines the bidding strategy by comparing the expected utility of each possible bid. A major challenge is how to estimate these expected utilities. To this end, we propose a set of domain-specific heuristics, and then correct them via machine learning using data from real-world players. The \BIS algorithm we present can be attached to any playing algorithm. It beats rule-based bidding bots when all use the same playing component. When combined with a rule-based playing algorithm, it is superior to the average recreational human.
AIJun 19, 2019
Multi-Agent Pathfinding: Definitions, Variants, and BenchmarksRoni Stern, Nathan Sturtevant, Ariel Felner et al.
The MAPF problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. Applications of MAPF include automated warehouses and autonomous vehicles. Research on MAPF has been flourishing in the past couple of years. Different MAPF research papers make different assumptions, e.g., whether agents can traverse the same road at the same time, and have different objective functions, e.g., minimize makespan or sum of agents' actions costs. These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult to establish appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. This paper aims to fill this gap and support researchers and practitioners by providing a unifying terminology for describing common MAPF assumptions and objectives. In addition, we also provide pointers to two MAPF benchmarks. In particular, we introduce a new grid-based benchmark for MAPF, and demonstrate experimentally that it poses a challenge to contemporary MAPF algorithms.
AIJan 16, 2019
Multi-Agent Pathfinding with Continuous TimeAnton Andreychuk, Konstantin Yakovlev, Dor Atzmon et al.
Multi-Agent Pathfinding (MAPF) is the problem of finding paths for multiple agents such that every agent reaches its goal and the agents do not collide. Most prior work on MAPF was on grids, assumed agents' actions have uniform duration, and that time is discretized into timesteps. We propose a MAPF algorithm that does not rely on these assumptions, is complete, and provides provably optimal solutions. This algorithm is based on a novel adaptation of Safe interval path planning (SIPP), a continuous time single-agent planning algorithm, and a modified version of Conflict-based search (CBS), a state of the art multi-agent pathfinding algorithm. We analyze this algorithm, discuss its pros and cons, and evaluate it experimentally on several standard benchmarks.
AIJul 2, 2017
Modifying Optimal SAT-based Approach to Multi-agent Path-finding Problem to Suboptimal VariantsPavel Surynek, Ariel Felner, Roni Stern et al.
In multi-agent path finding (MAPF) the task is to find non-conflicting paths for multiple agents. In this paper we focus on finding suboptimal solutions for MAPF for the sum-of-costs variant. Recently, a SAT-based approached was developed to solve this problem and proved beneficial in many cases when compared to other search-based solvers. In this paper, we present SAT-based unbounded- and bounded-suboptimal algorithms and compare them to relevant algorithms. Experimental results show that in many case the SAT-based solver significantly outperforms the search-based solvers.
AIMay 24, 2017
Efficient, Safe, and Probably Approximately Complete Learning of Action ModelsRoni Stern, Brendan Juba
In this paper we explore the theoretical boundaries of planning in a setting where no model of the agent's actions is given. Instead of an action model, a set of successfully executed plans are given and the task is to generate a plan that is safe, i.e., guaranteed to achieve the goal without failing. To this end, we show how to learn a conservative model of the world in which actions are guaranteed to be applicable. This conservative model is then given to an off-the-shelf classical planner, resulting in a plan that is guaranteed to achieve the goal. However, this reduction from a model-free planning to a model-based planning is not complete: in some cases a plan will not be found even when such exists. We analyze the relation between the number of observed plans and the likelihood that our conservative approach will indeed fail to solve a solvable problem. Our analysis show that the number of trajectories needed scales gracefully.
AIMar 3, 2017
Sequential Plan RecognitionReuth Mirsky, Roni Stern, Ya'akov et al.
Plan recognition algorithms infer agents' plans from their observed actions. Due to imperfect knowledge about the agent's behavior and the environment, it is often the case that there are multiple hypotheses about an agent's plans that are consistent with the observations, though only one of these hypotheses is correct. This paper addresses the problem of how to disambiguate between hypotheses, by querying the acting agent about whether a candidate plan in one of the hypotheses matches its intentions. This process is performed sequentially and used to update the set of possible hypotheses during the recognition process. The paper defines the sequential plan recognition process (SPRP), which seeks to reduce the number of hypotheses using a minimal number of queries. We propose a number of policies for the SPRP which use maximum likelihood and information gain to choose which plan to query. We show this approach works well in practice on two domains from the literature, significantly reducing the number of hypotheses using fewer queries than a baseline approach. Our results can inform the design of future plan recognition systems that interleave the recognition process with intelligent interventions of their users.