OCFeb 15, 2013
Symbolic control of stochastic systems via approximately bisimilar finite abstractionsMajid Zamani, Peyman Mohajerin Esfahani, Rupak Majumdar et al.
Symbolic approaches to the control design over complex systems employ the construction of finite-state models that are related to the original control systems, then use techniques from finite-state synthesis to compute controllers satisfying specifications given in a temporal logic, and finally translate the synthesized schemes back as controllers for the concrete complex systems. Such approaches have been successfully developed and implemented for the synthesis of controllers over non-probabilistic control systems. In this paper, we extend the technique to probabilistic control systems modeled by controlled stochastic differential equations. We show that for every stochastic control system satisfying a probabilistic variant of incremental input-to-state stability, and for every given precision $\varepsilon>0$, a finite-state transition system can be constructed, which is $\varepsilon$-approximately bisimilar (in the sense of moments) to the original stochastic control system. Moreover, we provide results relating stochastic control systems to their corresponding finite-state transition systems in terms of probabilistic bisimulation relations known in the literature. We demonstrate the effectiveness of the construction by synthesizing controllers for stochastic control systems over rich specifications expressed in linear temporal logic. The discussed technique enables a new, automated, correct-by-construction controller synthesis approach for stochastic control systems, which are common mathematical models employed in many safety critical systems subject to structured uncertainty and are thus relevant for cyber-physical applications.
CYJun 29, 2023
Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human TutorsTung Phung, Victor-Alexandru Pădurean, José Cambronero et al.
Generative AI and large language models hold great promise in enhancing computing education by powering next-generation educational technologies for introductory programming. Recent works have studied these models for different scenarios relevant to programming education; however, these works are limited for several reasons, as they typically consider already outdated models or only specific scenario(s). Consequently, there is a lack of a systematic study that benchmarks state-of-the-art models for a comprehensive set of programming education scenarios. In our work, we systematically evaluate two models, ChatGPT (based on GPT-3.5) and GPT-4, and compare their performance with human tutors for a variety of scenarios. We evaluate using five introductory Python programming problems and real-world buggy programs from an online platform, and assess performance using expert-based annotations. Our results show that GPT-4 drastically outperforms ChatGPT (based on GPT-3.5) and comes close to human tutors' performance for several scenarios. These results also highlight settings where GPT-4 still struggles, providing exciting future directions on developing techniques to improve the performance of these models.
PLJan 24, 2023
Generating High-Precision Feedback for Programming Syntax Errors using Large Language ModelsTung Phung, José Cambronero, Sumit Gulwani et al.
Large language models (LLMs), such as Codex, hold great promise in enhancing programming education by automatically generating feedback for students. We investigate using LLMs to generate feedback for fixing syntax errors in Python programs, a key scenario in introductory programming. More concretely, given a student's buggy program, our goal is to generate feedback comprising a fixed program along with a natural language explanation describing the errors/fixes, inspired by how a human tutor would give feedback. While using LLMs is promising, the critical challenge is to ensure high precision in the generated feedback, which is imperative before deploying such technology in classrooms. The main research question we study is: Can we develop LLMs-based feedback generation techniques with a tunable precision parameter, giving educators quality control over the feedback that students receive? To this end, we introduce PyFiXV, our technique to generate high-precision feedback powered by Codex. The key idea behind PyFiXV is to use a novel run-time validation mechanism to decide whether the generated feedback is suitable for sharing with the student; notably, this validation mechanism also provides a precision knob to educators. We perform an extensive evaluation using two real-world datasets of Python programs with syntax errors and show the efficacy of PyFiXV in generating high-precision feedback.
SYMay 5, 2017
Shrinking Horizon Model Predictive Control with Signal Temporal Logic Constraints under Stochastic DisturbancesSamira S. Farahani, Rupak Majumdar, Vinayak Prabhu et al.
We present Shrinking Horizon Model Predictive Control (SHMPC) for discrete-time linear systems with Signal Temporal Logic (STL) specification constraints under stochastic disturbances. The control objective is to maximize an optimization function under the restriction that a given STL specification is satisfied with high probability against stochastic uncertainties. We formulate a general solution, which does not require precise knowledge of the probability distributions of the (possibly dependent) stochastic disturbances; only the bounded support intervals of the density functions and moment intervals are used. For the specific case of disturbances that are independent and normally distributed, we optimize the controllers further by utilizing knowledge of the disturbance probability distributions. We show that in both cases, the control law can be obtained by solving optimization problems with linear constraints at each step. We experimentally demonstrate effectiveness of this approach by synthesizing a controller for an HVAC system.
SYSep 29, 2017
Compositional Abstractions of Interconnected Discrete-Time Stochastic Control SystemsAbolfazl Lavaei, Sadegh Esmaeil Zadeh Soudjani, Rupak Majumdar et al.
This paper is concerned with a compositional approach for constructing abstractions of interconnected discrete-time stochastic control systems. The abstraction framework is based on new notions of so-called stochastic simulation functions, using which one can quantify the distance between original interconnected stochastic control systems and their abstractions in the probabilistic setting. Accordingly, one can leverage the proposed results to perform analysis and synthesis over abstract interconnected systems, and then carry the results over concrete ones. In the first part of the paper, we derive sufficient small-gain type conditions for the compositional quantification of the distance in probability between the interconnection of stochastic control subsystems and that of their abstractions. In the second part of the paper, we focus on the class of discrete-time linear stochastic control systems with independent noises in the abstract and concrete subsystems. For this class of systems, we propose a computational scheme to construct abstractions together with their corresponding stochastic simulation functions. We demonstrate the effectiveness of the proposed results by constructing an abstraction (totally 4 dimensions) of the interconnection of four discrete-time linear stochastic control subsystems (together 100 dimensions) in a compositional fashion.
SYApr 12, 2012
Synthesis of Minimal Error Control SoftwareRupak Majumdar, Indranil Saha, Majid Zamani
Software implementations of controllers for physical systems are at the core of many embedded systems. The design of controllers uses the theory of dynamical systems to construct a mathematical control law that ensures that the controlled system has certain properties, such as asymptotic convergence to an equilibrium point, while optimizing some performance criteria. However, owing to quantization errors arising from the use of fixed-point arithmetic, the implementation of this control law can only guarantee practical stability: under the actions of the implementation, the trajectories of the controlled system converge to a bounded set around the equilibrium point, and the size of the bounded set is proportional to the error in the implementation. The problem of verifying whether a controller implementation achieves practical stability for a given bounded set has been studied before. In this paper, we change the emphasis from verification to automatic synthesis. Using synthesis, the need for formal verification can be considerably reduced thereby reducing the design time as well as design cost of embedded control software. We give a methodology and a tool to synthesize embedded control software that is Pareto optimal w.r.t. both performance criteria and practical stability regions. Our technique is a combination of static analysis to estimate quantization errors for specific controller implementations and stochastic local search over the space of possible controllers using particle swarm optimization. The effectiveness of our technique is illustrated using examples of various standard control systems: in most examples, we achieve controllers with close LQR-LQG performance but with implementation errors, hence regions of practical stability, several times as small.
SYJul 26, 2019
Lazy Abstraction-Based Control for Safety SpecificationsKyle Hsu, Rupak Majumdar, Kaushik Mallik et al.
We present a lazy version of multi-layered abstraction-based controller synthesis (ABCS) for continuous-time nonlinear dynamical systems against safety specifications. State-of-the-art multi-layered ABCS uses pre-computed finite-state abstractions of different coarseness. Our new algorithm improves this technique by computing transitions on-the-fly, and only when a particular region of the state space needs to be explored by the controller synthesis algorithm for a specific coarseness. Additionally, our algorithm improves upon existing techniques by using coarser cells on a larger subset of the state space, which leads to significant computational savings.
SYOct 9, 2017
Parameter Optimization in Control Software using Statistical Fault Localization TechniquesJyotirmoy V. Deshmukh, Xiaoqing Jin, Rupak Majumdar et al.
Embedded controllers for cyber-physical systems are often parameterized by look-up maps representing discretizations of continuous functions on metric spaces. For example, a non-linear control action may be represented as a table of pre-computed values, and the output action of the controller for a given input is computed by using interpolation. For industrial-scale control systems, several man-hours of effort is spent in tuning the values within the look-up maps, and sub-optimal performance is often associated with inappropriate values in look-up maps. Suppose that during testing, the controller code is found to have sub-optimal performance. The parameter fault localization problem asks which parameter values in the code are potential causes of the sub-optimal behavior. We present a statistical parameter fault localization approach based on binary similarity coefficients and set spectra methods. Our approach extends previous work on software fault localization to a quantitative setting where the parameters encode continuous functions over a metric space and the program is reactive. We have implemented our approach in a simulation workflow for automotive control systems in Simulink. Given controller code with parameters (including look-up maps), our framework bootstraps the simulation workflow to return a ranked list of map entries which are deemed to have most impact on the performance. On a suite of industrial case studies with seeded errors, our tool was able to precisely identify the location of the errors.
SYSep 27, 2017
Compositional Construction of Finite State Abstractions for Stochastic Control SystemsKaushik Mallik, Sadegh Esmaeil Zadeh Soudjani, Anne-Kathrin Schmuck et al.
Controller synthesis techniques for continuous systems with respect to temporal logic specifications typically use a finite-state symbolic abstraction of the system. Constructing this abstraction for the entire system is computationally expensive, and does not exploit natural decompositions of many systems into interacting components. We have recently introduced a new relation, called (approximate) disturbance bisimulation for compositional symbolic abstraction to help scale controller synthesis for temporal logic to larger systems. In this paper, we extend the results to stochastic control systems modeled by stochastic differential equations. Given any stochastic control system satisfying a stochastic version of the incremental input-to-state stability property and a positive error bound, we show how to construct a finite-state transition system (if there exists one) which is disturbance bisimilar to the given stochastic control system. Given a network of stochastic control systems, we give conditions on the simultaneous existence of disturbance bisimilar abstractions to every component allowing for compositional abstraction of the network system.
SYFeb 13, 2016
Computing Distances between Reach FlowpipesRupak Majumdar, Vinayak S. Prabhu
We investigate quantifying the difference between two hybrid dynamical systems under noise and initial-state uncertainty. While the set of traces for these systems is infinite, it is possible to symbolically approximate trace sets using \emph{reachpipes} that compute upper and lower bounds on the evolution of the reachable sets with time. We estimate distances between corresponding sets of trajectories of two systems in terms of distances between the reachpipes. In case of two individual traces, the Skorokhod distance has been proposed as a robust and efficient notion of distance which captures both value and timing distortions. In this paper, we extend the computation of the Skorokhod distance to reachpipes, and provide algorithms to compute upper and lower bounds on the distance between two sets of traces. Our algorithms use new geometric insights that are used to compute the worst-case and best-case distances between two polyhedral sets evolving with time.
LGFeb 7, 2023
Online Reinforcement Learning with Uncertain Episode LengthsDebmalya Mandal, Goran Radanovic, Jiarui Gan et al.
Existing episodic reinforcement algorithms assume that the length of an episode is fixed across time and known a priori. In this paper, we consider a general framework of episodic reinforcement learning when the length of each episode is drawn from a distribution. We first establish that this problem is equivalent to online reinforcement learning with general discounting where the learner is trying to optimize the expected discounted sum of rewards over an infinite horizon, but where the discounting function is not necessarily geometric. We show that minimizing regret with this new general discounting is equivalent to minimizing regret with uncertain episode lengths. We then design a reinforcement learning algorithm that minimizes regret with general discounting but acts for the setting with uncertain episode lengths. We instantiate our general bound for different types of discounting, including geometric and polynomial discounting. We also show that we can obtain similar regret bounds even when the uncertainty over the episode lengths is unknown, by estimating the unknown distribution over time. Finally, we compare our learning algorithms with existing value-iteration based episodic RL algorithms in a grid-world environment.
SYJul 7, 2023
Neural Abstraction-Based Controller Synthesis and DeploymentRupak Majumdar, Mahmoud Salamati, Sadegh Soudjani
Abstraction-based techniques are an attractive approach for synthesizing correct-by-construction controllers to satisfy high-level temporal requirements. A main bottleneck for successful application of these techniques is the memory requirement, both during controller synthesis and in controller deployment. We propose memory-efficient methods for mitigating the high memory demands of the abstraction-based techniques using neural network representations. To perform synthesis for reach-avoid specifications, we propose an on-the-fly algorithm that relies on compressed neural network representations of the forward and backward dynamics of the system. In contrast to usual applications of neural representations, our technique maintains soundness of the end-to-end process. To ensure this, we correct the output of the trained neural network such that the corrected output representations are sound with respect to the finite abstraction. For deployment, we provide a novel training algorithm to find a neural network representation of the synthesized controller and experimentally show that the controller can be correctly represented as a combination of a neural network and a look-up table that requires a substantially smaller memory. We demonstrate experimentally that our approach significantly reduces the memory requirements of abstraction-based methods. For the selected benchmarks, our approach reduces the memory requirements respectively for the synthesis and deployment by a factor of $1.31\times 10^5$ and $7.13\times 10^3$ on average, and up to $7.54\times 10^5$ and $3.18\times 10^4$. Although this reduction is at the cost of increased off-line computations to train the neural networks, all the steps of our approach are parallelizable and can be implemented on machines with higher number of processing units to reduce the required computational time.
SYAug 17, 2011
A theory of robust software synthesisRupak Majumdar, Elaine Render, Paulo Tabuada
A key property for systems subject to uncertainty in their operating environment is robustness, ensuring that unmodelled, but bounded, disturbances have only a proportionally bounded effect upon the behaviours of the system. Inspired by ideas from robust control and dissipative systems theory, we present a formal definition of robustness and algorithmic tools for the design of optimally robust controllers for omega-regular properties on discrete transition systems. Formally, we define metric automata - automata equipped with a metric on states - and strategies on metric automata which guarantee robustness for omega-regular properties. We present fixed point algorithms to construct optimally robust strategies in polynomial time. In contrast to strategies computed by classical graph theoretic approaches, the strategies computed by our algorithm ensure that the behaviours of the controlled system gracefully degrade under the action of disturbances; the degree of degradation is parameterized by the magnitude of the disturbance. We show an application of our theory to the design of controllers that tolerate infinitely many transient errors provided they occur infrequently enough.
SYFeb 15, 2016
Dynamic Hierarchical Reactive Controller SynthesisAnne-Kathrin Schmuck, Rupak Majumdar
In the formal approach to reactive controller synthesis, a symbolic controller for a possibly hybrid system is obtained by algorithmically computing a winning strategy in a two-player game. Such game-solving algorithms scale poorly as the size of the game graph increases. However, in many applications, the game graph has a natural hierarchical structure. In this paper, we propose a modeling formalism and a synthesis algorithm that exploits this hierarchical structure for more scalable synthesis. We define local games on hierarchical graphs as a modeling formalism which decomposes a large-scale reactive synthesis problem in two dimensions. First, the construction of a hierarchical game graph introduces abstraction layers, where each layer is again a two-player game graph. Second, every such layer is decomposed into multiple local game graphs, each corresponding to a node in the higher level game graph. While local games have the potential to reduce the state space for controller synthesis, they lead to more complex synthesis problems where strategies computed for one local game can impose additional requirements on lower-level local games. Our second contribution is a procedure to construct a dynamic controller for local game graphs over hierarchies. The controller computes assume-admissible winning strategies that satisfy local specifications in the presence of environment assumptions, and dynamically updates specifications and strategies due to interactions between games at different abstraction layers at each step of the play. We show that our synthesis procedure is sound: the controller constructs a play which satisfies all local specifications. We illustrate our results through an example controlling an autonomous robot in a known, multistory building.
SYAug 9, 2019
Lazy Abstraction-Based Controller SynthesisKyle Hsu, Rupak Majumdar, Kaushik Mallik et al.
We present lazy abstraction-based controller synthesis (ABCS) for continuous-time nonlinear dynamical systems against reach-avoid and safety specifications. State-of-the-art multi-layered ABCS pre-computes multiple finite-state abstractions of varying granularity and applies reactive synthesis to the coarsest abstraction whenever feasible, but adaptively considers finer abstractions when necessary. Lazy ABCS improves this technique by constructing abstractions on demand. Our insight is that the abstract transition relation only needs to be locally computed for a small set of frontier states at the precision currently required by the synthesis algorithm. We show that lazy ABCS can significantly outperform previous multi-layered ABCS algorithms: on standard benchmarks, lazy ABCS is more than 4 times faster.
AIJul 19, 2023
Markov Decision Processes with Time-Varying Geometric DiscountingJiarui Gan, Annika Hennes, Rupak Majumdar et al.
Canonical models of Markov decision processes (MDPs) usually consider geometric discounting based on a constant discount factor. While this standard modeling approach has led to many elegant results, some recent studies indicate the necessity of modeling time-varying discounting in certain applications. This paper studies a model of infinite-horizon MDPs with time-varying discount factors. We take a game-theoretic perspective -- whereby each time step is treated as an independent decision maker with their own (fixed) discount factor -- and we study the subgame perfect equilibrium (SPE) of the resulting game as well as the related algorithmic problems. We present a constructive proof of the existence of an SPE and demonstrate the EXPTIME-hardness of computing an SPE. We also turn to the approximate notion of $ε$-SPE and show that an $ε$-SPE exists under milder assumptions. An algorithm is presented to compute an $ε$-SPE, of which an upper bound of the time complexity, as a function of the convergence property of the time-varying discount factor, is provided.
GTJun 6, 2023
Stochastic Principal-Agent Problems: Efficient Computation and LearningJiarui Gan, Rupak Majumdar, Debmalya Mandal et al.
We introduce a stochastic principal-agent model. A principal and an agent interact in a stochastic environment, each privy to observations about the state not available to the other. The principal has the power of commitment, both to elicit information from the agent and to provide signals about her own information. The players communicate with each other and then select actions independently. Each of them receives a payoff based on the state and their joint action, and the environment transitions to a new state. The interaction continues over a finite time horizon. Both players are far-sighted, aiming to maximize their total payoffs over the time horizon. The model encompasses as special cases extensive-form games (EFGs) and stochastic games of incomplete information, partially observable Markov decision processes (POMDPs), as well as other forms of sequential principal-agent interactions, including Bayesian persuasion and automated mechanism design problems. We consider both the computation and learning of the principal's optimal policy. Since the general problem, which subsumes POMDPs, is intractable, we explore algorithmic solutions under hindsight observability, where the state and the interaction history are revealed at the end of each step. Though the problem becomes more amenable under this condition, the number of possible histories remains exponential in the length of the time horizon, making approaches for EFG-based models infeasible. We present an efficient algorithm based on the inducible value sets. The algorithm computes an $ε$-approximate optimal policy in time polynomial in $1/ε$. Additionally, we show an efficient learning algorithm for an episodic reinforcement learning setting where the transition probabilities are unknown. The algorithm guarantees sublinear regret $\tilde{O}(T^{2/3})$ for both players over $T$ episodes.
OCMar 16, 2016
Coordinate-invariant incremental Lyapunov functionsMajid Zamani, Rupak Majumdar
In this note, we propose coordinate-invariant notions of incremental Lyapunov function and provide characterizations of incremental stability in terms of existence of the proposed Lyapunov functions.
SESep 2, 2024
Reward Augmentation in Reinforcement Learning for Testing Distributed SystemsAndrea Borgarelli, Constantin Enea, Rupak Majumdar et al.
Bugs in popular distributed protocol implementations have been the source of many downtimes in popular internet services. We describe a randomized testing approach for distributed protocol implementations based on reinforcement learning. Since the natural reward structure is very sparse, the key to successful exploration in reinforcement learning is reward augmentation. We show two different techniques that build on one another. First, we provide a decaying exploration bonus based on the discovery of new states -- the reward decays as the same state is visited multiple times. The exploration bonus captures the intuition from coverage-guided fuzzing of prioritizing new coverage points; in contrast to other schemes, we show that taking the maximum of the bonus and the Q-value leads to more effective exploration. Second, we provide waypoints to the algorithm as a sequence of predicates that capture interesting semantic scenarios. Waypoints exploit designer insight about the protocol and guide the exploration to ``interesting'' parts of the state space. Our reward structure ensures that new episodes can reliably get to deep interesting states even without execution caching. We have implemented our algorithm in Go. Our evaluation on three large benchmarks (RedisRaft, Etcd, and RSL) shows that our algorithm can significantly outperform baseline approaches in terms of coverage and bug finding.
49.4SYMay 21
Kernel-Based Safe Exploration in Deep Reinforcement LearningRupak Majumdar, Nikhil Singh, Sadegh Soudjani
Safety has been a major concern when deploying deep reinforcement learning algorithms in the real world. A promising direction that ensures that the learned policy does not visit unsafe regions is to learn a \emph{barrier function} along with the policy. A barrier is a function from states to reals that assigns low values to the initial states, high values to the unsafe states, and decreases in expectation on each transition; such a function can be used to bound the probability of reaching unsafe states. Previous attempts learned a barrier function directly from exploration data, but this required either large amounts of data or restrictions on the system dynamics. In this paper, we show how kernel embeddings can be used to learn barrier functions during deep reinforcement learning for stochastic systems with unknown dynamics. Our algorithm, \emph{kernel-based safe exploration (KBSE)}, learns an optimal policy and a barrier simultaneously during exploration. The barriers are computed iteratively, represented as conditional mean embeddings, and provide better probabilistic safety guarantees with more exploration. The exploration algorithm uses the learned barrier functions to identify safety violations. In the case of violation, it intervenes to modify the unsafe action to a safe action, thereby ensuring that the exploration is restricted to actions that bound the probability of reaching unsafe states. We evaluate KBSE on several complex continuous control benchmarks. Experimental results establish our new algorithm to be suitable for synthesizing control policies that are probabilistically safe without degradation in reward accumulation.
23.1PLApr 7
State Space Estimation for DPOR-based Model CheckersA. R. Balasubramanian, Mohammad Hossein Khoshechin Jorshari, Rupak Majumdar et al.
We study the estimation problem for concurrent programs: given a bounded program $P$, estimate the number of Mazurkiewicz trace-equivalence classes induced by its interleavings. This quantity informs two practical questions for enumeration-based model checking: how long a model checking run is likely to take, and what fraction of the search space has been covered so far. We first show the counting problem is #P-hard even for restricted programs and, unless $P=NP$, inapproximable within any subexponential factor, ruling out efficient exact or randomized approximation algorithms. We give a Monte Carlo approach to obtain a poly-time unbiased estimator: we convert a stateless optimal DPOR algorithm into an unbiased estimator by viewing its exploration as a bounded-depth, bounded-width tree whose leaves are the maximal Mazurkiewicz traces. A classical estimator by Knuth, when run on this tree, yields an unbiased estimate. To control the variance, we apply stochastic enumeration by maintaining a small population of partial paths per depth whose evolution is coupled. We have implemented our estimator in the JMC model checker and evaluated it on shared-memory benchmarks. With modest budgets, our estimator yields stable estimates, typically within a 20% band, within a few hundred trials, even when the state space has $10^5$--$10^6$ classes. We also show how the same machinery estimates model-checking cost by weighting all explored graphs, not only complete traces. Our algorithms provide the first provable poly-time unbiased estimators for counting traces, a problem of considerable importance when allocating model checking resources.
18.0SYMay 5
Almost Sure Reachability in Continuous-time Stochastic SystemsArash Bahari Kordabad, Rupak Majumdar, Sadegh Soudjani
We provide certificates for almost sure reachability of continuous-time stochastic systems governed by stochastic differential equations (SDEs). We first show that a standard Euler-Maruyama discretization may fail to preserve almost sure reachability property of the system using a double-well Langevin system. This observation motivates us to develop certificates for almost sure reachability directly on the continuous-time system. We introduce a pair of certificates, a drift function and a variant function, and prove necessity and sufficiency for almost sure reachability of an open bounded target set. Using these certificates, for linear SDEs, we give a characterization of almost sure reachability in terms of the spectral structure of the system matrices. For polynomial SDEs, we fix a polynomial template for the drift function and choose the variant function template as an exponential function composed with a polynomial. This allows us to translate the conditions in the certificates into sum-of-squares (SOS) constraints. We then propose an alternating scheme to resolve bilinearities. We illustrate the approach on the double-well Langevin example, showing that continuous-time SOS certificates recover almost sure reachability that is lost under time discretization. Moreover, we verify the SOS approach on a polynomial system.
ROMar 2, 2024
Optimal Integrated Task and Path Planning and Its Application to Multi-Robot Pickup and DeliveryAman Aryan, Manan Modi, Indranil Saha et al.
We propose a generic multi-robot planning mechanism that combines an optimal task planner and an optimal path planner to provide a scalable solution for complex multi-robot planning problems. The Integrated planner, through the interaction of the task planner and the path planner, produces optimal collision-free trajectories for the robots. We illustrate our general algorithm on an object pick-and-drop planning problem in a warehouse scenario where a group of robots is entrusted with moving objects from one location to another in the workspace. We solve the task planning problem by reducing it into an SMT-solving problem and employing the highly advanced SMT solver Z3 to solve it. To generate collision-free movement of the robots, we extend the state-of-the-art algorithm Conflict Based Search with Precedence Constraints with several domain-specific constraints. We evaluate our integrated task and path planner extensively on various instances of the object pick-and-drop planning problem and compare its performance with a state-of-the-art multi-robot classical planner. Experimental results demonstrate that our planning mechanism can deal with complex planning problems and outperforms a state-of-the-art classical planner both in terms of computation time and the quality of the generated plan.
AINov 18, 2024
Regret-Free Reinforcement Learning for LTL SpecificationsRupak Majumdar, Mahmoud Salamati, Sadegh Soudjani
Learning to control an unknown dynamical system with respect to high-level temporal specifications is an important problem in control theory. We present the first regret-free online algorithm for learning a controller for linear temporal logic (LTL) specifications for systems with unknown dynamics. We assume that the underlying (unknown) dynamics is modeled by a finite-state and action Markov decision process (MDP). Our core technical result is a regret-free learning algorithm for infinite-horizon reach-avoid problems on MDPs. For general LTL specifications, we show that the synthesis problem can be reduced to a reach-avoid problem once the graph structure is known. Additionally, we provide an algorithm for learning the graph structure, assuming knowledge of a minimum transition probability, which operates independently of the main regret-free algorithm. Our LTL controller synthesis algorithm provides sharp bounds on how close we are to achieving optimal behavior after a finite number of learning episodes. In contrast, previous algorithms for LTL synthesis only provide asymptotic guarantees, which give no insight into the transient performance during the learning phase.
LOSep 1, 2023
Satisfiability Checking of Multi-Variable TPTL with Unilateral Intervals Is PSPACE-CompleteShankara Narayanan Krishna, Khushraj Nanik Madnani, Rupak Majumdar et al.
We investigate the decidability of the ${0,\infty}$ fragment of Timed Propositional Temporal Logic (TPTL). We show that the satisfiability checking of TPTL$^{0,\infty}$ is PSPACE-complete. Moreover, even its 1-variable fragment (1-TPTL$^{0,\infty}$) is strictly more expressive than Metric Interval Temporal Logic (MITL) for which satisfiability checking is EXPSPACE complete. Hence, we have a strictly more expressive logic with computationally easier satisfiability checking. To the best of our knowledge, TPTL$^{0,\infty}$ is the first multi-variable fragment of TPTL for which satisfiability checking is decidable without imposing any bounds/restrictions on the timed words (e.g. bounded variability, bounded time, etc.). The membership in PSPACE is obtained by a reduction to the emptiness checking problem for a new "non-punctual" subclass of Alternating Timed Automata with multiple clocks called Unilateral Very Weak Alternating Timed Automata (VWATA$^{0,\infty}$) which we prove to be in PSPACE. We show this by constructing a simulation equivalent non-deterministic timed automata whose number of clocks is polynomial in the size of the given VWATA$^{0,\infty}$.
SENov 4, 2020
Probabilistic Bisimulation for Parameterized Systems (Technical Report)Chih-Duo Hong, Anthony W. Lin, Rupak Majumdar et al.
Probabilistic bisimulation is a fundamental notion of process equivalence for probabilistic systems. Among others, it has important applications including formalizing the anonymity property of several communication protocols. There is a lot of work on verifying probabilistic bisimulation for finite systems. This is however not the case for parameterized systems, where the problem is in general undecidable. In this paper we provide a generic framework for reasoning about probabilistic bisimulation for parameterized systems. Our approach is in the spirit of software verification, wherein we encode proof rules for probabilistic bisimulation and use a decidable first-order theory to specify systems and candidate bisimulation relations, which can then be checked automatically against the proof rules. As a case study, we show that our framework is sufficiently expressive for proving the anonymity property of the parameterized dining cryptographers protocol and the parameterized grades protocol, when supplied with a candidate regular bisimulation relation. Both of these protocols hitherto could not be verified by existing automatic methods. Moreover, with the help of standard automata learning algorithms, we show that the candidate relations can be synthesized fully automatically, making the verification fully automated.
ROOct 12, 2020
Multiparty Motion Coordination: From Choreographies to Robotics ProgramsRupak Majumdar, Nobuko Yoshida, Damien Zufferey
We present a programming model and typing discipline for complex multi-robot coordination programming. Our model encompasses both synchronisation through message passing and continuous-time dynamic motion primitives in physical space. We specify \emph{continuous-time motion primitives} in an assume-guarantee logic that ensures compatibility of motion primitives as well as collision freedom. We specify global behaviour of programs in a \emph{choreographic} type system that extends multiparty session types with jointly executed motion primitives, predicated refinements, as well as a \emph{separating conjunction} that allows reasoning about subsets of interacting robots. We describe a notion of \emph{well-formedness} for global types that ensures motion and communication can be correctly synchronised and provide algorithms for checking well-formedness, projecting a type, and local type checking. A well-typed program is \emph{communication safe}, \emph{motion compatible}, and \emph{collision free}. Our type system provides a compositional approach to ensuring these properties. We have implemented our model on top of the ROS framework. This allows us to program multi-robot coordination scenarios on top of commercial and custom robotics hardware platforms. We show through case studies that we can model and statically verify quite complex manoeuvres involving multiple manipulators and mobile robots---such examples are beyond the scope of previous approaches.
AISep 12, 2019
Joint Inference of Reward Machines and Policies for Reinforcement LearningZhe Xu, Ivan Gavran, Yousef Ahmad et al.
Incorporating high-level knowledge is an effective way to expedite reinforcement learning (RL), especially for complex tasks with sparse rewards. We investigate an RL problem where the high-level knowledge is in the form of reward machines, i.e., a type of Mealy machine that encodes the reward functions. We focus on a setting in which this knowledge is a priori not available to the learning agent. We develop an iterative algorithm that performs joint inference of reward machines and policies for RL (more specifically, q-learning). In each iteration, the algorithm maintains a hypothesis reward machine and a sample of RL episodes. It derives q-functions from the current hypothesis reward machine, and performs RL to update the q-functions. While performing RL, the algorithm updates the sample by adding RL episodes along which the obtained rewards are inconsistent with the rewards based on the current hypothesis reward machine. In the next iteration, the algorithm infers a new hypothesis reward machine from the updated sample. Based on an equivalence relationship we defined between states of reward machines, we transfer the q-functions between the hypothesis reward machines in consecutive iterations. We prove that the proposed algorithm converges almost surely to an optimal policy in the limit if a minimal reward machine can be inferred and the maximal length of each RL episode is sufficiently long. The experiments show that learning high-level knowledge in the form of reward machines can lead to fast convergence to optimal policies in RL, while standard RL methods such as q-learning and hierarchical RL methods fail to converge to optimal policies after a substantial number of training steps in many tasks.
SEFeb 4, 2019
Paracosm: A Language and Tool for Testing Autonomous Driving SystemsRupak Majumdar, Aman Mathur, Marcus Pirron et al.
Systematic testing of autonomous vehicles operating in complex real-world scenarios is a difficult and expensive problem. We present Paracosm, a reactive language for writing test scenarios for autonomous driving systems. Paracosm allows users to programmatically describe complex driving situations with specific visual features, e.g., road layout in an urban environment, as well as reactive temporal behaviors of cars and pedestrians. Paracosm programs are executed on top of a game engine that provides realistic physics simulation and visual rendering. The infrastructure allows systematic exploration of the state space, both for visual features (lighting, shadows, fog) and for reactive interactions with the environment (pedestrians, other traffic). We define a notion of test coverage for Paracosm configurations based on combinatorial testing and low dispersion sequences. Paracosm comes with an automatic test case generator that uses random sampling for discrete parameters and deterministic quasi-Monte Carlo generation for continuous parameters. Through an empirical evaluation, we demonstrate the modeling and testing capabilities of Paracosm on a suite of autonomous driving systems implemented using deep neural networks developed in research and education. We show how Paracosm can expose incorrect behaviors or degraded performance.
LGJan 21, 2019
Perception-in-the-Loop Adversarial ExamplesMahmoud Salamati, Sadegh Soudjani, Rupak Majumdar
We present a scalable, black box, perception-in-the-loop technique to find adversarial examples for deep neural network classifiers. Black box means that our procedure only has input-output access to the classifier, and not to the internal structure, parameters, or intermediate confidence values. Perception-in-the-loop means that the notion of proximity between inputs can be directly queried from human participants rather than an arbitrarily chosen metric. Our technique is based on covariance matrix adaptation evolution strategy (CMA-ES), a black box optimization approach. CMA-ES explores the search space iteratively in a black box manner, by generating populations of candidates according to a distribution, choosing the best candidates according to a cost function, and updating the posterior distribution to favor the best candidates. We run CMA-ES using human participants to provide the fitness function, using the insight that the choice of best candidates in CMA-ES can be naturally modeled as a perception task: pick the top $k$ inputs perceptually closest to a fixed input. We empirically demonstrate that finding adversarial examples is feasible using small populations and few iterations. We compare the performance of CMA-ES on the MNIST benchmark with other black-box approaches using $L_p$ norms as a cost function, and show that it performs favorably both in terms of success in finding adversarial examples and in minimizing the distance between the original and the adversarial input. In experiments on the MNIST, CIFAR10, and GTSRB benchmarks, we demonstrate that CMA-ES can find perceptually similar adversarial inputs with a small number of iterations and small population sizes when using perception-in-the-loop. Finally, we show that networks trained specifically to be robust against $L_\infty$ norm can still be susceptible to perceptually similar adversarial examples.
FLJan 3, 2019
Causality Analysis for Concurrent Reactive Systems (Extended Abstract)Rayna Dimitrova, Rupak Majumdar, Vinayak S. Prabhu
We present a comprehensive language theoretic causality analysis framework for explaining safety property violations in the setting of concurrent reactive systems. Our framework allows us to uniformly express a number of causality notions studied in the areas of artificial intelligence and formal methods, as well as define new ones that are of potential interest in these areas. Furthermore, our formalization provides means for reasoning about the relationships between individual notions which have mostly been considered independently in prior work; and allows us to judge the appropriateness of the different definitions for various applications in system design. In particular, we consider causality analysis notions for debugging, error resilience, and liability resolution in concurrent reactive systems. Finally, we present automata-based algorithms for computing various causal sets based on our language-theoretic encoding, and derive the algorithmic complexities.
ROMar 6, 2018
Precise but Natural Specification for Robot TasksIvan Gavran, Brendon Boldt, Eva Darulova et al.
We present Flipper, a natural language interface for describing high-level task specifications for robots that are compiled into robot actions. Flipper starts with a formal core language for task planning that allows expressing rich temporal specifications and uses a semantic parser to provide a natural language interface. Flipper provides immediate visual feedback by executing an automatically constructed plan of the task in a graphical user interface. This allows the user to resolve potentially ambiguous interpretations. Flipper extends itself via naturalization: its users can add definitions for utterances, from which Flipper induces new rules and adds them to the core language, gradually growing a more and more natural task specification language. Flipper improves the naturalization by generalizing the definition provided by users. Unlike other task-specification systems, Flipper enables natural language interactions while maintaining the expressive power and formal precision of a programming language. We show through an initial user study that natural language interactions and generalization can considerably ease the description of tasks. Moreover, over time, users employ more and more concepts outside of the initial core language. Such extensions are available to the Flipper community, and users can use concepts that others have defined.
SYAug 9, 2017
Compositional Abstraction-Based Controller Synthesis for Continuous-Time SystemsKaushik Mallik, Anne-Kathrin Schmuck, Sadegh Soudjani et al.
Controller synthesis techniques for continuous systems with respect to temporal logic specifications typically use a finite-state symbolic abstraction of the system model. Constructing this abstraction for the entire system is computationally expensive, and does not exploit natural decompositions of many systems into interacting components. We describe a methodology for compositional symbolic abstraction to help scale controller synthesis for temporal logic to larger systems. We introduce a new relation, called (approximate) disturbance bisimulation, as the basis for compositional symbolic abstractions. Disturbance bisimulation strengthens the standard approximate alternating bisimulation relation used in control. It extends naturally to systems which are composed of weakly interconnected sub-components possibly connected in feedback, and models the coupling signals as disturbances. After proving this composability of disturbance bisimulation for metric systems we apply this result to the compositional abstraction of networks of input-to-state stable deterministic non-linear control systems. We give conditions that allow to construct finite-state abstractions compositionally for each component in such a network, so that the abstractions are simultaneously disturbance bisimilar to their continuous counterparts. Combining these two results, we show conditions under which one can compositionally abstract a network of non-linear control systems in a modular way while ensuring that the final composed abstraction is disturbance bisimilar to the original system. We discuss how we get a compositional abstraction-based controller synthesis methodology for networks of such systems against local temporal specifications as a by-product of our construction.
SYJul 17, 2017
The Robot Routing Problem for Collecting Aggregate Stochastic RewardsRayna Dimitrova, Ivan Gavran, Rupak Majumdar et al.
We propose a new model for formalizing reward collection problems on graphs with dynamically generated rewards which may appear and disappear based on a stochastic model. The *robot routing problem* is modeled as a graph whose nodes are stochastic processes generating potential rewards over discrete time. The rewards are generated according to the stochastic process, but at each step, an existing reward disappears with a given probability. The edges in the graph encode the (unit-distance) paths between the rewards' locations. On visiting a node, the robot collects the accumulated reward at the node at that time, but traveling between the nodes takes time. The optimization question asks to compute an optimal (or epsilon-optimal) path that maximizes the expected collected rewards. We consider the finite and infinite-horizon robot routing problems. For finite-horizon, the goal is to maximize the total expected reward, while for infinite horizon we consider limit-average objectives. We study the computational and strategy complexity of these problems, establish NP-lower bounds and show that optimal strategies require memory in general. We also provide an algorithm for computing epsilon-optimal infinite paths for arbitrary epsilon > 0.
SYJun 26, 2017
Multilevel Monte Carlo Method for Statistical Model Checking of Hybrid SystemsSadegh Esmaeil Zadeh Soudjani, Rupak Majumdar, Tigran Nagapetyan
We study statistical model checking of continuous-time stochastic hybrid systems. The challenge in applying statistical model checking to these systems is that one cannot simulate such systems exactly. We employ the multilevel Monte Carlo method (MLMC) and work on a sequence of discrete-time stochastic processes whose executions approximate and converge weakly to that of the original continuous-time stochastic hybrid system with respect to satisfaction of the property of interest. With focus on bounded-horizon reachability, we recast the model checking problem as the computation of the distribution of the exit time, which is in turn formulated as the expectation of an indicator function. This latter computation involves estimating discontinuous functionals, which reduces the bound on the convergence rate of the Monte Carlo algorithm. We propose a smoothing step with tunable precision and formally quantify the error of the MLMC approach in the mean-square sense, which is composed of smoothing error, bias, and variance. We formulate a general adaptive algorithm which balances these error terms. Finally, we describe an application of our technique to verify a model of thermostatically controlled loads.
DMFeb 11, 2016
Hitting Families of Schedules for Asynchronous ProgramsDmitry Chistikov, Rupak Majumdar, Filip Niksic
We consider the following basic task in the testing of concurrent systems. The input to the task is a partial order of events, which models actions performed on or by the system and specifies ordering constraints between them. The task is to determine if some scheduling of these events can result in a bug. The number of schedules to be explored can, in general, be exponential. Empirically, many bugs in concurrent programs have been observed to have small bug depth; that is, these bugs are exposed by every schedule that orders $d$ specific events in a particular way, irrespective of how the other events are ordered, and $d$ is small compared to the total number of events. To find all bugs of depth $d$, one needs to only test a $d$-hitting family of schedules: we call a set of schedules a $d$-hitting family if for each set of $d$ events, and for each allowed ordering of these events, there is some schedule in the family that executes these events in this ordering. The size of a $d$-hitting family may be much smaller than the number of all possible schedules, and a natural question is whether one can find $d$-hitting families of schedules that have small size. In general, finding the size of optimal $d$-hitting families is hard, even for $d=2$. We show, however, that when the partial order is a tree, one can explicitly construct $d$-hitting families of schedules of small size. When the tree is balanced, our constructions are polylogarithmic in the number of events.
AIOct 29, 2015
Automatic Synthesis of Geometry Problems for an Intelligent Tutoring SystemChris Alvin, Sumit Gulwani, Rupak Majumdar et al.
This paper presents an intelligent tutoring system, GeoTutor, for Euclidean Geometry that is automatically able to synthesize proof problems and their respective solutions given a geometric figure together with a set of properties true of it. GeoTutor can provide personalized practice problems that address student deficiencies in the subject matter.
SYJul 2, 2015
Dynamic Bayesian Networks as Formal Abstractions of Structured Stochastic ProcessesSadegh Esmaeil Zadeh Soudjani, Alessandro Abate, Rupak Majumdar
We study the problem of finite-horizon probabilistic invariance for discrete-time Markov processes over general (uncountable) state spaces. We compute discrete-time, finite-state Markov chains as formal abstractions of general Markov processes. Our abstraction differs from existing approaches in two ways. First, we exploit the structure of the underlying Markov process to compute the abstraction separately for each dimension. Second, we employ dynamic Bayesian networks (DBN) as compact representations of the abstraction. In contrast, existing approaches represent and store the (exponentially large) Markov chain explicitly, which leads to heavy memory requirements limiting the application to models of dimension less than half, according to our experiments. We show how to construct a DBN abstraction of a Markov process satisfying an independence assumption on the driving process noise. We compute a guaranteed bound on the error in the abstraction w.r.t.\ the probabilistic invariance property; the dimension-dependent abstraction makes the error bounds more precise than existing approaches. Additionally, we show how factor graphs and the sum-product algorithm for DBNs can be used to solve the finite-horizon probabilistic invariance problem. Together, DBN-based representations and algorithms can be significantly more efficient than explicit representations of Markov chains for abstracting and model checking structured Markov processes.
SYMay 21, 2015
Quantifying Conformance using the Skorokhod Metric (full version)Jyotirmoy V. Deshmukh, Rupak Majumdar, Vinayak S. Prabhu
The conformance testing problem for dynamical systems asks, given two dynamical models (e.g., as Simulink diagrams), whether their behaviors are "close" to each other. In the semi-formal approach to conformance testing, the two systems are simulated on a large set of tests, and a metric, defined on pairs of real-valued, real-timed trajectories, is used to determine a lower bound on the distance. We show how the Skorkhod metric on continuous dynamical systems can be used as the foundation for conformance testing of complex dynamical models. The Skorokhod metric allows for both state value mismatches and timing distortions, and is thus well suited for checking conformance between idealized models of dynamical systems and their implementations. We demonstrate the robustness of the system conformance quantification by proving a \emph{transference theorem}: trajectories close under the Skorokhod metric satisfy "close" logical properties. Specifically, we show the result for the timed linear time logic \TLTL augmented with a rich class of temporal and spatial constraint predicates. We provide a window-based streaming algorithm to compute the Skorokhod metric, and use it as a basis for a conformance testing tool for Simulink. We experimentally demonstrate the effectiveness of our tool in finding discrepant behaviors on a set of control system benchmarks, including an industrial challenge problem.
LONov 3, 2014
Approximate Counting in SMT and Value Estimation for Probabilistic ProgramsDmitry Chistikov, Rayna Dimitrova, Rupak Majumdar
#SMT, or model counting for logical theories, is a well-known hard problem that generalizes such tasks as counting the number of satisfying assignments to a Boolean formula and computing the volume of a polytope. In the realm of satisfiability modulo theories (SMT) there is a growing need for model counting solvers, coming from several application domains (quantitative information flow, static analysis of probabilistic programs). In this paper, we show a reduction from an approximate version of #SMT to SMT. We focus on the theories of integer arithmetic and linear real arithmetic. We propose model counting algorithms that provide approximate solutions with formal bounds on the approximation error. They run in polynomial time and make a polynomial number of queries to the SMT solver for the underlying theory, exploiting "for free" the sophisticated heuristics implemented within modern SMT solvers. We have implemented the algorithms and used them to solve the value problem for a model of loop-free probabilistic programs with nondeterminism.
SYOct 22, 2014
Computing the Skorokhod Distance between Polygonal Traces (Full Paper)Rupak Majumdar, Vinayak S. Prabhu
The \emph{Skorokhod distance} is a natural metric on traces of continuous and hybrid systems. For two traces, from $[0,T]$ to values in a metric space $O$, it measures the best match between the traces when allowed continuous bijective timing distortions. Formally, it computes the infimum, over all timing distortions, of the maximum of two components: the first component quantifies the {\em timing discrepancy} of the timing distortion, and the second quantifies the mismatch (in the metric space $O$) of the values under the timing distortion. Skorokhod distances appear in various fundamental hybrid systems analysis concerns: from definitions of hybrid systems semantics and notions of equivalence, to practical problems such as checking the closeness of models or the quality of simulations. Despite its popularity and extensive theoretical use, the \emph{computation} problem for the Skorokhod distance between two finite sampled-time hybrid traces has remained open. We address in this work the problem of computing the Skorokhod distance between two polygonal traces (these traces arise when sampled-time traces are completed by linear interpolation between sample points). We provide the first algorithm to compute the exact Skorokhod distance when trace values are in $\reals^n$ for the $L_1$, $L_2$, and $L_{\infty}$ norms. Our algorithm, based on a reduction to Frechet distances, is fully polynomial-time, and incorporates novel polynomial-time procedures for a set of geometric primitives in $\reals^n$ over the three norms.
SENov 20, 2013
Dynamic Package Interfaces - Extended VersionShahram Esmaeilsabzali, Rupak Majumdar, Thomas Wies et al.
A hallmark of object-oriented programming is the ability to perform computation through a set of interacting objects. A common manifestation of this style is the notion of a package, which groups a set of commonly used classes together. A challenge in using a package is to ensure that a client follows the implicit protocol of the package when calling its methods. Violations of the protocol can cause a runtime error or latent invariant violations. These protocols can extend across different, potentially unboundedly many, objects, and are specified informally in the documentation. As a result, ensuring that a client does not violate the protocol is hard. We introduce dynamic package interfaces (DPI), a formalism to explicitly capture the protocol of a package. The DPI of a package is a finite set of rules that together specify how any set of interacting objects of the package can evolve through method calls and under what conditions an error can happen. We have developed a dynamic tool that automatically computes an approximation of the DPI of a package, given a set of abstraction predicates. A key property of DPI is that the unbounded number of configurations of objects of a package are summarized finitely in an abstract domain. This uses the observation that many packages behave monotonically: the semantics of a method call over a configuration does not essentially change if more objects are added to the configuration. We have exploited monotonicity and have devised heuristics to obtain succinct yet general DPIs. We have used our tool to compute DPIs for several commonly used Java packages with complex protocols, such as JDBC, HashSet, and ArrayList.
SENov 19, 2013
A Notion of Dynamic Interface for Depth-Bounded Object-Oriented PackagesShahram Esmaeilsabzali, Rupak Majumdar, Thomas Wies et al.
Programmers using software components have to follow protocols that specify when it is legal to call particular methods with particular arguments. For example, one cannot use an iterator over a set once the set has been changed directly or through another iterator. We formalize the notion of dynamic package interfaces (DPI), which generalize state-machine interfaces for single objects, and give an algorithm to statically compute a sound abstraction of a DPI. States of a DPI represent (unbounded) sets of heap configurations and edges represent the effects of method calls on the heap. We introduce a novel heap abstract domain based on depth-bounded systems to deal with potentially unboundedly many objects and the references among them. We have implemented our algorithm and show that it is effective in computing representations of common patterns of package usage, such as relationships between viewer and label, container and iterator, and JDBC statements and cursors.
AIJul 4, 2012
Counterexample-guided PlanningKrishnendu Chatterjee, Thomas A. Henzinger, Ranjit Jhala et al.
Planning in adversarial and uncertain environments can be modeled as the problem of devising strategies in stochastic perfect information games. These games are generalizations of Markov decision processes (MDPs): there are two (adversarial) players, and a source of randomness. The main practical obstacle to computing winning strategies in such games is the size of the state space. In practice therefore, one typically works with abstractions of the model. The diffculty is to come up with an abstraction that is neither too coarse to remove all winning strategies (plans), nor too fine to be intractable. In verification, the paradigm of counterexample-guided abstraction refinement has been successful to construct useful but parsimonious abstractions automatically. We extend this paradigm to probabilistic models (namely, perfect information games and, as a special case, MDPs). This allows us to apply the counterexample-guided abstraction paradigm to the AI planning problem. As special cases, we get planning algorithms for MDPs and deterministic systems that automatically construct system abstractions.