Archie C. Chapman

GT
h-index29
8papers
21citations
Novelty49%
AI Score28

8 Papers

SYJul 17, 2020
Probabilistic assessment of the impact of flexible loads under network tariffs in low voltage distribution networks

Donald Azuatalam, Archie C. Chapman, Gregor Verbič

Given the historically static nature of low-voltage networks, distribution network companies do not possess tools for dealing with an increasingly variable demand due to the high penetration of distributed energy resources (DER). Within this context, this paper proposes a probabilistic framework for tariff design that minimises the impact of DER on network performance, stabilise network company revenue, and improves the equity of network costs allocation. To address the issue of the lack of customers' response, we also show how DER-specific tariffs can be complemented with an automated home energy management system (HEMS) that reduces peak demand while retaining the desired comfort level. The proposed framework comprises a nonparametric Bayesian model which statistically generates synthetic load and PV traces, a hot-water-use statistical model, a novel HEMS to schedule customers' controllable devices, and a probabilistic power-flow model. Test cases using both energy- and demand-based network tariffs show that flat tariffs with a peak demand component reduce the customers' cost, and alleviate network constraints. This demonstrates, first, the efficacy of the proposed tool for the development of tariffs that are beneficial for networks with a high DER penetration, and second, how customers' HEM systems can be part of the solution.

ROMar 14, 2025
Training Directional Locomotion for Quadrupedal Low-Cost Robotic Systems via Deep Reinforcement Learning

Peter Böhm, Archie C. Chapman, Pauline Pounds

In this work we present Deep Reinforcement Learning (DRL) training of directional locomotion for low-cost quadrupedal robots in the real world. In particular, we exploit randomization of heading that the robot must follow to foster exploration of action-state transitions most useful for learning both forward locomotion as well as course adjustments. Changing the heading in episode resets to current yaw plus a random value drawn from a normal distribution yields policies able to follow complex trajectories involving frequent turns in both directions as well as long straight-line stretches. By repeatedly changing the heading, this method keeps the robot moving within the training platform and thus reduces human involvement and need for manual resets during the training. Real world experiments on a custom-built, low-cost quadruped demonstrate the efficacy of our method with the robot successfully navigating all validation tests. When trained with other approaches, the robot only succeeds in forward locomotion test and fails when turning is required.

LGMar 14, 2025
Low-cost Real-world Implementation of the Swing-up Pendulum for Deep Reinforcement Learning Experiments

Peter Böhm, Pauline Pounds, Archie C. Chapman

Deep reinforcement learning (DRL) has had success in virtual and simulated domains, but due to key differences between simulated and real-world environments, DRL-trained policies have had limited success in real-world applications. To assist researchers to bridge the \textit{sim-to-real gap}, in this paper, we describe a low-cost physical inverted pendulum apparatus and software environment for exploring sim-to-real DRL methods. In particular, the design of our apparatus enables detailed examination of the delays that arise in physical systems when sensing, communicating, learning, inferring and actuating. Moreover, we wish to improve access to educational systems, so our apparatus uses readily available materials and parts to reduce cost and logistical barriers. Our design shows how commercial, off-the-shelf electronics and electromechanical and sensor systems, combined with common metal extrusions, dowel and 3D printed couplings provide a pathway for affordable physical DRL apparatus. The physical apparatus is complemented with a simulated environment implemented using a high-fidelity physics engine and OpenAI Gym interface.

LGJun 11, 2019
Macro-action Multi-time scale Dynamic Programming for Energy Management in Buildings with Phase Change Materials

Zahra Rahimpour, Gregor Verbic, Archie C. Chapman

This paper focuses on energy management in buildings with phase change material (PCM), which is primarily used to improve thermal performance, but can also serve as an energy storage system. In this setting, optimal scheduling of an HVAC system is challenging because of the nonlinear and non-convex characteristics of the PCM, which makes solving the corresponding optimization problem using conventional optimization techniques impractical. Instead, we use dynamic programming (DP) to deal with the nonlinear nature of the PCM. To overcome DP's curse of dimensionality, this paper proposes a novel methodology to reduce the computational burden, while maintaining the quality of the solution. Specifically, the method incorporates approaches from sequential decision making in artificial intelligence, including macro actions and multi-time scale Markov decision processes, coupled with an underlying state-space approximation to reduce the state-space and action-space size. The performance of the method is demonstrated on an energy management problem for a typical residential building located in Sydney, Australia. The results demonstrate that the proposed method performs well with a computational speed-up of up to 12,900 times compared to the direct application of DP.

SYApr 13, 2019
A Novel Probabilistic Framework to Study the Impact of PV-battery Systems on Low-Voltage Distribution Networks

Yiju Ma, Donald Azuatalam, Thomas Power et al.

Battery storage, particularly residential battery storage coupled with rooftop PV, is emerging as an essential component of the smart grid technology mix. However, including battery storage and other flexible resources like electric vehicles and loads with thermal inertia into a probabilistic analysis based on Monte Carlo (MC) simulation is challenging, because their operational profiles are determined by computationally intensive optimization. Additionally, MC analysis requires a large pool of statistically-representative demand profiles to sample from. As a result, the analysis of the network impact of PV-battery systems has attracted little attention in the existing literature. To fill these knowledge gaps, this paper proposes a novel probabilistic framework to study the impact of PV-battery systems on low-voltage distribution networks. Specifically, the framework incorporates home energy management(HEM) operational decisions within the MC time series power flow analysis. First, using available smart meter data, we use a Bayesian nonparametric model to generate statistically-representative synthetic demand and PV profiles. Second, a policy function approximation that emulates battery scheduling decisions is used to make the simulation of optimization-based HEM feasible within the MC framework. The efficacy of our method is demonstrated on three representative low-voltage feeders, where the computation time to execute our MC framework is 5% of that when using explicit optimization methods in each MC sample. The assessment results show that uncoordinated battery scheduling has a limited beneficial impact, which is against the conjecture that batteries will serendipitously mitigate the technical problems induced by PV generation.

SYJun 1, 2015
An Iterative On-Line Mechanism for Demand-Side Aggregation

Archie C. Chapman, Gregor Verbic

This paper considers a demand-side aggregation scheme specifically for large numbers of small loads, such as households and small and medium-sized businesses. We introduce a novel auction format, called a staggered clock-proxy auction (SCPA), for on-line scheduling of these loads. This is a two phase format, consisting of: a sequence of overlapping iterative ascending-price clock auctions, one for each time-slot over a finite decision horizon, and; a set of proxy auctions that begin at the termination of each individual clock auction, and which determine the final price and allocation for each time-slot. The overlapping design of the clock phases grant bidders the ability to effectively bid on inter-temporal bundles of electricity use, thereby focusing on the most relevant parts of the price-quantity space. Since electricity is a divisible good, the proxy auction uses demand-schedule bids, which the aggregator uses to compute a uniform-price partial competitive equilibrium for each time slot. We show that, under mild assumptions on the bidders' utilities functions, the proxy phase implements the Vickrey-Clarke-Groves outcome, which makes straightforward bidding in the proxy phase a Bayes-Nash equilibrium. Furthermore, we demonstrate the SCPA in a scenario comprised of household agents with three different utility function types, and show how the mechanism enables efficient on-line energy use scheduling.

GTMar 15, 2012
Automated Planning in Repeated Adversarial Games

Enrique Munoz de Cote, Archie C. Chapman, Adam M. Sykulski et al.

Game theory's prescriptive power typically relies on full rationality and/or self-play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant-sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game's Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model-based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two-player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament's actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.

GTFeb 14, 2012
Filtered Fictitious Play for Perturbed Observation Potential Games and Decentralised POMDPs

Archie C. Chapman, Simon A. Williamson, Nicholas R. Jennings

Potential games and decentralised partially observable MDPs (Dec-POMDPs) are two commonly used models of multi-agent interaction, for static optimisation and sequential decisionmaking settings, respectively. In this paper we introduce filtered fictitious play for solving repeated potential games in which each player's observations of others' actions are perturbed by random noise, and use this algorithm to construct an online learning method for solving Dec-POMDPs. Specifically, we prove that noise in observations prevents standard fictitious play from converging to Nash equilibrium in potential games, which also makes fictitious play impractical for solving Dec-POMDPs. To combat this, we derive filtered fictitious play, and provide conditions under which it converges to a Nash equilibrium in potential games with noisy observations. We then use filtered fictitious play to construct a solver for Dec-POMDPs, and demonstrate our new algorithm's performance in a box pushing problem. Our results show that we consistently outperform the state-of-the-art Dec-POMDP solver by an average of 100% across the range of noise in the observation function.