LGSep 8, 2017
Mirror Descent Search and its AccelerationMegumi Miyashita, Shiro Yano, Toshiyuki Kondo
In recent years, attention has been focused on the relationship between black-box optimiza- tion problem and reinforcement learning problem. In this research, we propose the Mirror Descent Search (MDS) algorithm which is applicable both for black box optimization prob- lems and reinforcement learning problems. Our method is based on the mirror descent method, which is a general optimization algorithm. The contribution of this research is roughly twofold. We propose two essential algorithms, called MDS and Accelerated Mirror Descent Search (AMDS), and two more approximate algorithms: Gaussian Mirror Descent Search (G-MDS) and Gaussian Accelerated Mirror Descent Search (G-AMDS). This re- search shows that the advanced methods developed in the context of the mirror descent research can be applied to reinforcement learning problem. We also clarify the relationship between an existing reinforcement learning algorithm and our method. With two evaluation experiments, we show our proposed algorithms converge faster than some state-of-the-art methods.
SYJun 9, 2015
Automated Linear Function Submission-based Double Auction as Bottom-up Real-Time Pricing in a Regional Prosumers' Electricity NetworkTadahiro Taniguchi, Koki Kawasaki, Yoshiro Fukui et al.
A linear function submission-based double-auction (LFS-DA) mechanism for a regional electricity network is proposed in this paper. Each agent in the network is equipped with a battery and a generator. Each agent simultaneously becomes a producer and consumer of electricity, i.e., a prosumer and trades electricity in the regional market at a variable price. In the LFS-DA, each agent uses linear demand and supply functions when they submit bids and asks to an auctioneer in the regional market.The LFS-DA can achieve an exact balance between electricity demand and supply for each time slot throughout the learning phase and was shown capable of solving the primal problem of maximizing the social welfare of the network without any central price setter, e.g., a utility or a large electricity company, in contrast with conventional real-time pricing (RTP). This paper presents a clarification of the relationship between the RTP algorithm derived on the basis of a dual decomposition framework and LFS-DA. Specifically, we proved that the changes in the price profile of the LFS-DA mechanism are equal to those achieved by the RTP mechanism derived from the dual decomposition framework except for a constant factor.