Ung-il Chung

STAT-MECH
h-index9
4papers
9citations
Novelty54%
AI Score28

4 Papers

LGAug 28, 2023Code
Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted Averages

Nobuhito Manome, Shuji Shinohara, Ung-il Chung

The multi-armed bandit (MAB) problem is a classical problem that models sequential decision-making under uncertainty in reinforcement learning. In this study, we propose a new generalized upper confidence bound (UCB) algorithm (GWA-UCB1) by extending UCB1, which is a representative algorithm for MAB problems, using generalized weighted averages, and present an effective algorithm for various problem settings. GWA-UCB1 is a two-parameter generalization of the balance between exploration and exploitation in UCB1 and can be implemented with a simple modification of the UCB1 formula. Therefore, this algorithm can be easily applied to UCB-based reinforcement learning models. In preliminary experiments, we investigated the optimal parameters of a simple generalized UCB1 (G-UCB1), prepared for comparison and GWA-UCB1, in a stochastic MAB problem with two arms. Subsequently, we confirmed the performance of the algorithms with the investigated parameters on stochastic MAB problems when arm reward probabilities were sampled from uniform or normal distributions and on survival MAB problems assuming more realistic situations. GWA-UCB1 outperformed G-UCB1, UCB1-Tuned, and Thompson sampling in most problem settings and can be useful in many situations. The code is available at https://github.com/manome/python-mab.

MAMay 13, 2024
Walk model that continuously generates Brownian walks to Lévy walks depending on destination attractiveness

Shuji Shinohara, Daiki Morita, Hayato Hirai et al.

The Lévy walk, a type of random walk characterized by linear step lengths that follow a power-law distribution, is observed in the migratory behaviors of various organisms, ranging from bacteria to humans. Notably, Lévy walks with power exponents close to two, also known as Cauchy walks, are frequently observed, though their underlying causes remain elusive. This study proposes a walk model in which agents move toward a destination in multi-dimensional space and their movement strategy is parameterized by the extent to which they pursue the shortest path to the destination. This parameter is taken to represent the attractiveness of the destination to the agents. Our findings reveal that if the destination is very attractive, agents intensively search the area around it using Brownian walks, whereas if the destination is unattractive, they explore a distant region away from the point using Lévy walks with power exponents less than two. In the case where agents are unable to determine whether the destination is attractive or unattractive, Cauchy walks emerge. The Cauchy walker searches the region with a probability inversely proportional to the distance from the destination. This suggests that it preferentially searches the area close to the destination, while concurrently having the potential to extend the search area much further. Our model, which can change the search method and search area depending on the attractiveness of the destination, has the potential to be utilized for exploring the parameter space of optimization problems.

STAT-MECHMay 23, 2023
Inverse square Levy walk emerging universally in goal-oriented tasks

Shuji Shinohara, Daiki Morita, Nobuhito Manome et al.

The Levy walk in which the frequency of occurrence of step lengths follows a power-law distribution, can be observed in the migratory behavior of organisms at various levels. Levy walks with power exponents close to 2 are observed, and the reasons are unclear. This study aims to propose a model that universally generates inverse square Levy walks (called Cauchy walks) and to identify the conditions under which Cauchy walks appear. We demonstrate that Cauchy walks emerge universally in goal-oriented tasks. We use the term "goal-oriented" when the goal is clear, but this can be achieved in different ways, which cannot be uniquely determined. We performed a simulation in which an agent observed the data generated from a probability distribution in a two-dimensional space and successively estimated the central coordinates of that probability distribution. The agent has a model of probability distribution as a hypothesis for data-generating distribution and can modify the model such that each time a data point is observed, thereby increasing the estimated probability of occurrence of the observed data. To achieve this, the center coordinates of the model must be moved closer to those of the observed data. However, in the case of a two-dimensional space, arbitrariness arises in the direction of correction of the center; this task is goal oriented. We analyze two cases: a strategy that allocates the amount of modification randomly in the x- and y-directions, and a strategy that determines allocation such that movement is minimized. The results reveal that when a random strategy is used, the Cauchy walk appears. When the minimum strategy is used, the Brownian walk appears. The presence or absence of the constraint of minimizing the amount of movement may be a factor that causes the difference between Brownian and Levy walks.

AIDec 16, 2020
Lévy walks derived from a Bayesian decision-making model in non-stationary environments

Shuji Shinohara, Nobuhito Manome, Yoshihiro Nakajima et al.

Lévy walks are found in the migratory behaviour patterns of various organisms, and the reason for this phenomenon has been much discussed. We use simulations to demonstrate that learning causes the changes in confidence level during decision-making in non-stationary environments, and results in Lévy-walk-like patterns. One inference algorithm involving confidence is Bayesian inference. We propose an algorithm that introduces the effects of learning and forgetting into Bayesian inference, and simulate an imitation game in which two decision-making agents incorporating the algorithm estimate each other's internal models from their opponent's observational data. For forgetting without learning, agent confidence levels remained low due to a lack of information on the counterpart and Brownian walks occurred for a wide range of forgetting rates. Conversely, when learning was introduced, high confidence levels occasionally occurred even at high forgetting rates, and Brownian walks universally became Lévy walks through a mixture of high- and low-confidence states.