LGJul 10, 2024

CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias

arXiv:2407.07454v32.6h-index: 4Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of modeling human cognitive biases in AI decision-making, but it is incremental as it adapts existing DQN methods to incorporate bias simulation.

The study tackled simulating human confirmation bias in decision-making by proposing CM-DQN, a deep reinforcement learning algorithm that uses different update strategies for positive and negative prediction errors, and found that confirmatory bias led to better learning effects in Lunar Lander and multi-armed bandit environments.

In human decision-making tasks, individuals learn through trials and prediction errors. When individuals learn the task, some are more influenced by good outcomes, while others weigh bad outcomes more heavily. Such confirmation bias can lead to different learning effects. In this study, we propose a new algorithm in Deep Reinforcement Learning, CM-DQN, which applies the idea of different update strategies for positive or negative prediction errors, to simulate the human decision-making process when the task's states are continuous while the actions are discrete. We test in Lunar Lander environment with confirmatory, disconfirmatory bias and non-biased to observe the learning effects. Moreover, we apply the confirmation model in a multi-armed bandit problem (environment in discrete states and discrete actions), which utilizes the same idea as our proposed algorithm, as a contrast experiment to algorithmically simulate the impact of different confirmation bias in decision-making process. In both experiments, confirmatory bias indicates a better learning effect.

View on arXiv PDF Code

Similar