Quantum POMDPs
This work addresses the theoretical complexity of decision-making in quantum systems for researchers in quantum computing and AI, establishing foundational differences from classical models.
The authors introduced quantum observable Markov decision processes (QOMDPs) as quantum analogues of POMDPs, showing that policy value complexity is similar for both in polynomial and infinite horizon cases, but proving that goal-state reachability is decidable for goal POMDPs and undecidable for goal QOMDPs.
We present quantum observable Markov decision processes (QOMDPs), the quantum analogues of partially observable Markov decision processes (POMDPs). In a QOMDP, an agent's state is represented as a quantum state and the agent can choose a superoperator to apply. This is similar to the POMDP belief state, which is a probability distribution over world states and evolves via a stochastic matrix. We show that the existence of a policy of at least a certain value has the same complexity for QOMDPs and POMDPs in the polynomial and infinite horizon cases. However, we also prove that the existence of a policy that can reach a goal state is decidable for goal POMDPs and undecidable for goal QOMDPs.