Tsetlin Machine for Solving Contextual Bandit Problems
This provides an interpretable method for contextual bandit problems, which is incremental as it adapts an existing Tsetlin Machine approach to a new application area.
The paper tackled the problem of contextual bandit learning by introducing an interpretable algorithm based on Tsetlin Machines, which outperformed other popular base learners on eight out of nine datasets.
This paper introduces an interpretable contextual bandit algorithm using Tsetlin Machines, which solves complex pattern recognition tasks using propositional logic. The proposed bandit learning algorithm relies on straightforward bit manipulation, thus simplifying computation and interpretation. We then present a mechanism for performing Thompson sampling with Tsetlin Machine, given its non-parametric nature. Our empirical analysis shows that Tsetlin Machine as a base contextual bandit learner outperforms other popular base learners on eight out of nine datasets. We further analyze the interpretability of our learner, investigating how arms are selected based on propositional expressions that model the context.