LG AIFeb 18, 2022

Can Interpretable Reinforcement Learning Manage Prosperity Your Way?

arXiv:2202.09064v26.97 citations

Originality Incremental advance

AI Analysis

This work addresses the need for transparent and personalized financial advice for customers and regulators, though it is incremental as it adapts an existing reinforcement learning algorithm to a new domain.

The paper tackles the challenge of providing interpretable investment advice in financial decision-making by training inherently interpretable reinforcement learning agents aligned with prototype financial personality traits, resulting in agents that adhere to intended characteristics, learn compound growth and risk concepts, and show improved policy convergence.

Personalisation of products and services is fast becoming the driver of success in banking and commerce. Machine learning holds the promise of gaining a deeper understanding of and tailoring to customers' needs and preferences. Whereas traditional solutions to financial decision problems frequently rely on model assumptions, reinforcement learning is able to exploit large amounts of data to improve customer modelling and decision-making in complex financial environments with fewer assumptions. Model explainability and interpretability present challenges from a regulatory perspective which demands transparency for acceptance; they also offer the opportunity for improved insight into and understanding of customers. Post-hoc approaches are typically used for explaining pretrained reinforcement learning models. Based on our previous modeling of customer spending behaviour, we adapt our recent reinforcement learning algorithm that intrinsically characterizes desirable behaviours and we transition to the problem of asset management. We train inherently interpretable reinforcement learning agents to give investment advice that is aligned with prototype financial personality traits which are combined to make a final recommendation. We observe that the trained agents' advice adheres to their intended characteristics, they learn the value of compound growth, and, without any explicit reference, the notion of risk as well as improved policy convergence.

View on arXiv PDF

Similar