GPT has become financially literate: Insights from financial literacy tests of GPT and a preliminary test of how people use it as a source of advice
This addresses the problem of providing accessible financial advice for the general public, though it is incremental in assessing model capabilities.
The study evaluated GPT models as financial robo-advisors using a financial literacy test, finding that GPT-3.5-based models scored around 65% and GPT-4 achieved 99%, compared to a 33% baseline, indicating emergent financial literacy in advanced models.
We assess the ability of GPT -- a large language model -- to serve as a financial robo-advisor for the masses, by using a financial literacy test. Davinci and ChatGPT based on GPT-3.5 score 66% and 65% on the financial literacy test, respectively, compared to a baseline of 33%. However, ChatGPT based on GPT-4 achieves a near-perfect 99% score, pointing to financial literacy becoming an emergent ability of state-of-the-art models. We use the Judge-Advisor System and a savings dilemma to illustrate how researchers might assess advice-utilization from large language models. We also present a number of directions for future research.