TR MLDec 4, 2017

Inferring agent objectives at different scales of a complex adaptive system

Dieter Hendricks, Adam Cobb, Richard Everett, Jonathan Downing, Stephen J. Roberts

arXiv:1712.01137v15.14 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of modeling hierarchical agent interactions in financial markets, which could improve learning algorithms in this domain, though it appears incremental in applying existing methods to a specific problem.

The authors tackled the problem of understanding agent objectives at different time scales in financial market microstructure by using Inverse Reinforcement Learning on estimated state-action trajectories to compute scale-specific reward functions, finding differences that could indicate distinct objectives and help identify scale boundaries.

We introduce a framework to study the effective objectives at different time scales of financial market microstructure. The financial market can be regarded as a complex adaptive system, where purposeful agents collectively and simultaneously create and perceive their environment as they interact with it. It has been suggested that multiple agent classes operate in this system, with a non-trivial hierarchy of top-down and bottom-up causation classes with different effective models governing each level. We conjecture that agent classes may in fact operate at different time scales and thus act differently in response to the same perceived market state. Given scale-specific temporal state trajectories and action sequences estimated from aggregate market behaviour, we use Inverse Reinforcement Learning to compute the effective reward function for the aggregate agent class at each scale, allowing us to assess the relative attractiveness of feature vectors across different scales. Differences in reward functions for feature vectors may indicate different objectives of market participants, which could assist in finding the scale boundary for agent classes. This has implications for learning algorithms operating in this domain.

View on arXiv PDF

Similar