ML LGOct 9, 2020

Discussion of Kallus (2020) and Mo, Qi, and Liu (2020): New Objectives for Policy Learning

arXiv:2010.04805v12.71 citations

Originality Synthesis-oriented

AI Analysis

This work addresses methodological improvements in policy learning for researchers and practitioners, but it appears incremental as it builds on existing frameworks.

The paper discusses new objective functions for policy learning, highlighting the importance of accounting for the curvature of the value function in retargeting frameworks and introducing two methods to do so, along with more efficient approaches for using calibration data in distributionally robust policy learning.

We discuss the thought-provoking new objective functions for policy learning that were proposed in "More efficient policy learning via optimal retargeting" by Nathan Kallus and "Learning optimal distributionally robust individualized treatment rules" by Weibin Mo, Zhengling Qi, and Yufeng Liu. We show that it is important to take the curvature of the value function into account when working within the retargeting framework, and we introduce two ways to do so. We also describe more efficient approaches for leveraging calibration data when learning distributionally robust policies.

View on arXiv PDF

Similar