Harry Potter and the Action Prediction Challenge from Natural Language
This work addresses action prediction from text for natural language processing applications, but it is incremental as it applies existing methods to a new dataset.
The paper tackles the problem of predicting actions from textual scene descriptions, using Harry Potter spells as a case study, and reports that an LSTM-based model performs best for frequent actions with large descriptions, while logistic regression works well for infrequent actions.
We explore the challenge of action prediction from textual descriptions of scenes, a testbed to approximate whether text inference can be used to predict upcoming actions. As a case of study, we consider the world of the Harry Potter fantasy novels and inferring what spell will be cast next given a fragment of a story. Spells act as keywords that abstract actions (e.g. 'Alohomora' to open a door) and denote a response to the environment. This idea is used to automatically build HPAC, a corpus containing 82,836 samples and 85 actions. We then evaluate different baselines. Among the tested models, an LSTM-based approach obtains the best performance for frequent actions and large scene descriptions, but approaches such as logistic regression behave well on infrequent actions.