Discovering material information using hierarchical Reformer model on financial regulatory filings
This work addresses the need for better understanding financial markets through automated analysis of regulatory filings, benefiting investors and regulators, though it is incremental as it adapts existing NLP methods to a new domain.
The authors tackled the problem of extracting material information from financial regulatory filings by constructing a hierarchical Reformer model, which successfully predicted trade volume changes and detected material information without explicit training, indicating the model captures market-relevant insights.
Most applications of machine learning for finance are related to forecasting tasks for investment decisions. Instead, we aim to promote a better understanding of financial markets with machine learning techniques. Leveraging the tremendous progress in deep learning models for natural language processing, we construct a hierarchical Reformer ([15]) model capable of processing a large document level dataset, SEDAR, from canadian financial regulatory filings. Using this model, we show that it is possible to predict trade volume changes using regulatory filings. We adapt the pretraining task of HiBERT ([36]) to obtain good sentence level representations using a large unlabelled document dataset. Finetuning the model to successfully predict trade volume changes indicates that the model captures a view from financial markets and processing regulatory filings is beneficial. Analyzing the attention patterns of our model reveals that it is able to detect some indications of material information without explicit training, which is highly relevant for investors and also for the market surveillance mandate of financial regulators.