Corporate Bankruptcy Prediction with Domain-Adapted BERT
This work addresses bankruptcy prediction for corporate stakeholders by improving input data quality, though it is incremental as it adapts an existing method to a specific domain.
The study tackled corporate bankruptcy prediction by using BERT for sentiment analysis on MD&A disclosures, achieving an accuracy rate of 91.56% and outperforming dictionary-based and Word2Vec-based methods in adjusted R-square across logistic regression, kNN-5, and SVM.
This study performs BERT-based analysis, which is a representative contextualized language model, on corporate disclosure data to predict impending bankruptcies. Prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables. However, in our study, we focus on improving the quality of input dataset. Specifically, we employ BERT model to perform sentiment analysis on MD&A disclosures. We show that BERT outperforms dictionary-based predictions and Word2Vec-based predictions in terms of adjusted R-square in logistic regression, k-nearest neighbor (kNN-5), and linear kernel support vector machine (SVM). Further, instead of pre-training the BERT model from scratch, we apply self-learning with confidence-based filtering to corporate disclosure data (10-K). We achieve the accuracy rate of 91.56% and demonstrate that the domain adaptation procedure brings a significant improvement in prediction accuracy.