Zero-Shot Text Matching for Automated Auditing using Sentence Transformers
This addresses the scarcity of annotated data in industrial auditing settings, though it is incremental as it applies an existing method to a new domain.
The paper tackled the problem of automated auditing by applying Sentence-BERT for zero-shot text matching on financial passages, showing that the model is robust to both in-domain and out-of-domain data.
Natural language processing methods have several applications in automated auditing, including document or passage classification, information retrieval, and question answering. However, training such models requires a large amount of annotated data which is scarce in industrial settings. At the same time, techniques like zero-shot and unsupervised learning allow for application of models pre-trained using general domain data to unseen domains. In this work, we study the efficiency of unsupervised text matching using Sentence-Bert, a transformer-based model, by applying it to the semantic similarity of financial passages. Experimental results show that this model is robust to documents from in- and out-of-domain data.