Neural translation and automated recognition of ICD10 medical entities from natural language
This addresses the time-consuming and expensive need for human expert intervention in medical entity recognition, though it is incremental as it applies existing deep learning methods to a new dataset.
The paper tackles the problem of recognizing medical entities from natural language, such as for coding medical acts, by applying deep neural sequence models to a large French database of death certificates, achieving efficient automated recognition.
The recognition of medical entities from natural language is an ubiquitous problem in the medical field, with applications ranging from medical act coding to the analysis of electronic health data for public health. It is however a complex task usually requiring human expert intervention, thus making it expansive and time consuming. The recent advances in artificial intelligence, specifically the raise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems, with the notable example of neural sequence models and their powerful applications in natural language processing. They however require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc stores an exhaustive database of death certificates at the French national scale, amounting to several millions of natural language examples provided with their associated human coded medical entities available to the machine learning practitioner. This article investigates the applications of deep neural sequence models to the medical entity recognition from natural language problem.