CLMar 15, 2024

Identifying Health Risks from Family History: A Survey of Natural Language Processing Techniques

Xiang Dai, Sarvnaz Karimi, Nathan O'Callaghan

arXiv:2403.09997v11.03 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This is an incremental survey that addresses the problem of improving precision health for healthcare professionals by summarizing existing NLP methods.

The paper surveys NLP techniques for extracting family history information from electronic health records to identify hereditary disease risks, highlighting that rule-based methods are widely used and neural models based on pre-trained language models are emerging.

Electronic health records include information on patients' status and medical history, which could cover the history of diseases and disorders that could be hereditary. One important use of family history information is in precision health, where the goal is to keep the population healthy with preventative measures. Natural Language Processing (NLP) and machine learning techniques can assist with identifying information that could assist health professionals in identifying health risks before a condition is developed in their later years, saving lives and reducing healthcare costs. We survey the literature on the techniques from the NLP field that have been developed to utilise digital health records to identify risks of familial diseases. We highlight that rule-based methods are heavily investigated and are still actively used for family history extraction. Still, more recent efforts have been put into building neural models based on large-scale pre-trained language models. In addition to the areas where NLP has successfully been utilised, we also identify the areas where more research is needed to unlock the value of patients' records regarding data collection, task formulation and downstream applications.

View on arXiv PDF

Similar