Identifying social isolation themes in NVDRS text narratives using topic modeling and text-classification methods
This work addresses the need for better surveillance of social isolation in suicide prevention in the US, though it is incremental as it applies existing NLP methods to a new dataset.
The researchers tackled the problem of identifying social isolation themes in suicide narratives from the NVDRS, developing classifiers with an average F1 score of 0.86 and accuracy of 0.82, and found that 1,198 out of over 300,000 suicides from 2002 to 2020 mentioned chronic social isolation, with higher odds for men, gay individuals, and divorced persons.
Social isolation and loneliness, which have been increasing in recent years strongly contribute toward suicide rates. Although social isolation and loneliness are not currently recorded within the US National Violent Death Reporting System's (NVDRS) structured variables, natural language processing (NLP) techniques can be used to identify these constructs in law enforcement and coroner medical examiner narratives. Using topic modeling to generate lexicon development and supervised learning classifiers, we developed high-quality classifiers (average F1: .86, accuracy: .82). Evaluating over 300,000 suicides from 2002 to 2020, we identified 1,198 mentioning chronic social isolation. Decedents had higher odds of chronic social isolation classification if they were men (OR = 1.44; CI: 1.24, 1.69, p<.0001), gay (OR = 3.68; 1.97, 6.33, p<.0001), or were divorced (OR = 3.34; 2.68, 4.19, p<.0001). We found significant predictors for other social isolation topics of recent or impending divorce, child custody loss, eviction or recent move, and break-up. Our methods can improve surveillance and prevention of social isolation and loneliness in the United States.