Claas Flint

NCDec 13, 2019

Systematic Misestimation of Machine Learning Performance in Neuroimaging Studies of Depression

Claas Flint, Micah Cearns, Nils Opel et al.

We currently observe a disconcerting phenomenon in machine learning studies in psychiatry: While we would expect larger samples to yield better results due to the availability of more data, larger machine learning studies consistently show much weaker performance than the numerous small-scale studies. Here, we systematically investigated this effect focusing on one of the most heavily studied questions in the field, namely the classification of patients suffering from major depressive disorder (MDD) and healthy control (HC) based on neuroimaging data. Drawing upon structural magnetic resonance imaging (MRI) data from a balanced sample of $N = 1,868$ MDD patients and HC from our recent international Predictive Analytics Competition (PAC), we first trained and tested a classification model on the full dataset which yielded an accuracy of $61\,\%$. Next, we mimicked the process by which researchers would draw samples of various sizes ($N = 4$ to $N = 150$) from the population and showed a strong risk of misestimation. Specifically, for small sample sizes ($N = 20$), we observe accuracies of up to $95\,\%$. For medium sample sizes ($N = 100$) accuracies up to $75\,\%$ were found. Importantly, further investigation showed that sufficiently large test sets effectively protect against performance misestimation whereas larger datasets per se do not. While these results question the validity of a substantial part of the current literature, we outline the relatively low-cost remedy of larger test sets, which is readily available in most cases.

NCNov 24, 2019

Biological sex classification with structural MRI data shows increased misclassification in transgender women

Claas Flint, Katharina Förster, Sophie A. Koser et al.

Transgender individuals (TIs) show brain structural alterations that differ from their biological sex as well as their perceived gender. To substantiate evidence that the brain structure of TIs differs from male and female, we use a combined multivariate and univariate approach. Gray matter segments resulting from voxel-based morphometry preprocessing of $N = 1753$ cisgender (CG) healthy participants were used to train ($N=1402$) and validate (20 % hold-out; $N = 351$) a support-vector machine classifying the biological sex. As a second validation, we classified $N = 1104$ patients with depression. A third validation was performed using the matched CG sample of the transgender women (TWs) application-sample. Subsequently, the classifier was applied to $N = 26$ TWs. Finally, we compared brain volumes of CG-men, women and TW-pre/post treatment (cross-sex hormone treatment) in a univariate analysis controlling for sexual orientation, age and total brain volume. The application of our biological sex classifier to the transgender sample resulted in a significantly lower true positive rate (TPR) (TPR-male = 56.0 %). The TPR did not differ between CG-individuals with (TPR-male = 86.9 %) and without depression (TPR-male = 88.5 %). The univariate analysis of the transgender application-sample revealed that TW-pre/post treatment show brain structural differences from CG-women and CG-men in the putamen and insula, as well as the whole-brain analysis. Our results support the hypothesis that brain structure in TW differs from brain structure of their biological sex (male) as well as their perceived gender (female). This finding substantiates evidence that TIs show specific brain structural alterations leading to a different pattern of brain structure than CG-individuals.

Claas Flint

2 Papers