Life Sciences

Machine learning models for diagnosis and risk prediction in eating disorders, depression, and alcohol use disorder

Published on - Research Square - Preprint

Authors: Sylvane Desrivières, Zuo Zhang, Lauren Robinson, Robert Whelan, Lee Jollans, Zijian Wang, Frauke Nees, Congying Chu, Marina Bobou, Dongping Du, Ilinca Cristea, Tobias Banaschewski, Gareth Barker, Arun Bokde, Antoine Grigis, Hugh Garavan, Andreas Heinz, Rudiger Bruhl, Jean-Luc Martinot, Marie-Laure Paillère Martinot, Eric Artiges, Dimitri Papadopoulos Orfanos, Luise Poustka, Sarah Hohmann, Sabina Millenet, Juliane Fröhner, Michael Smolka, Nilakshi Vaidya, Henrik Walter, Jeanne Winterer, M. Broulidakis, Betteke van Noort, Argyris Stringaris, Jani Penttilä, Yvonne Grimmer, Corinna Insensee, Andreas Becker, Yuning Zhang, Sinead King, Julia Sinclair, Gunter Schumann, Ulrike Schmidt, Dimitri Papadopoulos Orfanos, Betteke van Noort

Abstract This study uses machine learning models to uncover diagnostic and risk prediction markers for eating disorders (EDs), major depressive disorder (MDD), and alcohol use disorder (AUD). Utilizing case-control samples (ages 18-25 years) and a longitudinal population-based sample (n=1,851), the models, incorporating diverse data domains, achieved high accuracy in classifying EDs, MDD, and AUD from healthy controls. The area under the receiver operating characteristic curves (AUC-ROC [95% CI]) reached 0.92 [0.86-0.97] for AN and 0.91 [0.85-0.96] for BN, without relying on body mass index as a predictor. The classification accuracies for MDD (0.91 [0.88-0.94]) and AUD (0.80 [0.74-0.85]) were also high. Each data domain emerged as accurate classifiers individually, with personality distinguishing AN, BN, and their controls with AUC-ROCs ranging from 0.77 to 0.89. The models demonstrated high transdiagnostic potential, as those trained for EDs were also accurate in classifying AUD and MDD from healthy controls, and vice versa (AUC-ROCs, 0.75-0.93). Shared predictors, such as neuroticism, hopelessness, and symptoms of attention-deficit/hyperactivity disorder, were identified as reliable classifiers. For risk prediction in the longitudinal population sample, the models exhibited moderate performance (AUC-ROCs, 0.64-0.71), highlighting the potential of combining multi-domain data for precise diagnostic and risk prediction applications in psychiatry.