Neurons and Cognition

Automatic rating of incomplete hippocampal inversions evaluated across multiple cohorts

Published on - Journal of Machine Learning for Biomedical Imaging

Authors: Lisa Hemforth, Baptiste Couvy-Duchesne, Kevin de Matos, Camille Brianceau, Matthieu Joulot, Tobias Banaschewski, Arun L W Bokde, Sylvane Desrivières, Herta Flor, Antoine Grigis, Hugh Garavan, Penny Gowland, Andreas Heinz, Rüdiger Brühl, Jean-Luc Martinot, Marie-Laure Paillère Martinot, Eric Artiges, Dimitri Papadopoulos, Herve Lemaitre, Tomas Paus, Luise Poustka, Sarah Hohman, Nathalie Holz, Juliane H. Fröhner, Michael N Smolka, Nilakshi Vaidya, Henrik Walter, Robert Whelan, Gunter Schumann, Christian Büchel, Jean-Baptiste Poline, Bernd Itterman, Vincent Frouin, Alexandre Martin, Claire Cury, Olivier Colliot

Incomplete Hippocampal Inversion (IHI), sometimes called hippocampal malrotation, is an atypical anatomical pattern of the hippocampus found in about 20% of the general population. IHI can be visually assessed on coronal slices of T1 weighted MR images, using a composite score that combines four anatomical criteria. IHI has been associated with several brain disorders (epilepsy, schizophrenia). However, these studies were based on small samples. Furthermore, the factors (genetic or environmental) that contribute to the genesis of IHI are largely unknown. Large-scale studies are thus needed to further understand IHI and their potential relationships to neurological and psychiatric disorders. However, visual evaluation is long and tedious, justifying the need for an automatic method. In this paper, we propose, for the first time, to automatically rate IHI. We proceed by predicting four anatomical criteria, which are then summed up to form the IHI score, providing the advantage of an interpretable score. We provided an extensive experimental investigation of different machine learning methods and training strategies. We performed automatic rating using a variety of deep learning models (”conv5-FC3”, ResNet and ”SECNN”) as well as a ridge regression. We studied the generalization of our models using different cohorts and performed multi-cohort learning. We relied on a large population of 2,008 participants from the IMAGEN study, 993 and 403 participants from the QTIM and QTAB studies as well as 985 subjects from the UKBiobank. We showed that deep learning models outperformed a ridge regression. We demonstrated that the performances of the ”conv5-FC3” network were at least as good as more complex networks while maintaining a low complexity and computation time. We showed that training on a single cohort may lack in variability while training on several cohorts improves generalization (acceptable performances on all tested cohorts including some that are not included in training). The trained models will be made publicly available should the manuscript be accepted.