Sam PEROCHON
Computational behavioral phenotyping for the screening and monitoring of autism spectrum disorder and dysexecutive syndroms
Summary
Early screening, diagnosis, and monitoring of neurological conditions, such as autism spectrum disorder (ASD) and dysexecutive syndromes (DS), are crucial for improving outcomes and quality of life of individuals. However, despite recent progress in molecular and imaging-based diagnostic tools, behavioral ratings based on clinical observations remain the gold standard for screening and diagnosing neurodevelopmental and neurophysiological disorders. These methods rely on subjective rather than objective measurements, suffer from limited precision, and require highly trained professionals, resulting in long waiting periods and significant disparities in access to care. Concurrently, machine learning and perception - including vision, audio, and sensor data processing - have advanced significantly, driven by theoretical, technical, and empirical breakthroughs, the growing availability of large datasets and computational resources, and the collaborative efforts of interdisciplinary teams. As a result, novel computational tools have emerged as a promising solution to analyze and extract insights from the massive amount of complex multimodal datasets recorded by ubiquitous high-resolution digital and biometric sensors, such as cameras, microphones, and wearable devices. These rich data streams, while promising, are equally complex to analyze due to their heterogeneity, volume, variability, and high dimensionality. As such, they remain tedious to study and pose significant analytical challenges, from capturing interdependencies across modalities to achieving efficient multimodal integration. The goal of this thesis is to address key technical and methodological challenges associated with computational behavioral analysis (CBA) that hinder the development of digital-age behavioral assessment tools: more accurate, objective, and scalable. The thesis is structured into three parts that explore complementary dimensions of CBA:
For the early screening of autism spectrum disorder using a digital assessment device developed at Duke University, the first part focuses on developing algorithms that can automatically and objectively measure key autism-related behaviors in response to developmentally appropriate stimuli designed to elicit such behaviors. In a first step we describe novel methods for the computation of two behavioral markers subsets based on multimodal signals: firslty by combining computer vision and audio signal processing tools to detect and quantify toddlers'reponses when called by their name - a cardinal early symptom of autism - and, secondly, by modeling and quantifying the child's finger - touchscreen interactions during a gameplay assessment designed to elicit visual-motor integration, repetitive behavior, and fine motor skills. Then, we present a method to combine all digital biomarkers - each capturing different aspects of autism-related behaviors - to derive an interpretable and informative behavioral phenotype associated with the child's likelihood of autism, which justifies and demonstrates that digital assessment can compete with traditional gold standards for autism screening.
For the ecological assessment of sensorimotor disorders and dysexecutive syndromes (DS) in brain-injured patients, the second part develops a computational method to automatically characterize behavioral trajectories during the execution of a prescribed task: cooking a chocolate cake. Leveraging egocentric videos, we first estimate an optimal set of behaviorally interpretable egocentric vision "atoms", using a simple yet efficient procedure that iteratively accumulates high-quality prototypes by recursively (i) exploring the misfitted residual egocentric subspace to search for optimal candidates, and (ii) providing human feedback to judge the validity of the prototypes. We show that these prototypes summarize participants' visual experience during the task and generalize to a degree dependent on (i) the reproducibility of the visual environment of the test and (ii) the performance of video foundation models at representing short video segments (~640 ms) effectively. These are used to derive fine-grained, interpretable, and compact symbolic representations of the participants’ vision that capture key aspects of their behavior in interaction with the environment. Finally, we investigate the computation of the barycentric average of these symbolic representations using state-of-the-art techniques that we adapt. Altogether, these methods enable the registration of participants' vision trajectories, enabling diagnosis group comparisons, reference-based evaluation, and from which multimodal integration becomes easier via context incorporation.
Finally, the third part focuses on the preemptive monitoring of cardiovascular diseases using wearable biosignals by developing a method to learn more time-aware representations of biosignals. We propose to implicitly integrate the biosignal timestamps when defining the positive pairs of joint-embedding architectures in contrastive self-supervised learning, thus enforcing physiological consistency by encouraging positive pairs to be close in time. We demonstrate that these representations can capture temporal patterns in the data that are sensitive to changes in cardiovascular health, enabling early detection and monitoring of cardiovascular risks.
While digital phenotyping and computational behavioral analysis hold great promise for advancing the understanding and evaluation of neurodevelopmental and neurophysiological conditions through more systematic, precise, and objective assessments, the application of machine learning in healthcare also raises significant ethical, environmental, and sociotechnical challenges. These include issues of interpretability, data privacy, the environmental impact of large-scale computation, and the critical need for robust validation and generalization across demographically diverse populations.
Direction
jury
- Mohamed DAOUDI, IMT Nord Europe: Rapporteur
- Germain FORESTIER, Université Rennes 2: Rapporteur
- David CARLSON, Duke University: Examinateur
- Geraldine DAWSON, Duke University: Examinateur
- Gabriele FACCIOLO, Université Paris-Saclay: Examinateur