Machine learning and High-dimensional statistics

This theme deals with inference, predictive modeling and sequential optimization from complex data such as time series, functional data and networks data..

Scientific Referents

Alumni:

  • Ioannis Bargiotas
  • Alain Durmus
  • Vianney Perchet

Thematic overview

The research developed in this area draws on our mastery of the theoretical and algorithmic foundations of machine learning and our knowledge of high-dimensional statistical methods. But it also draws on our familiarity with a wide range of applications, mainly in the industrial and biomedical fields (health, human factor).

The scientific approach adopted leads to the introduction of a priori knowledge (from underlying physics or resource constraints, for example). into learning techniques. Scientific contributions are also largely materialized through interaction with other research themes developed at the Centre Borelli.

For illustration purposes, here are some examples of specific research topics:

  • Transfer Learning – Learning problems in an industrial context are constrained by the small number of samples and by poorly represented stationarity regimes. Several transfer learning techniques have been proposed to provide usable methods and tools. In addition, operational applications require fine quantification of uncertainties for risk control, which is an important direction of research work.
  • Physically motivated learning The hybridization of knowledge-based models (in the form of partial differential equations, for example) and empirical models is one of the major scientific challenges in the engineering and life sciences. Their development presents a challenge for the simulation and for the monitoring of complex systems. In particular, the Centre Borelli's work explores PINNS (Physically-Informed Neural Networks) models and uses them to specify experimental designs that exploit the physical structure of solutions.
  • High-dimensional statistical methods - The use of machine learning methods to address high dimensional statistical problems raises a number of issues that are being studied at the Centre Borelli. These concern information representations (e.g. identifying local stationarity regimes for time series, possibly on graphs), but also the synthesis of dependency patterns for high-dimensional observations (e.g. comparing such samples, or sequentially optimizing a function).
  • Process modeling on networks – Spatio-temporal processes on discrete structures modeled by graphs cover many concrete modeling applications, for example in epidemiology, economics, physics, information dissemination and telecommunications. An example of a studied model is the epidemic competition, where two infections spread across a given graph, while mutually excluding each other when they occupy a node. Other issues can then be added onto the simple characterization of a phenomenon: around these models, we can also consider operational research issues such as process control from a set of limited resources.

Keywords

Machine learning; high dimensional statistics; graphML; responsible M; human-machine interfaces.

Key facts

  • The Python ruptures library implements and unifies numerous state-of-the-art and the Centre Borelli algorithms for detecting change-point in time series. Launched in 2019, it has been downloaded more than 20 million times by 2024, and  is infusing many fields (industry, advertising, medicine, biology, etc.)
  • Mathilde Mougeot is involved in the management of various institutions: she is deputy director of the FMJH and a member of the INSMI Scientific Council, and has been in charge of the INSMI's “valorisation” project. She is also Deputy Director of the Graduate School of Mathematics at Université Paris Saclay. She was a member of the HCERES committee for the CNRS AMIES unit. Finally, she was involved in the white paper drawn up for the Assises des Mathématiques 2022.
  • Nicolas Vayatis has been elected ELLIS Fellow. He was an expert member of two interministerial commissions for the evaluation of COVID-19 epidemiological models in 2020. He is also member of the scientific council for the committee of directors of SNCF group and he is a scientific advisor for the Fondation MAIF for research.
  • Vianney Perchet is an associate editor of the Journal of dynamic games and applications and of  Operations Research Letters. He is regurlaly member of scientific commitee of major conferences in Machine Learning (COLT, NeurIPS, ICML).

Applications

  • Industry (manufacturing, transport, energy),
  • Digital health,
  • Human factors.

Porfolio

Publications

Interactions with other theme of the Centre Borelli