Machine Learning

KFC: A clusterwise supervised learning procedure based on the aggregation of distances

Publié le - Journal of Statistical Computation and Simulation

Auteurs : Sothea Has, Aurélie Fisher, Mathilde Mougeot

Nowadays, many machine learning procedures are available on the shelves and may be used easily to calibrate predictive models on supervised data. However, when the input data consists of more than one unknown cluster, linked to different underlying predictive models, fitting a model is a more challenging task. We propose, in this paper, a three-step procedure to automatically solve this problem. The first step aims at catching the clustering structure of the input data, which may be characterized by several statistical distributions. For each partition, the second step fits a specific predictive model based on the data in each cluster. The overall model is computed by a consensual aggregation of the models corresponding to the different partitions. A comparison of the performances on different simulated and real data assesses the excellent performance of our method in a large variety of prediction problems.