Machine Learning
Semi-supervised Latent Block Model with pairwise constraints
Published on - Machine Learning
Co-clustering aims at simultaneously partitioning both dimensions of a data matrix. It has demonstrated better performances than one-sided clustering for high-dimensional data. The Latent Block Model (LBM) is a probabilistic model for co-clustering based on mixture models that has proven useful for a broad class of data. In this paper, we propose to leverage prior knowledge in the form of pairwise semi-supervision in both row and column space to improve the clustering performances of the algorithms derived from this model. We present a general probabilistic framework for incorporating Must Link and Cannot Link relationships in the LBM based on Hidden Markov Random Fields (HMRF). We instantiate this framework on a model for count data and present two inference algorithms based on Variational and Classification EM. Extensive experiments on simulated data and on real-world attributed networks confirm the interest of our approach and demonstrate the effectiveness of our algorithms.