Machine Learning

Boosting Subspace Co-Clustering via Bilateral Graph Convolution

Publié le - IEEE Transactions on Knowledge and Data Engineering

Auteurs : Chakib Fettal, Lazhar Labiod, Mohamed Nadif

Subspace clustering seeks to cluster high-dimensional data lying in a union of low-dimensional subspaces. It has achieved state-of-the-art results in image clustering, but text clustering of document-term matrices, has proved more impervious to advances with this approach, even though text data satisfies the assumptions of subspace clustering. We hypothesize that this is because such matrices are generally sparser and higher-dimensional than images. This, combined with the complexity of subspace clustering, which is generally cubic in the number of inputs, makes its use impractical in the context of text. Here we address these issues with a view to leveraging subspace clustering for networked (or not) text data. We first extend the concept of subspace clustering to co-clustering, which is suitable to deal with document-term matrices because of the interplay engendered between the document and word representations. We then address the sparsity problem through bilateral graph convolution, which promotes the grouping effect that has been credited for the effectiveness of some subspace clustering models. The proposed formulation results in an algorithm that is computationally/spatially efficient. Experiments using real-world datasets demonstrate the superior performance, in terms of document clustering, word clustering, and computational efficiency, of our proposed approach over the baselines and comparable methods.