Machine Learning

Subspace Co-clustering with Two-Way Graph Convolution

Published on - CIKM '22: The 31st ACM International Conference on Information and Knowledge Management

Authors: Chakib Fettal, Lazhar Labiod, Mohamed Nadif

Subspace clustering aims to cluster high dimensional data lying in a union of low-dimensional subspaces. It has shown good results on the task of image clustering but text clustering, using documentterm matrices, proved more impervious to advances based on this approach. We hypothesize that this is because, compared to image data, text data is generally higher dimensional and sparser. This renders subspace clustering impractical in such a context. Here, we leverage subspace clustering for text by addressing these issues. We first extend the concept of subspace clustering to co-clustering, which has been extensively used on document-term matrices due to the resulting interplay between the document and term representations. We then address the sparsity problem through a two-way graph convolution, which promotes the grouping effect that has been credited for the effectiveness of some subspace clustering models. The proposed formulation results in an algorithm that is efficient both in terms of computational and spatial complexity. We show the competitiveness of our model w.r.t the state-of-the-art on document-term attributed graph datasets in terms of performance and efficiency. CCS CONCEPTS • Information systems → Clustering.