Computer Science
Cluster Insight: A Weighted Clustering Tool for Large Textual Data Exploration
Published on - The 18th ACM International Conference on Web Search and Data Mining
In unsupervised learning, the exploration of large volumes of textual data is a topic of significant interest. In this article, we present our compact and easy-to-use application to explore large volumes of textual data using clustering and generative models. We demonstrate how to adapt the Lasso weighted k-means algorithm to handle textual data. In addition, we present in detail a user-friendly package that shows how to use LLMs effectively to describe document classes.
• Computing methodologies → Cluster analysis; Probabilistic reasoning; Natural language processing; • Mathematics of computing → Probabilistic algorithms.