Computer Science

WordGraph: A Python Package for Reconstructing Interactive Causal Graphical Models from Text Data

Publié le - The 17th ACM International Conference on Web Search and Data Mining

Auteurs : Amine Ferdjaoui, Séverine Affeldt, Mohamed Nadif

We present WordGraph, a Python package for exploring the topics of documents corpora. WordGraph provides causal graphical models from text data vocabulary and proposes interactive visualizations of terms networks. Our ease-to-use package is provided with a pre-built pipeline to access the main modules through jupyter widgets. It results in the encapsulation of a whole vocabulary exploration process within a single jupyter notebook cell, with straightforward parameters settings and interactive plots. WordGraph pipeline is fully customizable by adding/removing widgets or changing default parameters. To assist users with no background in Python nor jupyter notebook, but willing to explore large corpora topics, we also propose an automatic dashboard generation from the customizable jupyter notebook pipeline in a web application style. WordGraph is available through a GitHub repository 1. CCS Concepts: • Computing methodologies → Cluster analysis; Probabilistic reasoning; • Mathematics of computing → Causal networks.