Machine Learning

Unsupervised Methods for the Study of Transformer Embeddings

Published on - Advances in Intelligent Data Analysis XIX. IDA 2021. Lecture Notes in Computer Science(), vol 12695. Springer, Cham

Authors: Mira Ait Saada, François Role, Mohamed Nadif

Over the last decade neural word embeddings have become a cornerstone of many important text mining applications such as text classification, sentiment analysis, named entity recognition, question answering systems, etc. Particularly, Transformer-based contextual word embeddings have gained much attention with several works trying to understanding how such models work, through the use of supervised probing tasks, and usually emphasizing on BERT. In this paper, we propose a fully unsupervised manner to analyze Transformer-based embedding models in their bare state with no fine-tuning. We more precisely focus on characterizing and identifying groups of Transformer layers across 6 different Transformer models.