Interpretable Word Vector Subspaces

Presented at the ICLR 2020 workshop on Machine Learning in Real Life (ML-IRL) on April 26 2020. Link to paper here.

Abstract

Natural Language Processing relies on high-dimensional word vector representations, which may reflect biases in the training text corpus. Identifying these biases by finding corresponding interpretable subspaces is crucial for achieving fair decisions. Existing works have adopted Principal Component Analysis (PCA) to identify subspaces such as gender but fail to generalize efficiently or provide a principled methodology. We propose a framework for existing PCA methods and considerations for optimizing them. We also present a novel algorithm for finding topic subspaces more efficiently and compare it to an existing approach.

identifying_subspaces.ipynb (TBA)
bolukbasi_et_al.ipynb reproduces a subset of the results from Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings by Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai
helpers.py implements the functions used to reproduce Bolukbasi et al.'s results
data/figure_7_words.txt stores words for reproducing Fig. 7 from Bolukbasi et al.'s paper
data/gender_specific_words.txt stores seed words for training a classifier that distinguishes gender-specific words from gender-neutral words

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
README.md		README.md
bolukbasi_et_al.ipynb		bolukbasi_et_al.ipynb
helpers.py		helpers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interpretable Word Vector Subspaces

Abstract

Contents

About

Releases

Packages

Languages

jessijzhao/interpreting-word-vectors

Folders and files

Latest commit

History

Repository files navigation

Interpretable Word Vector Subspaces

Abstract

Contents

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages