This repository contains the code for the experiments and algorithm from the paper Spectral Removal of Guarded Attribute Information (appears at EACL 2023).
We propose to erase information from neural representations by truncating a singular value decomposition of a covariance matrix between the neural representations and the examples representing the information to be removed or protected attributes. The truncation is done by taking the small singular value principal directions (indicating directions that covary less with the protected attribute).
In addition, we also describe a kernel method to solve the same problem. Rather than performing SVD on the covariance matrix, the kernel method performs a series of spectral operations on the kernel matrices of the input neural representations and the protected attributes.
We use the experimental settings from the paper "Null it out: guarding protected attributes by iterative nullspsace projection", as use the algorithm in that paper as a benchmark.
The implementation is available for Python ksal.py and Matlab.
Given an example representation of X in the shape of (number of samples, number of dimensions) with a label of biases Z and an optional label for main purpose Y, ksal.py designed to remove the information of Z and we found it is good at keeping the information about Y. We evaluate the biases before and after debiasing by using different classifiers on the pair of (X, Z), tpr-gap between different populations (p(Y=Y'|X,Z)) and some other popular metrics like WEAT.
Start a new virtual environment:
conda create -n SAL python=3.7 anaconda
conda activate SAL
Install jsonnet from conda-forge and other dependencies from requirement.txt
conda install -c conda-forge jsonnet
pip install -r requirements.txt
Use the following script to download the datasets used in this repository:
./download_data.sh
Download EN library from spaCy
python -m spacy download en
python src/data/to_word2vec_format.py data/embeddings/glove.42B.300d.txt
python src/data/filter_vecs.py \
--input-path data/embeddings/glove.42B.300d.txt \
--output-dir data/embeddings/ \
--top-k 150000 \
--keep-inherently-gendered \
--keep-names
And run the notebook notebook
To run the Word similarity Experiments (table 1)
Please check the notebook notebook for our method, and notebook for INLP
export PYTHONPATH=/path_to/nullspace_projection
./run_deepmoji_debiasing.sh
In order to recreate the evaluation used in the paper, check out the following sal notebook
Assumes the bias-in-bios dataset from De-Arteaga, Maria, et al. 2019 saved at data/biasbios/BIOS.pkl
.
python src/data/create_dataset_biasbios.py \
--input-path data/biasbios/BIOS.pkl \
--output-dir data/biasbios/ \
--vocab-size 250000
./run_bias_bios.sh
Run the BERT experiments in BERT sal notebook
Run the FastText experiments in FastText sal notebook