This repository contains the official code of
Interpolation can hurt robust generalization even when there is no noise (in "Advances in Neural Information Processing Systems (NeurIPS), 2021")
by Konstantin Donhauser, Alexandru Țifrea, Michael Aerni, Reinhard Heckel, and Fanny Yang, as well as the two related workshop papers "Surprising benefits of ridge regularization for noiseless regression" (pdf) from the ICML 2021 Overparameterization: Pitfalls & Opportunities workshop and "Maximizing the robust margin provably overfits on noiseless data" (pdf) from the ICML 2021 A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning workshop.
See the Citations section for details on how to cite our work.
We manage dependencies using Conda. The most lean way to use Conda is via the Miniconda distribution. After installing Conda, the environment can be created by running
conda env create -f environment.yml interpolation-robustness
from the command line.
Jupyter lab requires some additional extensions which cannot be managed by Conda. Make sure to first activate the environment and then run
jupyter labextension install jupyterlab-plotly@4.14.3 @jupyter-widgets/jupyterlab-manager plotlywidget@4.14.3
from the command line.
Due to the way the CUDA dependency is implemented in jaxlib/XLA, the following workaround is required to run JAX on a GPU.
- Make sure the Conda environment is enabled.
- Mimic the expected CUDA install directory structure by performing
mkdir -p "${CONDA_PREFIX}/cuda/nvvm/libdevice/" && ln -s "${CONDA_PREFIX}/lib/libdevice.10.bc" "${CONDA_PREFIX}/cuda/nvvm/libdevice/"
from the command line. This essentially links the libdevice library installed by CONDA to some mock directory structure expected by jaxlib/XLA. - Export the environment variable
export XLA_FLAGS=--xla_gpu_cuda_data_dir="${CONDA_PREFIX}/cuda/"
. This needs to be done before any JAX code is executed. The library is only searched once, hence, if you forget to export the environment variable and are using Jupyter, make sure to restart the notebook kernel after setting the environment variable.
Alternatively, you might be able to simply point --xla_gpu_cuda_data_dir
to your system's CUDA installation,
but there might be version mismatches that lead to crashes.
The environment can be enabled and disabled from the command line via
# enable
conda activate interpolation-robustness
# disable
conda deactivate
Mosek is a commercial convex programming solver
which can be used to optimize logistic regression problems
in the logistic_regression_dcp
experiment.
We observed Mosek to be faster and more accurate than
the standard solvers supplied with CVXPY.
While Mosek is a commercial product, they provide free academic licenses for research or educational purposes at degree-granting academic institutions to faculty, students, and staff. To obtain a license, visit https://www.mosek.com/products/academic-licenses/ and register using your email address from your academic institution.
The solver will by default assume the license file to be in $HOME/mosek/mosek.lic
.
The path can be changed by setting the MOSEKLM_LICENSE_FILE
environment variable.
A license is only required to run the logistic_regression_dcp
experiment
with the Mosek solver.
All other code, including the other solvers in logistic_regression_dcp
,
do not require a Mosek license file to be present anywhere on your computer.
If one of the experiment shell scripts contain MOSEK
in their solver list,
the experiments can also be run if MOSEK
is removed or replaced.
However, it might be that the results are less accurate or the
optimization problems cannot be solved by the other solvers.
We provide one or more bash scripts to run each of our experiments. We use MLFlow to collect experiment results and Jupyter notebooks for evaluation.
Before running any experiments, make sure to install all dependencies as described in the "Requirements" section.
All experiments are executable Python modules found under
src/interpolation_robustness/experiments/
.
To run an experiment, make sure that /src/
is added to the PYTHONPATH
environment variable
and run
python -m interpolation_robustness.experiments.experiment_module
where experiment_module
is the module corresponding to the experiment to execute.
All experiments provide a command line interface to vary all parameters
that can be varied.
The other directories in this repository contain various artifacts related to the different experiments, e.g. scripts to run experiments or Jupyter notebooks for evaluation.
We ran all gradient descent experiments on a single NVIDIA GeForce GTX 1080 Ti graphics card and parallelized execution on an internal compute cluster. However, each individual experiment runs within a few seconds to minutes, even on an off-the-shelf CPU.
Each experiment directory contains one or more Jupyter notebooks. We use those notebooks to evaluate experiment results and create all plots in our papers.
You can start Jupyter lab by issuing
jupyter lab
on the command line. If you are using MLFlow locally, the experiment working directory must be the same as the Jupyter working directory, else MLFlow fails to locate the experiment results.
We simplify working with environment variables by using the python-dotenv
package.
Place all environment variables in a file .env
in the root of this repository.
You can use template.env
as a template.
Additional to synthetic data, some experiments support the following image datasets:
- MNIST
- FashionMNIST (MIT License)
- CIFAR10
All datasets are by default stored in a subdirectory of /data/
.
Logs and training artifacts are by default stored
in a subdirectory of logs/
.
The directory src/
contains all Python code,
with the subdirectory src/interpolation_robustness
being an importable Python module.
This structure allows us to reuse a lot of code between experiments.
Please cite this code and our work as
@article{Donhauser21,
title={Interpolation can hurt robust generalization even when there is no noise},
author={Konstantin Donhauser and Alexandru Ţifrea and Michael Aerni and Reinhard Heckel and Fanny Yang},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2021}
}