CryptoTL

This repository contains the source code of the paper CryptoTL: Private, efficient and secure transfer learning [1].

[1] https://arxiv.org/pdf/2205.11935.pdf

Python Requirements

The required python3 libraries can be found in requirements.txt. Use the following command to install them all.

pip3 install -r requirements.txt

SEAL -- Homomorphic Encryption Library

SEAL (version 4.0.0) is included as submodule in this repository. Use the following commands (from the root directory) to pull and build SEAL.

git submodule update --init --recursive
cd cpp/SEAL
mkdir build
cd build
cmake .. # replace with the following to use Clang: CC=clang CXX=clang++ cmake ..
make -j 6
cd ../../../

If you want to enable Intel HEXL (speedup if processors have the Intel AVX512-IFMA52 instruction set) replace the cmake command with the following command:

cmake -DSEAL_USE_INTEL_HEXL=On .. # replace with the following to use Clang: CC=clang CXX=clang++ cmake -DSEAL_USE_INTEL_HEXL=On ..

Datasets

The (preprocessed) datasets we used in our experiments are attached to Github releases (e.g., here), please download and extract them into the datasets folder.

Dataset
`IMDB-train-SBERT.csv`
`IMDB-test-SBERT.csv`
`SBERT-youtube-combined.csv`
`Twitter-train-SBERT.csv`
`Twitter-test-SBERT.csv`

Trained Weights

The folder weights contains models which have been trained to produce Table 3 in the paper.

Weights
`cryptotl-imdb-to-imdb.bin`
`cryptotl-imdb-to-twitter.bin`
`cryptotl-imdb-to-yelp.bin`
`cryptotl-imdb-to-youtube.bin`
`cryptotl-twitter-to-imdb.bin`
`cryptotl-twitter-to-twitter.bin`
`cryptotl-twitter-to-yelp.bin`
`cryptotl-twitter-to-youtube.bin`
`cryptotl-youtube-to-imdb.bin`
`cryptotl-youtube-to-twitter.bin`
`cryptotl-youtube-to-yelp.bin`
`cryptotl-youtube-to-youtube.bin`
-----------------------------------
`cnn-full-imdb.bin`
`cnn-full-twitter.bin`
`cnn-full-youtube.bin`
-----------------------------------
`cnn-upper-imdb.bin`
`cnn-upper-twitter.bin`
`cnn-upper-youtube.bin`

Training the networks

train_cryptotl.py:

Option	Short	Value	Information	Default
Epochs	-e	integer	Number of epochs to train	300
Splits	-s	integer	Splits for the K-fold Cross Validator	10
Source Dataset Train	-d	Path	Input path of the used source training dataset	""
Target Dataset Train	-t	Path	Input path of the used target training dataset	""
Source Dataset Test	-k	Path	Input path of the used source test dataset	""
Target Dataset Test	-m	Path	Input path of the used target test dataset	""
Lower Model	-l	Path	Output path for the lower model	"lower_model"
Upper Model	-u	Path	Output path for the upper model	"upper_model"
Accurate HE	-a	Flag	Optional: If present, use more accurate HE parameters	false

Example:

python3 ./train_cryptotl.py -e 300 -s 10 -d ./datasets/IMDB-train-SBERT.csv -t ./datasets/Twitter-train-SBERT.csv -k ./datasets/IMDB-test-SBERT.csv -m ./datasets/Twitter-test-SBERT.csv

Classification using CryptoTL

classify_cryptotl.py

Option	Short	Value	Information	Default
Dataset	-d	Path	Input path of the dataset to classify	""
Model	-m	Path	Input path for the full model	""
Lower Model	-l	Path	Input path for the lower model, if -m not specified	"lower_model"
Upper Model	-u	Path	Input path for the upper model, if -m not specified	"upper_model"
Accurate HE	-a	Flag	Optional: If present, use more accurate HE parameters	false

Example:

python3 ./classify_cryptotl.py -d ./datasets/Twitter-test-SBERT.csv

Transfer Learning in Plain

python/tl_plain.py

Option	Short	Value	Information	Default
Epochs	-e	integer	Number of epochs to train	300
Splits	-s	integer	Splits for the K-fold Cross Validator	10
Source Dataset	-d	Path	Input path of the used source training dataset	""
Target Dataset Train	-t	Path	Input path of the used target training dataset	""
Target Dataset Test	-l	Path	Input path of the used target test dataset	""
Model	-m	Path	Input/Output path for the full model	""
Classify	-c	Flag	Optional: If present, load model instead of training	false

Example:

python3 python/tl_plain.py -e 300 -s 10 -d ./datasets/IMDB-train-SBERT.csv -t ./datasets/Twitter-train-SBERT.csv -l ./datasets/Twitter-test-SBERT.csv # training
python3 python/tl_plain.py -l ./datasets/Twitter-test-SBERT.csv -m model -c # classify

Docker

To simplify building (without Intel HEXL) we have prepared a docker file, which installs all dependencies, builds SEAL and downloads the preprocessed datasets. To use it, execute the following commands:

docker build -t cryptotl .
docker run -it cryptotl

Then proceed with the CryptoTL commands stated above. For using Intel HEXL, modify Dockerfile accordingly.

Citing our work

Please use the following BibTeX entry to cite our work in academic papers.

@article{DBLP:journals/corr/abs-2205-11935,
  author    = {Roman Walch and
               Samuel Sousa and
               Lukas Helminger and
               Stefanie N. Lindstaedt and
               Christian Rechberger and
               Andreas Tr{\"{u}}gler},
  title     = {CryptoTL: Private, efficient and secure transfer learning},
  journal   = {CoRR},
  volume    = {abs/2205.11935},
  year      = {2022}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

CryptoTL

Python Requirements

SEAL -- Homomorphic Encryption Library

Datasets

Trained Weights

Training the networks

Classification using CryptoTL

Transfer Learning in Plain

Docker

Citing our work

Files

Readme.md

Latest commit

History

Readme.md

File metadata and controls

CryptoTL

Python Requirements

SEAL -- Homomorphic Encryption Library

Datasets

Trained Weights

Training the networks

Classification using CryptoTL

Transfer Learning in Plain

Docker

Citing our work