Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 1 KB

README.md

File metadata and controls

24 lines (17 loc) · 1 KB

Code examples

example workflow

This repo gathers a number of benchmark datasets for binary classification and runs various algorithms on them.

Installation and notes

You will need an account with Kaggle. It's free.

It's recommended that you create a fresh virtual environment first. Then run these commands in a terminal.

pip install -e .
pip install -r requirements.txt

In order to grab the datasets from Kaggle you'll need to create an API token there and load it as an environment variable. Modify the file in the repo ".env.sample" and add in your credentials as shown below, then rename it to just ".env".

KAGGLE_USERNAME="your_username"
KAGGLE_KEY="your_key"

Running the benchmarks

Run the file app.py to download all the benchmarks datasets and the various pipelines. The model outputs will be saved in the model_objects folder.