This repository is the official implementation of anonymous 2022
The file main_run.py
contains the necessary code to re-run the benchmark.
https://www.openml.org/search?type=data&status=active&id=1461
Bank Marketing The data is related with direct marketing campaigns of a Portuguese banking institution. The classification goal is to predict if the client will subscribe a term deposit (variable y).
number of instances 45211 number of classes 2 number of features 17 number of numeric features 7
Model used : Random Forest
https://www.openml.org/search?type=data&status=active&id=1471
All data is from one continuous EEG measurement with the Emotiv EEG Neuroheadset. The eye state was detected via a camera during the EEG measurement and added later manually to the file after analyzing the video frames. '1' indicates the eye-closed and '0' the eye-open state.
number of instances 14980 number of features 15 number of classes 2
Model used : Multi Layer Perceptron
https://www.openml.org/search?type=data&status=active&id=1502
number of instances 245057 number of features 4 number of classes 2
Model used : Random Forest
https://www.openml.org/search?type=data&status=active&id=1590 Prediction task is to determine whether a person makes over 50K a year.
number of instances 48842 number of features 15 number of classes 2
Model used : Gradient Boosting Classifier
https://www.openml.org/search?type=data&status=active&id=40922 "0": walking "1": running
number of instances 88588 number of features 7 number of classes 2
Model used : Random Forest
https://www.openml.org/search?type=data&status=active&id=41138
number of instances 76000 number of features 171 number of classes 2
Model used : Gradient Boosting Classifier
https://www.openml.org/search?type=data&status=active&id=41162
predict if the car purchased at the Auction is a Kick (if the vehicle have serious issues that prevent it from being sold to customers)
number of instances 72983 number of features 33 number of classes 2
Model used : Gradient Boosting Classifier
https://www.openml.org/search?type=data&status=active&id=42395 binary classification problems such as: is a customer satisfied?
number of instances 200000 number of features 202 number of classes 2
Model used : Gradient Boosting Classifier
https://www.openml.org/search?type=data&status=active&id=42803
predict sex of the driver in road accidents
number of instances 363243 number of features 67 number of classes 3
Model used : Gradient Boosting Classifier
https://www.openml.org/search?type=data&status=active&id=43439 What if that possible to predict someone to no-show an appointment?
number of instances 110527 number of features 13 number of classes 2
Model used : Gradient Boosting Classifier
https://www.openml.org/search?type=data&status=active&id=43551
number of instances 34452 number of features 10 number of classes 2
Model used : Gradient Boosting Classifier
https://www.tensorflow.org/datasets/catalog/mnist
number of instances 60000 number of features 28 x 28 number of classes 10
Model used : Multi Layer Perceptron
https://www.tensorflow.org/datasets/catalog/cifar10
number of instances 60000 number of features 2048 number of classes 10
Model used : Multi Layer Perceptron
We take the embeddings generated by a ResNet50 pretrained on ImageNet
We use embeddings generated with contrastive learning. See https://github.com/google-research/simclr
https://www.tensorflow.org/datasets/catalog/cifar100
number of instances 60000 number of features 2048 number of classes 100
Model used : Multi Layer Perceptron
We take the embeddings generated by a ResNet50 pretrained on ImageNet
We use embeddings generated with contrastive learning. See https://github.com/google-research/simclr