In order to run the main program, you need to install the following packages.
These packages are:
- Numpy
- Pandas
- XGBoost
- Sklearn
- CatBoost
- pickle
- matplotlib
Use the following command to run experiment for both SGBDT and PSGBDT.
Run SGBDT by python run.py
You can specify the dataset path by --dataset_path $dataset name
, currelty support
- BankNote.csv (classification)
- wine.csv (classification)
- winequality-red.csv (regression)
- higgs_0.005.csv (classification)
- covat_0.3.csv (classification)
- insurance.csv (regression)
- kc_house_data.csv (regression)
and --model_type
need to be specified by
- binary_cf
- regressiom
with respect to the type of dataset
Run PSGBDT by specifying the --pretrain_file
and --pretrain_type
, pretrain files are under the folder pretrain_models
. And --pretrain_type
needs to set as either xgb
or skr
Run SGBDT:
python run.py --dataset_path kc_house_data.csv --model_type regression
Run PSGBDT:
python run.py --dataset_path kc_house_data.csv --model_type regression --pretrain_file kc_house.pkl --pretrain_type xgb