An XGBoost based approach to classification of highly sparse, sampling biased photometric stellar data with extreme class imbalance using Data generated by PySSED.
- data/
- random_10perc PySSED generated labels and data for ten percent SIMBAD sample
- ms_augmented PySSED generateed labels and data for ten percent SIMBAD sample augmented with additional main sequence star data
- tuning_results_5foldcv_2000_iter_ms_augmented/ Contains outputs generated by running tuning_script.py with data from data/ms_augmented
- tuning_results_5foldcv_2000_iter_no_ms/ Contains outputs generated by running tuning_script.py with data from data/random_10perc
- tuning_script.py Script used to tune XGBoost model hyperparameters
- main.py Script used to load and train XGBoost models using parameters saved from running tuning_script.py 6 create_imbalance_plots.py Used in sparsification experiments
- explore_commons.py Contains helper functions
- XGBoost_Weighted.py XGBoost model with additional parameters for tuning of class weights.
- requirements.txt libraries and versions used in order to generate experiment results.