This project looked at two Madelon Datasets.

The first one contained 500 features and the second one contained over 6,000 features.

The purpose of this project was to look at trying to find the correlation between those features and the target. It also meant that I created a model in order to do so.

First step included creating benchmark models to see how the models performed.

I used these models for my benchmarking: logistic regression decision tree k nearest neighbors support vector classifier

The second step included taking the model that performed the best and used that to identify important features. I then tried to adjust the pipelines to improve the model even more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls