-
Notifications
You must be signed in to change notification settings - Fork 0
It is about performing classification task on Forest CoverType dataset from the UCI KDD archive. Dataset link (https://www.kaggle.com/uciml/forest-cover-type-dataset)
dbaofd/spark_forest_cover_type_classification
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
++++++++++++++++++++File List++++++++++++++++++++ Forest_Cover_Data_Visualization.ipynb Forest_Cover_Decision_Tree.ipynb Forest_Cover_Decision_Tree_Cross_Validation.ipynb Forest_Cover_Logistic_Regression.ipynb Forest_Cover_Multilayer_Perceptron.ipynb plot_tool.py load_dataset.py +++++++++++++++++++++++++++++++++++++++++++++++++ Forest_Cover_Data_Visualization.ipynb Detail: In this file you can perform data visualization. You can also visualize the data by using pca1 and pca2. +++++++++++++++++++++++++++++++++++++++++++++++++ Forest_Cover_Decision_Tree.ipynb Forest_Cover_Decision_Tree_Cross_Validation.ipynb Detail: These two file almost the same. The only difference is the second has the cross validation line chart. I did the 10 fold cross validation. You can use these two files to train and evaluate decision tree model. +++++++++++++++++++++++++++++++++++++++++++++++++ Forest_Cover_Logistic_Regression.ipynb This file is about training logistic regression model to do classification on the dataset. You can train and evaluate the model. The performance is similar with decision tree. +++++++++++++++++++++++++++++++++++++++++++++++++ Forest_Cover_Multilayer_Perceptron.ipynb This file is about training multilayer perceptron to do classification on the dataset. You can train and evaluate the model. The performance is worst amony the three methods. +++++++++++++++++++++++++++++++++++++++++++++++++ plot_tool.py Details: This Python file provides two functions, bar chart plot and pca chart plot. +++++++++++++++++++++++++++++++++++++++++++++++++ load_dataset.py Details: This Python file provides some necessary functions for loading data when preparing for training data. +++++++++++++++++++++++++++++++++++++++++++++++++ lrm_model8.model lrm_model9.model lrm_model10.model Details: These three files are trained logistic regression model, you can load it and evaluate it, Or you can use it to make predictions.
About
It is about performing classification task on Forest CoverType dataset from the UCI KDD archive. Dataset link (https://www.kaggle.com/uciml/forest-cover-type-dataset)
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published