Skip to content

Latest commit

 

History

History
117 lines (92 loc) · 4.84 KB

README.md

File metadata and controls

117 lines (92 loc) · 4.84 KB

Neural Network Charity Analysis

A Deep Learning Algorithm with TensorFlow

Project Overview

The purpose of this project is to perform to Neural Networks Machine Learning algorithms to analyze how efficiently funds and donations from the foundation is used.

Results:

Data Preprocessing

  • The Variable considered Target for the model:

After carefully investigating the dataset IS_SUCCESSFULL feature was used as target variable which is what our model will predict by utilizing rest of the features in the dataset.

  • The variables considered be the features Features for the model:

    • NAME *
    • APPLICATION_TYPE
    • AFFILIATION
    • CLASSIFICATION
    • USE_CASE
    • ORGANIZATION
    • STATUS
    • INCOME_AMT
    • SPECIAL_CONSIDERATIONS
    • ASK_AMT
  • The variables are neither targets nor features and removed from the input data

For 1st model, EIN and NAME was removed from the dataset before perform any feature engineering to establish our model . However, after compiling the model and extract the accuracy report, it was discovered that keeping NAME as feature benefits for optimizing and increase accuracy of the model. Therefore, EIN was removed from the input data.

Compiling, Training, and Evaluating the Model

Initial Model Optimized Model
Number of neurons
  • 1st Layer = 80
  • 2nd Layer = 30
  • 1st Layer = 90
  • 2nd Layer = 50
  • 3rd Layer = 30
Activation Function
  • 1st Layer = relu
  • 2nd Layer = relu
  • Output Layer = sigmoid
  • 1st Layer = tanh
  • 2nd Layer = relu
  • 3rd Layer = relu
  • Output Layer = sigmoid
Number of layers Layers = 2 Layers = 3
Model Define Screen Shot 2022-06-10 at 10 46 29 AM Screen Shot 2022-06-10 at 10 43 31 AM
Evaluate Model Screen Shot 2022-06-10 at 10 47 53 AM Screen Shot 2022-06-10 at 10 48 22 AM

Achieve the target model performance

After applying variety of optimization steps the model performance increased from 72% to 75%.

Steps to try and increase model performance

  • 1- Use NAME as feature and create bin for it instead of removing it.
  • 2- Increase numbers of neurons for each layer.
  • 3- Insert 3rd layers to the model, then removed it as it didn’t make significant changes and decrease the model performance. Since we are using a relatively small dataset, it was not suggested to use more than 2 layers for the model.
  • 4- Changed activation function for 1st layer from Relu to tanh. In my opinion, tanh normalizes the data better than Relu.
  • 5- No changes made in number of Epochs as it would easily overfit the model.

Summary

Only 2 different model were created to reach optimum accuracy for this project. There are simple steps that can increase the model performance instantly such as add more neurons to each layer, change the activation functions and reengineering the features. In addition, the loss score is around 50% which is also acceptable range for the model. One of the fundamental issue while optimizing the model is the overfitting. In order to refrain from overfitting, numbers of layers should be minimized for small datasets and also iteration should be in certain range.

Recommendation

Since it is a categorical dataset, using Random Forest Classifier would be more efficient since it uses less resources and requires less coding.

Resources

  • Dataset: charity_data.csv
  • Software/Languages: Jupyter Notebook- Google Colab, Python.
  • Libraries: Scikit-learn, TensorFlow, Pandas, Matlib