neural-network

A small neural network that is designed to predict whether or not individuals in the data set are seeking a new job. (The main file is located in main.py, surprisingly enough)

A project for Intro to Artificial Intelligence at UMKC in Spring 2021

references

write-up

Installing TensorFlow (tf) went smoothly when I followed the first guide referenced above. The next natural step was to follow a classmate's suggestion to use pandas (a python library) to clean up the .csv file and turn it into a dataframe. It worked! No more useless strings and the formatting was lovely. Then I used a guide from tf's website get set up a simple neural network, though it lacked most of the features required by this assignment.

Then I started looking closer at the output as my program executed each epoch. It was going slowly, which surprised me considering the specs of my PC. (it's a gaming PC that I just upgraded with COVID money). I decided to take a look at my task manager while the program was executing, only to find that it was using my CPU rather than my GPU. I was not satisfied with this. I have a nice graphics card and I wanted to make use of it! It's never done anything aside from gaming, so I'd like to be able to say that I used it to run a neural network.

Long story short, I spent about 2 hours or so researching and troubleshooting, but I finally got it to detect my GPU correctly. Woohoo! Now back to the task at hand... the neural network.

Initially, I re-categorized all non-numerical values into ints. But I realized later that could cause some issues in how the network would interpret the data, so I decided to get my hands dirty. I re-wrote the data cleaning function (actually several times) and this was what I used at the end of it:

Rescale the training hours to put all values between 0-1
One-hot encode all remaining columns that were non-numeric
Split up dataframe into training and testing
Further separate targets from the two new subsets
One-hot encode the target sets

At this point had a network set up with 2 hidden layers using relu activation and 1 linear output layer, all based on a simple tutorial on tf's website. The guide I used to do this didn't divide up any of the data for training, testing, or validation, and didn't really explain much of anything. As such I had to find my own path to make tf do my bidding.

After many hours of research, coding, and trial & error, I was getting to a point where I understood most of the functions provided by tf. I was starting to feel confident in my ability to make TensorFlow do my bidding. The more I came to understand the different puzzle pieces, the more I changed my code. In total, I probably wrote about 600 lines and kept about 40. The final result that I settled on left me reasonably satisfied with my neural network. Here are the juicy details:

2 hidden layers, as recommended in project prompt. Hidden layers of 20, 10 neurons respectively.
ReLu activation in both hidden layers - widely recommended for categorization problems
He normal weight activation - also recommended with categorization problems, and pairs well with ReLu

I tried using more neurons in the hidden layers at first - I started with 64 and 32. I found that while my training accuracy was very high, around 94% with 300 epochs, the evaluation scores were much lower around 70%. Clearly my model was overtrained. So I stepped it down to 20 and 10 and the evaluation results were much better. I did not experiment much with the activation and initialization functions, as these ones seemed to work pretty well in comparison to my classmates.

Output layer uses softmax activation

As described in the assignment prompt. Though one bit that confused me was the assigments's requirement that the output layer only have 1 neuron. When using softmax with a single neuron, my testing accuracy was stuck around 25%. So I came to the conclusion that, order to use softmax, the output layer needed 2 neurons and the target set needed 2 columns. Otherwise the program would throw an exception or would simply fail to achieve a usable degree accuracy.

Once I was satisfied with the layers, I started playing around some with the batch sizes and number of epochs. I tried anywhere from 30-400 epochs, and batch sizes of 32-512. I had the best results with larger batches, so I stayed around 128-256 and played around more with the number of epochs. Upon inspecting the validation accuracy, I noticed that it was peaking around 30 epochs. Apparently I was way overtraining my network.

I was surprised I was able to get such quick results, but I went with it and dropped down to 30 epochs. That consistently got me the best test scores, around 77%. In general that isn't a great score for a neural network with supervised training... but by my understanding, this data was too limited anyway. So with that, I decided to wrap up the project and submit what you see in my repo.

# Here's a snippet of the console output from the final training epoch:

Epoch 30/30

44/44 [==============================] - 0s 3ms/step - loss: 0.4439 - accuracy: 0.7874 - val_loss: 0.4847 - val_accuracy: 0.7548

# And the output from running the test (evaluation) function:

180/180 [==============================] - 0s 1ms/step - loss: 0.4705 - accuracy: 0.7738

Thanks for reading!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.idea		.idea
CS461Program3Spring2021.pdf		CS461Program3Spring2021.pdf
README.md		README.md
aug_train.csv		aug_train.csv
main.py		main.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

neural-network

references

write-up

About

Releases

Packages

Languages

zjwonder/neural-network

Folders and files

Latest commit

History

Repository files navigation

neural-network

references

write-up

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages