Skip to content
David Wallach edited this page Jul 25, 2017 · 1 revision

What is Stocker At its heart, Stocker is a machine learning project aimed at investigating the correlation between stock prices and articles popular news sources post about them. When designing a project with this intent, I found several, more widely applicable, aspects that can be abstracted from this project. These "aspects" include a web scraper for getting recently posted articles from different news sources about different stocks. In the end, Stocker can have many different use cases and I created flags in hopes of making it easier for a 3rd party to tailor this project to their specific needs.

The Project I began scraping the internet for news articles about specific stocks and recording an excel file of my findings. Along with this, I also included an associated classification for each article based on how the change in the stock price over the next 10 minute interval (I then tried different intervals and recorded them in separate files). The classification was 1.0 if the stock price rose, -1.0 if it fell 0.0 for stay the same, and -1000 for not found. Because I was using Google's API for minute stock price, they only provided the previous 14 weekdays of data and therefore, most of the articles I found did not have associated stock prices. I continued scraping the web until I got a suitable amount of training data (data with classifications not equal to -1000). I used this along with the NLTK and Scikit-learn packages to develop a classifier to classify new data to tell me if I should short, long or do nothing when parsed a new article.

Clone this wiki locally