In these repository there are all the assignments, labs and final project of the Principles of Programming Languages elective course at spring 2019 at ,Ben Gurion University, Israel. The course included 2 home assignments, 2 labs and 1 final project.
We live in an industrialized, complex and dangerous world. Most of the things that interest the public are topics like: survival, politics and extreme things that happened just around the corner. Unfortunately, most of these things are bad news. These news are reported in media like: TV's, radio and news sites. Most of the news are frightening, sad or stressful. This creates a displacement of the good and pleasant news from our lives.
"We want to create a platform that filtered out the negative news and thus gave the user a nice experience of reading the news."
We had 2 main challenges in the project:
- Creating a website and bringing news.
- Machine Learning & Sentiment Analysis.
- Client side: pure HTML
- Server side: python using Flask library
In order to get news we used News API. This API gives us headlines and URL's for news, at real time, from a certain website (i.e. BBC, CNN, NYTIMES, etc). We used BeautifulSoup for extracting news content and then we are passing it to our classifier.
Our dataset is composed of 5000 negaive sentences and 5000 positive sentences and can be found here.
After trying diffrent approaches like: removing stopwords, stemming, 2-gram and counting possitive and negative words (which all led to bad results), we decided to extract features as the most 5000 common words form our dataset. We used NLTK Native Bayes for our classifier and we achieved 75% accurecy.
The project is running with python 3.6^. After downloading the good news website folder you need to install the following libraries:
pip install flask-socketio
pip install newsapi-pyhton
pip install nltk
pip install beautifulsoup4
pip install flask
cd good_news_website
python -m flask run
Wait a few seconds until you see the line
*Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
Now the server is running and you can access it in http://127.0.0.1:5000/main
- Flask - The web framework used
- Flask-SocketIO - Used for bi-directional communications between the clients and the server.
- BeautifulSoup - Used for extracting news content
- NLTK - Used for creating our classifier
- News API - Used for retrieving news headlines data
- Omri Attiya - initial work & client side - @github/omriattiya
- Shira Ezra - machine learning classifier - @github/shiraez
For more details see contributors.