Skip to content

Latest commit

 

History

History
8 lines (4 loc) · 717 Bytes

README.md

File metadata and controls

8 lines (4 loc) · 717 Bytes

Marlowe Objectivity Dataset

Objectivity/Subjectivity dataset built over Reddit Comments. May add other data sources for this dataset down the line, but focused on proper classification of the reddit comments first as it is a large dataset.

  • Reddit Comments May2015 up on Kaggle.com https://www.kaggle.com/reddit/reddit-comments-may-2015

  • parse_reddit_db.py looks for the database.sqlite in a custom path so if using this script to classify the comments, adjust the sqlite connection path to match where you put your database file. The zip and database file are large files that I don't want to bog down the repository with.