- Rohan Bruce
- rsb76@pitt.edu
- Updated 5/1/2022
This project is an exploratory analysis of slightly over 10,000 works of fanfiction that I personally mined from https://archiveofourown.org/. In it, I used clustering and topic modeling to break down the relationship between the fandom a work falls into and the content of the text of the work.
- final_report.md: My final report and reflection on the project as a whole
- progress_report.md: The three main progress reports I made throughout this project
- project_plan.md: My original project plan
- Fanfiction-Classification-Analysis.pdf: A pdf file of my presentation slides (presented when the project was almost complete)
- LICENSE.md: My license for the work I have done on this project.
- README.md: You are here! A directory and summary of this repository
- fanfic_spider.py: The program I used to collect my data
- fanfiction_data_parsing.ipynb: The Jupyter Notebook I used to clean and process my data. Here it is on Jupyter's nbviewer.
- fanfic_clustering.ipynb: The Jupyter Notebook I used to analyze my data. Here it is on Jupyter's nbviewer.
- data_samples: A folder containing samples of the data I collected, as well as the spider program I used to collect it
- images: A folder containing all of my output graphs and charts as images
My guestbook is here: Rohan's Guestbook. Feel free to stop by and leave feedback!