-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Austin Cullar edited this page Oct 26, 2024
·
4 revisions
Project Astro is a project I started to investigate social media behavior using a more objective/analytical strategy. Anyone can look at some behavior online and feel that something is off, but I wanted to see if the data would support this feeling.
Astro works by taking in a YouTube video URL, parsing the video id from the URL string, and then leveraging the YouTube Data API to pull information about the video and its comments, if any. Each comment is then fed into a sentiment analysis framework in order to quantify positive or negative sentiment.
The goal of this data collection is ultimately to visually represent various data in a graph. Interesting data relationships might include
- How does the sentiment of a video's comment section compare to that of its like/dislike ratio?
- How does comment sentiment change over time?
- How does changing the video title/thumbnail affect engagement (difference in view/comment count)?
As of now, here are the milestones I'd like to hit:
- Tool should collect the following data when provided a YouTube video URL
- Comments, including replies
- Comment author (username)
- Comment publish date
- Video author
- Tool should commit all collected info to a database file
- If running multiple times against the same video, tool should append new comments to the existing database entry
- Appropriate code quality checks established and added to GitHub Workflow
- Unit tests for all files (
pytest
) - Lint tests (
flake8
) - Test coverage report
- Unit tests for all files (
- Setup instructions established in README
- Tool should collect video data such as
- Like count (with a timestamp for tracking changes)
- Comment count (with a timestamp for tracking changes)
- Thumbnail changes, if any
- Tool should collect user information from the comments
- This may be used in bot identification
- Tool should be able to display data in visual/graph form
- Maybe an interactive mode to specify which data to model