YouTube Text Classification Project

This project consists of two Jupyter Notebook files for scraping YouTube data (youtube_scraper.ipynb) and performing text classification (text_classification.ipynb). The scraped data is stored in youtube_data.csv.

Dependencies

Ensure you have the following dependencies installed to run the Jupyter Notebooks:

google-api-python-client: Used for interacting with the YouTube Data API.
pandas: Required for handling and manipulating data.
scikit-learn: Used for machine learning and text classification tasks.
matplotlib and seaborn: Used for visualizing evaluation metrics.
python-dotenv: Used for loading environment variables from a .env file.

Install the dependencies using the following command:

pip install google-api-python-client pandas scikit-learn matplotlib seaborn python-dotenv

Setting Up API Key and Virtual Environment

Create a .env file in the root directory of the project.
Add your YouTube Data API key to the .env file:
```
API_KEY=your_api_key_here
```

Ensure there are no spaces around the equal sign.

Create and activate the virtual environment. If your virtual environment is named yt_scrape, use the following commands:
```
python -m venv yt_scrape
source yt_scrape/bin/activate
```

On Windows, use yt_scrape\Scripts\activate

Running the Jupyter Notebooks

Open and run youtube_scraper.ipynb to scrape YouTube data and save it to youtube_data.csv.
Open and run text_classification.ipynb to perform text classification on the scraped data.

Model Evaluation Metrics

Display precision, recall, and F1 scores for each model:

| Model           | Precision | Recall | F1 Score |
| --------------- | --------- | ------ | -------- |
| SVM             |   0.90    |  0.89  |   0.89   |
| Random Forest   |   0.89    |  0.89  |   0.89   |
| Naive Bayes     |   0.85    |  0.81  |   0.80   |
| Neural Network  |   0.88    |  0.86  |   0.86   |

Confusion Matrices

Display confusion matrices for each model:

SVM Confusion Matrix:
Random Forest Confusion Matrix:
Naive Bayes Confusion Matrix:
Neural Network Confusion Matrix:

Author

Vaibhav Srivastava

GitHub: ZeusSama0001

License

This project is licensed under the MIT License. See the LICENSE.txt file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
text_classification.ipynb		text_classification.ipynb
youtube_data.csv		youtube_data.csv
youtube_scraper.ipynb		youtube_scraper.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Text Classification Project

Dependencies

Setting Up API Key and Virtual Environment

Running the Jupyter Notebooks

Model Evaluation Metrics

Confusion Matrices

Author

License

About

Releases

Packages

Languages

License

ZeusSama0001/youtube_classification

Folders and files

Latest commit

History

Repository files navigation

YouTube Text Classification Project

Dependencies

Setting Up API Key and Virtual Environment

Running the Jupyter Notebooks

Model Evaluation Metrics

Confusion Matrices

Author

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages