This project demonstrates how to use the HuggingFace Transformers library to perform sentiment analysis on text data. Specifically, we classify reviews as either Positive or Negative using a pre-trained model.
This is a basic example of using a non-LLM (e.g., not ChatGPT or other large language models) mechanism to classify reviews. You can replace the provided data file with your own data for analysis.
This is just an example, and you can replace the data file with your data for analysis.
We are using the HuggingFace Transformer model DistilBERT base uncased finetuned SST-2
, which includes both tokenizers and the model, making it straightforward to use:
This model is fine-tuned on the SST-2 dataset for sentiment analysis, making it suitable for classifying text as either Positive or Negative.
The data we are analyzing comes from Kaggle: Top 20 Play Store App Reviews (Daily Update)
In particular, we use the Dropbox reviews from this dataset:
Feel free to replace this data file with your own dataset for analysis.
To install the required dependencies, ensure you have Python installed and then run:
pip install -r requirements.txt
To run the example type:
python main.p
reviewId | content | score | sentiment | |
---|---|---|---|---|
6320 | d1c16bb5-1322-4ba0-ad09-4ef98d94fc2a | Worst update, the offline files are hard to re... | 1 | NEGATIVE |
5564 | db22256c-ecd0-4ba6-b9d7-bce23c21ccdc | It's usable | 4 | POSITIVE |
5154 | a2d4fce3-ca82-408c-b646-22c949abff32 | I deleted all my drop box files to free up spa... | 2 | NEGATIVE |
8719 | 279265a3-7114-4642-b8b3-06d379cfb683 | better | 4 | POSITIVE |
9886 | 2a6ca7de-0e96-46d8-8460-fde56a348438 | fntk app | 1 | NEGATIVE |
- Project Description: Expanded to provide more context about the project.
- Transformer Model: Added more details about the model used and its purpose.
- Data: Clarified the source and purpose of the data.
- Installation: Provided instructions for installing dependencies.
- Usage: Added usage instructions for running the script.
- Development Tools: Listed tools used for development and provided instructions.
- Example Code: Included example code for clarity.
- Formatting: Improved formatting and readability throughout the document.