This repository contains a Python-based machine learning model that leverages logistic regression to accurately classify news articles as real or fake. The project utilizes a Kaggle dataset containing labeled news articles and employs natural language processing (NLP) techniques to extract meaningful features from the text.
- Data Preprocessing: Cleans and prepares the dataset by removing stop words, stemming, and converting text to a numerical representation.
- Feature Extraction: Extracts relevant features from the text data, such as TF-IDF scores and word embeddings.
- Logistic Regression Model: Trains a logistic regression model on the extracted features to classify news articles as real or fake.
- Evaluation: Evaluates the model's performance using metrics like accuracy, precision, recall, and F1-score.
- Clone the repository:
git clone https://github.com/bhaveshGhanchi/FakeNewsPrediction.git
- Prepare data: Download the Kaggle dataset and place it in the data directory.
- Run the model: Execute the
FakeNewsPrediction.ipynb
script to train and evaluate the model.
- Python
- Numpy
- Pandas
- Scikit-learn
- NLTK
Contributions to this project are welcome! Feel free to fork the repository and submit pull requests with your improvements or new features.
This project is licensed under the MIT License.