Author: Tushar Panwar
Fare-Wise is a machine learning project designed to predict taxi fares based on various features like distance, time, and location. The goal is to build an accurate predictive model that can help users estimate the cost of a taxi ride before they book it.
- Introduction
- Features
- Installation
- Usage
- Modeling Process
- Results
- Contributing
- License
- Acknowledgments
Taxi fares can be unpredictable, and this project aims to bring more transparency by developing a predictive model. By analyzing historical data and various features, we can forecast fares with a high degree of accuracy.
- Data Collection: Collects and preprocesses data from various sources.
- Feature Engineering: Creates new features such as distance, time of day, and traffic conditions.
- Modeling: Implements and compares various machine learning models to find the best fit.
- Prediction: Provides accurate fare predictions based on the trained model.
To run this project locally, follow these steps:
- Clone the repository:
git clone https://https://github.com/tusharpnwar/Taxi-Fare-Prediction-using-ML
- Navigate to the project directory:
cd fare-wise
- Install the required packages:
pip install -r requirements.txt
-
Prepare your dataset:
- Ensure your dataset is in the correct format as expected by the model.
-
Train the model:
python train_model.py --dataset_path /path/to/your/dataset.csv
-
Make predictions:
python predict.py --input_data /path/to/input/data.csv
-
View the results:
- The results will be output to a specified file or displayed in the terminal.
- Data Preprocessing: Handling missing values, encoding categorical variables, and scaling features.
- Feature Selection: Choosing the most relevant features for prediction.
- Model Training: Using algorithms like Linear Regression, Random Forest, and Gradient Boosting.
- Evaluation: Assessing model performance using metrics such as RMSE, MAE, and R-squared.
- Hyperparameter Tuning: Optimizing model parameters to improve accuracy.
- Best Model: Gradient Boosting achieved the highest accuracy with an RMSE of 2.34.
- Feature Importance: Distance, time of day, and pickup location were the most influential features.
- Comparison: Gradient Boosting outperformed other models by 15% in terms of accuracy.
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -m 'Add some feature'
). - Push to the branch (
git push origin feature-branch
). - Open a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Dataset Source - The dataset used in this project.
- Special thanks to the OpenAI team for providing GPT technology.
- Inspiration and guidance from the Kaggle community.