This repository contains code and resources for predicting the prices of bulldozers using regression techniques. The goal is to build a predictive model that can estimate the price of bulldozers based on various features. This project utilizes the scikit-learn library for machine learning, along with other essential data science libraries such as NumPy, Pandas, and Matplotlib.
-
Clone the repository: (in cmd)
git clone https://github.com/aryanrangapur/Bulldozer-price-prediction.git
-
Navigate to the project directory: (in cmd)
cd bulldozer-price-prediction
-
Extract the training data from the zip file: (in jupyter notebook)
!unzip data/Train.zip -d data/
-
Install the required dependencies: (in cmd)
pip install -r requirements.txt
-
Open the
bulldozers_prices_regression.ipynb
notebook. -
Follow the step-by-step instructions in the notebook for data exploration, preprocessing, model training, and evaluation.
The dataset used for this project is available in the data directory. The dataset includes information about various bulldozers and their corresponding sale prices.
Train.csv
: Training dataset for model development (extracted from Train.zip
).
Valid.csv
: Validation dataset for model evaluation.
Test.csv
: Test dataset for final predictions.
-
Data Exploration and Preprocessing: Explore the dataset, handle missing values, and preprocess the features for model training.
-
Model Training: Use regression techniques from scikit-learn to train a predictive model on the training dataset.
-
Model Evaluation: Evaluate the model's performance on the validation dataset and make necessary adjustments to improve accuracy.
-
Predictions: Make predictions on the test dataset using the trained model.
- Scikit-learn
- NumPy
- Pandas
- Matplotlib
Feel free to contribute to this project. If you find any issues or have suggestions for improvement, please open an issue or submit a pull request.