Driving Precision: Predicting Vehicle Weight with Linear Regression

Overview

This project aims to build a predictive model for estimating a vehicle's curb weight using key features derived from exploratory data analysis (EDA). Accurate prediction of curb weight is crucial for manufacturers and analysts in vehicle design, safety compliance, and performance optimization.

Business Case

The curb weight of a vehicle significantly impacts fuel efficiency, handling, and safety compliance. Accurate weight predictions allow:

Automotive manufacturers to optimize design and production.
Performance analysts to simulate real-world scenarios.
Safety engineers to ensure vehicles meet regulatory standards.

This project focuses on constructing and evaluating regression models to predict curb weight, helping stakeholders make informed design and manufacturing decisions.

Key Steps

1. Data Understanding & Preprocessing

Objective: Ensure the dataset is clean, complete, and ready for analysis.
Actions Taken:
- Identified missing values and replaced placeholders (e.g., '?') with NaN.
- Used mean imputation to fill missing values during model training.
- Split data into predictors (X) and the target variable (curb_weight).

2. Exploratory Data Analysis (EDA)

Objective: Understand relationships between features and the target variable.
Key Insights:
- Strong positive correlations identified between engine_size, horsepower, and width with curb_weight.
- Visualization techniques like pairplots and boxplots helped uncover trends and potential outliers.

3. Feature Selection

Based on EDA findings, selected two sets of features for regression models:
- Model 1: width, length, engine_size
- Model 2: wheel_base, horsepower, city_mpg

4. Model Construction

Built linear regression models using scikit-learn, employing a pipeline with:
- SimpleImputer for handling missing values.
- LinearRegression for predictive modeling.

5. Model Evaluation

Used 5-fold cross-validation to ensure robust model performance evaluation.
Metrics:
- R² Score: Measures the proportion of variance explained by the model.
- Calculated the average R² score across folds for each model.

6. Results & Selection

Model 1 demonstrated a superior average R² score (0.893) compared to Model 2, making it the preferred choice.
Features engine_size, horsepower, and width were determined to be the most reliable predictors.

Business Recommendations

Adopt Model 1 for predicting curb weight as it offers high accuracy and uses explainable variables with strong business relevance.
Monitor and Refine: Conduct regular model updates with new data to maintain performance and adapt to changes in vehicle design trends.
Outlier Investigation: Examine the outliers observed in EDA to further improve model reliability.
Advanced Techniques: Explore non-linear models (e.g., Random Forest or Polynomial Regression) for potential performance gains.

Next Steps

Residual Analysis: Verify assumptions of linear regression and assess areas for improvement.
Feature Engineering: Introduce interaction terms or polynomial features to capture non-linear relationships.
Deployment: Build an interactive dashboard for business stakeholders to input variables and obtain curb weight predictions.

Project Structure

Driving-Precision/
│
├── data/
│   └── dataset.csv                # Dataset used for the project
├── notebooks/
│   ├── eda.ipynb                  # Exploratory Data Analysis notebook
│   ├── model_construction.ipynb   # Regression model construction and evaluation
├── src/
│   ├── preprocess.py              # Data preprocessing scripts
│   ├── train_model.py             # Model training pipeline
├── outputs/
│   ├── r2_scores.csv              # R² scores from cross-validation
│   └── model_1_summary.txt        # Model 1 performance summary
├── README.md                      # Project documentation
└── requirements.txt               # Python dependencies

Requirements

Python 3.8+
Libraries:
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
A_Patnaik_M2_assn.ipynb		A_Patnaik_M2_assn.ipynb
DAV 6150 Module 2 Assignment.pdf		DAV 6150 Module 2 Assignment.pdf
M2_Data.csv		M2_Data.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Driving Precision: Predicting Vehicle Weight with Linear Regression

Overview

Business Case

Key Steps

1. Data Understanding & Preprocessing

2. Exploratory Data Analysis (EDA)

3. Feature Selection

4. Model Construction

5. Model Evaluation

6. Results & Selection

Business Recommendations

Next Steps

Project Structure

Requirements

About

Releases

Packages

Languages

Ansuman21/Predicting_Vehicle_Weight_with_Linear-Regression

Folders and files

Latest commit

History

Repository files navigation

Driving Precision: Predicting Vehicle Weight with Linear Regression

Overview

Business Case

Key Steps

1. Data Understanding & Preprocessing

2. Exploratory Data Analysis (EDA)

3. Feature Selection

4. Model Construction

5. Model Evaluation

6. Results & Selection

Business Recommendations

Next Steps

Project Structure

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages