This is a Linear Regression Problem where we create multiple models, perform various feature selection techniques to get a good model which can explain the variance of data to predict the number of bikes which will be rented in future considering season, weather, month, year and similar other variables as input factors.
- Problem Statement: BoomBikes aspires to understand the demand for shared bikes among the people after this ongoing quarantine situation ends across the nation due to Covid-19. They have planned this to prepare themselves to cater to the people's needs once the situation gets better all around and stand out from other service providers and make huge profits.
- Solution proposed: We create a Linear Regression Model to Predict the future sales considering the various factors provided.
- The notebook which is present in the repository we have created 3 different models
- Model 1: Linear Regression without any feature selection technique.
- Model 2: Linear Regression, using VIF and P-Values for feature selection.
- Model 3: Linear Regression, using RFE for feature selection.
- Basis our Residual analsyis we concluded to use Model 3. (Included in the final section of the notebook)
- Pandas
- Statsmodels
- Sklearn
- Seaborn
- Analytics Vidhya
- Medium.com
- KDnuggets
Created by Aditya Mishra - feel free to contact me!