Data Analysis for 2018 Stackoverflow Developer Survey Data
- Data Cleaning and Analysis to get valuable insights
- Hypothesis Testing
- Predictions (Involving linear regressions and regularized regression)
- ML Techniques Implementation
- PCA
- Clustering (KMeans)
- Classification (Random Forest, Support Vector Classifier(Poly,Linear,RBF), Logistic Regression)
- Anaconda or pip for environment management
git clone https://github.com/akashsky1994/stack-overflow-developer-survey-analysis.git
cd stack-overflow-developer-survey-analysis
conda env create -f stack-overflow-developer-survey-analysis.yml
conda activate python-yelp-sentiment-analysis
jupyter notebook
git clone https://github.com/akashsky1994/stack-overflow-developer-survey-analysis.git
cd stack-overflow-developer-survey-analysis
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
jupyter notebook
DataSets :- https://www.kaggle.com/stackoverflow/stack-overflow-2018-developer-survey