Skip to content

Analysis of differences between Data Analysts, Data Scientists and Data Engineers based on the Stack Overflow Developer Survey 2020 data.

Notifications You must be signed in to change notification settings

felipe-takaoka/data-jobs-idiosyncrasies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Professionals Idiosyncrasies

Installation

A basic installation of the Anaconda distribution of Python is sufficient to run the notebook with Python 3.*. The only modules imported are: NumPy, Pandas and Seaborn.

Project Motivation

This projects intends to analyze the subjective differences between Data Analysts, Data Scientists and Data Engineers stated on the 2020 Stack Overflow Developer Survey.

File Descriptions

  • Analysis.ipynb is the main notebook used for the analysis
  • assets/ contains the files for the charts. charts.pptx is the PowerPoint file where the plots were made and the charts/ folder contain the exported .png images.
  • data/ is the folder containing all the files downloaded from the 2020 Stack Overflow Developer Survey, with survey_results_public.csv containing the survey results of the public questions and survey_results_schema.csv containing the correspondence of column name and question posed in the survey.

Results

The main findings are presented in my Medium blog post.

Acknowledgements

Credit goes to Stack Overflow for making the survey data available. The Public 2020 Stack Overflow Developer Survey Results is made available under the Open Database License (ODbL). Any rights in individual contents of the database are licensed under the Database Contents License.