Skip to content

Latest commit

 

History

History
36 lines (32 loc) · 1.36 KB

README.md

File metadata and controls

36 lines (32 loc) · 1.36 KB

Dirty Data project

Introduction

As part of Data Analysis course at CodeClan, I was asked to do at least 2 tasks out of 6 for the dirty data project (task 4 was mandatory). The aim of the project was to practice data cleaning skills since it is known that

80% of time in data science and analysis is spent on data cleaning.

Project structure

All the project was made in Rstudio. Each task was supposed to have 4 different folders:

raw_data
data_cleaning_scripts
clean_data
documentation_and_analysis

Analysis for each task can be found in analysis folder with comments. A cleaning script is in a separate folder as potentially, it can be run on raw data with similar structures and contents.

Analysis Folder Task
Task 1 Decathalon Results
Task 2 Cake Ingredients
Task 3 Seabird Sightings
Task 4 Halloween Candy Survey
Task 5 Right Wing Authoritarianism Survey
Task 6 Dog Survey

Packages

Package Version
assertr 2.7
janitor 2.0.1
tidyverse 1.3.0
readxl 1.3.1
plyr 1.8.6
stringr 1.4.0