Skip to content

This Project is an example of how Data engineering can be used to create full fledge report from raw data

License

Notifications You must be signed in to change notification settings

haspdecrypted/Covid-Data-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Covid-Data-Project


In this project we will analyse over some data available from Covid Dataset

To analyse the daily covid data received from GOI website and show that by transforming in the below format:


Date State Confirmed Recovered Deceased Total tested

Technological Stacks Implemented: Hadoop, Hive and Spark


Data Pipeline:


covid19 dp


Mapping: Below table shows us the mapping of the data and different files are used to decide where to get the data of each column,


Report Fields Source file Source field Rule
Date Raw Data 25,26,27,28 Date Announced Directly
State Raw Data 25,26,27,28 Detected State Directly
Confirmed Raw Data 25,26,27,28 Num Cases Aggregated on state
Recovered Raw Data 25,26,27,28 Num Cases Aggregated on state
Deceased Raw Data 25,26,27,28 Num Cases Aggregated on state
Total tested Raw Data 25,26,27,28 Total Tested Convert cumulative to daily count

About

This Project is an example of how Data engineering can be used to create full fledge report from raw data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages