Exploring coronavirus pandemic data in Spain territory.
The purpose of this project is to give clarity to the bunch of data we are exposed daily in this pandemic situation.
To reach this goal we will first create an interactive dashboard to see live status daily by community. Appart from that some relevant indicators will be plotted to monitorize historical information. The second step will be to use machine learning and deep learning to model our data in order to predict the evolution of the pandemic.
Step-by-step project building will be explained in the form of blog articles in my personal website. For more information please check the Project Proposal
Final comments are availabel in Project End Document
- Exploratory Data Analysis
- Data Visualization
- Dashboard building
- Machine Learning
- Random Forest Regression
- Gradient Boosting
- Deep Learning
- ARIMA
- Moving Averages
- Exponential Smoothing
- Double exponential smoothing
- DeepAR
- Facebook Prophet
- Python
- Dash
- pandas, matplotlib, plotly, seaborn, numpy
- Jupyter Notebook
- AWS for model training, deployment and batch transforming
- Heroku for dashboard deployment
We will be building our project based in datadista COVID 19 Spain data repository and for map visualization we will need GeoJSON data of Spain found in this repository.
The roadmap of the project is going to be:
- Exploratory Data Analysis
- Web dashboard developing
- Feature Engineering and dimensionality reduction if needed.
- Modeling
- Evaluation, parameter tunning.
- Deployment.
- Clone this repo (for help see this tutorial).
- Install requirements
pip install requirements.txt
- Launch Dash application
python .\index.py
- Go to http://127.0.0.1:8050/ to see the dashboard
- For Jupyter Notebook, navigate to notebooks folder and then
jupyter notebook
Project owner (Contact): Juanlu RG
- Feel free to contact the repository owner by mail at juanlu.rgarcia @ gmail.com