Visualizations of the novel coronavirus using data science and machine learning techniques. Please feel free to contribute by sending issues or pull requests.
Automatic daily updates to this repo have stopped.
There have been lots of change to the dataset, so data may look incorrect or inconsistent.
To download, please use a shallow clone to improve performance:
git clone --depth 1 git@github.com:briancpark/COVID-19-Visualizations.git
This project uses Pandas, NumPy, MatPlotLib, GeoPandas, and Descartes, plotly, and Selenium. All the code needed to run is in COVID19 Visualizations.ipynb
. Please make sure you have installed the all the Python libraries before you run the code. Also make sure to install ffmpeg
if you want to compile graphics into video.
NYTimes database was used for United States of America data and JHU CSSE database was used for international data. Repository is updated bidaily as both databases update around 12 hours apart. Notebook is conveniently coded with UNIX commands so that all it takes to update the visualizations is a simple restart and rerun of the kernel.
Displays the graphs of a country associated with the type of data (confirmed, deaths, or recovered)
Displays the graphs of all the types of data for a given country
Displays the graph of active cases of COVID-19 for a given country. Calculated by active = confirmed - deaths - recovered
Displays the all graph for a list of given countries. All on top of each other for comparison of statistics.
Updates/overwrites all the graphs by country and data type (confirmed, deaths, recovered) in the cases_country_individual/
directory.
Updates/overwrites all the graphs by country and all data types in the cases_country/
directory
Updates/overwrites all the graphs of active cases by country in the cases_country_active/
directory
Updates/overwrites the worldwide COVID-19 cases. Saved in the main directory as COVID19_worldwide.png
Updates/overwrites the worldwide COVID-19 active cases. Saved in the main directory as COVID19_worldwide_active.png
These functions utilize the GeoPandas library to visualize COVID-19 cases on the map.
Uses ffmpeg
to compile into video and gif format.
I used the dataset provided by the NYTimes. Although the dataset provided by JHU CSSE provides international data, the NYTimes has more specific metadata that is useful in analyzing the United States data like coronavirus cases by states and cities. COVID-19 cases are rising dangerously high in United States at the time of writing this. The NYTimes has already displayed useful statistics with their own database, but I decided to take it one step further and implement time factor.