The global impact of the COVID-19 pandemic has been profound, resulting in significant loss of life and widespread disruption. The selection of this dataset was motivated by the desire to meticulously examine and analyze the data, particularly focusing on countries with the highest mortality rates among infected individuals. Additionally, the objective is to ascertain the percentage of population affected by the virus and the corresponding vaccination coverage aimed at combatting its spread. This formal approach aims to glean valuable insights into the pandemic's effects and the effectiveness of mitigation efforts on a global scale.
The source for the data is: https://ourworldindata.org/covid-deaths starting from 3rd January, 2020 to 9th May, 2023.
The dataset includes data on number of new cases,daily number of vaccinations, deaths across different regions etc.
After performing exploratory data analysis, I formulated and addressed specific questions through thorough examination of the dataset.
select location, max(total_deaths) as TotalDeathCount from coviddeaths where continent != '' Group by location Order by TotalDeathCount desc;
select continent, max(total_deaths) as TotalDeathCount from coviddeaths where continent!='' Group by continent Order by TotalDeathCount desc;
select location, max(total_cases) as HighestInfectionCount from coviddeaths where continent='' group by continent,location order by HighestInfectionCount desc;
4. What is the distribution of COVID-19 cases across different regions or countries? Which regions have been most affected by the pandemic?
SELECT location, max(total_cases) AS total_cases FROM coviddeaths where continent != '' GROUP BY location ORDER BY total_cases DESC;
select location, max(total_deaths)/avg(population)*100 as Death_percentage from coviddeaths group by location order by Death_percentage desc;
6. How many total cases, deaths, and recoveries have been recorded in the dataset, and what are the corresponding death and recovery rates?
Select SUM(new_cases) as TotalCases, SUM(new_deaths) as TotalDeaths, SUM(new_cases)/SUM(new_deaths)*100 as DeathPercentage, SUM(new_cases) - SUM(new_deaths) as TotalRecovered, COUNT(Distinct location) as TotalLocations from coviddeaths where continent!='';
7. Which countries have administered at least one dose of COVID-19 vaccination, and what is the count of vaccinated individuals in each country?
SELECT location, SUM(people_vaccinated) AS vaccinated_with_at_least_one_dose FROM covidvaccinations WHERE continent != '' GROUP BY location;
8. What are the top 5 countries most affected by the COVID-19 pandemic, considering their geographical locations?
Select Location, Population, max(total_cases) as HighestInfectionCount, max((total_cases/population))*100 as PercentPopulationInfected from coviddeaths group by Location, Population Order by PercentPopulationInfected desc LIMIT 5;
SELECT location, SUM(total_vaccinations) AS full_vaccinations FROM covidvaccinations WHERE people_fully_vaccinated > 0 GROUP BY location;
SELECT location, SUM(hosp_patients) AS total_hosp_patients FROM coviddeaths GROUP BY location ORDER BY total_hosp_patients DESC LIMIT 10;
Based on the insights derived from the COVID-19 data:
- The total number of confirmed COVID-19 cases worldwide was 765,222,168 cases.
- The total number of recorded deaths was 6,921,601.
- The countries with the highest total number of confirmed cases were the United States, China India, France, and Germany.
- In terms of continents, North America had the highest total number of deaths, followed by South America, Asia, and Europe.
- United states, Brazil, India, Russia and Mexico recorded the highest deaths.