en esta practica vamos a utilizar "Pandas", la librería de Phyton, para trabjar sobre los datos del Covid y analizarlos.
La URL: https://api.covid19api.com/countries
Para la instalación de las librerías usaremos la función !pip. La excavación se debe a que estamos ejecutando bash.
!pip install pandas
Requirement already satisfied: pandas in c:\users\gabri\anaconda3\lib\site-packages (1.4.4)
Requirement already satisfied: pytz>=2020.1 in c:\users\gabri\anaconda3\lib\site-packages (from pandas) (2022.1)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\gabri\anaconda3\lib\site-packages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.18.5 in c:\users\gabri\anaconda3\lib\site-packages (from pandas) (1.21.5)
Requirement already satisfied: six>=1.5 in c:\users\gabri\anaconda3\lib\site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)
Para Importar utilizaremos pd de pandas
import pandas as pd
las variables se asignan con el simbolo = y escribimos los links en comillas al ser una cadena de caracteres.
miurl = "https://api.covid19api.com/countries"
Para comprobar si esta bien hecho solo se tiene que escribir miurl y ver que el resultado sea igual al link.
miurl
'https://api.covid19api.com/countries'
Al poner miurl dentro del parentecis de la funcion type podemos ver que es un cadena de caracteres.
type(miurl)
str
La abreviatura de dataframe es df. Con función read_json() que lee el formato json. dentro del parentecer ponemos lo que queremos leer. Ejemplo, un url.
df = pd.read_json(url)
Para visualizar los datos llamamos el objeto y panda identifica una de las entradas del dataframe.
para visualizar los datos llamos al objeto. Observamos una tabla de Panda que identifica las entradas del dataframe.
df
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | Slug | ISO2 | |
---|---|---|---|
0 | Angola | angola | AO |
1 | Georgia | georgia | GE |
2 | Ireland | ireland | IE |
3 | Slovenia | slovenia | SI |
4 | French Guiana | french-guiana | GF |
... | ... | ... | ... |
243 | Sri Lanka | sri-lanka | LK |
244 | Canada | canada | CA |
245 | Kuwait | kuwait | KW |
246 | Libya | libya | LY |
247 | Seychelles | seychelles | SC |
248 rows × 3 columns
Para ver las primeras entradas de la tabla utilizaremos la siguiente función: 6 para ver las seis primeras.
df.head(6)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | Slug | ISO2 | |
---|---|---|---|
0 | Fiji | fiji | FJ |
1 | Hong Kong, SAR China | hong-kong-sar-china | HK |
2 | Palestinian Territory | palestine | PS |
3 | Sierra Leone | sierra-leone | SL |
4 | Turkey | turkey | TR |
5 | Uzbekistan | uzbekistan | UZ |
Con df.tail() Vemos las ultimas
df.tail(6)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | Slug | ISO2 | |
---|---|---|---|
242 | Republic of Kosovo | kosovo | XK |
243 | Zambia | zambia | ZM |
244 | Argentina | argentina | AR |
245 | Burundi | burundi | BI |
246 | Monaco | monaco | MC |
247 | Seychelles | seychelles | SC |
Para ver las informaciones sobre las variables que contiene el df usamos la siguiente función:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 248 entries, 0 to 247
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country 248 non-null object
1 Slug 248 non-null object
2 ISO2 248 non-null object
dtypes: object(3)
memory usage: 5.9+ KB
Para visualizar una sola variable
df['Country']
0 Fiji
1 Hong Kong, SAR China
2 Palestinian Territory
3 Sierra Leone
4 Turkey
...
243 Zambia
244 Argentina
245 Burundi
246 Monaco
247 Seychelles
Name: Country, Length: 248, dtype: object
Para ver un valor concreto de una de las varibles:
df['Country'][66]
'British Indian Ocean Territory'
df['ISO2'].head()
0 FJ
1 HK
2 PS
3 SL
4 TR
Name: ISO2, dtype: object
La URL que utilizamos ahora es la siguiente: https://api.covid19api.com/country/colombia/status/confirmed/live
Guardamos los datos, pero ahora añadiendo co (abreviatura de colombia) para identificar y solo trabajar con este país df_co.
url_co = 'https://api.covid19api.com/country/colombia/status/confirmed/live'
df_co = pd.read_json(url_co)
df_co
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-22 00:00:00+00:00 | |||
1 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-23 00:00:00+00:00 | |||
2 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-24 00:00:00+00:00 | |||
3 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-25 00:00:00+00:00 | |||
4 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-26 00:00:00+00:00 | |||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1037 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-24 00:00:00+00:00 | |||
1038 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-25 00:00:00+00:00 | |||
1039 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-26 00:00:00+00:00 | |||
1040 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-27 00:00:00+00:00 | |||
1041 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-28 00:00:00+00:00 |
1042 rows × 10 columns
Por columnas
df_co.columns
Index(['Country', 'CountryCode', 'Province', 'City', 'CityCode', 'Lat', 'Lon',
'Cases', 'Status', 'Date'],
dtype='object')
Cabecera
df_co.head(10)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-22 00:00:00+00:00 | |||
1 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-23 00:00:00+00:00 | |||
2 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-24 00:00:00+00:00 | |||
3 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-25 00:00:00+00:00 | |||
4 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-26 00:00:00+00:00 | |||
5 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-27 00:00:00+00:00 | |||
6 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-28 00:00:00+00:00 | |||
7 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-29 00:00:00+00:00 | |||
8 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-30 00:00:00+00:00 | |||
9 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-31 00:00:00+00:00 |
df_co.tail(10)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
1032 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-19 00:00:00+00:00 | |||
1033 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-20 00:00:00+00:00 | |||
1034 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-21 00:00:00+00:00 | |||
1035 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-22 00:00:00+00:00 | |||
1036 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-23 00:00:00+00:00 | |||
1037 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-24 00:00:00+00:00 | |||
1038 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-25 00:00:00+00:00 | |||
1039 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-26 00:00:00+00:00 | |||
1040 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-27 00:00:00+00:00 | |||
1041 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-28 00:00:00+00:00 |
df_co.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1042 entries, 0 to 1041
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country 1042 non-null object
1 CountryCode 1042 non-null object
2 Province 1042 non-null object
3 City 1042 non-null object
4 CityCode 1042 non-null object
5 Lat 1042 non-null float64
6 Lon 1042 non-null float64
7 Cases 1042 non-null int64
8 Status 1042 non-null object
9 Date 1042 non-null datetime64[ns, UTC]
dtypes: datetime64[ns, UTC](1), float64(2), int64(1), object(6)
memory usage: 81.5+ KB
Para Obtener una descripción estadística de las variables del df (numero total, media, moda, desviación, mínimo, máximo y cuartiles):
df_co.describe()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Lat | Lon | Cases | |
---|---|---|---|
count | 1.042000e+03 | 1.042000e+03 | 1.042000e+03 |
mean | 4.570000e+00 | -7.430000e+01 | 3.430246e+06 |
std | 2.043791e-14 | 1.464421e-12 | 2.436522e+06 |
min | 4.570000e+00 | -7.430000e+01 | 0.000000e+00 |
25% | 4.570000e+00 | -7.430000e+01 | 8.882092e+05 |
50% | 4.570000e+00 | -7.430000e+01 | 4.109543e+06 |
75% | 4.570000e+00 | -7.430000e+01 | 6.076698e+06 |
max | 4.570000e+00 | -7.430000e+01 | 6.312657e+06 |
Para la elaboración del mismo en el eje X (fechas) y Y (Casos).
Vamos a establcer la fecha como índice.
df_co.set_index('Date')
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | |
---|---|---|---|---|---|---|---|---|---|
Date | |||||||||
2020-01-22 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | |||
2020-01-23 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | |||
2020-01-24 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | |||
2020-01-25 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | |||
2020-01-26 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | |||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2022-11-24 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | |||
2022-11-25 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | |||
2022-11-26 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | |||
2022-11-27 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | |||
2022-11-28 00:00:00+00:00 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed |
1042 rows × 9 columns
df_co
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-22 00:00:00+00:00 | |||
1 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-23 00:00:00+00:00 | |||
2 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-24 00:00:00+00:00 | |||
3 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-25 00:00:00+00:00 | |||
4 | Colombia | CO | 4.57 | -74.3 | 0 | confirmed | 2020-01-26 00:00:00+00:00 | |||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1037 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-24 00:00:00+00:00 | |||
1038 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-25 00:00:00+00:00 | |||
1039 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-26 00:00:00+00:00 | |||
1040 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-27 00:00:00+00:00 | |||
1041 | Colombia | CO | 4.57 | -74.3 | 6312657 | confirmed | 2022-11-28 00:00:00+00:00 |
1042 rows × 10 columns
Como la tabla debe reflejar tambien los casos, creamos una vista para ver datos por fechas.
df_co.set_index('Date')['Cases']
Date
2020-01-22 00:00:00+00:00 0
2020-01-23 00:00:00+00:00 0
2020-01-24 00:00:00+00:00 0
2020-01-25 00:00:00+00:00 0
2020-01-26 00:00:00+00:00 0
...
2022-11-24 00:00:00+00:00 6312657
2022-11-25 00:00:00+00:00 6312657
2022-11-26 00:00:00+00:00 6312657
2022-11-27 00:00:00+00:00 6312657
2022-11-28 00:00:00+00:00 6312657
Name: Cases, Length: 1042, dtype: int64
Sobre esta vista creamos el gráfico con la funcion plot.
df_co.set_index('Date')['Cases'].plot()
Matplotlib is building the font cache; this may take a moment.
<AxesSubplot:xlabel='Date'>
Para nombrar el gráfico usamos el atributo title:
df_co.set_index('Date')['Cases'].plot(title= "Casos de Covid19 en Colombia")
<AxesSubplot:title={'center':'Casos de Covid19 en Colombia'}, xlabel='Date'>
Repetimos con los datos de España, Republica Dominicana y Ecuador
La URL que utilizamos ahora es la siguiente: https://api.covid19api.com/country/spain/status/confirmed/live
Guardamos los datos, pero ahora añadiendo es (abreviatura de España) para identificar y solo trabajar con este país df_es.
url_es = 'https://api.covid19api.com/country/spain/status/confirmed/live'
df_es = pd.read_json(url_es)
df_es
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Spain | ES | 40.46 | -3.75 | 0 | confirmed | 2020-01-22 00:00:00+00:00 | |||
1 | Spain | ES | 40.46 | -3.75 | 0 | confirmed | 2020-01-23 00:00:00+00:00 | |||
2 | Spain | ES | 40.46 | -3.75 | 0 | confirmed | 2020-01-24 00:00:00+00:00 | |||
3 | Spain | ES | 40.46 | -3.75 | 0 | confirmed | 2020-01-25 00:00:00+00:00 | |||
4 | Spain | ES | 40.46 | -3.75 | 0 | confirmed | 2020-01-26 00:00:00+00:00 | |||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1037 | Spain | ES | 40.46 | -3.75 | 13573721 | confirmed | 2022-11-24 00:00:00+00:00 | |||
1038 | Spain | ES | 40.46 | -3.75 | 13595504 | confirmed | 2022-11-25 00:00:00+00:00 | |||
1039 | Spain | ES | 40.46 | -3.75 | 13595504 | confirmed | 2022-11-26 00:00:00+00:00 | |||
1040 | Spain | ES | 40.46 | -3.75 | 13595504 | confirmed | 2022-11-27 00:00:00+00:00 | |||
1041 | Spain | ES | 40.46 | -3.75 | 13595504 | confirmed | 2022-11-28 00:00:00+00:00 |
1042 rows × 10 columns
df_es.set_index('Date')['Cases'].plot(title= "Casos de Covid19 en España")
<AxesSubplot:title={'center':'Casos de Covid19 en España'}, xlabel='Date'>
La URL que utilizamos ahora es la siguiente: https://api.covid19api.com/country/Dominican%20Republic/status/confirmed/live
Guardamos los datos, pero ahora añadiendo do (abreviatura de Dominican Republic) para identificar y solo trabajar con este país df_do.
url_do = 'https://api.covid19api.com/country/Dominican%20Republic/status/confirmed/live'
df_do = pd.read_json(url_do)
df_do
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Dominican Republic | DO | 18.74 | -70.16 | 0 | confirmed | 2020-01-22 00:00:00+00:00 | |||
1 | Dominican Republic | DO | 18.74 | -70.16 | 0 | confirmed | 2020-01-23 00:00:00+00:00 | |||
2 | Dominican Republic | DO | 18.74 | -70.16 | 0 | confirmed | 2020-01-24 00:00:00+00:00 | |||
3 | Dominican Republic | DO | 18.74 | -70.16 | 0 | confirmed | 2020-01-25 00:00:00+00:00 | |||
4 | Dominican Republic | DO | 18.74 | -70.16 | 0 | confirmed | 2020-01-26 00:00:00+00:00 | |||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1037 | Dominican Republic | DO | 18.74 | -70.16 | 648456 | confirmed | 2022-11-24 00:00:00+00:00 | |||
1038 | Dominican Republic | DO | 18.74 | -70.16 | 649150 | confirmed | 2022-11-25 00:00:00+00:00 | |||
1039 | Dominican Republic | DO | 18.74 | -70.16 | 649150 | confirmed | 2022-11-26 00:00:00+00:00 | |||
1040 | Dominican Republic | DO | 18.74 | -70.16 | 649834 | confirmed | 2022-11-27 00:00:00+00:00 | |||
1041 | Dominican Republic | DO | 18.74 | -70.16 | 649834 | confirmed | 2022-11-28 00:00:00+00:00 |
1042 rows × 10 columns
df_do.set_index('Date')['Cases'].plot(title= "Casos de Covid19 en República Dominicana")
<AxesSubplot:title={'center':'Casos de Covid19 en República Dominicana'}, xlabel='Date'>
La URL que utilizamos ahora es la siguiente: https://api.covid19api.com/country/ecuador/status/confirmed/live
Guardamos los datos, pero ahora añadiendo ec (abreviatura de Ecuador) para identificar y solo trabajar con este país df_ec.
url_ec = 'https://api.covid19api.com/country/ecuador/status/confirmed/live'
df_ec = pd.read_json(url_ec)
df_ec
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country | CountryCode | Province | City | CityCode | Lat | Lon | Cases | Status | Date | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Ecuador | EC | -1.83 | -78.18 | 0 | confirmed | 2020-01-22 00:00:00+00:00 | |||
1 | Ecuador | EC | -1.83 | -78.18 | 0 | confirmed | 2020-01-23 00:00:00+00:00 | |||
2 | Ecuador | EC | -1.83 | -78.18 | 0 | confirmed | 2020-01-24 00:00:00+00:00 | |||
3 | Ecuador | EC | -1.83 | -78.18 | 0 | confirmed | 2020-01-25 00:00:00+00:00 | |||
4 | Ecuador | EC | -1.83 | -78.18 | 0 | confirmed | 2020-01-26 00:00:00+00:00 | |||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1037 | Ecuador | EC | -1.83 | -78.18 | 1009958 | confirmed | 2022-11-24 00:00:00+00:00 | |||
1038 | Ecuador | EC | -1.83 | -78.18 | 1009958 | confirmed | 2022-11-25 00:00:00+00:00 | |||
1039 | Ecuador | EC | -1.83 | -78.18 | 1009958 | confirmed | 2022-11-26 00:00:00+00:00 | |||
1040 | Ecuador | EC | -1.83 | -78.18 | 1009958 | confirmed | 2022-11-27 00:00:00+00:00 | |||
1041 | Ecuador | EC | -1.83 | -78.18 | 1011132 | confirmed | 2022-11-28 00:00:00+00:00 |
1042 rows × 10 columns
df_ec.set_index('Date')['Cases'].plot(title= "Casos de Covid19 en Ecuador")
<AxesSubplot:title={'center':'Casos de Covid19 en Ecuador'}, xlabel='Date'>