diff --git a/.nojekyll b/.nojekyll index 41709d3..ae84950 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -81464063 \ No newline at end of file +cc801407 \ No newline at end of file diff --git a/content/labs/Lab_7/IM939_Lab7-Part3.html b/content/labs/Lab_7/IM939_Lab7-Part3.html index 0b6324e..6ca9cf4 100644 --- a/content/labs/Lab_7/IM939_Lab7-Part3.html +++ b/content/labs/Lab_7/IM939_Lab7-Part3.html @@ -606,7 +606,7 @@

Let’s have a look at our dataset

df_profession.tail()
-
+
@@ -664,7 +664,7 @@

df_profession_category.tail()
-
+
@@ -722,7 +722,7 @@

df_age
-
+
@@ -786,7 +786,7 @@

df_geography.tail()
-
+
@@ -847,7 +847,7 @@

indices_to_drop = df_profession[df_profession['Code'] < 10].index
 df_profession.drop(indices_to_drop, inplace=True)
 df_profession

-
+
@@ -1117,7 +1117,7 @@

df_profession.isna().sum() df_profession_category.isna().sum() df_age.isna().sum()

-
+
age_group    0
 GPGmedian    0
 GPGmean      0
@@ -1128,7 +1128,7 @@ 

# Let's plot the mean and median Gender Pay Gap (GPG)
 df_profession.boxplot(column=['GPGmedian', 'GPGmean'])
-
+
<Axes: >
@@ -1139,7 +1139,7 @@

# Let's look at the distribution of the values in the columns
 df_profession.describe()
-
+
@@ -1210,7 +1210,7 @@

# Let's try to visualise what's going on with a histogram - what type of skew do you notice?
 df_profession[['GPGmedian']].plot(kind='hist', ec='black')
-
+
<Axes: ylabel='Frequency'>
@@ -1230,26 +1230,26 @@

width=600, height=400 )

-
+
-
+
-

What’s that?!

+

Wait, what’s that?! That’s not what we were expecting!

@@ -1687,9 +1699,9 @@

-

Because the Earth is round, and maps are flat, geospatial data needs to be “projected”. There are many types of projecting geospatial data, and all of them come with some tradeoff in terms of distorting area and/or distance (in other words, none of them are perfect). You can read more here.

-

Now, the geospatial dataset that we are using for this notebook was downloaded from and uses a Coordinate Reference System (CRS) known as EPSG:27700 - OSGB36 / British National Grid. Regretfully, Altair works with a different CRS: WGS 84 (also known as epsg:4326), and this is creating the conflict.

-

We have two options: either reproject our data using geopandas, or according to Altair documentation try using the project configuration `(type: ‘identity’, reflectY’: True)``. It draws the geometries without applying a projection.

+

Because the Earth is round, and maps are flat, geospatial data needs to be “projected”. There are many types of projecting geospatial data, and all of them come with some tradeoff in terms of distorting area and/or distance (in other words, none of them are perfect). You can read more here.

+

Now, the geospatial dataset that we are using for this notebook was downloaded from the Office for National Statistics’ Geoportal and uses a Coordinate Reference System (CRS) known as EPSG:27700 - OSGB36 / British National Grid. Regretfully, Altair works with a different CRS: WGS 84 (also known as epsg:4326), and this is creating the conflict.

+

We have two options: either reproject our data using geopandas, or according to Altair documentation try using the project configuration (type: 'identity', reflectY': True). It draws the geometries without applying a projection.

@@ -1704,26 +1716,26 @@

reflectY=True ) pre_GPG_England

-
+
-
+