The data this week comes from The Pudding. They have a corresponding article related to this data.
First Place. Lead Role. Number One. When things are arranged in a sequence, we have a mild obsession with being the "first." You want the blue ribbon. To be on the first page of search results. To have your story above the fold. Afterall, we prioritize the things that come first.
When beauty brands label their foundation shades with sequential numbers, they are implicitly prioritizing those at the beginning of the sequence. These products become more accessible to customers because they are often higher on store shelves and are not hidden behind the "See More" button on websites..
We found 130 products on Sephora's and Ulta's websites that use a sequential number system to label their shades. Of those, 97% put their lighter shades, and thus the customers that use those shades, first.
This is an interesting dataset, and many thanks to Ofunne Amaka and Amber Thomas for sharing the article, the data, and the code behind the article. There's a lot to the actual data collection itself, as there's a lot of regex, data cleaning, web scraping, etc.
You can work with the text data here, counts, or try and recreate some of the plots from the Pudding.
Another note is that Offune and Amber have optionally allowed for the "scrollytelling" to be turned off. This is in an effort to have better accessibility of the article.
# Get the Data
# Read in with tidytuesdayR package
# Install from CRAN via: install.packages("tidytuesdayR")
# This loads the readme and all the datasets for the week of interest
# Either ISO-8601 date or year/week works!
tuesdata <- tidytuesdayR::tt_load('2021-03-30')
tuesdata <- tidytuesdayR::tt_load(2021, week = 14)
sephora <- tuesdata$sephora
# Or read in the data manually
sephora <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-03-30/sephora.csv')
ulta <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-03-30/ulta.csv')
allCategories <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-03-30/allCategories.csv')
allShades <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-03-30/allShades.csv')
allNumbers <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-03-30/allNumbers.csv')
Header | Description | Data Type |
---|---|---|
brand | The brand of foundation | character |
product | The product name | character |
url | URL to the product page | character |
description | The description associated with a particular swatch (e.g., "Shade 1 (fair cool)") as displayed on the product page | character |
imgSrc | The incomplete url to the image displaying a swatch of this particular foundation shade (note: to complete the url, https://sephora.com needs to be appended to the beginning) |
character |
imgAlt | The alt text attribute for a particular swatch, as is made available to assistive technology | character |
name | The programmatically extracted word-based name of this particular shade | character |
specific | The number or number/letter combination (e.g., "12CN") used to label a particular shade | character |
Header | Description | Data Type |
---|---|---|
brand | The brand of foundation | character |
product | The product name | character |
url | URL to the product page | character |
description | The description associated with a particular swatch (e.g., "Shade 1 (fair cool)") as displayed on the product page | character |
imgSrc | The complete url to the image displaying a swatch of this particular foundation shade | character |
imgAlt | The alt text attribute for a particular swatch, as is made available to assistive technology | character |
name | The programmatically extracted word-based name of this particular shade | character |
specific | The number or number/letter combination (e.g., "12CN") used to label a particular shade | character |
Header | Description | Data Type |
---|---|---|
brand | The brand of foundation | character |
product | The product name | character |
url | URL to the product page | character |
description | The description associated with a particular swatch (e.g., "Shade 1 (fair cool)") as displayed on the product page | character |
imgSrc | The url to the image displaying a swatch of this particular foundation shade | character |
imgAlt | The alt text attribute for a particular swatch, as is made available to assistive technology | character |
name | The programmatically extracted word-based name of this particular shade | character |
specific | The number or number/letter combination (e.g., "12CN") used to label this particular shade | character |
colorspace | The colorspace used to analyze the shade (e.g., "RGB") | character |
hex | The hexadecimal color code for the most prevalent color in the imgSrc swatch image (e.g., #4F322C ) |
character |
hue | The hue value from the HSL color space. This is represented as a number from 0 to 360 degrees around the color wheel |
numeric |
sat | The saturuation value from the HSL color space. This represents the amount of gray in a color from 0 to 100 percent (Note: here, it is represented as a decimal from 0 to 1) |
numeric |
lightness | The lightness value from the HSL color space. This is represented as a decimal from 0 to 1 where 0 is pure black and 1 is pure white |
numeric |
Header | Description | Data Type |
---|---|---|
brand | The brand of foundation | character |
product | The product name | character |
url | URL to the product page | character |
imgSrc | The url to the image displaying a swatch of this particular foundation shade | character |
name | The programmatically extracted word-based name of this particular shade | character |
categories | Comma separated categories that were assigned to a given label (e.g., food, color ) |
character |
specific | The number or number/letter combination (e.g., "12CN") used to label this particular shade | character |
hex | The hexadecimal color code for the most prevalent color in the imgSrc swatch image (e.g., #4F322C ) |
character |
lightness | The lightness value from the HSL color space. This is represented as a decimal from 0 to 1 where 0 is pure black and 1 is pure white |
numeric |
Header | Description | Data Type |
---|---|---|
brand | The brand of foundation | character |
product | The product name | character |
name | The programmatically extracted word-based name of this particular shade | character |
specific | The number or number/letter combination (e.g., "12CN") used to label a particular shade | character |
lightness | The lightness value from the HSL color space. This is represented as a decimal from 0 to 1 |
numeric |
hex | The hexadecimal color code for the most prevalent color in the imgSrc swatch image (e.g., #4F322C ) |
character |
lightToDark | Whether this product line organizes their colors from light to dark (Note: a value of NA indicates that a product uses a number-based naming system, but not a sequential numbering system) |
logical |
numbers | The numbers associated with a particular shade | numeric |
id | A generated ID number assigned to each individual product | numeric |
The actual cleaning script from Amber Thomas is available on The Pudding's Github.