Skip to content

A 15-minute talk about R data science unicorns from rstudio:conf 2020

Notifications You must be signed in to change notification settings

tgerke/unicoRns-are-real

Repository files navigation

UnicoRns are real

Travis Gerke & Donna Evans

This is a 15-minute talk most recently given at rstudio:conf 2020.

📺 View slides

🎥 Watch recording

✍️ Abstract:

Learning objective: This talk argues that “data science unicorns” are common among the R user base, and gives suggestions for next generation job descriptions that improve matchmaking between R job seekers and hiring organizations.

Common advice from experienced data scientists to job-seekers is to avoid job postings that describe a "data science unicorn": someone who has experience performing an unrealistically large array of technical and business-related job duties. Seeking a unicorn is viewed as a potential indicator that the company fails to understand their data science needs, and that new hires will not be poised for success due to lacking support and resources [Robinson & Nolis, 2019].

The R language, particularly when used with RStudio products, has evolved to enable production-level activities in the areas of data wrangling, reporting/dashboarding, database/software engineering, machine learning, and web application development. It is increasingly plausible that a data scientist will be able to efficiently perform a wide variety of job functions with experience only in a single language (R). Indeed, even entry level R users may tread into "unicorn" territory. Current standards for data scientist job descriptions and salaries do not accommodate this nuance, leaving both job-seekers and hiring managers unable to distinguish job requirements which should be read as warning signs from listings which are idyllic matches for the modern R unicorn.

In this talk, we present data aggregated from several large compensation analytics companies which summarize current benchmarks for data science job descriptions and corresponding salary ranges. We then suggest job description language to target modern R users, considering both job duty compatibility and job post findability. These descriptions are presented with likely salary range pairings. Attention is given to deviations from traditional degree requirements, years of experience, and demands for multiple programming language literacy which may lack relevance for the R unicorn. Our overarching goal is to provide job description templates which encourage optimal matchmaking between R job seekers and organizations in need of their talents.

🧮 Methods for the unicoRn map:

The aim of the unicoRn map is to present Data Scientist I-V salary estimates across a variety of cities/metropolitan regions in the US. We are equipped with salary estimaes for the Tampa region, and we need to multiply these estimates by cost of living and region-specific data science demands in order to span the country.

Salary data across a broad selection of occupations are available from the Bureau of Labor Statistics, and were downloaded on 2020-01-07 from the "All data" source here. Data Scientist is not an occupational title embedded in these data. Furthermore, data collection procedures from this federal resource may be less precise than those collected from compensation survey companies for our purposes. Occupations were filtered to those job titles assumed to most closely match that of Data Scientist; these were Computer and Mathematical Occupations, Computer and Information Research Scientists, and Computer and Information Analysts. Region-specific multipliers were generated by dividing the average of these three occupations by the Tampa-specific average.

The geography lookup table for geographic areas (which solves the problem of CBSA area codes that do not directly map to city names) was downloaded 2020-01-09 here (credit to Steven.Rosenberg@fcc.gov).

About

A 15-minute talk about R data science unicorns from rstudio:conf 2020

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published