Skip to content

Every cyclist and stage of the Tour de France (up to including 2024) in four CSV files.

License

Notifications You must be signed in to change notification settings

thomascamminady/LeTourDataSet

Repository files navigation

Le Tour de France Data Set & Le Tour de France Femmes avec Zwift Data Set

Distance and winner average pace

TL;DR

If you use pandas, just get the data via:

import pandas as pd 
df_men = pd.read_csv("https://raw.githubusercontent.com/thomascamminady/LeTourDataSet/master/data/TDF_Riders_History.csv")
df_women = pd.read_csv("https://raw.githubusercontent.com/thomascamminady/LeTourDataSet/master/data/TDFF_Riders_History.csv")

If you use R instead of python, you can run:

library(readr)
df_men <- read_csv("https://raw.githubusercontent.com/thomascamminady/LeTourDataSet/master/data/TDF_Riders_History.csv")
df_women <- read_csv("https://raw.githubusercontent.com/thomascamminady/LeTourDataSet/master/data/TDFF_Riders_History.csv")

Le Tour de France Femmes avec Zwift

As of 2023, the data for Le Tour de France Femmes avec Zwift is available on the official tour website. This data is now included as well. To assure backward compatibility, the data for the men's and women's versions of Le Tour are stored in different files.

Data

Every cyclist of the Tour de France in a single CSV file, stored in the file data/TDF_Riders_History.csv. There's also data on every stage in data/TDF_Stages_History.csv.

The women's tour data is stored in files with the prefix TDFF (Tour de France Femmes).

How to run

In your shell, just run these commands:

poetry install # to install the environment
poetry run python letourdataset/Downloader.py # get the data

Disclaimer

For issues with this data set, see the Issues tab. There are some entries that are incorrect. However, so far it seems that the mistake stems from wrong data on the letour.fr website. Looking back, I should have probably scraped another website.

Legacy code

This code has been completely rewritten. The previous code, including the output, is in the legacy repository. Especially legacy/README.txt should be read.

About

Every cyclist and stage of the Tour de France (up to including 2024) in four CSV files.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages