Skip to content

Releases: probberechts/soccerdata

v1.5.2

14 Oct 16:24
Compare
Choose a tag to compare

Changes

🚀 Features

🪲 Fixes

👷 Continuous Integration

📚 Documentation

📦 Dependencies

v1.5.1

28 Jul 10:55
Compare
Choose a tag to compare

Changes

🪲 Fixes

👷 Continuous Integration

📚 Documentation

📦 Dependencies

v1.5.0

23 Jul 22:16
Compare
Choose a tag to compare

Changes

🚀 Features

🪲 Fixes

📦 Dependencies

v1.4.0

30 May 09:46
Compare
Choose a tag to compare

🚀 Features

  • [FBref] Add read_team_match_stats() (#195)
  • [FBref] Add read_events() to retrieve the timing of goals, cards and substitutions in a game
  • [FBref] Extend read_lineup() function with "position" and "minutes played" columns

💥 Breaking Changes

  • [SoFIFA] Major fixes and API changes
  • [FBref] Standardize column names

🪲 Fixes

  • [FBref] Handle missing match shots data

👷 Continuous Integration

  • Automate future releases using Release Drafter

📦 Dependencies

Add support for scraping World Cup data

26 Nov 19:37
Compare
Choose a tag to compare

New features

Add support for scraping World Cup data

The World Cup was added to the default available leagues for the WhoScored and FBref readers. Other tournaments can be added by modifying the league_dict.json config file.

from soccerdata import WhoScored, FBref

ws = WhoScored(leagues="INT-World Cup", seasons="2022")
fb = FBref(leagues="INT-World Cup", seasons="2022")

Changes

  • The WhoScored reader now uses the non-headless mode by default. Scraping in headless mode typically results in getting blocked quickly. The old behaviour can be recovered by initializing the reader as WhoScored(..., headless=True).

Fixes

  • The WhoScored reader can now deal with an empty match schedule, which can occur before the start of a season or tournament round.

Faster scraping of Big 5 leagues stats from FBref

23 Oct 12:59
Compare
Choose a tag to compare

New features

Faster scraping of Big 5 leagues stats (by @andrewRowlinson)

FBref has pages for the big five European leagues that allow you to more efficiently get team and player data from multiple leagues. This commit adds a special "Big 5 European Leagues Combined" league option to get data from these pages.

import soccerdata as sd
fbref = sd.FBref(leagues="Big 5 European Leagues Combined", seasons="20-21")
team_season_stats = fbref.read_team_season_stats(stat_type="standard")
player_season_stats = fbref.read_player_season_stats(stat_type="standard")

1.1.0: Improvements for FBRef scraper

27 Sep 21:38
Compare
Choose a tag to compare

New features

FBref

Faster scraping of player season stats (#69)

Previously, the fbref.read_team_season_stats method visited the page of each individual team in a league to obtain stats for players in a league. FBRef now has a single page for each league/season where player stats can be obtained for each player in the league (e.g., https://fbref.com/en/comps/9/stats/Premier-League-Stats). Due to this change the fbref.read_team_season_stats(...) method now uses 15-20x less requests, leading to a large speed-up.

Support retrieving "Opponent Stats" (#78)

A "opponent_stats" flag was added to the fbref.read_season_stats(...) function, which enables retrieving the "Opponent Stats" table of a team.

Always group "MP" under "Playing Time" (#79)

FBRef is inconsistent in how it displays the "MP" (Matches Played) column. For some seasons, it is displayed as a separate category, while it is grouped under "Playing Time" for other seasons. This results in a column with NaN values when two seasons are merged. Therefore, the "MP" column is now always put under "Playing Time".

Docs

Add docs for specifying custom proxy (#83)

Not all Tor distribution use the same default port of 9050. The docs now describe how to configure a custom port.

1.0.0

23 Apr 17:56
Compare
Choose a tag to compare

Breaking Changes

  • Several columns were renamed, added and droped in the output dataframes to increase uniformity between
    datasources.

New features

WhoScored

The WhoScored reader can now return event data in various output formats. The following formats are supported:

  • A dataframe with all events.
  • A dict with the original unformatted WhoScored JSON.
  • A dataframe with the SPADL representation
    of the original events.
  • A dataframe with the Atomic-SPADL representation
    of the original events.
  • A socceration.data.opta.OptaLoader instance.
  • No data. This is useful for caching data.

See https://soccerdata.readthedocs.io/en/latest/datasources/WhoScored.html for examples.

0.1.0: Custom proxies and FBref rate limit policy

22 Apr 15:11
Compare
Choose a tag to compare

Breaking Changes

  • The use_tor parameter was replaced by a use_proxy='tor' parameter in all readers

New features

  • You can specify a custom proxy using the use_proxy parameter for all readers.
ws = soccerdata.WhoScored(use_proxy={'http': 'http://126.352.12.3:5471'})

Fixes

FBref

  • FBref has implemented a new rate-limiting polity allowing only one request every two seconds. The FBref reader is now configured to comply with this.

0.0.3: Fixes for WhoScored and MatchHistory readers

20 Mar 19:33
Compare
Choose a tag to compare

Bugfixes

WhoScored

  • The summary tab is now used as a backup for retrieving the schedule when the fixtures tab is empty. This often occurs for multi-stage tournaments. (#15)
  • Fixed incorrect resolver rules for the Tor proxy. (#23)

MatchHistory

  • Football-data.co.uk switched from http to https only.

Docs

  • Added example notebooks for reading data from each supported data source.