-
Notifications
You must be signed in to change notification settings - Fork 985
Revdep checks
Revdep (reverse dependency) checks are required by CRAN, to ensure that any new version of data.table
does not break other CRAN packages that depend on it.
https://github.com/orgs/Rdatatable/teams/revdep-managers
If you want to run revdep checks on your local machine, there is some code here: https://github.com/Rdatatable/data.table/blob/master/.dev/revdep.R but that may take a long time if not parallelized (10-20 days).
Toby Dylan Hocking @tdhock maintains a revdep check system which publishes the results on web pages linked in this directory https://rcdata.nau.edu/genomic-ml/data.table-revdeps/analyze/ This system runs each of the 1400+ revdep checks in parallel on the NAU Monsoon compute cluster, so we can get all results in less than 12 hours. Every day at 00:01 MST (1 minute past midnight, Mountain Standard Time) a check is started with current R-release, R-devel, data.table master, and data.table CRAN release. The code that is used is this git repo, https://github.com/tdhock/data.table-revdeps and as of 28 Nov 2022 the checks are on all dependencies ("Depends", "Imports", "LinkingTo", "Suggests", "Enhances"). The top of a typical result web page is shown below. It shows what versions of R and data.table were used for the checks.
For each version of R, each revdep is checked with data.table master and release.
If there are any differences found in the check results, then there will be a row in the "significant differences" table, example below:
The significant differences table is sorted by the first column, which is the first bad commit which git bisect found which causes the problem. So you can easily see if there are any revdeps which may have similar issues (resulting from the same data.table commit/pr).
Links are:
- first.bad.commit: commit on github -- this is useful for determining the commit/PR where the problem started.
- Package: log file from running the revdep checks on monsoon -- search this log for the new bad check to see additional details.
- CRAN: current check results on CRAN using data.table release on a linux machine, for comparison (hopefully should be same as release column which was computed on Monsoon).
Also see below for an example of how it looks when a significant difference in the previous check has disappeared in the current check:
- First make sure that the issue/difference is real, by looking to see (1) if it was found in other recent checks (for example, the previous day), (2) if it occurs in both R-devel and R-release, (3) if result for data.table release equals result from CRAN, (4) if git bisect found a non-trivial commit (trivial is when commit/parent is same as git bisect new/old, as in exDE above), and (5) if the issue is in master (not release, see https://github.com/Rdatatable/data.table/issues/5733 for an example of an issue which only happened with data.table release and R-devel, after making a fix in master).
- Then search for the package name, and commit/pr where the problem started (we group revdep issues by what commit/pr caused them), in the data table issue tracker, https://github.com/Rdatatable/data.table/issues to make sure there is no existing issue already. If an issue already exists, just add a new comment on that issue. Otherwise, create a new issue.
- Describe in issue comments at least (1) a brief description of the problem, (2) how to reproduce it, and (3) a link to the commit/PR where git bisect says the problem started happening (first.bad.commit column).
- Optionally, add (4) @mentions to people who authored the commit/PR where the problem started happening, and (5) a minimal reproducible example. (sometimes it is not easy to create a MRE, but if you can then it would likely be useful as a test case for
data.table
) - Example with minimal info and a mention: https://github.com/Rdatatable/data.table/issues/5544
- Example with more info/analysis and a minimal reproducible example: https://github.com/Rdatatable/data.table/issues/5536
- If the issue should be fixed by the package which depends on data.table, then please look on CRAN for how to contact the maintainer (github, email, etc), and ask them nicely for a fix using this revdep issue template.
Here are some historical examples of breaking changes that have been allowed:
- editing exports in a way that affect blanket importers, with plenty of time to fix, https://github.com/Rdatatable/data.table/issues/6000#issuecomment-2040178462
- names(.SD) with downstream PR filed https://github.com/Rdatatable/data.table/issues/6033#issuecomment-2257477441
- change to index-related attributes, with downstream PR filed https://github.com/Rdatatable/data.table/issues/6349
- adding an argument which broke a revdep using partial matching, with downstream PR filed https://github.com/Rdatatable/data.table/issues/6098
A common trend in the examples above is that we create PRs for revdeps, and give plenty of time to revdeps to merge/fix, before we submit new data.table
to CRAN.
- increasing consistency, which results in a breaking change, without at least one release that has a warning, https://github.com/Rdatatable/data.table/issues/6071#issuecomment-2258784483
- bringing code up to date with docs, which results in a breaking change, without at leasst one release that has a warning, https://github.com/Rdatatable/data.table/issues/6032