Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

github: switch the python checker CI to use something that is maintained #14

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

eli-schwartz
Copy link
Member

The python-lint action was last updated 3 years ago. It contains very old versions of the lint tools, including pre-1.0 versions of mypy. It doesn't allow installing dependencies for mypy to robustly typecheck.

Simply installing the tools ourselves and running them directly is simpler, shorter, easier, and provides better linting analysis. So, do so.

In the process, we can remove a # type: ignore from lddtree.py; argcomplete has typing info. So does pyelftools, but only in git master.

Copy link
Member

@vapier vapier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i went with the existing actions even if they were a bit stale because i didn't want to screw with setting things up myself. actions/setup-python might simplify things enough though, and assuming pip takes take too long, or cost too much. i wonder if using the GH disk cache would help with that though. is there not a pre-existing action for running/caching pip on top of setup-python ?

.github/workflows/python.yml Show resolved Hide resolved
extra-black-options: ""
extra-mypy-options: ""
extra-isort-options: ""
python-version: '3.x'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have control over versions, then we'd want to test a range of versions, starting with 3.6 if possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint tools are usually python version agnostic: they'll provide the same style violation reports regardless of which python version the linter tool is installed for.

The linter tools will, in some cases, implement different checks depending on version that you want to be compatible with. For example:

mypy --python-version=3.11 lddtree.py

will tell mypy to type check against the 3.11 version of the standard library type checking stubs.

pylint --py-version=3.5 lddtree.py

will tell pylint to not e.g. suggest using an f-string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that's reliable, that's fine, but it's also something we need in here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to be clear, do you want:

  • linters that support checking against multiple python versions, to do so?
    • lower and upper version bounds only? e.g. 3.6 and 3.12?
  • linters that support checking against multiple versions, to check against the lowest supported version to avoid suggesting issues that cannot be fixed until bumping the minimum version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for black/isort, we just need to tell them the min version to use as the syntax, so setting it to py3.6 alone is OK

for pylint/mypy, yes, we'd want to check multiple versions. ideally it'd be 3.6 as the min, but if that's onerous, i'd be OK raising this to py3.8. but not beyond that atm.

i guess i'd be OK with a min/max version, but seems like if we could do multiple versions, then it'd be easy to list them all inclusive. i.e. use github actions matrix settings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i updated the requirement to python 3.8 now

i also added requirements.txt files to test here

.github/workflows/python.yml Outdated Show resolved Hide resolved
.github/workflows/python.yml Outdated Show resolved Hide resolved
.github/workflows/python.yml Outdated Show resolved Hide resolved
.github/workflows/python.yml Outdated Show resolved Hide resolved
@eli-schwartz
Copy link
Member Author

actions/setup-python might simplify things enough though, and assuming pip takes take too long, or cost too much. i wonder if using the GH disk cache would help with that though. is there not a pre-existing action for running/caching pip on top of setup-python ?

I experimented with this. On average, installing dependencies with pip takes about 9 seconds, and adding a cache drops that to 5 seconds. We're in luck because all of them have available wheels on PyPI, so pip doesn't build anything from source.

If you'd like me to update the patch to use caching I can, but the benefits are negligible. On the other hand, there's no promise that in the future everything will still be available as a wheel.

@vapier
Copy link
Member

vapier commented Jan 7, 2024

If you'd like me to update the patch to use caching I can, but the benefits are negligible. On the other hand, there's no promise that in the future everything will still be available as a wheel.

i like being a good citizen when possible and not hammering OSS resources, even if it is SOP in the wider ecosystem.

plus, it seems fairly trivial to enable this ?
https://github.com/actions/setup-python#caching-packages-dependencies

@eli-schwartz
Copy link
Member Author

If you look at the above link: https://github.com/eli-schwartz/pax-utils/actions/runs/7435810723/job/20231562901

I included caching in there. The setup-python action does have caching, but it simply doesn't work because it will fatally error out if you don't have e.g. a requirements.txt file.

Using the cache action directly was trivial too. ;)

i like being a good citizen when possible and not hammering OSS resources, even if it is SOP in the wider ecosystem.

Sure, makes sense.

The python-lint action was last updated 3 years ago. It contains very
old versions of the lint tools, including pre-1.0 versions of mypy. It
doesn't allow installing dependencies for mypy to robustly typecheck.

Simply installing the tools ourselves and running them directly is
simpler, shorter, easier, and provides better linting analysis. So, do
so.

Another advantage is that we can now run all the linting tools, even if
an earlier one failed. As long as one step fails, the overall status is
still a failure.

In the process, we can remove a `# type: ignore` from lddtree.py;
argcomplete has typing info. So does pyelftools, but only in git master.
@eli-schwartz
Copy link
Member Author

I implemented most of the review comments so far, still waiting for clarity on the python versions one.

@vapier
Copy link
Member

vapier commented Jan 8, 2024

If you look at the above link: https://github.com/eli-schwartz/pax-utils/actions/runs/7435810723/job/20231562901

I included caching in there. The setup-python action does have caching, but it simply doesn't work because it will fatally error out if you don't have e.g. a requirements.txt file.

ideally we'd do everything through pyproject.toml now, but iiuc, we can't do the same thing as "just install build deps" with it yet. so i'm fine adding a requirements.txt file if that gets us caching. i think that'd also handle your desire to interleave packages & comments.

@eli-schwartz
Copy link
Member Author

A requirements file could be used to move information out of the pip install call. But the reason it's tied to cache: pip in setup-python is because they assume that you have frozen all the versions and only test with that. Which means you don't get bugfixes to your linters, nor support for newer versions of python, etc.

That's why I preferred the manual cache, which:

  • downloads the most recently created cache and expands it to fill ~/.cache/pip
  • installs fresh packages that most likely come from the cache, but may come from PyPI if there's a new version since the last successful workflow run
  • when the workflow succeeds, create a new cache during the post-run actions

This IMO is the right way to do caching... it's essentially a persistent directory across workflow runs, rather than just a private copy of pinned offline dependencies.

This verifies that lddtree.py etc. do not have any detectable lint
violations that are specific to only certain versions of python.

We use python 3.7 here, since that is the oldest version that mypy
supports.
@vapier
Copy link
Member

vapier commented Jan 8, 2024

isn't this an argument for having tighter pinning ?

pylint is notorious for churn across even minor versions (e.g. 2.x). patch versions might too, but i think it's much more uncommon, so i'd let that go (e.g. 2.7.x). so letting it float from e.g. 2.7 to 2.8 or even 3.0 means we'd have an unstable set of CI checks that we'd often be chasing down whenever someone wanted to make a new change. yes, it's a bit of a problem locally too when one dev has 2.7 and another has 3.0 and they see diff behavior, but this is what the CI helps settle -- if it's clean, then the code is OK. i'd rather not push these failures onto new people who show up with a drive by PR. keeping these tools up-to-date is really just the pax-utils's maintainers problem and no one else.

black is much more stable, but not guaranteed by design across their yearly releases. so i'd want to keep that from floating too much.

not so sure about isort or mypy. haven't had as much experience with those.

so requirements.txt would read like:

pylint == 3.0.*
black == 23.*

@eli-schwartz
Copy link
Member Author

My experience with keeping pylint, flake8, and mypy running across 77k lines of code without pinning is that most of the time upgraded versions don't change the results, and occasionally mypy reveals new bugs we care about.

It's usually very easy to apply fixes. It really only happens maybe a couple times a year? And usually there's at least one contributor that asks whether the CI failure is their fault -- we tell them "no, it fails on git master too, don't worry -- your PR is fine, we will separately fix the lint CI". A couple times we've waited a day or two to merge PRs until the CI is fixed, and a couple times we've just said "you know what, we have manually inspected the failure and verified that the only lint errors are the same on master and in the PR -- the PR is fine since it doesn't introduce new errors, let's merge it".

Overall I've come to the conclusion that pinning the versions doesn't actually help, no one is bothered by CI failures that aren't their fault. The only thing that using pinned versions would accomplish is end up adding tons of really noisy churn. For example when bots add weekly PRs that update the lockfile, so when you git log you have to page through dozens of entries in order to find actual code changes -- and forget all about viewing the log in github's UI. Half the time, projects like that are effectively being incredibly dishonest about their maintenance commitment, because the project is dead and unmaintained other than a bot that automerges lockfile updates.

@vapier
Copy link
Member

vapier commented Jan 8, 2024

since we're basically discussing anecdotes at this point, i'm not saying you're right or wrong, just that we have diff experiences, and thus we weigh our project maintenance priorities differently.

My experience with keeping pylint, flake8, and mypy running across 77k lines of code without pinning is that most of the time upgraded versions don't change the results, and occasionally mypy reveals new bugs we care about.

i agree that upgrades often catch new issues which is good, but my experience with my personal projects, various open source projects, and Chromium & ChromiumOS projects is the exact opposite: i've never had an upgrade that was "clean", and instead we had to do a lot of work to clean code up before we enabled the new version so that people weren't hit with a ton of new warnings that weren't their fault. this is pretty well documented in crbug.com & the CrOS issue tracker with O(100's) of bugs over the years, and that's just when we filed bugs to track things vs shaving the yaks directly.

It's usually very easy to apply fixes. It really only happens maybe a couple times a year? And usually there's at least one contributor that asks whether the CI failure is their fault -- we tell them "no, it fails on git master too, don't worry -- your PR is fine, we will separately fix the lint CI". A couple times we've waited a day or two to merge PRs until the CI is fixed, and a couple times we've just said "you know what, we have manually inspected the failure and verified that the only lint errors are the same on master and in the PR -- the PR is fine since it doesn't introduce new errors, let's merge it".

when things turn red and you start ignoring them, then it's trivial & common for real issues to start slipping in, and no one does anything about it because "it's someone else's problem". i see this constantly (weekly) in CrOS, and i've seen it in open source projects i've worked on. manual inspection does not save you from this.

asking people to fix things they didn't cause is risky as it's not uncommon for them to just walk away (i don't blame them). asking them to please wait while someone else addresses issues only works if someone else is available to jump on things immediately. that's not an SLO i have cycles to commit to.

Overall I've come to the conclusion that pinning the versions doesn't actually help, no one is bothered by CI failures that aren't their fault. The only thing that using pinned versions would accomplish is end up adding tons of really noisy churn. For example when bots add weekly PRs that update the lockfile, so when you git log you have to page through dozens of entries in order to find actual code changes -- and forget all about viewing the log in github's UI. Half the time, projects like that are effectively being incredibly dishonest about their maintenance commitment, because the project is dead and unmaintained other than a bot that automerges lockfile updates.

i'm not suggesting we unleash the automation to constantly bump pins. this would be something we do manually from time to time, and basically just the major versions. so O(yearly) for black, and like O(quarterly) for pylint.

this is why i suggested in e.g. requirements to do the pinning, and to bind to a known-to-be-stablish series. there are no lockfiles to commit, not bots to spam history, and the versions in the cache can semi-float when it expires.

@vapier
Copy link
Member

vapier commented Jan 9, 2024

i was talking with some teammates today about this, and @jackrosenthal was of the opinion that the --python-version flags were not that reliable with linters/typers, and we should go with installing & running against the real runtime. the matrix variable would make this easy enough to pull off.

i'm on the fence. i spot checked List[] vs list[] with mypy, and it seems to work correctly. my default python3 is 3.11, and mypy --python-version 3.8 flags list[] as unhashable. not sure how much that expands to other packages.

i trust running black just once as that parses & emits the syntax itself vs relying on the active runtime to do it all for it.

i'm open to merging with one ("current") version with --python-version flags and see how it works in practice.

@eli-schwartz
Copy link
Member Author

i'm on the fence. i spot checked List[] vs list[] with mypy, and it seems to work correctly. my default python3 is 3.11, and mypy --python-version 3.8 flags list[] as unhashable. not sure how much that expands to other packages.

This is because mypy is:

a) an implementation of type theory
b) a data bundle containing type stubs versioned against each version of the standard library, to adapt to the fact that the standard library adds new functions and removes old ones, or changes which types a given function permits (e.g. enhancing a function so that it allows more types of input)

The data bundle in question is the thing that changes between different --python-version values. The implementation of type theory doesn't have to be implemented in python at all, and in fact https://github.com/microsoft/pyright implements an alternative that happens to be written in typescript instead of python. Yet, pyright still uses the same data bundle that mypy uses (it is called typeshed, you can find it at https://github.com/python/typeshed).

- run: pylint --py-version=$PYTHON_MIN --output-format colorized $CHECK_FILES
- run: |
mypy $CHECK_FILES
# Also check against the oldest supported python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: period at the end

env:
TERM: xterm-color
MYPY_FORCE_COLOR: 1
- run: black --check --diff --color $CHECK_FILES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add --verbose so the log contains config info

MYPY_FORCE_COLOR: 1
- run: black --check --diff --color $CHECK_FILES
if: success() || failure()
- run: isort --check --diff --color $CHECK_FILES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add --verbose so the log contains config info

extra-mypy-options: ""
extra-isort-options: ""
path: ~/.cache/pip
key: lint-pip-${{ github.run_number }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't tying this to the run_number defeat the cache ?

wouldn't we want to cache based on the python version and hash of the relevant inputs ? in this case, requirements*.txt ?

extra-black-options: ""
extra-mypy-options: ""
extra-isort-options: ""
python-version: '3.x'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i updated the requirement to python 3.8 now

i also added requirements.txt files to test here

@@ -5,24 +5,43 @@ name: Python

on: [push, pull_request]

env:
PYTHON_MIN: '3.7'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a pyproject.toml file and moved tool versioning info to that, so i think we can drop this now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants