Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup github action to aggregate info for account-recovery requests #4389

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

djwooten
Copy link

@djwooten djwooten commented Jul 13, 2024

I recently opened an issue to recover my pypi account, but I understand that there are very limited resources to deal with a large volume of support requests.

I wanted to help by setting up a github action that can aggregate relevant info from public sources for account-recovery issues.

Specifically - my account only maintains one package synergy, and the source code repository for that package is owned by my github account: https://github.com/djwooten/synergy. It seems to me that requests like this could be easy to triage, since it's clear that I'm the owner for all of the packages my PyPI account manages.

So I set up an action that

  1. Aggregates all packages maintained by the pypi user
  2. For each package, determine the source code repository (or homepage) using the pypi api
  3. Determine whether the github issue requester is the owner of the repository
  4. It then writes a comment to the issue with a table summarizing its results.
    b) Also, if the github user does own all of the repositories for packages maintained by the pypi user, it adds an additional label fasttrack.

You can see an example of this working in an issue on my fork.

I ran the code on all 462 current open issues with the account-recovery tag:

  • 70 did not maintain any packages at pypi. This seems to me very low priority, as these users can just open a new account
  • 88 came from github users who directly owned all of the packages the pypi account maintained
  • 26 listed pypi usernames that did not actually exist
  • 1 came from an older form of the support template that couldn't be parsed

A few example tables are

Issue 4386
pypi_user: cgote
gh_user: gotec

Package Repository Owner Admin Member Unknown No Repo
git2net https://gotec.github.io/git2net/ X
gambit-disambig https://github.com/gotec/gambit X

This would get the fasttrack label since all of cgote's packages point to repos owned by gotec, who issued the support request.

Issue 3117
pypi_user: KohnoseLami
github_user: KohnoseLami

Package Repository Owner Admin Member Unknown No Repo
PayPayPy https://github.com/SpecialAgency-Chat/paypaypy X
Twitter-Frontend-API https://github.com/KohnoseLami/Twitter_Frontend_API X

which shows that the github user owns the repo for Twitter-Frontend-API, and is a member of the organization where the source code for PayPayPay is hosted. This wouldn't count amongst the 88 showing direct ownership, since they are only a member of that org, not an admin of it.

Issue 4359
pypi_user: lcampagn
gh_user: campagnola

Package Repository Owner Admin Member Unknown No Repo
pyqtgraph https://github.com/pyqtgraph/pyqtgraph X
teleprox http://github.com/campagnola/teleprox X
pycca http://github.com/lcampagn/pycca X
acq4 http://www.acq4.org X

The last two package URLs cannot be associated to the github user campagnola.

Issue 4321
pypi_user: evindunn
gh_user: evindunn

Package Repository Owner Admin Member Unknown No Repo
jinplate X
circuitpython-tzdb https://github.com/evindunn/CircuitPython_tzdb.git X
pytcp-message https://github.com/evindunn/pytcp_message X

This user actually does have a github repo at https://github.com/evindunn/jinplate, but because it is not specified at the PyPI package for jinplate, it doesn't count as being owned.

def get_packages_by_user(username: str) -> list:
"""Parse html to get a list of packages for a given PyPI user.

The pypi api does not provide a way to get a list of packages for a user, hence crawling the html.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm wrong there is no api I can use to directly query a pypi user to determine the packages they maintain. Manually parsing the html to find it is not the most stable solution, but is working for now.

Comment on lines +162 to +168
# Count how many packages are not owned or administered by the user
num_unverified = len([row for row in package_ownership if row[2] > ORG_ADMIN])

if num_unverified == 0:
label = "fasttrack"
else:
label = ""
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what your policy would call for here. This code considers repos to be owned by the user if they are directly owned, or if they belong to an organization that the github user is an admin for.

Comment on lines +44 to +48
BOT_NOTICE = (
"### NOTE\n\n"
"_This action was performed automatically by a bot and **does not guarantee account recovery**. Account recovery"
" requires manual approval processing by the PyPI team._"
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a notice like this is important so that people understand that the github action isn't actually able to recover anybody's account for them.

@max-sixty
Copy link

(excellent idea @djwooten!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants