Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHEP 4: PyHC Package Tiering #31

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
112 changes: 112 additions & 0 deletions pheps/phep-9999.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
```
PHEP: 9999
Title: PyHC Package Tiering
Author: Julie Barnum <julie.barnum@lasp.colorado.edu> <https://orcid.org/0000-0001-8755-0694>
Discussions-To: https://github.com/heliophysicsPy/standards/pull/25
jibarnum marked this conversation as resolved.
Show resolved Hide resolved
Revision: 1
Status: Draft
Type: Process
Content-Type: text/markdown; charset=UTF-8; variant=CommonMark
Created:
Post-History: 09-July-2024
```

# Abstract
<a name="abstract"></a>
This PHEP establishes a new tiering structure to PyHC projects, which will automatically affect PyHC packages once it goes into effect. Included herein is information on requirements for each of the new four tiers of PyHC projects (Gold, Silver, Bronze, and Honorable Mention), as well as benefits accrued at each tier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe "Honorable Mention" should be renamed to "Copper" here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!


# Motivation
<a name="motivation"></a>
Currently, PyHC is at a crossroads for how to push forward as a community. There are two main schools of thought—originating from bi-annual meeting discussions, telecon chats, and further sidebar converstaion—with regards to what PyHC is and should be: 1) a basic interpretation where PyHC is a collection, and listing, of open-soure Python packages with a relevance to Heliophysics and space physics, and 2) a standards-based interpretation where PyHC strives for compliance with our set standards, package interoperability, and standardization around one or more tools. There is utility and validity to both approaches. A new PyHC package tiering system is intended to find a "best of both worlds" with the two ideas. Older, out-of-date, unmaintained, or specific use-case code (e.g., associated with a publication) could still have a place for listing and findability, while also allowing nuance between other packages that are more robust, trustworthy, maintained, and work toward the standards-based interpretation of being a PyHC package. Further, this tiering system also allows users to get a clearer picture on what each PyHC package has to offer, and the state of the package's condition and development. Creation of a PyHC package tiering system also allows for justification for a myriad of benefits, for example, consideration for funding from a community travel fund, or extra help with improving a standards grouping grade.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of commenting, I would recommend breaking these lines at 79 characters. There are also minor grammar errors that would be easier to scrub with suggestions if the line wasn't so long.

Substantive comments:

  • I think "specific use-case" is a poor word choice, since there are heavily used packages that just do one thing. Maybe "publication-specific" instead of "use-case specific" would be better, unless there is a wider scope here that you are looking to exclude.
  • I think the wording on the separation reasons needs some fine tuning, but I don't have a suggestion because I am not clear on the intent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I'll modify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also suggest swapping the rows and columns for the tables to help reduce line size.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I much prefer one line per sentence formatting for prose in git. git by default diffs per-line and a sentence should generally be a logical unit so if you change two parts of it a git conflict makes sense.


# Rationale
<a name="rationale"></a>
Decisions for tiering levels, requirements for each tier, and benefits accrued at each tier are based on conversations with the community (bi-annual meetings, telecons, etc.), and are listed here as a starting point for more discussion, likely to be refined in the future. Initially, ideas were presented to the community in a pyramid format. To make the differences between tiers more visible and understandable, it has been transformed into a spreadsheet format.

# PyHC Package Tiering Specifications
<a name="specification"></a>
There are four tiers proposed in this PHEP: Honorable Mention, Bronze, Silver, and Gold. See the table below for requirements associated with each tier:

| Tier | Self Evaluation Status | PyHC Standard Grades | pyOpenSci Review Status | PyHC Env Installation Conflicts | DOI | Interoperability Status | pip Installable? | conda Installable? |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I'm a bit hesitant to include Interoperability Status in here, both because it's very hard to define and because our community is still working out what we mean when we say packages are interoperable.

✅ I like having PyHC ENv Installation Conflicts in here since it partially addresses the interoperability issue while being well-defined and impactful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely open to changing up the categories. Kind of threw in a "kitchen sink" thing here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of linking this to the PyHC env. That would be easily testable. I would prefer not to have a definition of interoperability that requires a data object from one package be converted to a data object for another, since testing between each combination of core packages will grow rapidly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to remove the interoperability column, based on feedback.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But keeping in the PyHC env conflicts

| :--: | :--------------------: | :------------------: | :---------------------: | :-----------------------------: | :-: | :---------------------: | :--------------: | :----------------: |
| **Gold** | Completed | Mostly green, some yellow allowed | Completed | No conflicts allowed | Required | Interoperable with all other PyHC core packages | Yes | Yes |
| **Silver** | Completed | Several yellow, no red | In Progress | A couple conflicts exist | Required | Interoperable with most PyHC core packages | Yes | No |
| **Bronze** | Completed | Red grades allowed | Not Completed | Major conflicts exist | Required | Interoperable with 1-2 PyHC core packages | Yes | No |
| **Copper** | Not Done | N/A | N/A | Major conflicts exist | Not required | Does not interoperate with core packages | No | No |

Descriptions for each heading are as follows:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend including links to pages that explain how to do each of these things, to make it easier for new people to fulfill each requirement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll add that/include in the "How to Teach This" section.

- Self Evaluation Status: indicates whether a package has completed a self evaluation against PyHC's standards
- PyHC Standard Grades: indicates status of each standards grouping within a package's self evaluation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be standards, or PHEPs, or "standards and their replacements?"

As I noted in #33, #34, #35 it would be nice to replace our standards columns on the package page with PHEP compliance. More below.

Then instead of standards grades it would be more "compliance with standards-track PHEPs" and something like:

  • Gold: Complies with all "must" and many "should" requirements from applicable standards-track PHEPs (potentially exclude the "should"...)
  • Silver: Complies with most "must" requirements from applicable standards-track PHEPs
  • Bronze: Complies with some "must" requirements from applicable standards-track PHEPs
  • Copper: N/A

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the table approach so the requirements for each level are more specific and clearly laid out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am understanding correctly, in the end this would still be a kind of stoplight system like what we have now, but pointing to compliance to... PHEPs? I didn't see your issues before submitting my code here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, stoplight system but the columns are PHEPs instead of categories of standards.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to go along with that. But, does that also mean this PHEP has to sit in limbo until those other complimentary PHEPs get passed? @jtniehof

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I like leaving the "many should requirements" in the Gold-level packages. We really should have only the cream of the crop and those putting in the effort to fully comply requiring our highest tier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, put this in a related but different discussion thread, so now subjecting you to copy/paste:
I wouldn't think we need to hold this up until all standards PHEPs are done. If we have something like "when new standards-track PHEPs are approved packages have 6 (12?) months to self-evaluate and update their tiers", then we could in theory approve this with no new standards lined up. There's going to be a transition period regardless.

- pyOpenSci Review Status: indicates status of a pyOpenSci review
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this anything unique to PyHC, or just being in a pyOpenSci review? If the latter, can this link the appropriate pyOpenSci page?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be specifically for the PyHC-pyOpenSci pairing review process (checking against both pyOpenSci reqs + PyHC-specific reqs). No link for that, yet. That's part of why I'm trying to get the community to chat about what we'd need to define "yes, fits in with the PyHC" during the pyOpenSci process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explicitly say "to be defined in the future" or something? It feels a bit weird to be approving a standard that requires something that's not yet in place but I understand we can't do everything at once. Or this could be made more vague of "future collaborations" or something and the PHEP defining the pyOpenSci process would say "modifies PHEP 4 by adding...."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That kind of modification plan could work nicely. In that pathway, this PHEP would become the skeleton (most of) the other PHEPs would map to using a stoplight or yes/no system as appropriate.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some items, we can flesh it out here and not wait for another PHEP. Others will need this approach or something similar.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm going to modify the wording a bit here based on this feedback.

- PyHC Env Installation Conflicts: indicates state of installation conflicts within the PyHC software environment
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is very clear. What does a conflict mean here? I assume the only real way you get a conflict is that you require an old version of a package? Perhaps a better thing to say here would be a reference to #29 if that gets merged?

- DOI: indicates whether or not a package has a DOI (e.g., from Zenodo or a publication)
- Interoperability Status: indicates the level of interoperability a package has with PyHC core packages
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to need some defining ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply will remove it. ;)

- pip Installable: indicates whether a package is pip installable
- conda Installable: indicates whether a package is conda installable

The following table shows the benefits that are associated with each tier:

| Tier | Summer School Inclusion | PyHC Software Env Inclusion | PyHC-Chat Bot Inclusion | pyOpenSci Verified badge | Standards Compliance Assistance | Listing on Main PyHC Project Page | Listing on Secondary PyHC Project Page | Software Search Interface Inclusion | Consideration for Conference Travel Funding |
| :--: | :---------------------: | :-------------------------: | :---------------------: | :----------------------: | :-----------------------------: | :-------------------------------: | :------------------------------------: | :---------------------------------: | :-----------------------------------------: |
| **Gold** | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
Copy link

@nabobalis nabobalis Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that I don't think any packages would be at a gold level (based on the current crieta in this PR), that would suggest no packages at a future summer school.

Maybe we should loosen this down to silver or bronze? Or maybe tweak the gold level requirements?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Happy to modify that one.

| **Silver** | No | No | No | No | Yes | Yes | No | Yes | Yes |
| **Bronze** | No | No | No | No | No | Yes | No | Yes | No |
| **Honorable Mention** | No | No | No | No | No | No | Yes | Yes | No |

Descriptions for each heading are as follows:
- Summer School Inclusion: indicates whether a package will be included in summer school teaching materials
- PyHC Software Env Inclusion: indicates whether a package will be included within the PyHC software environment
- PyHC-Chat Bot Inclusion: indicates whether a packages will have up-to-date information included within the ChatGPT4-powered PyHC-Chat bot
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend having these be something people can opt in or out of.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't even think of that, that's a good idea. I'll modify to indicate that that would be an optional perk.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend dropping this from the table completely. I am sure there will be lots of smaller things like this that projects will or wont get pulled into. This feels very frivolous to include in a very formal specification document.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cadair I disagree that it's frivolous. As PyHC-Chat improves, inclusion within it could be a carrot—not the only, or biggest, but still a carrot—for people to work to move tiers.

- pyOpenSci Verified Badge: a badge that shows whether a package has completed the pyOpenSci review process
- Standards Compliance Assistance: indicates whether a package will receive extra help and/or advice from PyHC leadership in conforming to the PyHC standards
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this seems counterintuitive to me. Shouldn't the packages that are less up-to-snuff receive more help? Perhaps a better way of splitting time would be having a pool of volunteers that could give time for things like code reviews and then receive code reviews from other people. That would be a nice way of creating more of a community environment in PyHC, but is not directly related to the tiers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. There's also an idea that packages who've done the work to be at a higher level get more help if there's extra funding for someone to poke around their code base (e.g. work through bug issues or solve long-standing open PRs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! We should have community support mechanisms for packages to level up, as you suggest.

📌 I'm wondering if something like an annual unstructured Helio Hack Week (like Astro Hack Week) would provide good opportunities for us to support each other in this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a potential use case, having a path forward to help packages be pip/conda-installable would be a good support. A Hack Week could be useful here.

- Listing on Main PyHC Project Page: indictes whether a package's information will be displayed on a new, main PyHC Project page
- Listing on Secondary PyHC Project Page: indicates whetheer a package's information will be displayed on a new, secondary PyHC Project page
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be nice to just have a front page with a search bar that could let people find any project by name or by keyword.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We technically have this with @sapols 's updates to the Project page, but it could be made more clear, or placed in a more obvious location?

- Software Search Interface Inclusion: indicates whether a package will be included within a Heliophysics software search interface
- Consideration for Conference Travel Funding: indicates whether developers from a package will be considered for travel funding assistance to relevant science conferences (e.g. SHINE, CEDAR, or GEM)

Packages are evaluated against level of compliance with each requirement, as shown in the PyHC tiering chart. To be accepted for a tier, a package must meet **all** the requirements for said tier. Therefore, should a package fall into different tiers depending on the specific requirement, the package will be accepted at the lowest tier of requirements it meets. For example, if a package meets some requirements for the Silver tier, but other requirements only meet the Bronze tier, the package will be considered a Bronze tier package.

Once the tier specifications are set, the PyHC website will be updated to reflect the new tiers. Next, packages will self evaluate their PyHC package tier level. Similar to self evaluation of standards grading, packages will then submit a PR to [the PyHC website GitHub](https://github.com/heliophysicsPy/heliophysicsPy.github.io), modifying their package to fall under a certain tier. From there, the PyHC Leadership team-currently the PI (Julie Barnum) and the PyHC Tech Lead (Shawn Polson)-will give a final vote of approval and either merge the PR, or begin a discussion on the PR with reasons for a tier regrade. Note that a package is allowed to move between tiers. If a package upgrades their status to match that of Silver, instead of Bronze, tier, for example, they can submit a new PR to have their tier updated. The flip side is also true; should a package become defunct or drop in status, they may be downgraded to a lower tier. Packages will receive ample notification before this takes place (no less than three months' notice), with opportunity given to rectify any issues with their current tier level.


# Backwards Compatibility
<a name="backwards-compatibility"></a>
This PHEP does not propose a direct change to PyHC package code, simply the inclusion or not of packages within the various tiers, thus it introduces no compatibility concerns.

# Security Implications
<a name="security-implications"></a>
This PHEP raises no security implications as it does not interact with any executing code.

# How to Teach This
<a name="how-to-teach-this"></a>
This PHEP's contents and changes will be presented on, discussed, and hacked at various PyHC bi-weekly telecons and PyHC bi-annual meetings. Additionally, explanations for tiering, the process of obtaining a PyHC tier, etc. will be posted on the new main Projects page, as well as communicated within a blog post under the PyHC Blog page.

# Rejected Ideas
<a name="rejected-ideas"></a>
None yet to note.

# Open Issues
<a name="open-issues"></a>
None yet to note.

# Footnotes
<a name="footnotes"></a>
None yet to note.

# Revisions
<a name="revisions"></a>
Revision 1 (pending): Initial draft.

# Copyright
<a name="copyright"></a>
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. It should be cited as:
```
@techreport(phep2,
author = {Julie I. Barnum},
title = {PHEP Package Tiering},
year = {2024},
type = {PHEP},
number = {9999},
doi = {10.5281/zenodo.xxxxxxx}
)
```