Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHEP 4: PyHC Package Tiering #31

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

PHEP 4: PyHC Package Tiering #31

wants to merge 9 commits into from

Conversation

jibarnum
Copy link

@jibarnum jibarnum commented Jul 9, 2024

This PR proposes a new process PHEP to the PyHC. PHEP 4 establishes a new tiering structure to PyHC projects, which will automatically affect PyHC packages once it goes into effect. Included herein is information on requirements for each of the new four tiers of PyHC projects (Gold, Silver, Bronze, and Bronze), as well as benefits accrued at each tier.

@jibarnum jibarnum self-assigned this Jul 9, 2024
@jibarnum jibarnum requested a review from Cadair July 9, 2024 20:40
@jibarnum
Copy link
Author

jibarnum commented Jul 9, 2024

@jameswilburlewis @aburrell @rweigel @sandyfreelance @darrendezeeuw can't add you all as reviewers (I think I need to invite you to the PyHC org on GitHub first. But for your awareness and comments.

@jibarnum jibarnum requested a review from jklenzing July 9, 2024 20:45
Copy link

@aburrell aburrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First impressions.


# Motivation
<a name="motivation"></a>
Currently, PyHC is at a crossroads for how to push forward as a community. There are two main schools of thought—originating from bi-annual meeting discussions, telecon chats, and further sidebar converstaion—with regards to what PyHC is and should be: 1) a basic interpretation where PyHC is a collection, and listing, of open-soure Python packages with a relevance to Heliophysics and space physics, and 2) a standards-based interpretation where PyHC strives for compliance with our set standards, package interoperability, and standardization around one or more tools. There is utility and validity to both approaches. A new PyHC package tiering system is intended to find a "best of both worlds" with the two ideas. Older, out-of-date, unmaintained, or specific use-case code (e.g., associated with a publication) could still have a place for listing and findability, while also allowing nuance between other packages that are more robust, trustworthy, maintained, and work toward the standards-based interpretation of being a PyHC package. Further, this tiering system also allows users to get a clearer picture on what each PyHC package has to offer, and the state of the package's condition and development. Creation of a PyHC package tiering system also allows for justification for a myriad of benefits, for example, consideration for funding from a community travel fund, or extra help with improving a standards grouping grade.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of commenting, I would recommend breaking these lines at 79 characters. There are also minor grammar errors that would be easier to scrub with suggestions if the line wasn't so long.

Substantive comments:

  • I think "specific use-case" is a poor word choice, since there are heavily used packages that just do one thing. Maybe "publication-specific" instead of "use-case specific" would be better, unless there is a wider scope here that you are looking to exclude.
  • I think the wording on the separation reasons needs some fine tuning, but I don't have a suggestion because I am not clear on the intent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I'll modify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also suggest swapping the rows and columns for the tables to help reduce line size.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I much prefer one line per sentence formatting for prose in git. git by default diffs per-line and a sentence should generally be a logical unit so if you change two parts of it a git conflict makes sense.

Comment on lines 32 to 34
| **Gold** | Completed | Mostly green, some yellow allowed | Completed | No conflicts allowed | Required | Interoperable with all other PyHC core packages | Yes | Yes |
| **Bronze** | Completed | Several yellow, no red | In Progress | A couple conflicts exist | Required | Interoperable with most PyHC core packages | Yes | No |
| **Silver** | Completed | Red grades allowed | Not Completed | Major conflicts exist | Required | Interoperable with 1-2 PyHC core packages | Yes | No |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually silver is higher than bronze

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of sleep coding mistake. 🙃 Fixed.

pheps/phep-9999.md Outdated Show resolved Hide resolved
Descriptions for each heading are as follows:
- Summer School Inclusion: indicates whether a package will be included in summer school teaching materials
- PyHC Software Env Inclusion: indicates whether a package will be included within the PyHC software environment
- PyHC-Chat Bot Inclusion: indicates whether a packages will have up-to-date information included within the ChatGPT4-powered PyHC-Chat bot
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend having these be something people can opt in or out of.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't even think of that, that's a good idea. I'll modify to indicate that that would be an optional perk.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend dropping this from the table completely. I am sure there will be lots of smaller things like this that projects will or wont get pulled into. This feels very frivolous to include in a very formal specification document.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cadair I disagree that it's frivolous. As PyHC-Chat improves, inclusion within it could be a carrot—not the only, or biggest, but still a carrot—for people to work to move tiers.

- PyHC Software Env Inclusion: indicates whether a package will be included within the PyHC software environment
- PyHC-Chat Bot Inclusion: indicates whether a packages will have up-to-date information included within the ChatGPT4-powered PyHC-Chat bot
- pyOpenSci Verified Badge: a badge that shows whether a package has completed the pyOpenSci review process
- Standards Compliance Assistance: indicates whether a package will receive extra help and/or advice from PyHC leadership in conforming to the PyHC standards
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this seems counterintuitive to me. Shouldn't the packages that are less up-to-snuff receive more help? Perhaps a better way of splitting time would be having a pool of volunteers that could give time for things like code reviews and then receive code reviews from other people. That would be a nice way of creating more of a community environment in PyHC, but is not directly related to the tiers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. There's also an idea that packages who've done the work to be at a higher level get more help if there's extra funding for someone to poke around their code base (e.g. work through bug issues or solve long-standing open PRs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! We should have community support mechanisms for packages to level up, as you suggest.

📌 I'm wondering if something like an annual unstructured Helio Hack Week (like Astro Hack Week) would provide good opportunities for us to support each other in this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a potential use case, having a path forward to help packages be pip/conda-installable would be a good support. A Hack Week could be useful here.

Comment on lines 62 to 63
- Listing on Main PyHC Project Page: indictes whether a package's information will be displayed on a new, main PyHC Project page
- Listing on Secondary PyHC Project Page: indicates whetheer a package's information will be displayed on a new, secondary PyHC Project page
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be nice to just have a front page with a search bar that could let people find any project by name or by keyword.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We technically have this with @sapols 's updates to the Project page, but it could be made more clear, or placed in a more obvious location?

| **Silver** | Completed | Red grades allowed | Not Completed | Major conflicts exist | Required | Interoperable with 1-2 PyHC core packages | Yes | No |
| **Honorable Mention** | Not Done | N/A | N/A | Major conflicts exist | Not required | Does not interoperate with core packages | No | No |

Descriptions for each heading are as follows:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend including links to pages that explain how to do each of these things, to make it easier for new people to fulfill each requirement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll add that/include in the "How to Teach This" section.

<a name="specification"></a>
There are four tiers proposed in this PHEP: Honorable Mention, Bronze, Silver, and Gold. See the table below for requirements associated with each tier:

| Tier | Self Evaluation Status | PyHC Standard Grades | pyOpenSci Review Status | PyHC Env Installation Conflicts | DOI | Interoperability Status | pip Installable? | conda Installable? |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I'm a bit hesitant to include Interoperability Status in here, both because it's very hard to define and because our community is still working out what we mean when we say packages are interoperable.

✅ I like having PyHC ENv Installation Conflicts in here since it partially addresses the interoperability issue while being well-defined and impactful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely open to changing up the categories. Kind of threw in a "kitchen sink" thing here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of linking this to the PyHC env. That would be easily testable. I would prefer not to have a definition of interoperability that requires a data object from one package be converted to a data object for another, since testing between each combination of core packages will grow rapidly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to remove the interoperability column, based on feedback.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But keeping in the PyHC env conflicts


| Tier | Summer School Inclusion | PyHC Software Env Inclusion | PyHC-Chat Bot Inclusion | pyOpenSci Verified badge | Standards Compliance Assistance | Listing on Main PyHC Project Page | Listing on Secondary PyHC Project Page | Software Search Interface Inclusion | Consideration for Conference Travel Funding |
| :--: | :---------------------: | :-------------------------: | :---------------------: | :----------------------: | :-----------------------------: | :-------------------------------: | :------------------------------------: | :---------------------------------: | :-----------------------------------------: |
| **Gold** | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
Copy link

@nabobalis nabobalis Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that I don't think any packages would be at a gold level (based on the current crieta in this PR), that would suggest no packages at a future summer school.

Maybe we should loosen this down to silver or bronze? Or maybe tweak the gold level requirements?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Happy to modify that one.

jibarnum and others added 2 commits July 9, 2024 15:28
Changing Honorable Mention to Copper

Co-authored-by: Angeline Burrell <aburrell@users.noreply.github.com>
Copy link
Contributor

@jklenzing jklenzing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial thoughts. I think this is a good step forward.


# Motivation
<a name="motivation"></a>
Currently, PyHC is at a crossroads for how to push forward as a community. There are two main schools of thought—originating from bi-annual meeting discussions, telecon chats, and further sidebar converstaion—with regards to what PyHC is and should be: 1) a basic interpretation where PyHC is a collection, and listing, of open-soure Python packages with a relevance to Heliophysics and space physics, and 2) a standards-based interpretation where PyHC strives for compliance with our set standards, package interoperability, and standardization around one or more tools. There is utility and validity to both approaches. A new PyHC package tiering system is intended to find a "best of both worlds" with the two ideas. Older, out-of-date, unmaintained, or specific use-case code (e.g., associated with a publication) could still have a place for listing and findability, while also allowing nuance between other packages that are more robust, trustworthy, maintained, and work toward the standards-based interpretation of being a PyHC package. Further, this tiering system also allows users to get a clearer picture on what each PyHC package has to offer, and the state of the package's condition and development. Creation of a PyHC package tiering system also allows for justification for a myriad of benefits, for example, consideration for funding from a community travel fund, or extra help with improving a standards grouping grade.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also suggest swapping the rows and columns for the tables to help reduce line size.

<a name="specification"></a>
There are four tiers proposed in this PHEP: Honorable Mention, Bronze, Silver, and Gold. See the table below for requirements associated with each tier:

| Tier | Self Evaluation Status | PyHC Standard Grades | pyOpenSci Review Status | PyHC Env Installation Conflicts | DOI | Interoperability Status | pip Installable? | conda Installable? |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of linking this to the PyHC env. That would be easily testable. I would prefer not to have a definition of interoperability that requires a data object from one package be converted to a data object for another, since testing between each combination of core packages will grow rapidly.

- PyHC Software Env Inclusion: indicates whether a package will be included within the PyHC software environment
- PyHC-Chat Bot Inclusion: indicates whether a packages will have up-to-date information included within the ChatGPT4-powered PyHC-Chat bot
- pyOpenSci Verified Badge: a badge that shows whether a package has completed the pyOpenSci review process
- Standards Compliance Assistance: indicates whether a package will receive extra help and/or advice from PyHC leadership in conforming to the PyHC standards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a potential use case, having a path forward to help packages be pip/conda-installable would be a good support. A Hack Week could be useful here.

- pyOpenSci Review Status: indicates status of a pyOpenSci review
- PyHC Env Installation Conflicts: indicates state of installation conflicts within the PyHC software environment
- DOI: indicates whether or not a package has a DOI (e.g., from Zenodo or a publication)
- Interoperability Status: indicates the level of interoperability a package has with PyHC core packages
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to need some defining ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply will remove it. ;)

- Self Evaluation Status: indicates whether a package has completed a self evaluation against PyHC's standards
- PyHC Standard Grades: indicates status of each standards grouping within a package's self evaluation
- pyOpenSci Review Status: indicates status of a pyOpenSci review
- PyHC Env Installation Conflicts: indicates state of installation conflicts within the PyHC software environment
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is very clear. What does a conflict mean here? I assume the only real way you get a conflict is that you require an old version of a package? Perhaps a better thing to say here would be a reference to #29 if that gets merged?

@sapols
Copy link
Contributor

sapols commented Jul 11, 2024

Initial thoughts/issues:

  • Excited to see this draft! This is a great move for PyHC. Don't mean to focus on issues but will anyway to keep my comment short
  • Is it a typo that the first table ends with "Copper" instead of "Honorable Mention"?
  • It'd be helpful to add a hyperlink to the PyHC env to clarify which env we mean. Probably even a specific Docker image for extra clarity?
    • Although (and maybe this is a bigger question) do we need a new "PyHC env" to facilitate this? The purpose of the current one is to hold all PyHC packages, whereas this PHEP specifies only Gold-tier packages get inclusion in the env. (Which also begs the question how will packages know if they're compatible with the env if they're not included in it?)
  • Question: how will this affect "core" package status? Will "core" packages still exist, or does Gold-tier become the new "core"?

@aburrell
Copy link

@sapols "copper" was my suggestion, to keep with the medal terminology. It's the next medal after "bronze".

@rebeccaringuette
Copy link

rebeccaringuette commented Aug 5, 2024

I would like to echo Shawn's comments on this PHEP being a great step forward for PyHC. Some comments:

  • Agree that benefit items like Python env inclusion and chat bot inclusion should only be available to the packages that have put in the effort to make their inclusion simple. Could allow silver packages for those items given justification (e.g. number of users, effort made for those items but gold level not achieved).
  • We need two versions of the PyHC environment to avoid creating an environment so large that no one wants to wait for it to install/load. Suggest allowing some bronze + all silver + all gold in an 'all-PyHC' env, assuming no installation conflicts and necessary effort completed. For trimmed down version, could restrict to some silver + all gold and include breadth of use in the eligibility requirements (e.g. used by a large number of people -> include in the trimmed down PyHC env). This trimmed down version would be what we start with for the PyHC summer school. Using this approach provides a more flexible method to include packages in the summer school and to include packages in the PyHC environments.
  • agreed that standards compliance assistance should be available to all upon request. How much assistance is made available will have to depend on the level of PyHC funding available, maybe some simple justification requirement, and other factors
  • also hesitant about including interoperability status. Would rather see metadata compliance status (thinking of interoperability with the HSSI effort) as a column in the table and a software license status (should exclude the NOSA license on the gold tier since it makes collaboration extremely difficult).
  • also agree on including the PyHC env installation conflicts in the table, with a modification to include two versions as above
  • package DOI should be for the software repository, not an associated publication. Could be a preference, not a requirement for now, and there should obviously be some exceptions (e.g. JOSS).
  • PyHC standard grades should not be determined by self-evaluation. If a self evaluation is an allowed method, there must be a complete review by PyHC leadership (e.g. Shawn) to confirm those compliance ratings.
  • need to specify the current PyHC env, or a version created in the last year, not just any past version
  • like the idea of the term 'core packages' being removed. much prefer gold instead.
  • need to make some funding available for packages to go through the pyOpenSci process (e.g. through ROSES B.20, maybe small amounts available through a PyHC NSF or NASA grant for that purpose).
  • Need to state a time frame for packages to submit the tier they best align with (and how they fulfill the requirements for that tier) to PyHC leadership after this goes into effect (e.g. 6 months to 1 year?).

@jibarnum
Copy link
Author

jibarnum commented Aug 26, 2024

@sapols thanks for your thoughts.

Is it a typo that the first table ends with "Copper" instead of "Honorable Mention"?

No, as @aburrell pointed out, that was changed to keep with the "medal" terminology we used for the other categories.

It'd be helpful to add a hyperlink to the PyHC env to clarify which env we mean. Probably even a specific Docker image for extra clarity? Although (and maybe this is a bigger question) do we need a new "PyHC env" to facilitate this? The purpose of the current one is to hold all PyHC packages, whereas this PHEP specifies only Gold-tier packages get inclusion in the env. (Which also begs the question how will packages know if they're compatible with the env if they're not included in it?)

I think we want to establish some specific environments for this. @rebeccaringuette had the interesting suggestion in her comment (below yours) re creating two environments. I think some kind of split of Gold + Silver and then Gold + Silver + Bronze for PyHC-top-tier and PyHC-all environments, respectively (happy for some help in workshopping that terminology).

Question: how will this affect "core" package status? Will "core" packages still exist, or does Gold-tier become the new "core"?

I think this would make core go away, yes, leaving us the highest level being "Gold". It'd get confusing in my mind to delineate the differences between Gold and core. Further, we've always struggled to say what exactly it meant to be a core package, or how to become core package (apart from a nod of approval from current leadership and core package maintainers).

@jibarnum
Copy link
Author

@rebeccaringuette thanks for your thoughts above!

Agree that benefit items like Python env inclusion and chat bot inclusion should only be available to...

Indeed, I'm trying to make that a bit more clear in the soon-to-come commit.

We need two versions of the PyHC environment to avoid creating an environment so large that no one wants to wait for it to install/load...

I like this thought. I'll include it. However, I do wonder how we intend to include the bronze categories, which allow some major conflicts to exist with installation into the software environment... thoughts?

agreed that standards compliance assistance should be available to all upon request

I mostly agree. I think if you're already at Gold, you probably will only get assistance if you're in danger of dropping down a level.

also hesitant about including interoperability status...

Yeah, I nixed that one. The metadata suggestion is good, though can you elaborate on how we would evaluate that?

also agree on including the PyHC env installation

Yep.

package DOI should be for the software repository,

Sure, that makes sense.

PyHC standard grades should not be determined by self-evaluation...

Indeed, and thus the point of doing a pyOpenSci review process. But the self-evaluation is just step one to getting there. Shawn does also do a general review to make sure the grades are commensurate with the state of a repository.

need to specify the current PyHC env...

Sure.

like the idea of the term 'core packages'...

Same, I'm nixing that once (if) this PHEP goes into place.

need to make some funding available for packages

For sure. First we need to get a good definition on what we want for PyHC-specific requirements for a pyOpenSci process to show we have the process in place and ready to go for packages.

Need to state a time frame for packages to submit the tier they best align with

For sure, I need to include some wording on this. I don't want to wait too long, so perhaps 6 months is best. I'll find out soon if that's a terrible idea by how many tomatoes are thrown my way with the next commit. :)

@jibarnum
Copy link
Author

jibarnum commented Aug 27, 2024

Alright, all. Tried to catch and incorporate as many comments as I could. Please review and let me know what concerns/suggestions I didn't capture or have come up with the changes. Thanks!

| PyHC Standard Grades | Mostly green, some yellow | Several yellow, no red | A couple red | N/A |
| pyOpenSci Review Status | Completed | In progress | Not Started | Not Started |
| PyHC Env Installation Conflicts | No conflicts | A couple conflicts exist | Major conflicts exist | Major conflicts exist |
| HSSI Metadata Compliant | Fully Compliant | A couple issues exist | Major issues exist | Major issues exist |
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a bit more defining. Hoping to get some input on that from you @rebeccaringuette if you have thoughts from the metadata perspective? Are we looking for packages to have a specific set of metadata included at time of tier submission?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

  • Copper: Name of package, link to repository.
  • Bronze: + DOI, license, description, (publisher, publication year, authors -> needed to create a DOI), mandatory fields for HSSI*
  • Silver: + most recommended fields for HSSI*
  • Gold: + all recommended and some optional fields for HSSI*
    *to be determined

Ideally, the PyHC package submission form should incorporate HSSI metadata fields.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is perfect, thank you!

Descriptions for each heading are as follows:
- Summer School Inclusion: indicates whether a package will be included in summer school teaching materials
- PyHC-top-tier Env Inclusion: indicates whether a package will be included within the current PyHC software environment used at the summer school (also included within env in Science Platforms Coordination group???)
- PyHC-all Env Inclusion: indicates whether a package will be included within the current PyHC software environment containing all packages Bronze and higher. (Does this make sense based on the ability or not of a package to be installed in a common software env???)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That question in the parentheses is for you @sapols :D

Copy link
Contributor

@sapols sapols Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it makes sense. I'd say we can remove this question 🙂 Or replace it with a parenthetical like (provided the package is not the cause of a dependency conflict). It's currently possible to put ALL PyHC packages in the same env so no package is causing a conflict yet, but I recognize that could change in the future for lower-tier packages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, one thing with the Silver and Bronze levels is that they are allowed to have installation conflicts. Maybe that should change then? Though, are there other kinds of issues that can occur with integrating a package into the environment other than an installation conflict? @sapols

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind on that one, I made a change to the previous chart that makes it make more sense. ha


Descriptions for each heading are as follows:
- Summer School Inclusion: indicates whether a package will be included in summer school teaching materials
- PyHC-top-tier Env Inclusion: indicates whether a package will be included within the current PyHC software environment used at the summer school (also included within env in Science Platforms Coordination group???)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That question in the parentheses is once again for you @sapols :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually say the env being made for the Science Platforms Coordination group is out of scope here. It's not an explicitly PyHC thing; we're not even including all core PyHC packages in that env.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will remove.

| pyOpenSci Review Status | Completed | In progress | Not Started | Not Started |
| PHEP 3 Compliant? | Yes | Yes | Yes | No |
| HSSI Metadata Compliant? | Yes | Mostly | Partially | Bare Minimum Met |
| Software License | Fully compliant and excludes NOSA license | Fully compliant, allows NOSA license | Has non-recommended license (e.g., GPL) | Has no license |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how I feel about relegating GPL packages to the third tier when they could be excellent in all other regards. I am aware of why we don't recommend copyleft licenses, but sometimes it can't be helped. GPL (etc) are very good licenses which people can have legitimate reasons for using. If we really had a package which that was it's only "bad" grade would we really want to relegate it to effectively the lowest tier when it's a real possibility the authors would be unable to fix it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to changing this. In your mind, what would the levels be @Cadair ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might lump GPL and NOSA in the same category...agreed that making GPL "worse" than NOSA seems a bit much.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine with Jon's idea on this. So, NOSA + copyleft licenses would be bronze tier? Silver and gold would both require a recommended license (not to include those)?

Copy link
Author

@jibarnum jibarnum Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these suggestions work for me, I'll update to allowing them at Silver, but not Gold, tier.

| pyOpenSci Review Status | Completed | In progress | Not Started | Not Started |
| PHEP 3 Compliant? | Yes | Yes | Yes | No |
| HSSI Metadata Compliant? | Yes | Mostly | Partially | Bare Minimum Met |
| Software License | Fully compliant and excludes NOSA license | Fully compliant, allows NOSA license | Has non-recommended license (e.g., GPL) | Has no license |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that we should be listing packages without a license at all.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case that comes to mind for this would be code associated with a publication, particularly in cases where the publisher demands a separate DOI for the code. Then again, should those be included in PyHC?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points... I feel like this is a topic to talk with the community as a whole about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"No license" would mean nobody actually has the right to distribute or use the code. Suggest "non open source license" as the lowest tier.

Copy link

@rebeccaringuette rebeccaringuette Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn here, and this is related to another discussion above on the licenses. I agree that this would greatly benefit community discussion. Do we already have packages that have no license?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I'll do for now is put "non open source license" for copper tier. But I have a list of questions for the community to bring up at the next telecon (we're discussing PHEPs then).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current/(new?) license row doesn't look right. The gold level should not include GPL or NOSA licenses.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I accidentally did an incorrect paste when modifying things. Fixing now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

jibarnum and others added 2 commits September 4, 2024 12:22
quick update on pull #

Co-authored-by: Jon Niehof <jtniehof@gmail.com>
@jtniehof
Copy link
Contributor

Should there be an explicit closes #30 on this?

@rstoneback
Copy link

NASA funding is already requiring that software proposals satisfy PyHC standards. Did NASA check with us before adding that to funding announcements? Does applying PyHC standards in funding announcements comport with APA standards and U.S. agency rule making? What standards level is going to apply to NASA funding? Gold, silver, bronze, or copper?

Incidentally, my interest level in providing free labor to NASA, in the form of standards or otherwise, is quite low.

@rebeccaringuette
Copy link

That is a discussion to have with HDRL and NASA HQ once this gets settled. My initial thoughts are to require bronze as a minimum for software packages starting out. This sets the bar low, but still requires basic FAIR (e.g. DOI, license, pip for reusability, PyHC env for interoperability, and similar). Proposals from a bronze package (or copper) could alternatively ask for funds to improve the level to silver or gold in a detailed manner, e.g. the pyOpenSci review process.

@rebeccaringuette
Copy link

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields*, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields*.
*See HSSI metadata schema for details.
In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

@rstoneback
Copy link

rstoneback commented Sep 13, 2024

That is a discussion to have with HDRL and NASA HQ once this gets settled.

I disagree. If NASA wants to set the standards then they should set the standard. It should also be applied to not just to heliophysics, but to Earth, Planetary, and Astrophysics divisions. If NASA wants to use the PyHC standards then PyHC sets the standard, not NASA. I will repeat however that I think it is inappropriate for NASA to use the results of unfunded labor.

@rebeccaringuette
Copy link

That is a discussion to have with HDRL and NASA HQ once this gets settled.

I disagree. If NASA wants to set the standards then they should set the standard. It should also be applied to not just to heliophysics, but to Earth, Planetary, and Astrophysics divisions. If NASA wants to use the PyHC standards then PyHC sets the standard, not NASA. I will repeat however that I think it is inappropriate for NASA to use the results of unfunded labor.

The PyHC standards apply only to software relevant to Heliophysics and written in or run from Python, nothing more, and cannot be applied across NASA's divisions or even other software in Heliophysics.
concerning the funding comment, the PyHC standards are mentioned as conditions on NASA funding opportunities, particularly the HTM call, so the requirement is not unfunded. I don't recall at the moment if it is mentioned on other calls. Since PyHC is now moving to tiered standards, the conversation between PyHC leadership, HDRL leadership and NASA HQ will likely be which tier to set as a minimum standard for an updated version of those funding calls, assuming that HQ decides to change the wording of that AO and others at all. The decision of which tier a given proposal chooses to adhere to (and how they intend to adhere to it) may instead be left to the decision of the proposal submitter, which would then be left to the scrutiny of the proposal reviewers.

@jibarnum
Copy link
Author

Should there be an explicit closes #30 on this?

Yes, I'd say so!

@rebeccaringuette
Copy link

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields*, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields*. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

@jibarnum

@jibarnum
Copy link
Author

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields*, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields*. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

Sure. I just went with what you'd said earlier for each level. I can update. I feel the HSSI metadata schema will require a url. Do we have one at the moment?

@jibarnum jibarnum linked an issue Sep 13, 2024 that may be closed by this pull request
@jibarnum
Copy link
Author

jibarnum commented Sep 13, 2024

@rstoneback since HTM calls often closely align with the PyHC, and to the end of not siloing efforts, NASA made the choice to include our standards in their calls (to my knowledge, this is just for HTM). I was asked about wording for this, and provided what is shown therein. NASA could, in theory, go off and write their own things, but I suppose why reinvent the wheel if not necessary?

Like @rebeccaringuette it will require some discussion with NASA on if they want to update AO calls to match the new process we have, and if so, to what level. I'm not convinced it's appropriate to define here which level NASA funding calls will ascribe to. That's outside the scope of this PHEP, and wrong for us to levy that requirement on NASA since we're... not NASA.

I empathize with the funding concerns. The HTM call, albeit small at the moment, does have room for package maintenance funding requests. I strongly believe updating to better align with new PyHC tiering/PHEPs for standards would be a legitimate funding request. If enough packages are submitting those kinds of requests, that may even encourage NASA to start putting more money behind that (crosses fingers).

@rebeccaringuette
Copy link

Also in the table, the HSSI row needs some work. Recommended copper = all mandatory fields, bronze = all mandatory and some recommended fields*, silver = all mandatory and recommended fields, gold = all mandatory and recommended fields plus some optional fields*. *See HSSI metadata schema for details. In addition to metadata fields, gold and silver level packages should have priority consideration in contributing to the controlled vocabularies used by HSSI. These packages would also be eligible, subject to review, to manage those controlled vocabularies in a rotating fashion under HSSI metadata leadership. Contributing to the controlled vocabularies should be a requirement for gold level packages (e.g. are we missing anything).

Sure. I just went with what you'd said earlier for each level. I can update. I feel the HSSI metadata schema will require a url. Do we have one at the moment?

No, and likely not for a few months. We will need some tech support before that is available.

@rebeccaringuette
Copy link

What is this group's opinion on shifting the conda installation requirement to the silver level? It would simplify installation in the PyHC environment, especially on Heliocloud, but would such a requirement at the silver level too formidable of a hurdle so that it should only be at the gold level, or a simple enough task to include at the silver level? Note that pip installation is required at the bronze level.

@nabobalis
Copy link

What is this group's opinion on shifting the conda installation requirement to the silver level? It would simplify installation in the PyHC environment, especially on Heliocloud, but would such a requirement at the silver level too formidable of a hurdle so that it should only be at the gold level, or a simple enough task to include at the silver level? Note that pip installation is required at the bronze level.

For me, this should be at the bronze level.

@sapols
Copy link
Contributor

sapols commented Sep 18, 2024

I'll note that I intend to submit a proposal to hire a student developer whose sole job (at first) is to help PyHC packages join conda. No promises on how soon that could happen though, of course. I could buy conda installation being a silver-level thing if enough devs agree, but bronze is too low (as much as I'd love to do that, bronze just isn't realistic).

@nabobalis
Copy link

nabobalis commented Sep 18, 2024

Unless you have compiled code, creating a conda forge recipe is no more difficult than setting up the python packaging required to get on pypi.

So for me, it should be at the same level as pip

@jibarnum
Copy link
Author

Unless you have compiled code, creating a conda forge recipe is no more difficult than setting up the python packaging required to get on pypi.

So for me, it should be at the same level as pip

There are a few PyHC core packages not yet on conda (e.g. SpacePy IIRC @jtniehof ). It'd be good to hear from them on what the blockers are before deciding to relax the requirement down to silver or bronze.

@rebeccaringuette
Copy link

rebeccaringuette commented Sep 19, 2024 via email

@nabobalis
Copy link

Maybe if a package is pure python, it should be bronze, but more complex packages we bump that to silver?

But that might be too in the weeds for a rule or requirement.

@rebeccaringuette
Copy link

Since the standards landscape in PyHC is in flux, I suggest removing the PyHC standards grading row and instead asking all other PHEPs to determine what compliance looks like for each package level. That way, we don't have to renegotiate this PHEP for every change. The summation of those descriptions can be added to a summary document each time a new PHEP is approved. In my opinion, the "some", "most" and "all" terms currently on this row are too squishy to really be a standard. On the other hand, it could also be desirable for packages to choose which items of a list of standards to completely comply with based on their own package needs. Or, such considerations would ideally be incorporated into the descriptions of compliance for each PHEP and package level. Maybe some combination of the two ideas would be good, but consider this a push for more concreteness for this row.

All items except the pyOpenSci review process seem easy enough for quick checks to be implemented once passed. That seems to be a different PHEP needed.

One important missing component here is the level of contribution allowed and activity supported by a given package, and how that characteristic is imagined to be different for different package levels. This will likely require a custom review per package to confirm.

It also seems that the technical steering committee should be described in more detail in another PHEP, such as how to become a member of that (election vs service requirement?), what the requirements are to be on that committee (e.g. silver level?), any desired restrictions (one member per package at a time can run for election / be required to serve), rotations (2 years? half gets re-elected one year, the other half to be reelected the next year), and so on.

It would be nice to add that the self-assessment / PR activity described in the implementation section would be supported by a hackathon at a spring/fall PyHC meeting, although that may be too much in the weeds.

One thing we should recognize here that others have pointed out is the likely future multiplicity of PyHC software environments. As PyHC matures and our packages grow further in complexity, it may not be possible much longer for all packages to be installable in a single environment. I find it likely that there will be a PyHC software environment purposed for the summer school that drives continued improvements towards interoperability for that purpose, while there are other 'flavors' of PyHC environments directed towards a given analysis goal (e.g. mission pipeline development vs data analysis) or even categorized by sciences (e.g. solar vs ITM). This is yet to be determined, but for now the PyHC env row could be changed to refer to the PyHC software environment designed for the summer school since that is an effort that I expect to be more persistent than the other ideas. These ideas also seem to call for a change in the benefits table, which may be as simple as changing 'PyHC-all' to "a PyHC" software environment, and allowing the package to choose which one (other than the top-tier one).

Other missing factors here are test coverage and working documentation examples, but those seem better in a pyOpenSci review process. However, how would we require that the documentation examples keep working over time? If someone sees that a silver package's documentation does work, then their opinion of all silver level packages will be decreased, so there is some level of reputation and upkeep to factor in somewhere. Is that part of the pyOpenSci process? If so, maybe that component can be used as a way to judge maintenance?

@jtniehof
Copy link
Contributor

jtniehof commented Oct 5, 2024

Maybe if a package is pure python, it should be bronze, but more complex packages we bump that to silver?

But that might be too in the weeds for a rule or requirement.

I do think it's reasonable that being on conda-forge is a requirement at a higher level than being on PyPI; the appropriate breakpoint is up for discussion. PyPI is pretty much essential and conda-forge may not be more difficult but it's an additional thing.

To Rebecca's suggestion of deferring a lot of the specifics to additional PHEPs, #35 has some discussion on packaging standards. At this point we have no standards PHEPs, so in theory we could rewrite this to be "how future PHEPs declare the way things fit into this system" and not have to backfill on anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adopt PHEP(s) on core projects or project levels
10 participants