Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MDANCE to MDAnalysis #155

Merged
merged 11 commits into from
Aug 2, 2024
Merged

Add MDANCE to MDAnalysis #155

merged 11 commits into from
Aug 2, 2024

Conversation

lexin-chen
Copy link
Contributor

Hi MDAnalysis team!

Hope this message finds you well. MDANCE is a flexible n-ary clustering algorithm for Molecular Dynamics trajectories. We currently have a fully deterministic k-means clustering (NANI) included a pipeline for determining the number of clusters in the system, ways to identify better cluster representatives, etc. We will be adding more clustering algorithms to this package after publishing them. In conversations with @orbeckst , he suggested that we should try to incorporate this project as an MDAkit, so we are looking forward to collaborating with you.

Best,
Lexin

@orbeckst orbeckst self-assigned this Jul 11, 2024
@orbeckst
Copy link
Member

Hi @lexin-chen , great to see your MDANCE package here! I'll start reviewing and may ask others from the team for help.

@orbeckst
Copy link
Member

orbeckst commented Jul 11, 2024

I am following the MDAKits reviewer guide

Checklist

PR

  • Are the automatic checks all passing?
  • Did the docs build correctly?
  • Is the metadata file named correctly and in the right location?

MDAKit home

  • link ok?
  • summary?
  • docs?
  • installation instructions?
  • license?

Metadata file

Required entries

  • project_name
  • authors
  • maintainers
  • description
  • keywords
  • licence (really, we are using British spelling?!)
  • project_home
  • documentation_home
  • documentation_type

Optional entries but required in practice

  • install, src_install, import_name, python_requires: all automatically checked, ok when CI is green
  • run_tests: automatic but sanity check
  • mdanalysis_requires: manually double check

Optional entries

  • development_status
  • publications
  • changelog

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very neat package and the two papers are already showing that some serious work went into it. You fulfill almost all of our criteria.

The main thing that we need is a basic set of tests that run some of the functionality and it should go beyond importing the package (although that can be one of the tests).

For most of our projects we include a directory tests in the project (see, for instance, gridData/tests for something that's simpler than the MDAnalysis tests) and in there a tests/datafiles directory (again, see gridData/tests/datafiles) with a way to access the data files (example: gridData/tests/datafiles/__init__.py). We then add a test_<NAME>.py file for each module or each functionality that we want to test. We use pytest as our recommended testing framework but that's ultimately your choice. In our MDAKits Tutorial we also talk about tests; for gridData and example is test_dx.py.

I would recommend that you include a small test trajectory in your package and then write individual tests based on your examples:

Once you have tests, you should be able to run

pytest -v

at the top of your repo and automatically run all tests. If this works, you can then add to the metadata file

run_tests:
  - pytest --pyargs mdance

If you add tests as a PR to your repository https://github.com/mqcomplab/MDANCE we may also be able to give advice.

(We recently held a joint MDAnalysis/MolSSI workshop with an emphasis on packaging. The https://github.com/MDAnalysis/mdgeomkit is a simple example kit that you can look at to learn more about testing.)

@orbeckst
Copy link
Member

Just to add to my review: We are not looking for complete tests of everything. This is hard to do. Instead we'd like to see a few tests that run some of your typical functions. This is typically something that you are already doing in the examples. You then add an appprpriate assert statement at the end of the test function to assert that the test output is the same as a correct answer that you stored in the tests.

@orbeckst
Copy link
Member

orbeckst commented Jul 11, 2024

Also, even though I included gridDataFormats as a real-life example for testing, you might be better off just looking at https://github.com/MDAnalysis/mdgeomkit/tree/main/mdgeom/tests, which uses no outdated constructs... when I started my review I only thought about mdgeomkit half-way through writing...

@lexin-chen
Copy link
Contributor Author

Thanks @orbeckst for the comprehensive comments! I'll definitely look into the tests and add support for this soon!

@orbeckst
Copy link
Member

Hi @lexin-chen , do you need help with anything? It would be great to have the package registered before the UGM!

@lexin-chen
Copy link
Contributor Author

lexin-chen commented Jul 31, 2024

Thanks for checking up with me. I have written some tests on the tools and inputs folder. Once I work on the tests for nani I will submit it prob by the beginning of next week!

EDIT: okay I have created tests for the mdance/tools, mdance/inputs, and mdance/cluster/nani and the db analysis as per your suggestions! Thanks again!

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that you added tests. I haven't had time to install mdance and try the tests locally. I think there are different changes to be made in the file so that tests can run. You should be able to just accept the changes so that can we see if it works.

mdakits/mdance/metadata.yaml Outdated Show resolved Hide resolved
mdakits/mdance/metadata.yaml Outdated Show resolved Hide resolved
lexin-chen and others added 2 commits July 31, 2024 20:36
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent that you added tests. I also tried them locally and they are clearly working. ✅

The one current failure in the workflow is not a real show-stopper but it would be easy to fix: Add a tag for your releases in GitHub; the easiest way to do this is to create an initial release in https://github.com/mqcomplab/MDANCE/releases . This is needed so that we can identify your latest released version of the code. You should run the command

git describe --tags `git rev-list --tags --max-count=1`

and it should return something like v0.2.5 or 0.2.5.

At the moment it does not find anything and reports the error fatal: No names found, cannot describe anything.

If you don't want to add tags at the moment then let me know and I will approve. Just know that this will probably lead to some red "failing" badges.

Btw, I see that you're defining the version in your pyproject.toml. We found it extremely convenient to define the version through the tag itself and then use versioningit to include the version from the tag into the package. In this way, there's exactly one place where the version is defined.

@orbeckst orbeckst added the new create a new MDAKit label Aug 1, 2024
Because tests are included in the package, we can 
try to run them with pytest and without cloning the source.
This reverts commit 90ffdd2.

pytest --pyargs mdance does not work because ./tests is not included
in the package. We need to check out the source.
@lexin-chen
Copy link
Contributor Author

Hi @orbeckst , thanks for the great suggestions! as per your advice, I did create tags and release and now I have dynamic versioning as part of the package. Is there a way to recheck for the GitHub actions since I have no changes to the metadata file?

@orbeckst
Copy link
Member

orbeckst commented Aug 2, 2024

Excellent!

I just restarted the failed job — I can do this through the GH actions interface.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lexin-chen this is all looking excellent. Thank you so much for working with me and really taking my suggestions on board.

I will merge this PR and then MDANCE is officially registered! Congratulations 🎉 !

@orbeckst orbeckst merged commit fc17d9e into MDAnalysis:main Aug 2, 2024
5 checks passed
@orbeckst
Copy link
Member

orbeckst commented Aug 2, 2024

MDANCE is officially registered!

@lexin-chen
Copy link
Contributor Author

Thanks so much for your help and tips @orbeckst! I learned a lot from this experience too. Very glad it worked out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new create a new MDAKit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants