Skip to content

Latest commit

 

History

History
76 lines (52 loc) · 4.17 KB

CONTRIBUTING.md

File metadata and controls

76 lines (52 loc) · 4.17 KB

Contributing guide

Probatus aims to provide a set of tools that can speed up common workflows around validating regressors & classifiers and the data used to train them. We're very much open to contributions but there are some things to keep in mind:

  • Discuss the feature and implementation you want to add on Github before you write a PR for it. On disagreements, maintainer(s) will have the final word.
  • Features need a somewhat general use case. If the use case is very niche it will be hard for us to consider maintaining it.
  • If you’re going to add a feature, consider if you could help out in the maintenance of it.
  • When issues or pull requests are not going to be resolved or merged, they should be closed as soon as possible. This is kinder than deciding this after a long period. Our issue tracker should reflect work to be done.

That said, there are many ways to contribute to Probatus, including:

  • Contribution to code
  • Improving the documentation
  • Reviewing merge requests
  • Investigating bugs
  • Reporting issues

Starting out with open source? See the guide How to Contribute to Open Source and have a look at our issues labelled good first issue.

Setup

Development install:

pip install -e '.[all]'

Unit testing:

pytest

We use pre-commit hooks to ensure code styling. Install with:

pre-commit install

Now if you install it (which you are encouraged to do), you are encouraged to do the following command before committing your work:

pre-commit run --all-files

This will allow you to quickly see if the work you made contains some adaptions that you still might need to make before a pull request is accepted.

Standards

  • Python 3.9+
  • Follow PEP8 as closely as possible (except line length)
  • google docstring format
  • Git: Include a short description of what and why was done, how can be seen in the code. Use present tense, imperative mood
  • Git: limit the length of the first line to 72 chars. You can use multiple messages to specify a second (longer) line: git commit -m "Patch load function" -m "This is a much longer explanation of what was done"

Code structure

  • Model validation modules assume that trained models passed for validation are developed in a scikit-learn framework (i.e. have predict_proba and other standard functions), or follow a scikit-learn API e.g. XGBoost.
  • Every python file used for model validation needs to be in /probatus/
  • Class structure for a given module should have a base class and specific functionality classes that inherit from base. If a given module implements only a single way of computing the output, the base class is not required.
  • Functions should not be as short as possible in terms of lines of code. If a lot of code is needed, try to put together snippets of code into other functions. This make the code more readable, and easier to test.
  • Classes follow the probatus API structure:
    • Each class implements fit(), compute() and fit_compute() methods. fit() is used to fit an object with provided data (unless no fit is required), and compute() calculates the output e.g. DataFrame with a report for the user. Lastly, fit_compute() applies one after the other.
    • If applicable, the plot() method presents the user with the appropriate graphs.
    • For compute() and plot(), check if the object is fitted first.

Documentation

Documentation is a very crucial part of the project because it ensures usability of the package. We develop the docs in the following way:

  • We use mkdocs with mkdocs-material theme. The docs/ folder contains all the relevant documentation.
  • We use mkdocs serve to view the documentation locally. Use it to test the documentation everytime you make any changes.
  • Maintainers can deploy the docs using mkdocs gh-deploy. The documentation is deployed to https://ing-bank.github.io/probatus/.