Releases · KaveIO/PhiK

Version 0.12.4, Jan 2024

Add support for Python 3.12.

ENH: added plotting kwargs to correlation_report function.
#58

FIX: fix of bin edge values they are rounded with 1e-14
#60

FIX: numpy random multinomial requires integer number of samples (for nixOS)
#73

FIX: pandas deprecation warning
#74

Drop support for Python 3.7, has reached end of life.

What's Changed

Bump actions/download-artifact from 2 to 3 by @dependabot in #53

Bump actions/upload-artifact from 2 to 3 by @dependabot in #52

Add Valgrind to CICD by @RUrlus in #54

Bump pypa/gh-action-pypi-publish from 1.5.0 to 1.5.1 by @dependabot in #57

Python 3.11 support by @RUrlus in #62

Version 0.12.0, July 2021

C++ Extension


Phi_K contains an optional C++ extension to compute the significance matrix using the `hypergeometric` method
(also called the`Patefield` method).

Note that the PyPi distributed wheels contain a pre-build extension for Linux, MacOS and Windows.

A manual (pip) setup will attempt to build and install the extension, if it fails it will install without the extension.
If so, using the `hypergeometric` method without the extension will trigger a
NotImplementedError.

Compiler requirements through Pybind11:

- Clang/LLVM 3.3 or newer (for Apple Xcode's clang, this is 5.0.0 or newer)
- GCC 4.8 or newer
- Microsoft Visual Studio 2015 Update 3 or newer
- Intel classic C++ compiler 18 or newer (ICC 20.2 tested in CI)
- Cygwin/GCC (previously tested on 2.5.1)
- NVCC (CUDA 11.0 tested in CI)
- NVIDIA PGI (20.9 tested in CI)


Other
~~~~~

* You can now manually set the number of parallel jobs in the evaluation of Phi_K or its statistical significance
  (when using MC simulations). For example, to use 4 parallel jobs do:

  .. code-block:: python

    df.phik_matrix(njobs = 4)
    df.significance_matrix(njobs = 4)

  The default value is -1, in which case all available cores are used. When using ``njobs=1`` no parallel processing
  is applied.

* Phi_K can now be calculated with an independent expectation histogram:

  .. code-block:: python

    from phik.phik import phik_from_hist2d

    cols = ["mileage", "car_size"]
    interval_cols = ["mileage"]

    observed = df1[["feature1", "feature2"]].hist2d()
    expected = df2[["feature1", "feature2"]].hist2d()

    phik_value = phik_from_hist2d(observed=observed, expected=expected)

  The expected histogram is taken to be (relatively) large in number of counts
  compared with the observed histogram.

  Or can compare two (pre-binned) datasets against each other directly. Again the expected dataset
  is assumed to be relatively large:

  .. code-block:: python

    from phik.phik import phik_observed_vs_expected_from_rebinned_df

    phik_matrix = phik_observed_vs_expected_from_rebinned_df(df1_binned, df2_binned)

* Added links in the readme to the basic and advanced Phi_K tutorials on google colab.
* Migrated the spark example Phi_K notebook from popmon to directly using histogrammar for histogram creation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 0.12.4, Jan 2024

What's Changed

Contributors

Version 0.12.2, Mar 2022

Version 0.12.0, July 2021

Releases: KaveIO/PhiK

v0.12.4

Version 0.12.4, Jan 2024

v0.12.3

What's Changed

Contributors

v0.12.2

Version 0.12.2, Mar 2022

v0.12.1

v0.12.0

Version 0.12.0, July 2021