merge dev into main (#267)

* [pre-commit.ci] pre-commit autoupdate updates: - [github.com/hadialqattan/pycln: v2.3.0 → v2.4.0](hadialqattan/pycln@v2.3.0...v2.4.0) - [github.com/psf/black: 23.10.1 → 23.11.0](psf/black@23.10.1...23.11.0) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/PyCQA/isort: 5.12.0 → 5.13.2](PyCQA/isort@5.12.0...5.13.2) - [github.com/psf/black: 23.11.0 → 23.12.1](psf/black@23.11.0...23.12.1) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/asottile/pyupgrade: v3.15.0 → v3.15.1](asottile/pyupgrade@v3.15.0...v3.15.1) - [github.com/psf/black: 23.12.1 → 24.2.0](psf/black@23.12.1...24.2.0) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removing references to final score * Fixing linting * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Improving code coverage (hopefully) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Making pretty_print a non-private method for code coverage * Fixing merge conflicts * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Undoing accidental change * Update lint.yml to use actions/checkoutv4 Signed-off-by: Jim-smith <jim-smith@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update sphinx-docs.yml to use actions/checkoutv4 Signed-off-by: Jim-smith <jim-smith@users.noreply.github.com> * Update test.yml to use actions/checkoutv4 Signed-off-by: Jim-smith <jim-smith@users.noreply.github.com> * Update tests.ymlto use actions/checkoutv4 Signed-off-by: Jim-smith <jim-smith@users.noreply.github.com> * Update likelihood_attack.py Signed-off-by: Jim-smith <jim-smith@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update dependencies (#262) * update syntax for numpy>1.2 * update syntax for newer scikit-learn * update syntax for newer xgboost * update syntax for newer scikit-learn * update install dependencies * remove support for python3.8 * make DP test as approx equal * test python 3.12 * revert python 3.12 changes * update tests CI * add github action autoupdater * safemodel package installation optional * fix pylint warning * refactor tests * make common test module relative * fix pylint warnings * [pre-commit.ci] pre-commit autoupdate (#261) updates: - [github.com/pre-commit/pre-commit-hooks: v4.5.0 → v4.6.0](pre-commit/pre-commit-hooks@v4.5.0...v4.6.0) - [github.com/asottile/pyupgrade: v3.15.1 → v3.15.2](asottile/pyupgrade@v3.15.1...v3.15.2) - [github.com/psf/black: 24.2.0 → 24.4.0](psf/black@24.2.0...24.4.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Richard Preen <rpreen@gmail.com> * fix codecov reporting (#265) * fix codecov reporting * fix codecov reporting * fix codecov reporting * add new datasets (#257) * Clean up uncessary code, added back RDMP * Revert "Clean up uncessary code, added back RDMP" This reverts commit 0d179dc. * Removed texas, added back rdmp * Removing texas dataset from tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing merge conflicts * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updating tests to cover new tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removing RDMP test until the dataset can be loaded in * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yolaj-nhs <127405111+yolaj-nhs@users.noreply.github.com> Co-authored-by: yolaj-nhs <yola.jones@nhs.scot> Co-authored-by: yolaj-nhs <127405111+yolaj-nhs@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Richard Preen <rpreen@gmail.com> --------- Signed-off-by: Jim-smith <jim-smith@users.noreply.github.com> Signed-off-by: yolaj-nhs <127405111+yolaj-nhs@users.noreply.github.com> Co-authored-by: Jim-smith <jim-smith@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: yolaj-nhs <yola.jones@nhs.scot> Co-authored-by: albacrespi <acrespi001@dundee.ac.uk> Co-authored-by: yolaj-nhs <127405111+yolaj-nhs@users.noreply.github.com>
AI-SDC · Apr 24, 2024 · 7c0bcdd · 7c0bcdd
1 parent a83746e
commit 7c0bcdd
Show file tree

Hide file tree

Showing 57 changed files with 641 additions and 655 deletions.
diff --git a/.github/dependabot.yml b/.github/dependabot.yml
@@ -0,0 +1,11 @@
+---
+# Dependabot configuration.
+
+version: 2
+
+updates:
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "monthly"
+...
diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml
@@ -14,13 +14,12 @@ jobs:
 
     steps:
       - name: Checkout
-        uses: actions/checkout@v3
+        uses: actions/checkout@v4
 
       - name: Install dependencies
         run: |
           python -m pip install --upgrade pip
-          pip install pylint pytest pytest-cov
-          if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+          pip install .[test] pylint
 
       - name: pylint
         run: |

diff --git a/.github/workflows/sphinx-docs.yml b/.github/workflows/sphinx-docs.yml
@@ -17,12 +17,12 @@ jobs:
 
     steps:
       - name: Checkout
-        uses: actions/checkout@v3
+        uses: actions/checkout@v4
 
       - name: Run sphinx
         run: |
-          pip install -r ./docs/requirements.txt
-          sphinx-build -D todo_include_todos=0 ./docs/source ./docs/_build/html/
+          pip install .[doc]
+          sphinx-build ./docs/source ./docs/_build/html/
 
       - name: Commit changes
         run: |

diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -14,18 +14,21 @@ jobs:
 
     steps:
       - name: Checkout
-        uses: actions/checkout@v3
+        uses: actions/checkout@v4
 
       - name: Install
         run: |
-          python -m pip install --upgrade pip
-          pip install pytest pytest-cov
-          pip install .
+          pip install --upgrade pip
+          pip install .[test]
 
-      - name: pytest and report coverage
+      - name: Generate coverage report
         run: |
           pytest --cov=./ --cov-report=xml
-          curl -Os https://uploader.codecov.io/latest/linux/codecov
-          chmod +x codecov
-          ./codecov
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        with:
+          fail_ci_if_error: true
+          token: ${{ secrets.CODECOV_TOKEN }}
+          verbose: true
 ...
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -11,20 +11,20 @@ jobs:
       fail-fast: false
       matrix:
         os: [ubuntu-latest, macos-latest, windows-latest]
-        python-version: ["3.8", "3.9", "3.10"]
+        python-version: ["3.9", "3.10", "3.11"]
 
     steps:
       - name: Checkout
-        uses: actions/checkout@v3
+        uses: actions/checkout@v4
 
       - name: Setup Python
-        uses: actions/setup-python@v4
+        uses: actions/setup-python@v5
         with:
           python-version: ${{ matrix.python-version }}
 
-      - name: Install and pytest
-        run: pip install . pytest
+      - name: Install
+        run: pip install .[test]
 
-      - name: Run pytest
+      - name: Run Tests
         run: pytest .
 ...
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -8,7 +8,7 @@ repos:
 
   # Standard hooks
   - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v4.5.0
+    rev: v4.6.0
     hooks:
       - id: check-added-large-files
       - id: check-merge-conflict
@@ -33,27 +33,27 @@ repos:
 
   # Autoremoves unused imports
   - repo: https://github.com/hadialqattan/pycln
-    rev: "v2.3.0"
+    rev: "v2.4.0"
     hooks:
       - id: pycln
         stages: [manual]
 
   # Sort includes
   - repo: https://github.com/asottile/pyupgrade
-    rev: v3.15.0
+    rev: v3.15.2
     hooks:
       - id: pyupgrade
 
   # Upgrade old Python syntax
   - repo: https://github.com/PyCQA/isort
-    rev: 5.12.0
+    rev: 5.13.2
     hooks:
       - id: isort
         args: ["--profile", "black"]
 
   # Black format Python and notebooks
   - repo: https://github.com/psf/black
-    rev: 23.10.1
+    rev: 24.4.0
     hooks:
       - id: black-jupyter
 

diff --git a/README.md b/README.md
@@ -33,60 +33,52 @@ A collection of user guides can be found in the 'user_stories' folder of this re
 
 Documentation is hosted here: https://ai-sdc.github.io/AI-SDC/
 
-## Quick Start
+## Installation / End-user
 
-### Development
-
-Clone the repository and install the dependencies (safest in a virtual env):
+[![PyPI package](https://img.shields.io/pypi/v/aisdc.svg)](https://pypi.org/project/aisdc)
 
-```
-$ git clone https://github.com/AI-SDC/AI-SDC.git
-$ cd AI-SDC
-$ pip install -r requirements.txt
-```
+Install `aisdc` (safest in a virtual env) and manually copy the [`examples`](examples/) and [`example_notebooks`](example_notebooks/).
 
-Then run the tests:
+To install only the base package, which includes the attacks used for assessing privacy:
 
 ```
-$ pip install pytest
-$ pytest .
+$ pip install aisdc
 ```
 
-Or run an example:
+To install the base package and the safemodel package, which includes defensive wrappers for popular ML frameworks including [scikit-learn](https://scikit-learn.org) and [Keras](https://keras.io):
 
 ```
-$ python -m examples.lira_attack_example
+$ pip install aisdc[safemodel]
 ```
 
-### Installation / End-user
+## Running
 
-[![PyPI package](https://img.shields.io/pypi/v/aisdc.svg)](https://pypi.org/project/aisdc)
+To run an example, simply execute the desired script or start up `jupyter notebook` and run one of the notebooks.
 
-Install `aisdc` (safest in a virtual env) and manually copy the `examples` and `example_notebooks`.
+For example, to run the `lira_attack_example.py`:
 
 ```
-$ pip install aisdc
+$ python -m lira_attack_example
 ```
 
-Then to run an example:
+## Development
+
+Clone the repository and install the local package including all dependencies (safest in a virtual env):
 
 ```
-$ python attribute_inference_example.py
+$ git clone https://github.com/AI-SDC/AI-SDC.git
+$ cd AI-SDC
+$ pip install .[test]
 ```
 
-Or start up `jupyter notebook` and run an example.
-
-Alternatively, you can clone the repo and install:
+Then run the tests:
 
 ```
-$ git clone https://github.com/AI-SDC/AI-SDC.git
-$ cd AI-SDC
-$ pip install .
+$ pytest .
 ```
 
 ---
 
-
 This work was funded by UK Research and Innovation under Grant Numbers MC_PC_21033  and MC_PC_23006 as part of Phase 1 of the DARE UK (Data and Analytics Research Environments UK) programme (https://dareuk.org.uk/), delivered in partnership with Health Data Research UK (HDR UK) and Administrative Data Research UK (ADR UK). The specific projects were Semi-Automatic checking of Research Outputs (SACRO -MC_PC_23006) and   Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER - MC_PC_21033). This project has also been supported by MRC and EPSRC [grant number MR/S010351/1]: PICTURES.
 
 <img src="docs/source/images/UK_Research_and_Innovation_logo.svg" width="20%" height="20%" padding=20/> <img src="docs/source/images/health-data-research-uk-hdr-uk-logo-vector.png" width="10%" height="10%" padding=20/> <img src="docs/source/images/logo_print.png" width="15%" height="15%" padding=20/>
diff --git a/aisdc/attacks/attack_report_formatter.py b/aisdc/attacks/attack_report_formatter.py
@@ -261,8 +261,9 @@ def process_dict(self):
 
         output = {}
 
-        msg = "Final score (scale of 0-5, where 0 is least disclosive and 5 is recommend rejection)"
-        output[msg] = summarised_score
+        # msg = "Final score (scale of 0-5, where 0 is least disclosive and 5 is recommend
+        # rejection)"
+        # output[msg] = summarised_score
 
         return output
 
@@ -455,17 +456,6 @@ def __init__(self):
         self.support_rejection = []
         self.support_release = []
 
-    def _pretty_print(self, report: dict, title) -> str:
-        """Function that formats JSON code to make it more readable for TREs."""
-
-        returned_string = str(title) + "\n"
-
-        for key in report.keys():
-            returned_string = returned_string + key + "\n"
-            returned_string = returned_string + pprint.pformat(report[key]) + "\n\n"
-
-        return returned_string
-
     def _process_target_json(self):
         """Function that creates a summary of a target model JSON file."""
 
@@ -513,6 +503,17 @@ def _process_target_json(self):
 
         self.text_out.append(output_string)
 
+    def pretty_print(self, report: dict, title) -> str:
+        """Function that formats JSON code to make it more readable for TREs."""
+
+        returned_string = str(title) + "\n"
+
+        for key in report.keys():
+            returned_string = returned_string + key + "\n"
+            returned_string = returned_string + pprint.pformat(report[key]) + "\n\n"
+
+        return returned_string
+
     def process_attack_target_json(
         self, attack_filename: str, target_filename: str = None
     ):
@@ -541,7 +542,7 @@ def process_attack_target_json(
             self.support_rejection += returned[1]
             self.support_release += returned[2]
 
-        output_string = self._pretty_print(output, "ATTACK JSON RESULTS")
+        output_string = self.pretty_print(output, "ATTACK JSON RESULTS")
 
         self.text_out.append(output_string)
 

diff --git a/aisdc/attacks/likelihood_attack.py b/aisdc/attacks/likelihood_attack.py
@@ -1,4 +1,5 @@
 """Likelihood testing scenario from https://arxiv.org/pdf/2112.03570.pdf."""
+
 # pylint: disable = invalid-name
 # pylint: disable = too-many-branches
 
@@ -63,7 +64,7 @@ def _logit(p: float) -> float:
     If p is close to 0 or 1, evaluating the log will result in numerical instabilities.
     This code thresholds p at EPS and 1 - EPS where EPS defaults at 1e-16.
     """
-    if p > 1 - EPS:
+    if p > 1 - EPS:  # pylint:disable=consider-using-min-builtin
         p = 1 - EPS
     p = max(p, EPS)
     li = np.log(p / (1 - p))

diff --git a/aisdc/attacks/multiple_attacks.py b/aisdc/attacks/multiple_attacks.py
@@ -3,6 +3,7 @@
 and attribute inference attack using a single configuration file
 with multiple attack configuration.
 """
+
 from __future__ import annotations
 
 import argparse

diff --git a/aisdc/attacks/report.py b/aisdc/attacks/report.py
@@ -1,4 +1,5 @@
 """Code for automatic report generation."""
+
 import abc
 import json
 import os

diff --git a/aisdc/attacks/structural_attack.py b/aisdc/attacks/structural_attack.py
@@ -119,6 +119,8 @@ def get_unnecessary_risk(model: BaseEstimator) -> bool:
 
     elif isinstance(model, XGBClassifier):
         n_estimators = model.n_estimators
+        if n_estimators is None:
+            n_estimators = 1000
         if (
             (
                 max_depth > 3.5

diff --git a/aisdc/attacks/worst_case_attack.py b/aisdc/attacks/worst_case_attack.py
@@ -437,9 +437,9 @@ def _get_global_metrics(self, attack_metrics: list) -> dict:
                 0.5, m["n_pos_test_examples"], m["n_neg_test_examples"]
             )
 
-            global_metrics[
-                "null_auc_3sd_range"
-            ] = f"{0.5 - 3*auc_std:.4f} -> {0.5 + 3*auc_std:.4f}"
+            global_metrics["null_auc_3sd_range"] = (
+                f"{0.5 - 3*auc_std:.4f} -> {0.5 + 3*auc_std:.4f}"
+            )
             global_metrics["n_sig_auc_p_vals"] = self._get_n_significant(
                 auc_p_vals, self.p_thresh
             )
@@ -603,9 +603,9 @@ def _get_attack_metrics_instances(self) -> dict:
             attack_metrics_instances["instance_" + str(rep)] = self.attack_metrics[rep]
 
         attack_metrics_experiment["attack_instance_logger"] = attack_metrics_instances
-        attack_metrics_experiment[
-            "attack_metric_failfast_summary"
-        ] = self.attack_metric_failfast_summary.get_attack_summary()
+        attack_metrics_experiment["attack_metric_failfast_summary"] = (
+            self.attack_metric_failfast_summary.get_attack_summary()
+        )
 
         return attack_metrics_experiment
 
@@ -617,14 +617,14 @@ def _get_dummy_attack_metrics_experiments_instances(self) -> dict:
             temp_dummy_attack_metrics = self.dummy_attack_metrics[exp_rep]
             dummy_attack_metric_instances = {}
             for rep, _ in enumerate(temp_dummy_attack_metrics):
-                dummy_attack_metric_instances[
-                    "instance_" + str(rep)
-                ] = temp_dummy_attack_metrics[rep]
+                dummy_attack_metric_instances["instance_" + str(rep)] = (
+                    temp_dummy_attack_metrics[rep]
+                )
             temp = {}
             temp["attack_instance_logger"] = dummy_attack_metric_instances
-            temp[
-                "attack_metric_failfast_summary"
-            ] = self.dummy_attack_metric_failfast_summary[exp_rep].get_attack_summary()
+            temp["attack_metric_failfast_summary"] = (
+                self.dummy_attack_metric_failfast_summary[exp_rep].get_attack_summary()
+            )
             dummy_attack_metrics_experiments[
                 "dummy_attack_metrics_experiment_" + str(exp_rep)
             ] = temp
@@ -643,9 +643,9 @@ def make_report(self) -> dict:
         output["metadata"] = self.metadata
 
         output["attack_experiment_logger"] = self._get_attack_metrics_instances()
-        output[
-            "dummy_attack_experiments_logger"
-        ] = self._get_dummy_attack_metrics_experiments_instances()
+        output["dummy_attack_experiments_logger"] = (
+            self._get_dummy_attack_metrics_experiments_instances()
+        )
 
         report_dest = os.path.join(self.output_dir, self.report_name)
         json_attack_formatter = GenerateJSONModule(report_dest + ".json")