Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add image_diff comparison method for test output verification using images #17556

Merged
merged 40 commits into from
Mar 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
8318d5c
Add "image_diff" comparison for test files
kostrykin Feb 26, 2024
94352dc
Add `files_image_diff`
kostrykin Feb 27, 2024
bcf78ec
Add tests for `files_image_diff`
kostrykin Feb 27, 2024
908116c
Add `pillow` to pinned-requirements.txt
kostrykin Feb 27, 2024
b325e27
Fix linting issues
kostrykin Feb 27, 2024
56506e4
Fix linting issues
kostrykin Feb 27, 2024
3af5443
Fix linting issues
kostrykin Feb 27, 2024
a1b51ea
Fix linting issues
kostrykin Feb 27, 2024
b730912
Fix linting issues
kostrykin Feb 27, 2024
5bf2f3f
Fix linting issues
kostrykin Feb 27, 2024
7c447f0
Fix pillow imports
kostrykin Feb 27, 2024
a147d72
Add pillow dependency
kostrykin Feb 27, 2024
ea1c980
Fix pillow usage
kostrykin Feb 27, 2024
0d6077d
Fix bugs
kostrykin Feb 27, 2024
7996703
Fix bugs
kostrykin Feb 27, 2024
f22d051
Fix bugs
kostrykin Feb 27, 2024
e5eb5d4
Fix linting issues
kostrykin Feb 27, 2024
4d800a8
Fix bug
kostrykin Feb 27, 2024
8255e9d
Set default `eps` to 0.01
kostrykin Feb 27, 2024
1937e2b
Fix bug
kostrykin Feb 27, 2024
172188f
Update XSD
kostrykin Feb 27, 2024
d2a40cb
Fix linting issues
kostrykin Feb 27, 2024
2d858f1
Fix bug
kostrykin Feb 27, 2024
b3d91d8
Fix bug
kostrykin Feb 27, 2024
88b707a
Fix bug
kostrykin Feb 27, 2024
cc7791b
Fix bug
kostrykin Feb 27, 2024
3bb63ee
Add functional test for `image_diff`
kostrykin Feb 27, 2024
a72db4c
Fix functional test for `image_diff` comparison
kostrykin Feb 27, 2024
cea6a6c
Add error handler
kostrykin Feb 27, 2024
14e1797
Update functional test
kostrykin Feb 27, 2024
49fe197
Fix bugs
kostrykin Feb 27, 2024
e034928
Fix bugs
kostrykin Feb 27, 2024
8567ace
Fix linting issues
kostrykin Feb 27, 2024
356d3bd
Fix linting issues
kostrykin Feb 27, 2024
aa79578
Fix spelling
kostrykin Feb 28, 2024
9076175
Add linting for `image_diff`
kostrykin Feb 28, 2024
711fd6a
Remove unrequired test file
kostrykin Feb 28, 2024
3d3ad72
Add type annotations
nsoranzo Feb 29, 2024
1a3edae
Add missing package requirements
nsoranzo Feb 29, 2024
62c610c
Move numpy and pillow requirements to new extended-assertions extra
nsoranzo Mar 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lib/galaxy/dependencies/pinned-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ parsley==1.3 ; python_version >= "3.8" and python_version < "3.13"
paste==3.7.1 ; python_version >= "3.8" and python_version < "3.13"
pastedeploy==3.1.0 ; python_version >= "3.8" and python_version < "3.13"
pebble==5.0.6 ; python_version >= "3.8" and python_version < "3.13"
pillow==10.2.0 ; python_version >= "3.8" and python_version < "3.13"
pkgutil-resolve-name==1.3.10 ; python_version >= "3.8" and python_version < "3.9"
promise==2.3 ; python_version >= "3.8" and python_version < "3.13"
prompt-toolkit==3.0.43 ; python_version >= "3.8" and python_version < "3.13"
Expand Down
2 changes: 2 additions & 0 deletions lib/galaxy/tool_util/linters/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,8 @@ def lint(cls, tool_source: "ToolSource", lint_ctx: "LintContext"):
"decompress": ["diff"],
"delta": ["sim_size"],
"delta_frac": ["sim_size"],
"metric": ["image_diff"],
"eps": ["image_diff"],
}
for test_idx, test in enumerate(tests, start=1):
for output in test.xpath(".//*[self::output or self::element or self::discovered_dataset]"):
Expand Down
3 changes: 3 additions & 0 deletions lib/galaxy/tool_util/parser/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
DEFAULT_DELTA = 10000
DEFAULT_DELTA_FRAC = None

DEFAULT_METRIC = "mae"
DEFAULT_EPS = 0.01


def is_dict(item):
return isinstance(item, dict) or isinstance(item, OrderedDict)
Expand Down
5 changes: 5 additions & 0 deletions lib/galaxy/tool_util/parser/xml.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
from galaxy.tool_util.parser.util import (
DEFAULT_DELTA,
DEFAULT_DELTA_FRAC,
DEFAULT_EPS,
DEFAULT_METRIC,
)
from galaxy.util import (
Element,
Expand Down Expand Up @@ -788,6 +790,9 @@ def __parse_test_attributes(output_elem, attrib, parse_elements=False, parse_dis
attributes["decompress"] = string_as_bool(attrib.pop("decompress", False))
# `location` may contain an URL to a remote file that will be used to download `file` (if not already present on disk).
location = attrib.get("location")
# Parameters for "image_diff" comparison
attributes["metric"] = attrib.pop("metric", DEFAULT_METRIC)
attributes["eps"] = float(attrib.pop("eps", DEFAULT_EPS))
if location and file is None:
file = os.path.basename(location) # If no file specified, try to get filename from URL last component
attributes["location"] = location
Expand Down
85 changes: 84 additions & 1 deletion lib/galaxy/tool_util/verify/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import hashlib
import json
import logging
import math
import os
import os.path
import re
Expand All @@ -14,23 +15,38 @@
Any,
Callable,
Dict,
List,
Optional,
TYPE_CHECKING,
)

try:
import numpy
except ImportError:
pass
try:
import pysam
except ImportError:
pysam = None # type: ignore[assignment]
pass
try:
from PIL import Image
except ImportError:
pass

from galaxy.tool_util.parser.util import (
DEFAULT_DELTA,
DEFAULT_DELTA_FRAC,
DEFAULT_EPS,
DEFAULT_METRIC,
)
from galaxy.util import unicodify
from galaxy.util.compression_utils import get_fileobj
from .asserts import verify_assertions
from .test_data import TestDataResolver

if TYPE_CHECKING:
import numpy.typing

log = logging.getLogger(__name__)

DEFAULT_TEST_DATA_RESOLVER = TestDataResolver()
Expand Down Expand Up @@ -171,6 +187,8 @@ def get_filename(filename: str) -> str:
files_delta(local_name, temp_name, attributes=attributes)
elif compare == "contains":
files_contains(local_name, temp_name, attributes=attributes)
elif compare == "image_diff":
files_image_diff(local_name, temp_name, attributes=attributes)
else:
raise Exception(f"Unimplemented Compare type: {compare}")
except AssertionError as err:
Expand Down Expand Up @@ -432,3 +450,68 @@ def files_contains(file1, file2, attributes=None):
line_diff_count += 1
if line_diff_count > lines_diff:
raise AssertionError(f"Failed to find '{contains}' in history data. (lines_diff={lines_diff}).")


def _multiobject_intersection_over_union(
mask1: "numpy.typing.NDArray", mask2: "numpy.typing.NDArray", repeat_reverse: bool = True
) -> List["numpy.floating"]:
iou_list = []
for label1 in numpy.unique(mask1):
cc1 = mask1 == label1
cc1_iou_list = []
for label2 in numpy.unique(mask2[cc1]):
cc2 = mask2 == label2
cc1_iou_list.append(intersection_over_union(cc1, cc2))
iou_list.append(max(cc1_iou_list))
if repeat_reverse:
iou_list.extend(_multiobject_intersection_over_union(mask2, mask1, repeat_reverse=False))
return iou_list


def intersection_over_union(mask1: "numpy.typing.NDArray", mask2: "numpy.typing.NDArray") -> "numpy.floating":
assert mask1.dtype == mask2.dtype
assert mask1.ndim == mask2.ndim == 2
assert mask1.shape == mask2.shape
if mask1.dtype == bool:
return numpy.logical_and(mask1, mask2).sum() / numpy.logical_or(mask1, mask2).sum()
else:
return min(_multiobject_intersection_over_union(mask1, mask2))


def get_image_metric(
attributes: Dict[str, Any]
) -> Callable[["numpy.typing.NDArray", "numpy.typing.NDArray"], "numpy.floating"]:
metric_name = attributes.get("metric", DEFAULT_METRIC)
metrics = {
"mae": lambda arr1, arr2: numpy.abs(arr1 - arr2).mean(),
# Convert to float before squaring to prevent overflows
"mse": lambda arr1, arr2: numpy.square((arr1 - arr2).astype(float)).mean(),
"rms": lambda arr1, arr2: math.sqrt(numpy.square((arr1 - arr2).astype(float)).mean()),
"fro": lambda arr1, arr2: numpy.linalg.norm((arr1 - arr2).reshape(1, -1), "fro"),
"iou": lambda arr1, arr2: 1 - intersection_over_union(arr1, arr2),
}
try:
return metrics[metric_name]
except KeyError:
raise ValueError(f'No such metric: "{metric_name}"')


def files_image_diff(file1: str, file2: str, attributes: Optional[Dict[str, Any]] = None) -> None:
"""Check the pixel data of 2 image files for differences."""
attributes = attributes or {}

with Image.open(file1) as im1:
arr1 = numpy.array(im1)
with Image.open(file2) as im2:
arr2 = numpy.array(im2)

if arr1.dtype != arr2.dtype:
raise AssertionError(f"Image data types did not match ({arr1.dtype}, {arr2.dtype}).")

if arr1.shape != arr2.shape:
raise AssertionError(f"Image dimensions did not match ({arr1.shape}, {arr2.shape}).")

distance = get_image_metric(attributes)(arr1, arr2)
distance_eps = attributes.get("eps", DEFAULT_EPS)
if distance > distance_eps:
raise AssertionError(f"Image difference {distance} exceeds eps={distance_eps}.")
31 changes: 28 additions & 3 deletions lib/galaxy/tool_util/xsd/galaxy.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -1659,7 +1659,7 @@ Different methods can be chosen for the comparison with the local file specified
by ``file`` using the ``compare`` attribute:

- ``diff``: uses diff to compare the history data set and the file provided by
``file``. Compressed files are decompressed before the compariopm if
``file``. Compressed files are decompressed before the comparison if
``decompress`` is set to ``true``. BAM files are converted to SAM before the
comparision and for pdf some special rules are implemented. The number of
allowed differences can be set with ``lines_diff``. If ``sort="true"`` history
Expand All @@ -1677,6 +1677,10 @@ by ``file`` using the ``compare`` attribute:
- ``sim_size``: compares the size of the history dataset and the ``file`` subject to
the values of the ``delta`` and ``delta_frac`` attributes. Note that a ``has_size``
content assertion should be preferred, because this avoids storing the test file.
- ``image_diff``: compares the pixel data of the history data set and the file
provided by ``file``. The difference of the images is quantified according to their
pixel-wise distance with respect to a specific ``metric``. The check passes if the
distance is not larger than the value set for ``eps``. Only 2-D images can be used.

]]></xs:documentation>
</xs:annotation>
Expand Down Expand Up @@ -1813,6 +1817,13 @@ will be infered from the last component of the location URL. For example, `locat
If you specify a `checksum`, it will be also used to check the integrity of the download.</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="metric" type="TestOutputMetricType" default="mae">
</xs:attribute>
<xs:attribute name="eps" type="xs:float" default="0.01">
<xs:annotation>
<xs:documentation xml:lang="en">If ``compare`` is set to ``image_diff``, this is the maximum allowed distance between the data set that is generated in the test and the file in ``test-data/`` that is referenced by the ``file`` attribute, with distances computed with respect to the specified ``metric``. Default value is 0.01.</xs:documentation>
</xs:annotation>
</xs:attribute>
</xs:complexType>
<xs:group name="TestOutputElement">
<xs:choice>
Expand Down Expand Up @@ -7450,15 +7461,29 @@ and ``bibtex`` are the only supported options.</xs:documentation>
<xs:annotation>
<xs:documentation xml:lang="en">Type of comparison to use when comparing
test generated output files to expected output files. Currently valid value are
``diff`` (the default), ``re_match``, ``re_match_multiline``,
and ``contains``. In addition there is ``sim_size`` which is discouraged in favour of a ``has_size`` assertion.</xs:documentation>
``diff`` (the default), ``re_match``, ``re_match_multiline``, ``contains``,
and ``image_diff``. In addition there is ``sim_size`` which is discouraged in
favour of a ``has_size`` assertion.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="diff"/>
<xs:enumeration value="re_match"/>
<xs:enumeration value="sim_size"/>
<xs:enumeration value="re_match_multiline"/>
<xs:enumeration value="contains"/>
<xs:enumeration value="image_diff"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="TestOutputMetricType">
<xs:annotation>
<xs:documentation xml:lang="en">If ``compare`` is set to ``image_diff``, this is the metric used to compute the distance between images for quantification of their difference. For intensity images, possible metrics are *mean absolute error* (``mae``, the default), *mean squared error* (``mse``), *root mean squared* error (``rms``), and the *Frobenius norm* (``fro``). In addition, for binary images and label maps (with multiple objects), ``iou`` can be used to compute *one minus* the *intersection over the union* (IoU). Object correspondances are established by taking the pair of objects, for which the IoU is highest, and the distance of the images is the worst value determined for any pair of corresponding objects.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="mae"/>
<xs:enumeration value="mse"/>
<xs:enumeration value="rms"/>
<xs:enumeration value="fro"/>
<xs:enumeration value="iou"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="PermissiveBoolean">
Expand Down
6 changes: 2 additions & 4 deletions lib/galaxy/util/checkers.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,11 +187,9 @@ def iter_zip(file_path: str):
yield (z.open(f), f)


def check_image(file_path: str):
def check_image(file_path: str) -> bool:
"""Simple wrapper around image_type to yield a True/False verdict"""
if image_type(file_path):
return True
return False
return bool(image_type(file_path))


COMPRESSION_CHECK_FUNCTIONS: Dict[str, CompressionChecker] = {
Expand Down
24 changes: 12 additions & 12 deletions lib/galaxy/util/image_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@

import imghdr
import logging
from typing import (
List,
Optional,
)

try:
import Image as PIL
from PIL import Image
except ImportError:
try:
from PIL import Image as PIL
except ImportError:
PIL = None
PIL = None

log = logging.getLogger(__name__)


def image_type(filename):
def image_type(filename: str) -> Optional[str]:
fmt = None
if PIL is not None:
if Image is not None:
try:
im = PIL.open(filename)
fmt = im.format
im.close()
with Image.open(filename) as im:
fmt = im.format
except Exception:
# We continue to try with imghdr, so this is a rare case of an
# exception we expect to happen frequently, so we're not logging
Expand All @@ -30,10 +30,10 @@ def image_type(filename):
if fmt:
return fmt.upper()
else:
return False
return None


def check_image_type(filename, types):
def check_image_type(filename: str, types: List[str]) -> bool:
fmt = image_type(filename)
if fmt in types:
return True
Expand Down
1 change: 1 addition & 0 deletions packages/data/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ include_package_data = True
install_requires =
galaxy-files
galaxy-objectstore
galaxy-tool-util
galaxy-util[template]
alembic
alembic-utils
Expand Down
2 changes: 1 addition & 1 deletion packages/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ while read -r package_dir || [ -n "$package_dir" ]; do # https://stackoverflow.
if [ "$package_dir" = "util" ]; then
pip install -e '.[template,jstree]'
elif [ "$package_dir" = "tool_util" ]; then
pip install -e '.[cwl,mulled,edam]'
pip install -e '.[cwl,mulled,edam,extended-assertions]'
else
pip install -e '.'
fi
Expand Down
4 changes: 4 additions & 0 deletions packages/tool_util/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ mulled =
Whoosh
edam =
edam-ontology
extended-assertions =
numpy
pysam
pillow

[options.packages.find]
exclude =
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ paramiko = "!=2.9.0, !=2.9.1" # https://github.com/paramiko/paramiko/issues/196
Parsley = "*"
Paste = "*"
pebble = "*"
pillow = "*"
psutil = "*"
pulsar-galaxy-lib = ">=0.15.0.dev0"
pycryptodome = "*"
Expand Down
Binary file added test-data/im1_uint8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added test-data/im1_uint8.tif
Binary file not shown.
Binary file added test-data/im2_a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added test-data/im2_b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added test-data/im3_a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added test-data/im3_b.tif
Binary file not shown.
36 changes: 36 additions & 0 deletions test/functional/tools/image_diff.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
<tool id="image_diff" name="image_diff" version="0.1.0">
<command><![CDATA[
cp '$in' '$out'
]]></command>
<inputs>
<param name="in" type="data" format="data"/>
</inputs>
<outputs>
<data name="out" format="data"/>
</outputs>
<tests>
<!-- test pair of equal images (but different formats) -->
<test>
<param name="in" value="im1_uint8.png" />
<output name="out" value="im1_uint8.png" compare="image_diff" metric="mae" />
</test>
<test>
<param name="in" value="im1_uint8.tif" />
<output name="out" value="im1_uint8.png" compare="image_diff" metric="mse" eps="0" />
</test>
<test>
<param name="in" value="im1_uint8.png" />
<output name="out" value="im1_uint8.png" compare="image_diff" metric="rms" eps="0" />
</test>
<!-- test pair of different images -->
<test>
<param name="in" value="im2_a.png" />
<output name="out" value="im2_b.png" compare="image_diff" metric="mae" eps="0.25" />
</test>
<!-- test RGB data -->
<test>
<param name="in" value="im3_a.png" />
<output name="out" value="im3_b.tif" compare="image_diff" metric="fro" eps="256" />
</test>
</tests>
</tool>
1 change: 1 addition & 0 deletions test/functional/tools/sample_tool_conf.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
<tool file="param_text_option.xml" />
<tool file="column_param.xml" />
</section>
<tool file="image_diff.xml"/>
<tool file="output_format_input.xml"/>
<tool file="ucsc_tablebrowser.xml"/>
<tool file="test_data_source.xml"/>
Expand Down
Loading
Loading