Releases: tensorflow/model-analysis
Releases · tensorflow/model-analysis
TensorFlow Model Analysis 0.24.0
Major Features and Improvements
- Use TFXIO and batched extractors by default in TFMA.
Bug fixes and other changes
- Updated the type hint of FilterOutSlices.
- Fix issue with precision@k and recall@k giving incorrect values when
negative thresholds are used (i.e. keras defaults). - Fix issue with MultiClassConfusionMatrixPlot being overridden by
MultiClassConfusionMatrix metrics. - Made the Fairness Indicators UI thresholds drop down list sorted.
- Fix the bug that Sort menu is not hidden when there is no model comparison.
- Depends on
absl-py>=0.9,<0.11
. - Depends on
ipython>=7,<8
. - Depends on
pandas>=1.0,<2
. - Depends on
protobuf>=3.9.2,<4
. - Depends on
tensorflow-metadata>=0.24.0,<0.25.0
. - Depends on
tfx-bsl>=0.24.0,<0.25.0
.
Breaking changes
- Query based metrics evaluations that make use of
MetricsSpecs.query_key
are now passedtfma.Extracts
with leaf values that are of type
np.ndarray
containing an additional dimension representing the values
matched by the query (e.g. if the labels and predictions were previously 1D
arrays, they will now be 2D arrays where the first dimension's size is equal
to the number of examples matching the query key). Previously a list of
tfma.Extracts
was passed instead. This allows user's to now add custom
metrics based ontf.keras.metrics.Metric
as well astf.metrics.Metric
(any previous customizations based ontf.metrics.Metric
will need to be
updated). As part of this change thetfma.metrics.NDCG
,
tfma.metrics.MinValuePosition
, andtfma.metrics.QueryStatistics
have
been updated. - Renamed
ConfusionMatrixMetric.compute
toConfusionMatrixMetric.result
for consistency with other APIs.
Deprecations
- Deprecating Py3.5 support.
Version 0.23.0
Major Features and Improvements
- Changed default confidence interval method from POISSON_BOOTSTRAP to
JACKKNIFE. This should significantly improve confidence interval evaluation
performance by as much as 10x in runtime and CPU resource usage. - Added support for additional confusion matrix metrics (FDR, FOR, PT, TS, BA,
F1 score, MCC, FM, Informedness, Markedness, etc). See
https://en.wikipedia.org/wiki/Confusion_matrix for full list of metrics now
supported. - Change the number of partitions used by the JACKKNIFE confidence interval
methodology from 100 to 20. This will reduce the quality of the confidence
intervals but support computing confidence intervals on slices with fewer
examples. - Added
tfma.metrics.MultiClassConfusionMatrixAtThresholds
. - Refactoring code to compute
tfma.metrics.MultiClassConfusionMatrixPlot
using derived computations.
Bug fixes and other changes
- Added support for labels passed as SparseTensorValues.
- Stopped requiring
avro-python3
. - Fix NoneType error when passing BinarizeOptions to
tfma.metrics.default_multi_class_classification_specs. - Fix issue with custom metrics contained in modules ending in
tf.keras.metric. - Changed the BoundedValue.value to be the unsampled metric value rather than
the sample average. - Add
EvalResult.get_metric_names()
. - Added errors for missing slices during metrics validation.
- Added support for customizing confusion matrix based metrics in keras.
- Made BatchedInputExtractor externally visible.
- Updated tfma.load_eval_results API to return empty results instead of
throwing an error when evaluation results are missing for a model_name. - Fixed an issue in Fairness Indicators UI where omitted slices error message
was being displayed even if no slice was omitted. - Fix issue with slice_spec.is_slice_applicable not working for float, int,
etc types that are encoded as strings. - Wrap long strings in table cells in Fairness Indicators UI
- Depends on
apache-beam[gcp]>=2.23,<3
. - Depends on
pyarrow>=0.17,<0.18
. - Depends on
scipy>=1.4.1,<2
- Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,<3
. - Depends on
tensorflow-metadata>=0.23,<0.24
. - Depends on
tfx-bsl>=0.23,<0.24
.
Breaking changes
- Rename EvalResult.get_slices() to EvalResult.get_slice_names().
Deprecations
- Note: We plan to remove Python 3.5 support after this release.
TFMA 0.22.2 Release
Major Features and Improvements
- Added analyze_raw_data(), an API for evaluating TFMA metrics on Pandas
DataFrames.
Bug fixes and other changes
- Previously metrics would only be computed for combinations of keys that
produced different metric values (e.g.ExampleCount
will be the same for
all models, outputs, classes, etc, so only one metric key was used). Now a
metric key will be returned for each combination associated with the
MetricSpec
definition even if the values will be the same. Support for
model independent metrics has also been removed. This means by default
multiple ExampleCount metrics will be created when multiple models are used
(one per model). - Fixed issue with label_key and prediction_key settings not working with TF
based metrics. - Fairness Indicators UI
- Thresholds are now sorted in ascending order.
- Barchart can now be sorted by either slice or eval.
- Added support for slicing on any value extracted from the inputs (e.g. raw
labels). - Added support for filtering extracts based on sub-keys.
- Added beam counters to track the feature slices being used for evaluation.
- Adding KeyError when analyze_raw_data is run without a valid label_key or
prediction_key within the provided Pandas DataFrame. - Added documentation for
tfma.analyze_raw_data
,tfma.view.SlicedMetrics
,
andtfma.view.SlicedPlots
. - Unchecked Metric thresholds now block the model validation.
- Added support for per slice threshold settings.
- Added support for sharding metrics and plots outputs.
- Updated load_eval_result to support filtering plots by model name. Added
support for loading multiple models at same output path using
load_eval_results. - Fix typo in jupyter widgets breaking TimeSeriesView and PlotViewer.
- Add
tfma.slicer.stringify_slice_key()
. - Deprecated external use of tfma.slicer.SingleSliceSpec (tfma.SlicingSpec
should be used instead). - Updated tfma.default_eval_shared_model and tfma.default_extractors to better
support custom model types. - Depends on 'tensorflow-metadata>=0.22.2,<0.23'
Breaking changes
- Changed to treat CLASSIFY_OUTPUT_SCORES involving 2 values as a multi-class
classification prediction instead of converting to binary classification. - Refactored confidence interval methodology field. The old path under
Options.confidence_interval_methodology
is now at
Options.confidence_intervals.methodology
. - Removed model_load_time_callback from ModelLoader construct_fn (timing is
now handled by load). Removed access to shared_handle from ModelLoader.
Deprecations
Version 0.22.1
Version 0.22.1
Major Features and Improvements
Bug fixes and other changes
- Depends on
pyarrow>=0.16,<0.17
.
Breaking changes
Deprecations
Version 0.22.0
Major Features and Improvements
- Added support for jackknife-based confidence intervals.
- Add EvalResult.get_metrics(), which extracts slice metrics in dictionary
format from EvalResults. - Adds TFMD
Schema
as an available argument to computations callbacks.
Bug fixes and other changes
- Version is now available under
tfma.version.VERSION
ortfma.__version__
. - Add auto slicing utilities for significance testing.
- Fixed error when a metric and loss with the same classname are used.
- Adding two new ratios (false discovery rate and false omission rate) in
Fairness Indicators. MetricValue
s can now contain both a debug message and a value (rather than
one or the other).- Fix issue with displaying ConfusionMatrixPlot in colab.
CalibrationPlot
now infersleft
andright
values from schema, when
available. This makes the calibration plot useful to regression users.- Fix issue with metrics not being computed properly when mixed with specs
containing micro-aggregation computations. - Remove batched keys. Instead use the same keys for batched and unbatched
extract. - Adding support to visualize Fairness Indicators in Fairness Indicators
TensorBoard Plugin by providing remote evalution path in query parameter:
<tensorboard_url>#fairness_indicators& p.fairness_indicators.evaluation_output_path=<evaluation_path>
. - Fixed invalid metrics calculations for serving models using the
classification API with binary outputs. - Moved config writing code to extend from tfma.writer.Writer and made it a
member of default_writers. - Updated tfma.ExtractEvaluateAndWriteResults to accept Extracts as input in
addition to serialize bytes and Arrow RecordBatches. - Depends on
apache-beam[gcp]>=2.20,<3
. - Depends on
pyarrow>=0.16,<1
. - Depends on
tensorflow>=1.15,!=2.0.*,<3
. - Depends on
tensorflow-metadata>=0.22,<0.23
. - Depends on
tfx-bsl>=0.22,<0.23
.
Breaking changes
- Remove desired_batch_size as an option. Large batch failures can be handled
via serially processing the failed batch which also acts as a deterent from
scaling up batch sizes further. Batch size can be handled via BEAM batch
size tuning.
Deprecations
- Deprecating Py2 support.
Release 0.21.6
Release 0.21.6
Major Features and Improvements
Bug fixes and other changes
- Populate confidence_interval field in addition to bounded_value when
confidence intervals is enabled. - Only requires
avro-python3>=1.8.1,!=1.9.2.*,<2.0.0
on Python 3.5 + MacOS
Breaking changes
Deprecations
Release 0.21.5
Release 0.21.5
Major Features and Improvements
- Now publish NPM under
tensorflow_model_analysis
for UI components.
Bug fixes and other changes
- Depends on 'tfx-bsl>=0.21.3,<0.22',
- Depends on 'tensorflow>=1.15,<3',
- Depends on 'apache-beam[gcp]>=2.17,<3',
Breaking changes
- Rollback populating TDistributionValue metric when confidence intervals is
enabled in V2.
Deprecations
Release 0.21.4
Release 0.21.4
Major Features and Improvements
- Added support for creating metrics specs from tf.keras.losses.
- Added evaluation comparison feature to the Fairness Indicators UI in Colab.
- Added better defaults handling for eval config so that a single model spec
can be used for both candidate and baseline.
Bug fixes and other changes
- Fixed issue with keras metrics saved with the model not being calculated
unless a keras metric was added to the config. - Depends on
pandas>=0.24,<2
. - Depends on
pyarrow>=0.15,<1
. - Depends on 'tfx-bsl>=0.21.3,<0.23',
- Depends on 'tensorflow>=1.15,!=2.0.*,<3',
- Depends on 'apache-beam[gcp]>=2.17,<2.18',
Breaking changes
Deprecations
Release 0.21.3
Release 0.21.3
Major Features and Improvements
- Added support for model validation using either value threshold or diff
threshold. - Added a writer to output model validation result (ValidationResult).
- Added support for multi-model evaluation using EvalSavedModels.
- Added support for inserting model_names by default to metrics_specs.
Bug fixes and other changes
- Fixed issue with model_name not being set in keras metrics.
Breaking changes
- Populate TDistributionValue metric when confidence intervals is enabled in
V2. - Rename the writer MetricsAndPlotsWriter to MetricsPlotsAndValidationsWriter.
Deprecations
Release 0.21.2
Release 0.21.2
Major Features and Improvements
Bug fixes and other changes
- Adding SciPy dependency for both Python2 and Python3
- Increased table and tooltip font in Fairness Indicators.
Breaking changes
tfma.BinarizeOptions.class_ids
,tfma.BinarizeOptions.k_list
,
tfma.BinarizeOptions.top_k_list
, andtfma.Options.disabled_outputs
are
now wrapped in an additional proto message.