Skip to content

Releases: tensorflow/model-analysis

TensorFlow Model Analysis 0.24.0

10 Sep 01:55
f0b99a9
Compare
Choose a tag to compare

Major Features and Improvements

  • Use TFXIO and batched extractors by default in TFMA.

Bug fixes and other changes

  • Updated the type hint of FilterOutSlices.
  • Fix issue with precision@k and recall@k giving incorrect values when
    negative thresholds are used (i.e. keras defaults).
  • Fix issue with MultiClassConfusionMatrixPlot being overridden by
    MultiClassConfusionMatrix metrics.
  • Made the Fairness Indicators UI thresholds drop down list sorted.
  • Fix the bug that Sort menu is not hidden when there is no model comparison.
  • Depends on absl-py>=0.9,<0.11.
  • Depends on ipython>=7,<8.
  • Depends on pandas>=1.0,<2.
  • Depends on protobuf>=3.9.2,<4.
  • Depends on tensorflow-metadata>=0.24.0,<0.25.0.
  • Depends on tfx-bsl>=0.24.0,<0.25.0.

Breaking changes

  • Query based metrics evaluations that make use of MetricsSpecs.query_key
    are now passed tfma.Extracts with leaf values that are of type
    np.ndarray containing an additional dimension representing the values
    matched by the query (e.g. if the labels and predictions were previously 1D
    arrays, they will now be 2D arrays where the first dimension's size is equal
    to the number of examples matching the query key). Previously a list of
    tfma.Extracts was passed instead. This allows user's to now add custom
    metrics based on tf.keras.metrics.Metric as well as tf.metrics.Metric
    (any previous customizations based on tf.metrics.Metric will need to be
    updated). As part of this change the tfma.metrics.NDCG,
    tfma.metrics.MinValuePosition, and tfma.metrics.QueryStatistics have
    been updated.
  • Renamed ConfusionMatrixMetric.compute to ConfusionMatrixMetric.result
    for consistency with other APIs.

Deprecations

  • Deprecating Py3.5 support.

Version 0.23.0

24 Aug 16:00
baab61b
Compare
Choose a tag to compare

Major Features and Improvements

  • Changed default confidence interval method from POISSON_BOOTSTRAP to
    JACKKNIFE. This should significantly improve confidence interval evaluation
    performance by as much as 10x in runtime and CPU resource usage.
  • Added support for additional confusion matrix metrics (FDR, FOR, PT, TS, BA,
    F1 score, MCC, FM, Informedness, Markedness, etc). See
    https://en.wikipedia.org/wiki/Confusion_matrix for full list of metrics now
    supported.
  • Change the number of partitions used by the JACKKNIFE confidence interval
    methodology from 100 to 20. This will reduce the quality of the confidence
    intervals but support computing confidence intervals on slices with fewer
    examples.
  • Added tfma.metrics.MultiClassConfusionMatrixAtThresholds.
  • Refactoring code to compute tfma.metrics.MultiClassConfusionMatrixPlot
    using derived computations.

Bug fixes and other changes

  • Added support for labels passed as SparseTensorValues.
  • Stopped requiring avro-python3.
  • Fix NoneType error when passing BinarizeOptions to
    tfma.metrics.default_multi_class_classification_specs.
  • Fix issue with custom metrics contained in modules ending in
    tf.keras.metric.
  • Changed the BoundedValue.value to be the unsampled metric value rather than
    the sample average.
  • Add EvalResult.get_metric_names().
  • Added errors for missing slices during metrics validation.
  • Added support for customizing confusion matrix based metrics in keras.
  • Made BatchedInputExtractor externally visible.
  • Updated tfma.load_eval_results API to return empty results instead of
    throwing an error when evaluation results are missing for a model_name.
  • Fixed an issue in Fairness Indicators UI where omitted slices error message
    was being displayed even if no slice was omitted.
  • Fix issue with slice_spec.is_slice_applicable not working for float, int,
    etc types that are encoded as strings.
  • Wrap long strings in table cells in Fairness Indicators UI
  • Depends on apache-beam[gcp]>=2.23,<3.
  • Depends on pyarrow>=0.17,<0.18.
  • Depends on scipy>=1.4.1,<2
  • Depends on tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,<3.
  • Depends on tensorflow-metadata>=0.23,<0.24.
  • Depends on tfx-bsl>=0.23,<0.24.

Breaking changes

  • Rename EvalResult.get_slices() to EvalResult.get_slice_names().

Deprecations

  • Note: We plan to remove Python 3.5 support after this release.

TFMA 0.22.2 Release

23 Jun 01:21
2f7ffd1
Compare
Choose a tag to compare

Major Features and Improvements

  • Added analyze_raw_data(), an API for evaluating TFMA metrics on Pandas
    DataFrames.

Bug fixes and other changes

  • Previously metrics would only be computed for combinations of keys that
    produced different metric values (e.g. ExampleCount will be the same for
    all models, outputs, classes, etc, so only one metric key was used). Now a
    metric key will be returned for each combination associated with the
    MetricSpec definition even if the values will be the same. Support for
    model independent metrics has also been removed. This means by default
    multiple ExampleCount metrics will be created when multiple models are used
    (one per model).
  • Fixed issue with label_key and prediction_key settings not working with TF
    based metrics.
  • Fairness Indicators UI
    • Thresholds are now sorted in ascending order.
    • Barchart can now be sorted by either slice or eval.
  • Added support for slicing on any value extracted from the inputs (e.g. raw
    labels).
  • Added support for filtering extracts based on sub-keys.
  • Added beam counters to track the feature slices being used for evaluation.
  • Adding KeyError when analyze_raw_data is run without a valid label_key or
    prediction_key within the provided Pandas DataFrame.
  • Added documentation for tfma.analyze_raw_data, tfma.view.SlicedMetrics,
    and tfma.view.SlicedPlots.
  • Unchecked Metric thresholds now block the model validation.
  • Added support for per slice threshold settings.
  • Added support for sharding metrics and plots outputs.
  • Updated load_eval_result to support filtering plots by model name. Added
    support for loading multiple models at same output path using
    load_eval_results.
  • Fix typo in jupyter widgets breaking TimeSeriesView and PlotViewer.
  • Add tfma.slicer.stringify_slice_key().
  • Deprecated external use of tfma.slicer.SingleSliceSpec (tfma.SlicingSpec
    should be used instead).
  • Updated tfma.default_eval_shared_model and tfma.default_extractors to better
    support custom model types.
  • Depends on 'tensorflow-metadata>=0.22.2,<0.23'

Breaking changes

  • Changed to treat CLASSIFY_OUTPUT_SCORES involving 2 values as a multi-class
    classification prediction instead of converting to binary classification.
  • Refactored confidence interval methodology field. The old path under
    Options.confidence_interval_methodology is now at
    Options.confidence_intervals.methodology.
  • Removed model_load_time_callback from ModelLoader construct_fn (timing is
    now handled by load). Removed access to shared_handle from ModelLoader.

Deprecations

Version 0.22.1

14 May 23:42
Compare
Choose a tag to compare

Version 0.22.1

Major Features and Improvements

Bug fixes and other changes

  • Depends on pyarrow>=0.16,<0.17.

Breaking changes

Deprecations

Version 0.22.0

13 May 22:01
Compare
Choose a tag to compare

Major Features and Improvements

  • Added support for jackknife-based confidence intervals.
  • Add EvalResult.get_metrics(), which extracts slice metrics in dictionary
    format from EvalResults.
  • Adds TFMD Schema as an available argument to computations callbacks.

Bug fixes and other changes

  • Version is now available under tfma.version.VERSION or tfma.__version__.
  • Add auto slicing utilities for significance testing.
  • Fixed error when a metric and loss with the same classname are used.
  • Adding two new ratios (false discovery rate and false omission rate) in
    Fairness Indicators.
  • MetricValues can now contain both a debug message and a value (rather than
    one or the other).
  • Fix issue with displaying ConfusionMatrixPlot in colab.
  • CalibrationPlot now infers left and right values from schema, when
    available. This makes the calibration plot useful to regression users.
  • Fix issue with metrics not being computed properly when mixed with specs
    containing micro-aggregation computations.
  • Remove batched keys. Instead use the same keys for batched and unbatched
    extract.
  • Adding support to visualize Fairness Indicators in Fairness Indicators
    TensorBoard Plugin by providing remote evalution path in query parameter:
    <tensorboard_url>#fairness_indicators& p.fairness_indicators.evaluation_output_path=<evaluation_path>.
  • Fixed invalid metrics calculations for serving models using the
    classification API with binary outputs.
  • Moved config writing code to extend from tfma.writer.Writer and made it a
    member of default_writers.
  • Updated tfma.ExtractEvaluateAndWriteResults to accept Extracts as input in
    addition to serialize bytes and Arrow RecordBatches.
  • Depends on apache-beam[gcp]>=2.20,<3.
  • Depends on pyarrow>=0.16,<1.
  • Depends on tensorflow>=1.15,!=2.0.*,<3.
  • Depends on tensorflow-metadata>=0.22,<0.23.
  • Depends on tfx-bsl>=0.22,<0.23.

Breaking changes

  • Remove desired_batch_size as an option. Large batch failures can be handled
    via serially processing the failed batch which also acts as a deterent from
    scaling up batch sizes further. Batch size can be handled via BEAM batch
    size tuning.

Deprecations

  • Deprecating Py2 support.

Release 0.21.6

09 Mar 22:07
Compare
Choose a tag to compare

Release 0.21.6

Major Features and Improvements

Bug fixes and other changes

  • Populate confidence_interval field in addition to bounded_value when
    confidence intervals is enabled.
  • Only requires avro-python3>=1.8.1,!=1.9.2.*,<2.0.0 on Python 3.5 + MacOS

Breaking changes

Deprecations

Release 0.21.5

06 Mar 04:20
Compare
Choose a tag to compare

Release 0.21.5

Major Features and Improvements

  • Now publish NPM under tensorflow_model_analysis for UI components.

Bug fixes and other changes

  • Depends on 'tfx-bsl>=0.21.3,<0.22',
  • Depends on 'tensorflow>=1.15,<3',
  • Depends on 'apache-beam[gcp]>=2.17,<3',

Breaking changes

  • Rollback populating TDistributionValue metric when confidence intervals is
    enabled in V2.

Deprecations

Release 0.21.4

02 Mar 03:58
Compare
Choose a tag to compare

Release 0.21.4

Major Features and Improvements

  • Added support for creating metrics specs from tf.keras.losses.
  • Added evaluation comparison feature to the Fairness Indicators UI in Colab.
  • Added better defaults handling for eval config so that a single model spec
    can be used for both candidate and baseline.

Bug fixes and other changes

  • Fixed issue with keras metrics saved with the model not being calculated
    unless a keras metric was added to the config.
  • Depends on pandas>=0.24,<2.
  • Depends on pyarrow>=0.15,<1.
  • Depends on 'tfx-bsl>=0.21.3,<0.23',
  • Depends on 'tensorflow>=1.15,!=2.0.*,<3',
  • Depends on 'apache-beam[gcp]>=2.17,<2.18',

Breaking changes

Deprecations

Release 0.21.3

14 Feb 19:09
Compare
Choose a tag to compare

Release 0.21.3

Major Features and Improvements

  • Added support for model validation using either value threshold or diff
    threshold.
  • Added a writer to output model validation result (ValidationResult).
  • Added support for multi-model evaluation using EvalSavedModels.
  • Added support for inserting model_names by default to metrics_specs.

Bug fixes and other changes

  • Fixed issue with model_name not being set in keras metrics.

Breaking changes

  • Populate TDistributionValue metric when confidence intervals is enabled in
    V2.
  • Rename the writer MetricsAndPlotsWriter to MetricsPlotsAndValidationsWriter.

Deprecations

Release 0.21.2

31 Jan 20:40
Compare
Choose a tag to compare

Release 0.21.2

Major Features and Improvements

Bug fixes and other changes

  • Adding SciPy dependency for both Python2 and Python3
  • Increased table and tooltip font in Fairness Indicators.

Breaking changes

  • tfma.BinarizeOptions.class_ids, tfma.BinarizeOptions.k_list,
    tfma.BinarizeOptions.top_k_list, and tfma.Options.disabled_outputs are
    now wrapped in an additional proto message.

Deprecations