Skip to content

Latest commit

 

History

History
47 lines (24 loc) · 2.64 KB

index.md

File metadata and controls

47 lines (24 loc) · 2.64 KB

DivExplorer

Machine learning models may perform differently on different data subgroups. We propose the notion of divergence over itemsets (i.e., conjunctions of simple predicates) as a measure of different classification behavior on data subgroups, and the use of frequent pattern mining techniques for their identification. We quantify the contribution of different attribute values to divergence with the notion of Shapley values to identify both critical and peculiar behaviors of attributes. See our paper for details and citations.

DivExplorer is available as a python package and as a interactive cloud-based web app.

DivExplorer web app

DivExplorer is a web app for analyzing datasets and finding subgroups of data where a classifier behaves differently than on the overall data.

Demonstration screenshot

Watch the demonstration video :

<iframe width="300" src="https://www.youtube.com/embed/C5IUNvgWHhU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Demonstration video

You can try DivExplorer here. You can use the preprocessed discretized COMPAS dataset as demo dataset.

Source code and documentation are available at the following link.

DivExplorer is designed by Eliana Pastor, Andrew Gavgavian, Elena Baralis, Luca de Alfaro

DivExplorer package

DivExplorer is available as a python package at the following link.

Citations

Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence. Eliana Pastor, Luca de Alfaro, Elena Baralis. Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). June 20--25, 2021, Virtual Event, China. (to appear)