You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
models/bart-mq: finetuned version of facebook/bart-large-mnli on data/training.csv
models/deberta-mq: finetuned version of microsoft/deberta-large-mnli on data/training.csv
models/bart-adj: finetuned version of models/bart-mq on data/training-adj.csv
models/deberta-adj: finetuned version of models/deberta-mq on data/training-adj.csv
Creating the models
source .venv/bin/activate
# facebook/bart-large-mnli and microsoft/deberta-large-mnli will automatically# be downloaded from huggingface.co when used# models/bart-mq
python -m sarn.train --output-dir "models/bart-mq" --log-dir "logs/bart-mq""facebook/bart-large-mnli""data/training.csv"# models/deberta-mq
python -m sarn.train --output-dir "models/deberta-mq" --log-dir "logs/deberta-mq""microsoft/deberta-large-mnli""data/training.csv"# models/bart-adj
python -m sarn.train --output-dir "models/bart-adj" --log-dir "logs/bart-adj""facebook/bart-large-mnli""data/training-adj.csv"# models/deberta-adj
python -m sarn.train --output-dir "models/deberta-adj" --log-dir "logs/deberta-adj""microsoft/deberta-large-mnli""data/training-adj.csv"
Model statistics
Accuracy
Model
data/evaluation.csv
data/evaluation-adj.csv
facebook/bart-large-mnli
65.25%
40.97%
microsoft/deberta-large-mnli
71.19%
47.22%
models/bart-mq
57.63%
34.72%
models/deberta-mq
61.86%
34.72%
models/bart-adj
45.76%
58.33%
models/deberta-adj
42.37%
57.64%
ROC curves
An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds.
[...]
AUC stands for "Area under the ROC Curve." That is, AUC measures the entire two-dimensional area underneath the entire ROC curve (think integral calculus) from (0,0) to (1,1).
[...]
AUC provides an aggregate measure of performance across all possible classification thresholds. One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example.