ModelsFact Dataset

This dataset contains 4.2k human factuality judgements:

CNN/DM:

600 summaries generated by BART-Large
600 summaries generated by BertSum (Liu and Lapata, 2019)
600 summaries generated by PGConv (See et al., 2017)
600 summaries generated by BottomUp (Gehrmann et al., 2018)
600 summaries generated by AbsRL (Chen and Bansal, 2018)

XSum

600 summaries generated by BART-Large
600 summaries generated by BertSum

For each of these 4.2k summaries, one randomly selected sentence (displayed in context) was annotated for factuality by three annotators, and an aggregated judgement (produced by MACE) has been added. Note that the annotated BART-Large summaries are taken from the constraint-fact dataset.

Fields

id: ID between 0 and 4199
summary: Complete summary
summary_raw: Same as summary
summary_sentence: The randomly selected summary sentence that annotators judged for factuality
summary_sentence_contextleft: Left context of the summary_sentence
summary_sentence_contextright: Right context of the summary_sentence
model_name: Name of the model that generated the summary (abs_rl, bart, bert_sum, bottom_up, or pointer_gen_cov)
abstractiveness_constraint: Abstractiveness constraint used to generate this summary (none, lambda2, lambda4, 1/lambda2, or 1/lambda1, see our paper)
annotator_comments: Comments from the annotators
annotator_ids: Anonymized annotator IDs (the annotator ID space is shared with the annotator ID space in the models-fact dataset)
annotator_votes: Factuality votes from the annotators (0=not factually consistent with the displayed document(s); 1=factually consistent)
annotator_votes_combined: Aggregated factuality judgement from MACE
dataset_name: Name of the dataset (cnn_dailymail or xsum)
document_full: Complete input document(s)
document_short: Shortened document(s) displayed to the annotators, which contains the sentences most similar to the summary_sentence.
document_original: Original input document(s) from the test set. This is the same as document_full except for XSum, where document_full contains the first sentence reinserted, but document_original does not (see Footnote 6 of the paper).
document_id: Document ID in dataset_name

Download

This dataset can be downloaded here: models_fact_v1.0.tar.gz

The dataset does not contain the input articles from CNN/DM and XSum, but we provide a script that will insert them from the corresponding Huggingface datasets. Run the script like this:

python abstractive-factual-tradeoff/misc/unpack.py /path/to/models_fact_v1.0.tar.gz

That will create a directory /path/to/models_fact_v1.0 next to the tarball. The directory will contain a data.jsonl file with the dataset. It will also contain directories with the full test.source and test.target files for cnn_dailymail and xsum.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ModelsFact Dataset

Fields

Download

Files

README.md

Latest commit

History

README.md

File metadata and controls

ModelsFact Dataset

Fields

Download