This repo contains the source code of acl-srw 2020 work on the AMR (Abstract Meaning Representation) based approach for abstractive summarization i.e. source code of Unsupervised Semantic Abstractive Summarization paper. You can find all the details in the following -
- Any summarization dataset
- To get the gold-standard AMR parses you need to get membership on the Linguistic Data Consortium
- AMR Parser - To convert text to its AMR.
- AMR Generator - To convert AMRs to text.
- AMR library - To obtain AMR for sentences.
- ROUGE - To evaluate the quality.
- Convert individual sentence to one lined AMR using the AMR Parser
- Combine the sentences to get the files in the format similar to the proxy report section of the AMR bank
- Run to convert to the required format
$ python generate_amr_file.py --story_file stories.txt --summary_file target_summaries.txt
- Run
python2.7 pipeline.py --input_file path_to_file --dataset name_of_the_dataset
to generate the summary amrs pipeline.py
uses appended AMR Library to perform all the functions
- Convert individual generated summary AMRs back to sentences using the AMR Generator
- Put the generated sentences file
predicted_summaries.txt
in/auxiliary
- Combine sentences to produce final summary with
python merge_summary_sentences.py
- Run ROUGE to evaluate
@inproceedings{dohare-etal-2018-unsupervised,
title = "Unsupervised Semantic Abstractive Summarization",
author = "Dohare, Shibhansh and
Gupta, Vivek and
Karnick, Harish",
booktitle = "Proceedings of {ACL} 2018, Student Research Workshop",
month = jul,
year = "2018",
address = "Melbourne, Australia",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/P18-3011",
doi = "10.18653/v1/P18-3011",
pages = "74--83",
abstract = "Automatic abstractive summary generation remains a significant open problem for natural language processing. In this work, we develop a novel pipeline for Semantic Abstractive Summarization (SAS). SAS, as introduced by Liu et. al. (2015) first generates an AMR graph of an input story, through which it extracts a summary graph and finally, creates summary sentences from this summary graph. Compared to earlier approaches, we develop a more comprehensive method to generate the story AMR graph using state-of-the-art co-reference resolution and Meta Nodes. Which we then use in a novel unsupervised algorithm based on how humans summarize a piece of text to extract the summary sub-graph. Our algorithm outperforms the state of the art SAS method by 1.7{\%} F1 score in node prediction.",
}