Entity Commonsense Representation for Neural Abstractive Summarization
This TensorFlow code was used in the experiments of the research paper
Reinald Kim Amplayo*, Seonjae Lim* and Seung-won Hwang. Entity Commonsense Representation for Neural Abstractive Summarization. NAACL, 2018.
* Authors have equal contributions
You will need the following data saved in a separate data
folder:
word_vecs.txt
: word vectors (we used GloVe vectors which can be downloaded here: http://nlp.stanford.edu/data/glove.840B.300d.zip)entity_vecs.txt
: entity vectors (we used wiki2vec vectors which can be downloaded here: https://github.com/idio/wiki2vec/raw/master/torrents/enwiki-gensim-word2vec-1000-nostem-10cbow.torrent)train.article.txt
andvalid.article.txt
: contains the text to be summarizedtrain.title.txt
andvalid.title.txt
: contains the summarized texttrain.entity.txt
andvalid.entity.txt
: contains the entities tagged using the format specified by wiki2vec here: https://github.com/idio/wiki2vec)
To run the code, several parameters are needed to be set in the src/summarization.py
. Refer to our paper to determine the recommended values.
To train the model, execute the following code:
python script/train.py
Similarly, to test the model, execute the following code (although test will automatically come after training is done):
python script/test.py
To cite the paper/code, please use this BibTex:
@inproceedings{amplayo2018entity,
Author = {Reinald Kim Amplayo and Seonjae Lim and Seung-won Hwang},
Booktitle = {NAACL},
Location = {New Orleans, LA},
Year = {2018},
Title = {Entity Commonsense Representation for Neural Abstractive Summarization},
}
If you have questions, send me an email: rktamplayo at yonsei dot ac dot kr