Code accompanying the paper Retrieving Compositional Documents Using Position-Sensitive Word Mover’s Distance.
The twin dataset described in the publication is found under:
dataset
where the extracted IMDB mapping are found in
dataset/imdb-mappings.csv
the ground truth mappings of twin films described on wikipedia are listed in
dataset/twin-films.csv
and the extracted plot keywords are found in binary format (python pickle) under
dataset/plotKeywords.pkl
.
An example implementation of the Position-Sensitive Word Mover’s Distance can be found under:
code/pwmd.py
To run the code, please ensure to install the following python packages in advance.
python 3
spacy
pyemd
Please use the following citation:
@inproceedings{Trapp2017b,
title={Retrieving Compositional Documents Using Position-Sensitive Word Mover’s Distance},
author={Trapp, Martin and Skowron, Marcin and Schabus, Dietmar},
booktitle={Proceedings of the ACM International Conference on the Theory of Information Retrieval},
year={2017},
doi={10.1145/3121050.3121084}
}
The code is licensed under the MIT License.