Skip to content

A systematic comparison between pipeline and end-to-end architectures in the RDF-to-text task

Notifications You must be signed in to change notification settings

yetanotherfrancisdeveloper/DeepNLG

 
 

Repository files navigation

Pipeline vs. End-to-End Architecture to Neural Data-to-Text

This is the code used to obtained the results reported in the manuscript "Neural data-to-text generation: A comparison between pipeline and end-to-end architectures". Link to the article available on arXiv.

The file main.sh is the main script of this project. You may run it to extract the intermediate representations from the data, to train the models, to evaluate each step of the pipeline approach (reported in Section 6 of the paper) as well as to generate text from the non-linguistic approach based on each model (reported in Section 7 of the paper).

To run the script, first install the Python dependencies by running the following command:

pip install > requirements.txt

Main differences from the main repository

  • Solve some issues with the requirements. The requirements.txt file has been updated accordingly;
  • The paths are all relative and not absolute. For this reason, there is no need to change the paths in the scripts/vars file. Instead of calling this file, the scripts will call scripts/vars.sh. Every time the script is called it will echo/print the paths to the needed directories;
  • You don't need to clone the dependencies: Moses, Nematus and Suboword NMT. If these are not in Master (DeepNLG/), then they will be cloned in this directory once the main script is called;
  • The scripts should now work also on Windows;
  • Fixed errors with some scripts where there were encoding issues;
  • Also fixed some error that kept occurring from the Nematus dependency. This is why my forked repository is cloned instead of the original one.

How to run the repository

To execute everything without issues I suggest running the main script as follows:

bash ./main.sh

More data

The augmented version of the WebNLG corpus is available here. To see information about the evaluation, go to the evaluation folder.

Reproducibility

For reproducibility reasons, the data used in the experiments can be found here, whereas the results are here.

About

A systematic comparison between pipeline and end-to-end architectures in the RDF-to-text task

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PLSQL 81.7%
  • Lex 16.5%
  • Other 1.8%