Skip to content
Iris edited this page Jun 17, 2022 · 2 revisions

In our analysis, we use one parser's configurations:

  1. parser/parser_config.json - Biaffine dependency parser (Dozat and Manning, 2017)

You can look up how the parser performs in our analysis by checking out the corresponding accuracy rates reported in the Performance section of the README file.

=================================================================================

How to run the parser:

Step 1:

  • Adjust parameters including file paths in the respective .json config files, as needed. By default, the paths point to datasets in data. See respective README files there for details about the datasets.

Step 2:

  • Preprocess the input data. The parser expects data to be in CoNLL-U format with the relevant columns being the second (FORM), the fifth (LABEL), the seventh (HEAD) and the eighth (DEPREL).

Step 3:

Train the model:

  • Run allennlp train [params] -s [serialization dir] to train a model, where

[params] is the path to the .json config file.

[serialization dir] is the directory to save trained model, logs and other results.

Step 4:

Evaluate the model:

  • Run allennlp evaluate [archive file] [input file] --output-file [output file] to evaluate the model on some evaluation data, where

[archive file] is the path to an archived trained model.

[input file] is the path to the file containing the evaluation data.

[output file] is an optional path to save the metrics as JSON; if not provided, the output will be displayed on the console.

Step 5:

Use the model to make preditions:

  • Run allennlp predict [archive file] [input file] --use-dataset-reader --output-file [output file] to parse a file with a pretrained model, where

[archive file] i is the path to an archived trained model.

[input file] is the path to the file you want to parse; this file should be in the same format as the training data, i.e. CoNLL-2003

use-dataset-reader tells the parser to use the same dataset reader as it used during training.

[output file] is an optional path to save parsing results as JSON; if not provided, the output will be displayed on the console.

Clone this wiki locally