Skip to content

Latest commit

 

History

History
152 lines (142 loc) · 5.45 KB

README.md

File metadata and controls

152 lines (142 loc) · 5.45 KB

Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions

PWC

This is our code for the paper Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions.

Citing

While unpublished, you can cite the arxiv preprint:

@misc{https://doi.org/10.48550/arxiv.2203.12235,
  doi = {10.48550/ARXIV.2203.12235},  
  url = {https://arxiv.org/abs/2203.12235},
  author = {Kogkalidis, Konstantinos and Moortgat, Michael},
  keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}

About

The model presents a new approach to constructive supertagging, based on an explicit graph representation that accounts for both the intra-tree interactions between a single supertag and the inter-tree interactions between (partially decoded) supertag sequences. To account for the disparity between the various modalities in the graph (i.e. sentential word order, subword contextualized vectors, tree-sequence order and intra-tree edges) we adopt a heterogeneous formulation. Decoding is performed in parallel over trees, each temporal step associated with an increased tree depth. Statefulness is achieved by representing each partially decoded tree with a single state-tracking vector, which is updated twice at each step: once with feedback from its own tree's last decoded fringe, and once with feedback from surrounding trees. The result is a highly parallel yet partially auto-regressive architecture with input-scaling memory complexity and near-constant decoding time that achieves new state-of-the-art scores on four datasets while retaining the ability to predict rare supertags reliably.

Results

Averages of 6 repetitions compared to recent results on the respective datasets (crawled 23/03/2022).

Model Accuracy Frequent (100+) Uncommon (10-99) Rare (1-9) OOV
CCGbank
Attentive Convolutions 96.25 96.64 71.04 n/a n/a
Ours 96.29 96.61 72.06 34.45 4.55
CCGrebank
Recursive Tree Addressing 94.70 95.11 68.86 36.76 4.94
Ours 95.07 95.45 71.06 34.45 4.55
French TLGbank
ELMO & LSTM 93.20 95.10 75.19 25.85 n/a
Ours 95.92 96.40 81.48 55.37 7.25
Æthel
Symbol-Sequential Transformer 83.67 84.55 64.70 50.58 24.55
Ours 93.67 94.83 73.45 53.83 15.79

Project Structure

dyngraphst.neural contains the model architecture , and dyngraphst.data contains the data preprocessing code; see the READMEs of the respective directories for more details. While anticipating the next stable release of PyTorchGeometric, the code will require two distinct python environments: python3.9 with pytorch 1.10.2 and torch_geometric 2.0.3 for training/inference, and python3.10 for data processing and evaluation. The two will be coallesced at a later stage.

How-to

Detailed instructions coming soon.

Contact & Support

If you have any questions or comments or would like a grammar/language specific pretrained model, feel free to get in touch.