This repository holds the code of the Evidence Graph model, a model for parsing the argumentation structure of text.
It basically is a re-implementation of the model presented first in (1). Most work was done 2016-2017. It was used in the experiments of (2), (3) and (4).
This code runs in Python 3.8. It is recommended to install it in a separate virtual environment. Here are installation instructions for an Ubuntu 18.04 linux:
# basics
sudo apt install python3.8-dev
# for lxml
sudo apt install libxml2-dev libxslt1-dev
# for matplotlib
sudo apt install libpng-dev libfreetype6-dev
# for graph plotting
sudo apt install graphviz
Install all required python libaries in the environment and download the language models required by the spacy library.
make install-requirements
make download-spacy-data-de
make download-spacy-data-en
Furthermore, several microtext corpora required for the experiments can be downloaded with:
make download-corpora
Make sure all the tests pass.
make test
Run a (shortened and simplified) minimal experiment, to see that everything is working:
env/bin/python src/experiments/run_minimal.py --corpus m112en
You should (see last lines of the output) get an average macro F1 of the base classifiers similar to:
(cc ~= 0.82, ro ~= 0.75, fu ~= 0.74, at ~= 0.72).
Evaluate the results, which have been written to data/
:
env/bin/python src/experiments/eval_minimal.py --corpus m112en
You should (see first lines of the output) get an average macro F1 for the decoded results similar to:
(cc ~= 0.86, ro ~= 0.74, fu ~= 0.76, at ~= 0.71).
Adjust run_minimal.py:
- Remove the line
folds = folds[:5]
in order to run all 50 train/test splits. - In the experimental conditions, set
optimize
toTrue
so that the local model's hyperparameters are optimized. - In the experimental conditions, set
optimize_weights
toTrue
so that the global model's hyperparameters are optimized.
For more details, see the actual experiment definitions in src/experiments
.
Note that the results published in the papers were obtained using the Python 2 version of this code base. With the migration to Python 3 and various updated dependencies, the scores differ slightly. To reproduce the exact published scores, you will need to run version v0.4.0 of this code base.
Load a spacy nlp for the desired language and pass it together with a connective lexicon to the TextFeatures.
from evidencegraph.features_text import TextFeatures
from evidencegraph.classifiers import EvidenceGraphClassifier
my_features = TextFeatures(
nlp=spacy.load("klingon"),
connectives={}, # add a connective lexicon here
feature_set=TextFeatures.F_SET_ALL_BUT_VECTORS
)
clf = EvidenceGraphClassifier(
my_features.feature_function_segments,
my_features.feature_function_segmentpairs
)
Derive a custom base classifier class (stick to the interface) and pass this class to the EvidenceGraphClassifier.
from evidencegraph.classifiers import BaseClassifier
class MyBaseClassifier(BaseClassifier):
# do something different here
pass
clf = EvidenceGraphClassifier(
my_features.feature_function_segments,
my_features.feature_function_segmentpairs,
base_classifier_class=MyBaseClassifier
)
Simply load a folder containing argument graph xml files into a GraphCorpus.
from evidencegraph.corpus import GraphCorpus
corpus = GraphCorpus()
corpus.load("path/to/my/folder")
texts, trees = corpus.segments_trees()
-
Joint prediction in MST-style discourse parsing for argumentation mining
Andreas Peldszus, Manfred Stede.
In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), Portugal, Lisbon, September 2015. -
Automatic recognition of argumentation structure in short monological texts
Andreas Peldszus.
Ph.D. thesis, Universität Potsdam, 2018. -
Comparing decoding mechanisms for parsing argumentative structures
Stergos Afantenos, Andreas Peldszus, Manfred Stede.
In: Argument & Computation, Volume 9, Issue 3, 2018, Pages 177-192. -
More or less controlled elicitation of argumentative text: Enlarging a microtext corpus via crowdsourcing
Maria Skeppstedt, Andreas Peldszus, Manfred Stede.
In: Proceedings of the 5th Workshop on Argument Mining. EMNLP 2018, Belgium, Brussels, November 2018.