Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pronominal adverbs #33

Open
cinkova opened this issue Oct 8, 2022 · 1 comment
Open

Pronominal adverbs #33

cinkova opened this issue Oct 8, 2022 · 1 comment

Comments

@cinkova
Copy link

cinkova commented Oct 8, 2022

I wonder about the policy of tagging the pronominal adverbs (Pronominaladverbien), e. g. dabei, damit, wozu, hierauf. Depending on the preposition lemma, they tend to be tagged as either PRON or ADV.
ADV is rather strange, since they usually substitute a noun, just like personal pronouns. I am wondering because of a similar phenomenon being tagged as ADP in Irish, which is super weird. I have even found some ADP that are not leaves (Irish, still).
Note that the German words I mention are not the "genuine" pronominal adverbs such as woher!

@dan-zeman
Copy link
Member

The general policy for pronominal adverbs in UD is that their UPOS is ADV and they have a non-empty PronType (e.g. Int, Dem, Tot, Neg). However, the assumed examples there are typically the "genuine" pronominal adverbs ("where", "when", "there", "never" etc.)

I think that the traditional approach in German tagging (even before UD) is that words like damit, wozu are adverbs or pronominal adverbs (PROAV in the Stuttgart-Tübingen tagset). I understand that they substitute a nominal, but one with preposition — it is actually not unusual cross-linguistically that a prepositional phrase has an adverbial meaning, and if fused into one word, becomes an adverb. Moreover, in the specific German case, if we view dabei as a contraction of "bei da" it is actually a preposition (bei) with an adverb (da), so the head is adverb, so it is even more natural for the whole thing to be adverb than it would be if the head were a pronoun. It would be actually possible to treat them as multiword tokens in UD and split them to two syntactic words, but I'm not sure it would help much.

To summarize, I believe that the policy should be that these words are ADV (definitely not ADP! but also not PRON). Unfortunately, I'm afraid that this wasn't the policy when the pre-UD annotation of the GSD treebank was created, or the policy was not consistently followed. ADV seems to be the prevailing solution (http://hdl.handle.net/11346/PMLTQ-9XIU) but there are other tags like PRON, ADP, SCONJ or CCONJ. This should be fixed. The other German treebanks also prefer ADV and seem to be more consistent than GSD but they, too, have tagging errors.

@dan-zeman dan-zeman added the bug label Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants