-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An appalling lemma discrepancy #480
Comments
Thoughts on this? I'd like to switch it to the double letter versions |
Yeah definitely not "appal". I guess it should be a VERB, with lemma "appall". |
That's not the only time there's
although not consistently so:
It looks like usually a state of mind is tagged as |
Deciding VERB (past participle) vs. ADJ is very nuanced. I defer to @amir-zeldes when it comes up. (One test for ADJ is un- negation, which works for some of these. Another is very modification. I'm not sure there's a good reason that shocked should be ADJ but amazed should be VERB.) |
It is indeed very tricky, since the original PTB guidelines are internally inconsistent/offer contradictory tests without a ranking. I do rely on PTB/ON precendents in doubtful cases, but my basic methodology and hopefully the one you find in practice in the GU corpora is articulated here, and I'll repeat it for convenience - the priorities of importance we have noticed in prior corpora and what we enforce is, in order:
In OntoNotes, it seems "impressed" is about 50-50 when not used in a perfect construction/as a finite verb, and tagged VBN whenever "by" appears (criterion 2), otherwise JJ. Criterion 4 would justify this behavior IMO, but we have 3. ranked higher, which is why it is always VBN in GUM in this function. Personally I would opt for VBN here, it seems pretty transparent to me. |
Originally posted by @amir-zeldes in UniversalDependencies/UD_English-GUM#78 (comment) If test 3 permits adding a copula, then "united states" => "states that are united" clearly passes, but so would canonical stative adjectives ("tall people" => "people who are tall"). So I'm not sure how that would favor the verb analysis. If it doesn't permit adding a copula, then "states that united" is not quite the same meaning, though it is related by the inchoative alternation. |
Yes, this is true, but it's more consistent with VBG (uniting factors -> factors which unite), which can always be construed that way, and I don't see why active participles get to stay verbs in this use when passives don't. For me the question is mainly one of the lexeme (lemma), and note that not all participle-like forms pass this test, for example "missing documents" are not "documents which miss". |
Well there's a semantic difference—"uniting" is dynamic, "united" is stative. Paraphrasing as a finite verb would usually favor the dynamic reading except with verbs that are inherently stative like "exist". |
Hi, @nschneid ! |
Right, but even for "existing" if I search for DT followed by it, I get 36:3 VBG... So it seems at least for active participles, PTB recognizes the deverbal nature regardless of lexical stativeness |
I'm willing to change EWT's treatment of "united" to VERB/VBG per GUM policies. But there are likely other lexical items in EWT that are not consistent with GUM. |
OK thanks - I wouldn't be surprised if there are also internal inconsistencies within GUM and EWT, it's probably a longer term to do but worth looking at at some point. |
vs
Should this be
appal
instead ofappall
, anyway? Similar question forenrol
, I suppose. It doesn't look right to my American sensibilities.The text was updated successfully, but these errors were encountered: