You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These should have the NumType=Ord feature (as they specify an ordered list of items).
The 1. etc variants should have the NumForm=Digit feature.
The a) etc variants should have a NumForm feature, but no suitable form currently exists for these; maybe NumForm=Alpha (alphabetic -- "Examples: a, b, c, α, β, γ").
EWT tokenizes the ( and ) as separate tokens.
Validation issues:
ERROR: Sentence GUM_textbook_governments-1 token 1 -- invalid X form '1.1'
ERROR: Sentence GUM_academic_eegimaa-1 token 1 -- invalid X form '2.'
ERROR: Sentence GUM_textbook_chemistry-1 token 1 -- invalid X form '2.1'
ERROR: Sentence GUM_textbook_chemistry-13 token 1 -- invalid X form '1.'
ERROR: Sentence GUM_textbook_chemistry-15 token 1 -- invalid X form '2.'
ERROR: Sentence GUM_textbook_chemistry-20 token 1 -- invalid X form '3.'
ERROR: Sentence GUM_textbook_chemistry-21 token 1 -- invalid X form '4.'
ERROR: Sentence GUM_textbook_chemistry-26 token 1 -- invalid X form '5.'
ERROR: Sentence GUM_academic_census-1 token 1 -- invalid X form '1'
ERROR: Sentence GUM_academic_economics-1 token 1 -- invalid X form '2.'
ERROR: Sentence GUM_academic_economics-2 token 1 -- invalid X form '2.1.'
ERROR: Sentence GUM_academic_economics-35 token 1 -- invalid X form '2.2.'
ERROR: Sentence GUM_academic_epistemic-23 token 30 -- invalid X form '8'
ERROR: Sentence GUM_academic_implicature-1 token 1 -- invalid X form '4.'
ERROR: Sentence GUM_academic_implicature-7 token 1 -- invalid X form '4.1.'
ERROR: Sentence GUM_academic_implicature-30 token 1 -- invalid X form '5.'
ERROR: Sentence GUM_academic_lighting-13 token 1 -- invalid X form '1.'
ERROR: Sentence GUM_academic_mutation-8 token 1 -- invalid X form '1.'
ERROR: Sentence GUM_academic_mutation-17 token 1 -- invalid X form '2.'
ERROR: Sentence GUM_academic_mutation-45 token 1 -- invalid X form '3.'
ERROR: Sentence GUM_academic_replication-12 token 14 -- invalid X form '(a)'
ERROR: Sentence GUM_academic_replication-12 token 25 -- invalid X form '(b)'
ERROR: Sentence GUM_academic_replication-12 token 36 -- invalid X form '(c)'
ERROR: Sentence GUM_academic_replication-20 token 11 -- invalid X form '(a)'
ERROR: Sentence GUM_academic_replication-20 token 18 -- invalid X form '(b)'
ERROR: Sentence GUM_academic_replication-20 token 25 -- invalid X form '(c)'
ERROR: Sentence GUM_academic_replication-20 token 35 -- invalid X form '(d)'
ERROR: Sentence GUM_academic_salinity-1 token 1 -- invalid X form '1.'
ERROR: Sentence GUM_bio_nida-33 token 1 -- invalid X form '1.'
ERROR: Sentence GUM_bio_nida-34 token 1 -- invalid X form '2.'
ERROR: Sentence GUM_bio_nida-35 token 1 -- invalid X form '3.'
ERROR: Sentence GUM_interview_herrick-48 token 3 -- invalid X form '1)'
ERROR: Sentence GUM_interview_herrick-48 token 20 -- invalid X form '2)'
ERROR: Sentence GUM_news_defector-35 token 19 -- invalid X form 'a)'
ERROR: Sentence GUM_news_defector-35 token 35 -- invalid X form 'b)'
ERROR: Sentence GUM_textbook_artwork-9 token 1 -- invalid X form '38.'
ERROR: Sentence GUM_textbook_artwork-27 token 1 -- invalid X form '39.'
ERROR: Sentence GUM_textbook_artwork-29 token 1 -- invalid X form '40.'
ERROR: Sentence GUM_textbook_artwork-31 token 1 -- invalid X form '41.'
ERROR: Sentence GUM_textbook_grit-1 token 1 -- invalid X form '2.2'
ERROR: Sentence GUM_textbook_history-1 token 1 -- invalid X form '1'
ERROR: Sentence GUM_textbook_history-2 token 1 -- invalid X form '1.1'
ERROR: Sentence GUM_textbook_history-73 token 1 -- invalid X form '1.'
ERROR: Sentence GUM_textbook_history-78 token 1 -- invalid X form '2.'
ERROR: Sentence GUM_textbook_spacetime-1 token 1 -- invalid X form '24.2'
ERROR: Sentence GUM_textbook_stats-1 token 1 -- invalid X form '2.3'
ERROR: Sentence GUM_voyage_isfahan-25 token 1 -- invalid X form '1'
ERROR: Sentence GUM_voyage_isfahan-31 token 1 -- invalid X form '2'
ERROR: Sentence GUM_voyage_isfahan-39 token 1 -- invalid X form '3'
ERROR: Sentence GUM_voyage_isfahan-43 token 1 -- invalid X form '4'
ERROR: Sentence GUM_voyage_isfahan-48 token 1 -- invalid X form '5'
ERROR: Sentence GUM_voyage_isfahan-50 token 1 -- invalid X form '6'
ERROR: Sentence GUM_voyage_isfahan-58 token 1 -- invalid X form '7'
ERROR: Sentence GUM_voyage_isfahan-64 token 1 -- invalid X form '8'
ERROR: Sentence GUM_voyage_isfahan-67 token 1 -- invalid X form '9'
ERROR: Sentence GUM_whow_basil-7 token 1 -- invalid X form '1'
ERROR: Sentence GUM_whow_basil-16 token 1 -- invalid X form '2'
ERROR: Sentence GUM_whow_basil-20 token 1 -- invalid X form '3'
ERROR: Sentence GUM_whow_basil-24 token 1 -- invalid X form '4'
ERROR: Sentence GUM_whow_basil-30 token 1 -- invalid X form '5'
ERROR: Sentence GUM_whow_basil-35 token 1 -- invalid X form '1'
ERROR: Sentence GUM_whow_basil-44 token 1 -- invalid X form '2'
ERROR: Sentence GUM_whow_basil-47 token 1 -- invalid X form '3'
ERROR: Sentence GUM_whow_basil-52 token 1 -- invalid X form '4'
ERROR: Sentence GUM_whow_basil-58 token 1 -- invalid X form '1'
ERROR: Sentence GUM_whow_basil-66 token 1 -- invalid X form '2'
ERROR: Sentence GUM_whow_basil-68 token 1 -- invalid X form '3'
ERROR: Sentence GUM_whow_basil-72 token 1 -- invalid X form '4'
The text was updated successfully, but these errors were encountered:
I could see using Ord for the numerical ones, but until we sort out what we're doing about LS I will leave this open. I anticipate this will stay as-is for v2.13.
Ping regarding this ( and @nschneid) ... one of the more frequent errors caused by the CoreNLP constituency -> dependency converter is because it wants to make the dependency "num" but the UPOS "X". If we come up with a standard and apply it to the EWT & GUM treebanks, I can implement that in the converter pretty easily.
X
UPOS. -- EWT favoursNUM
for these and https://universaldependencies.org/u/pos/X.html states that it should be used restrictively.NumType=Ord
feature (as they specify an ordered list of items).1.
etc variants should have theNumForm=Digit
feature.a)
etc variants should have aNumForm
feature, but no suitable form currently exists for these; maybeNumForm=Alpha
(alphabetic -- "Examples: a, b, c, α, β, γ").(
and)
as separate tokens.Validation issues:
The text was updated successfully, but these errors were encountered: