Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Space separated numbers #10

Open
AngledLuffa opened this issue Apr 30, 2021 · 1 comment
Open

Space separated numbers #10

AngledLuffa opened this issue Apr 30, 2021 · 1 comment

Comments

@AngledLuffa
Copy link

I found a few examples of numbers which are conceptually one number, separated into two tokens:

В доменна пещ бяха изгорени и 157 615 пиратски компактдиска, хванати от митниците.
7       157     157     NUM     Mc-pi   Definite=Ind|Number=Plur|NumType=Card   10      nummod  10:nummod       _
8       615     615     NUM     Mc-pi   Definite=Ind|Number=Plur|NumType=Card   7       flat    7:flat  _
Даваме на Югославия помощи за $ 50 000 помощ
7       50      петдесет        NUM     Mc-pi   Definite=Ind|Number=Plur|NumType=Card   9       nummod  9:nummod        _
8       000     000     NUM     Mc-pi   Definite=Ind|Number=Plur|NumType=Card   7       flat    7:flat  _

This is different from the treatment in FR_GSD, for example:

# text = À partir du XXIe siècle, les recensements réels des communes de moins de 10 000 habitants ont lieu tous les cinq ans.
17      10 000  10 000  NUM     _       Number=Plur     18      nummod  _       _

Conceptually this looks like it should be one token, although I don't know if that's standardized in UD or exactly what implications that would have for downstream tools.

@osenova
Copy link
Contributor

osenova commented Apr 30, 2021

Hi, thanks, I will look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants