Skip to content
Pablo Gamallo edited this page Jul 23, 2020 · 29 revisions

Detected bugs

Splitter

  • [es] Úsalo [fixed in commit b38f047]
    Some verbal forms are not splitted correctly when they start with uppercase.

    Úsalo con precaución.

    expected: 'Usa', 'lo', 'con', 'precaución', '.'
    got: 'Úsalo', 'con', 'precaución', '.'

    With two clitics as well [fixed in commit 2e28d5d]

    Ábreselo antes.

    expected: 'Abre', 'se', 'lo', 'antes', '.'
    got: 'Ábreselo', 'antes', '.'

  • [gl] Este [fixed in commit e09dd4f]
    The determiner este is incorrectly splitted at the beginning of the sentence.

    Este xeito.

    expected: 'Este', 'xeito', '.'
    got: 'Es', 'te', 'xeito', '.'

    Value of variable $excep

  • [es] Ábraselo
    Some imperative forms of abrir are incorrectly lemmatized as abrasar when combined with some clitics.

    Ábraselo inmediatamente.

    expected: 'Abra', 'se', 'lo'
    got: 'Abrase', 'lo'

  • [gl] Splitted entities [fixed in commit 706a7bd]
    Some entities are splitted even in non ambiguous positions (middle of the sentence).

    O concerto foi na Casa das Crechas

    expected: 'Casa', 'de', 'as', 'Crechas'
    got: 'Casa', 'de', 'as', 'Cre', 'che', 'as'

    Other examples: Follas Vellas, Ponte Caldelas, Alfama, Área Central, Rías Baixas, Torrente Ballester, Apóstolo

Tagger

  • [gl] a ría [fixed in commit ac8d345]

    A ría de Vigo.

    expected: "ría ría NCFS000"
    got: "ría rir VMII3S0"

  • [pt] Mos
    This entity is incorrectly splitted and tagged as PP+PP, even at positions without ambiguity (e.g. starting with an uppercase letter and in the middle of the sentence).

    (pt) Estive em Mos no verão.

    expected: "Mos mos NP00000"
    got: "Mos me+os PP+PP"

Clone this wiki locally