conllu

A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.

machine-translation word-alignment conllu dataset-preparation parlamint

Updated May 6, 2024
Jupyter Notebook

eaklykova / syntaxcomp

Star

A Python3 package for extracting syntactic complexity measures from CoNLL-U annotations.

syntax complexity sentence-segmentation udpipe conllu text-complexity clause-segmentation syntactic-complexity

Updated Jun 18, 2024
Python

arthurdjn / udpos

Star

Universal Dependencies datasets preprocess and autodownloads.

converter pytorch dataset txt conllu udpos

Updated Mar 15, 2020
Python

fergusq / bils

Star

Small bilar packages

nlp json utilities telegram-bot irc-bot http-server bilar conllu

Updated Jun 17, 2018

GiulioTaralli / Hidden-Markov-Model-NER-tagging

Star

NER tagging with HMM and Viterbi algorithm - University Project

python viterbi-algorithm pandas hidden-markov-model conllu ner-tagging

Updated Jul 27, 2024
Jupyter Notebook

MinionAttack / corpus-translator

Star

Tool for translating a corpus file from one language to another.

nlp translation conllu huggingface

Updated Dec 8, 2022
Python

TajaKuzman / Text-Representations-in-FastText

Star

Analysing different text representations for genre identification. I parse CONLL-u files and extract various representations of a text (running text, lemmas, part-of-speech), then train a Fasttext model on each to see which representation is the most beneficial for the genre identification task.

text-classification fasttext language-processing conllu genre-identification feature-analysis

Updated Aug 18, 2022
Jupyter Notebook

stefanrer / CountBigramFreqInConlluCorpus

Star

Count Bigram frequency in a conllu format corpus

python frequency json dictionary python3 bigrams conllu unigrams tscore bigram-frequency unigram-frequency

Updated Dec 23, 2023
Python

Improve this page

Add a description, image, and links to the conllu topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the conllu topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conllu

Here are 23 public repositories matching this topic...

pyconll / pyconll

avramandrei / BERT-Sequence-Labeling

proycon / foliatools

rgalhama / spaCy2CoNLLU

udon2 / udon2

acoli-repo / conll

MinionAttack / conllu-conll-tool

danieldk / conllu-utils

fostroll / corpuscula

rhdunn / opennlp-extensions

MuhammadYaseenKhan / CoNLL-U_Parser

SapienzaNLP / exploring-srl

TajaKuzman / Parlamint-translation

eaklykova / syntaxcomp

arthurdjn / udpos

fergusq / bils

GiulioTaralli / Hidden-Markov-Model-NER-tagging

MinionAttack / corpus-translator

TajaKuzman / Text-Representations-in-FastText

stefanrer / CountBigramFreqInConlluCorpus

Improve this page

Add this topic to your repo