✒️ Essay Grammar Checker

Essay Grammar Checker trained on Russian Error-Annotated Learner English Corpus using SpaCy.

Training

The checker consists of 6 pipelines each trained on specific error types. Error Categories used for pipeline mapping:

    "spelling":{"Spelling", "Capitalisation"},  
    "punctuation": {"Punctuation"},
    "articles": {"Articles"},  
    "vocabulary": {"lex_item_choice", "lex_part_choice",
                   'Category_confusion','Formational_affixes'},
    "grammar_major": {'Tense_choice','Prepositions','Agreement_errors', 'Redundant_comp'},
    "grammar_minor": {'Word_order','Noun_number', 'Numerals','Verb_pattern', 'Determiners'}

The project.yml defines the data assets required by the project, as well as the available commands and workflows. For details, see the spaCy projects documentation. nlp.rehearse method can be also used to update trained models.

Commands

The following commands are defined by the project. They can be executed using spacy project run [name]. Commands are only re-run if their inputs have changed.

Command	Description
`preprocess`	Convert the data to spaCy format required
`generate_configs`	Configs class weight update
`train_pipelines`	Launch training
`evaluate_pipelines`	Evaluate models
`assemble_pipelines`	Assemble model
`package`	Package the resulting model

Workflows

The following workflows are defined by the project. They can be executed using spacy project run [name] and will run the specified commands in order. Commands are only re-run if their inputs have changed.

Workflow	Steps
`all`	`preprocess` → `generate_configs` → `train_pipelines` → `evaluate_pipelines` → `assemble_pipelines` → `package`

Assets

The following assets are defined by the project. They can be fetched by running spacy project assets in the project directory. The data used for training can be extracted from the corpus using the following code.

File	Source
`assets/realec/data_realec.tar.bz2`	REALEC

Performance

Metric	Scores
f1-scores	`punctuation`:0.779, `spelling`:0.939, `capitalisation`:0.902, `articles`:0.852, `lex_part_choice`: 0.235, `lex_item_choice`: 0.685, `Category_confusion`: 0.705, `Formational_affixes`: 0.742, `Verb_pattern`:0.629, `Noun_number`:0.920, `Word_order`:0.527, `Numerals`:0.736, `Determiners`:0.044, `Agreement_errors`:0.835, `Prepositions`:0.710, `Redundant_comp`:0.495, `Tense_choice`:0.825

Usage

Install

!pip install https://huggingface.co/iproskurina/en_grammar_checker/resolve/main/en_grammar_checker-any-py3-none-any.whl

# Using spacy.load().
import spacy
nlp = spacy.load("en_grammar_checker")

# Importing as module.
import en_grammar_checker
nlp = en_grammar_checker.load()

Streamlit

streamlit run streamlit_app.py

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
grammar_checker		grammar_checker
images		images
README.md		README.md
Retrain_example.ipynb		Retrain_example.ipynb
grammar_checker_example_usage.ipynb		grammar_checker_example_usage.ipynb
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
util.py		util.py
viz.py		viz.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✒️ Essay Grammar Checker

Training

Commands

Workflows

Assets

Performance

Usage

Install

Streamlit

SpanCategorizer-based rendering

About

Languages

upunaprosk/grammar-checker

Folders and files

Latest commit

History

Repository files navigation

✒️ Essay Grammar Checker

Training

Commands

Workflows

Assets

Performance

Usage

Install

Streamlit

SpanCategorizer-based rendering

About

Topics

Resources

Stars

Watchers

Forks

Languages