MeCab Text Cleaner

This is a simple Python package for getting japanese readings (yomigana) and accents using MeCab. Please also consider using pyopenjtalk (no accents) or pyopenjtalk_g2p_prosody (ESPnet) (with accents), as this package does not account for accent changes in compound words.

Installation

Install this via pip or pipx (or your favourite package manager):

pipx install mecab-text-cleaner[unidecode,unidic]

pip install mecab-text-cleaner[unidecode,unidic]

Usage

> mtc いい天気ですね。
イ]ー テ]ンキ デス ネ。
> mtc いい天気ですね。 --ascii
i] te]nki desu ne.
> mtc いい天気ですね --no-add-atype --no-add-blank-between-words
イーテンキデスネ
> mtc いい天気ですね --no-add-atype --no-add-blank-between-words -r kana
イイテンキデスネ

from mecab_text_cleaner import to_reading, to_ascii_clean

assert to_reading("     空、雲。\n雨！（") == "ソ]ラ、 ク]モ。\nア]メ！（"
assert to_ascii_clean("      한空、雲。\n雨！（") == "han so]ra, ku]mo. \na]me!("

Contributors ✨

Thanks goes to these wonderful people (emoji key):

This project follows the all-contributors specification. Contributions of any kind welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github		.github
.idea		.idea
docs		docs
src/mecab_text_cleaner		src/mecab_text_cleaner
templates		templates
tests		tests
.all-contributorsrc		.all-contributorsrc
.copier-answers.yml		.copier-answers.yml
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CHANGELOG		CHANGELOG
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
commitlint.config.js		commitlint.config.js
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
renovate.json		renovate.json
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MeCab Text Cleaner

Installation

Usage

Contributors ✨

About

Releases 2

Sponsor this project

Packages

Contributors 3

Languages

License

34j/mecab-text-cleaner

Folders and files

Latest commit

History

Repository files navigation

MeCab Text Cleaner

Installation

Usage

Contributors ✨

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 2

Sponsor this project

Packages 0

Contributors 3

Languages

Packages