Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added normalizer #8

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Balasankar C <balasankarc@autistici.org>
Diadara <nithin111@gmail.com>
Jerin Philip <monu1618@gmail.com>
Jishnu Mohan <jishnu7@gmail.com>
Nithin Saji <nithin111@gmail.com>
Vasudev Kamath <copyninja@users.noreply.github.com>
Vasudev Kamath <kamathvasudev@gmail.com>
diadara <nithin111@gmail.com>
47 changes: 47 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
CHANGES
=======

* Make test running and coverage analysis consider namespace package
* Make libindic a namespace package
* Update readme for matching new directory structure
* Add CircleCI configuration file
* Add tox integration
* Add a test for Malayalam syllablengram
* Add travis
* Fix package name
* Add testrepository support
* Tweak tests to use testtools
* Move to libindic.module strucure
* Ignore intermediate files during build and test
* Add requirements for running and tests
* Add makefile for building
* Move tests to package
* Use pbr for packaging
* PEP8 compliance
* Run Autopep8 on docs
* Fix relative imports for Python3 sypport
* Fixing broken link
* add jquery.ime
* new ui
* javascript code improvements
* Version bumped 0.4. Docs added and template renamed
* changin template to reflect module rename
* adding an intro to docs
* adding sphinx based docs
* adding more docstrings
* pep8 cleaning in tests
* added test cases for english
* template name needs to match module name
* Bumped version to 0.3
* rename syllabalizer module to indicsyllabifier
* Version is bumped to 0.2
* Package depend on indicsyllabifier not syllabalizer
* Module named as indicngram
* Version is 0.1 and wrap long description
* Wrap lines to 80 characters
* Ignore .ropeproject created by elpy
* added templates
* fixed imports
* pep8 cleaning
* Update README.md
* Initial commit
18 changes: 15 additions & 3 deletions libindic/ngram/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,17 @@
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
import indicsyllabifier

import normalizer

class Ngram:
"""
Ngram class.You need to create an object to use the function
"""


def __init__(self):
self.s=indicsyllabifier.getInstance()
self.n=normalizer.getInstance()

def syllableNgram(self, text, window_size=2):
"""
Expand All @@ -37,10 +42,17 @@ def syllableNgram(self, text, window_size=2):
window_size = int(window_size)
words = text.split(" ")
ngrams = []

# s = indicsyllabifier.getInstance()
# n = normalizer.getInstance() ##

for word in words:
s = indicsyllabifier.getInstance()

# TODO-Normalize before taking ngram!!!
syllables = s.syllabify(word)

word=self.n.normalize(word) ##

syllables = self.s.syllabify(word)
syllable_count = len(syllables)
window_start = 0
window_end = 0
Expand Down