Thai Word Segmentation using TCC + Bidirectional RNNs (PyTorch)
Credit code from A Beginner's Guide to Deep NLP with PyTorch - Dr. Prachya Boonkwan
Colab Notebook : https://colab.research.google.com/drive/1WS08VsjlZGAmCGsoI7AlRm-Do3zo-b-g
pip install nokcut
Train by BEST I Corpus Training set. (90% training , 10% test)
ep 6
loss: 0.017879242024514966
f1 : 98.47012481095481
F1 From BEST I Corpus Test set
F-measure: 96.94929
Recall: 122271.00000/125850.00000 = 97.15614
Precision: 122271.00000/126387.00000 = 96.74333
Number of incorrect : 3579.00000 words
Mr. Wannaphong Phatthiyaphaibun wannaphong@kkumail.com