/** Copyright by hty
Feel free to contact the following people if you find any problems in the package. hongty106@gmail.com * */
-
This is Tianyu Hong's first version of a program using LDA to predict Chinese message. The training data set is shown in the file named "training data.xml". I put all the crawled data into the xml file. Those data is crawled from some online healthcare communities and of course is in Chinese. Here I just post some data about heart disease.
-
To use the program, you should have the "WindowBuilder" plugin on your Eclipse.
-
I use the "Fudan NLP" package to separate the each message and get rid of all the stopwords.
-
I use Gibbs Sampling. The relative methods are in the package "cn.edu.zju.lda".
Attention: It's possible that the UI seems not beautify to you. Sorry for that, I'm working on the second version and I hope that one would suit your appetite.