Releases: oroszgy/hunlp-resources
Releases · oroszgy/hunlp-resources
Webcorpuswiki word2vec model
Word2vec model trained on the Hungarian Webcorpus and on the Hungarian Wikipedia dump (as of 2017-04-21).
Parameters:
- 300 dimensions
cbow
model- minimium word frequency is set to 10
Webcorpuswiki frequencies
Term and document frequency list of words generated from union of the Hungarian Webcorpus and the Hungarian Wikipedia dump (as of 2017-04-21).
Webcorpuswiki Brown clusters
Brown clusters (2^6 clusters) got from the Hungarian Webcorpus and on the Hungarian Wikipedia dump (as of 2017-04-21).