Source code of the paper How to Evaluate Word Representations of Informal Domain?
Scraping data from Urban Dictionary 🎍
- Scraping data from webpage:
+ scrapy crawl UD
- Scrapying data via API:
+ scrapy crawl UD_API
UD_Extractor/
SeqLabeling/
train Word2Vec, FastText, GloVe with tweets data. `trainEmbedding/'
Employ Twitter hashtag prediction downstream task using above pretrained informal word vectors as the extrinsic evaluation.
HashtagPrediction/
Use Mean Average Precision (MAP) as the intrinsic evaluation rate on word analogy task. Compare the correlations beween the intrinsic and extrinsic tasks.
calcSim
informal word pair search tool, written in Flask: demo/