Install Tweepy
Get your Twitter app keys from https://apps.twitter.com/ and put the keys in the crawl_tweets.py
script.
python crawl_tweets.py -i tweet_ids_train.txt -a train-annot.json -o tweets_train.conll # NAACL 2018 Dataset python crawl_tweets.py -i tweet_ids_dev.txt -a dev-annot.json -o tweets_dev.conll # EACL 2017 dataset python crawl_tweets.py -i tweet_ids_test.txt -a test-annot.json -o tweets_test.conll #EACL 2017 dataset
Any publication reporting the work done using this data should cite the following papers:
@inproceedings{bhat2017joining, title={Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data}, author={Bhat, Irshad and Bhat, Riyaz A and Shrivastava, Manish and Sharma, Dipti}, booktitle={Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers}, volume={2}, pages={324--330}, year={2017} } @inproceedings{bhat20`18universal, title={Universal Dependency Parsing for Hindi-English Code-Switching}, author={Bhat, Irshad and Bhat, Riyaz A and Shrivastava, Manish and Sharma, Dipti}, booktitle={Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)}, volume={1}, pages={987--998}, year={2018} }
Irshad Ahmad Bhat MS-CSE IIITH, Hyderabad bhatirshad127@gmail.com irshad.bhat@research.iiit.ac.in