The PaperCrawler is for crawling papers of ACLAnthology.
This version supports crawling of titles, authors, links, and years of papers published in Top 4 NLP conference(ACL, EMNLP, CONLL, COLING).
If you wanna other attributes or conferences, please modify my code and make a pull request. Thank you!
From Chrome Web Driver link, Install the chromedriver appropriate for your operating system and the version of the Chrome web browser,
Save the chromedriver.exe(in the case of windows) file in the path of /chromedriver
.
pip install selenium
pip install tqdm
pip install pandas
pip install openpyxl
python run_crawler.py --year CONFERENCE_YEAR \
--output_dir OUTPUT_DIR_PATH \
--chrome_driver_path CHROME_DRIVER_PATH