This project can help you scrape hotel information from Tripadvisor. I divide the process into two steps as following:
In url_parser.py, scrape and save url_parser.csv on data folder
(including hotels' basic information ex. hotal name, url, number of comments, hotel rank in country, etc.)
In content_parser.py, scrape hotels' detailed information baesd on Step1 output ex. hotel rank, phone number, numbers of comments in each rank, etc.
- Python2
- Use Chromedriver or PhantomJS
- Set your target_url in url_parser.py
- Turn off debug mode in content_parser.py
If you have any feature requests, don't hesitate to contact me :)
- Country selection
- Date selection
- Room selection
- Cookie usage
- Chromedriver: http://chromedriver.chromium.org