Tripadvisor_crawler

This project can help you scrape hotel information from Tripadvisor. I divide the process into two steps as following:

Step1

In url_parser.py, scrape and save url_parser.csv on data folder
(including hotels' basic information ex. hotal name, url, number of comments, hotel rank in country, etc.)

Step2

In content_parser.py, scrape hotels' detailed information baesd on Step1 output ex. hotel rank, phone number, numbers of comments in each rank, etc.

Preparation

Python2
Use Chromedriver or PhantomJS
Set your target_url in url_parser.py
Turn off debug mode in content_parser.py

To Be Continued

If you have any feature requests, don't hesitate to contact me :)

Country selection
Date selection
Room selection
Cookie usage

Reference

Chromedriver: http://chromedriver.chromium.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Tripadvisor_crawler

Step1

Step2

Preparation

To Be Continued

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

Tripadvisor_crawler

Step1

Step2

Preparation

To Be Continued

Reference