Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 680 Bytes

README.md

File metadata and controls

27 lines (19 loc) · 680 Bytes

To install, first clone, then virtualenv -p python3 . This requires at least python3.6.1

Then, do . bin/activate pip install -r requirements.txt

To create the database: sqlite3 albums.db To create tables: python3 models.py

To scrape first ten pages of high-scoring albums pages: python3 scraper.py

To scrape some other page python3 $URL

To scrape deeper/less deep on high-scoring albums python3 100

If you want to crawl deeper from a single starting point, change the global variable RECURSION_DEPTH at the start of scraper.py

If you want to use more or fewer Python threads, change MAX_WORKERS at the start of scraper.py