This is an extension of the original NICE scraper which is currently hosted at ScraperWiki. Other tools that are part of this collection are for presentation of the data via a web browser using a UI from the original scraper which was implemented by Zarino
The scraper accesses the NICE website daily to check that the PDF files that we currently have are indeed the latest, and to also pull the new PDFs as they are added. Once we've determined if there is any new content we optionally regenerate the source JSON and HTML used to render the data to the client.
- git clone git://github.com/openhealthcare/guidance_scrapers.git
- cd guidance_scrapers
- virtualenv . --no-site-packages
- pip install -r requirements.txt
- cd setup; . ./install_cron.sh
- ....