Scrapes Google ratings information (stars and review count) for Australian hospitals without using Google Maps API
- Python 3
- Google Chrome
- Chromedriver
Copy the source code
git clone https://github.com/minho42/hospital-ranking.git
cd hospital-ranking/
Optional: Use virtual environment
python -m venv venv
source venv/bin/activate
Install required packages
pip install -r requirements.txt
Download Chromedriver
Change variable CHROME_DRIVER_PATH
in app.py
Download raw data ('Current Listing of Commonwealth declared hospitals')
Change filename to all_hospitals.xlsx
Run the script (this takes a long time, like > 30 minutes)
python app.py
Eventually, ranking.json
is generated in frontend/
i.e. frontend/ranking.json
which can be used in the frontend app
all_hospitals.xlsx
all_hospitals.json
[
{
"sector": "PUBLIC",
"state": "NSW",
"name": "CONCORD REPATRIATION HOSPITAL"
}
]
rating.json
[
{
"sector": "PUBLIC",
"state": "NSW",
"name": "CONCORD REPATRIATION HOSPITAL",
"stars": "2.9",
"reviews": "288"
}
]
frontend/ranking.json
[
{
"sector": "PUBLIC",
"state": "NSW",
"name": "CONCORD REPATRIATION HOSPITAL",
"stars": "2.9",
"reviews": "288",
"ranking": "2.9022899952150216"
}
]
weighted rating (WR) = (v / (v + m)) _ R + (m / (v + m)) _ C
R = average for the hospital (mean) = (Rating)
v = number of reviews for the hospital = (reviews)
m = minimum reviews required to be listed (currently 1)
C = the mean review across the whole reviews
Referenced from https://www.quora.com/How-does-IMDbs-rating-system-work