Skip to content

Commit

Permalink
Fix kuringgai_nsw_gov_au source.
Browse files Browse the repository at this point in the history
They changed the formatting of the addresses in the geocode API. I changed it to use a regexp that is a bit more agnostic to the specific format and just looks for the components to appear in roughly the right order.

I also had the code raise an exception that gives a better clue to the specific problem.
  • Loading branch information
werdnum authored and 5ila5 committed Jun 26, 2024
1 parent caddefc commit b2211e7
Showing 1 changed file with 16 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import datetime
import json
import re
import requests

from bs4 import BeautifulSoup
Expand Down Expand Up @@ -92,11 +93,24 @@ def fetch(self):
r1 = s.get(q, headers = HEADERS)
data = json.loads(r1.text)["Items"]

expected_address_regexp = re.compile(
r"{}\s+{}\s+{}[,\s]+NSW[,\s]+{}".format(
re.escape(self.street_number),
re.escape(self.street_name),
re.escape(self.suburb),
re.escape(self.post_code),
),
re.IGNORECASE,
)

# Find the geolocation for the address
for item in data:
if address in item['AddressSingleLine']:
if expected_address_regexp.match(item["AddressSingleLine"]):
locationId = item["Id"]

if locationId == 0:
raise ValueError("Address not found")

# Retrieve the upcoming collections for location
q = requote_uri(str(API_URLS["schedule"]).format(locationId))
r2 = s.get(q, headers = HEADERS)
Expand All @@ -105,7 +119,7 @@ def fetch(self):

soup = BeautifulSoup(responseContent, "html.parser")
services = soup.find_all("article")

entries = []

for item in services:
Expand Down

0 comments on commit b2211e7

Please sign in to comment.