All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- #2 - The schedule for 2020-09-01 introduces a new column in every time table for times in GMT; to address this, the code now counts how many columns contain "Time (.*)" (regex) to accurately offset the cell checking.
- In uq.py,
black
as a color background no longer causesKeyError
while comparing Euclidean color distances.
- In uq.py:
get_uq_from_cell()
no longer callsget_closest_color()
when it doesn't find a match; instead, it will raise the newMismatchedColor
exception.get_closest_color()
now has an optional parameteris_uq
(default:True
) to filter events by UQ or other (concerts, etc.). Related to #1.get_closest_color()
is now called bySchedule.parse()
;Schedule.parse()
will check whether a prior event is listed in a schedule before determining whether an event is a UQ or something else.
- #1 - In uq.py,
get_closest_color()
now properly identifies the appropriate "closest" color. For example, if an event was intended to be a concert but its closest color was an urgent quest, the concert should now be picked instead. For more context, look up issue #1 on GitHub.
- Added a blacklist. Prior to this change, the scraper would keep opening https://pso2.com/news/urgent-quests/about because the page contained no UQs in the database. The scraper only knows not to scrape if the database contains at least one entry with the URL. The blacklist is not meant to be configured separately; please commit changes and/or file a pull request.
- Changed the repo name to PSO2NA UQ Parser.
- Readme now actually contains information!
- In uq.py:
- Renamed
hexcolor
inget_uq_from_cell
tocolor
, as the color can be a hard-coded string (e.g.red
) or a hex representation. - Renamed
get_colors_from_table()
toget_colors_from_key()
, askey
is a more accurate descriptor of the HTML table. - Instead of relying on hard-coded colors, a new function
get_closest_color()
was added: this should get the correct UQ if colors don't match exactly between the schedule and the color key.
- Renamed
- In uq.py:
parse_date()
now considers cases where a schedule contains an event in the past year (e.g. December) posted in the new year (e.g. January).- Combine
MainPage.delete_old()
withMainPage.parse()
sincedelete_old()
requiredparse()
anyway.
- In webhook.py:
- Instead of only checking the direct next event, limit the results based on
LAST
known index.
- Instead of only checking the direct next event, limit the results based on
- Added a specific check in
get_uq_from_cell()
for a hard-coded mismatch in colors (key and schedule colors differ). webhook.execute_webhook()
should now update theLAST
selected event.
- Instead of stopping parsing in
MainPage.parse()
, continue to parse, only ignoring existing entries. (They will be skipped.)
- Fixed incorrect reference to
UQMainPage
(see 1.0.0) in main.py. - Fixed
self.schedules
not beingself.schedules.items()
in uq.py.
- Added a new function
MainPage.delete_old()
in uq.py that deletes schedules (and their associated UQs) if the schedule is no longer found on the main page. This is run in main.py. Note that ifMainPage.parse()
is not called prior,delete_old()
will error.
- Uncommented a
+ 1
toparse_date()
in uq.py for years. The +1 is meant for incrementing year for future events in the new year. e.g. the script is run in December and found a new schedule in January. The year will be bumped up based on months (12
and1
respectively). If results have not been cached or updated, comment out the+ 1
. - Webhooks are now generic, no longer tied to Discord, as long as the method is
POST
. Consequently, this means you should use the full URL in the configuration, not just the ID.
- Added Discord webhook support. Set this up by copying webhook.yaml.example to
webhook.yaml
and fill out the ID of the webhook. This project is not affiliated or endorsed by Discord. - Added a check that ensures scraping only new schedules by getting the most recent entry and getting its schedule title.
- In uq.py:
UQMainPage
andUQSchedule
were renamedMainPage
andSchedule
, respectively. TheUQ
is already implied with the module name.
- In config.py:
Removed GUID per RSS entry, as the feed won't validate.Just kidding, changed the value because RSS readers can't handle this.- Added missing dependency
feedgen
.
- Initial version