Skip to content

Commit

Permalink
feat: Update regex to catch spaces between date characters
Browse files Browse the repository at this point in the history
  • Loading branch information
essteer committed Jul 12, 2024
1 parent 1b3f411 commit babdf6a
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions src/utils/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,8 @@ def get_regex_matches(text: str) -> list[re.Match]:
\s?
星期[⼀一二三四五六日] # Chinese day of week (including char variants)
\s?[\w\s]+ # Longform date
(?P<month>(\d){1,2})月 # Month
(?P<date>(\d){1,2})日 # Date
(?P<month>(\d){1,2})\s?月 # Month
(?P<date>(\d){1,2})\s?日 # Date
\s?
(?P<desc>.*?)?\s? # Event description
Expand All @@ -122,11 +122,11 @@ def get_regex_matches(text: str) -> list[re.Match]:
\d{1,2} # Hour
(:\d{2})? # Minutes (optional)
\s*
([AaPp]{1}\s?[Mm]{1})?) # am/pm (optional, with possible space before "m")
([AaPp]{0,1}\s?[Mm]{0,1})?) # am/pm (optional, with possible space before "m")
\s?-?\s? # hyphen with or without spaces
(?P<close> # End time
(\d{1,2}(:\d{2})?\s*([AaPp]{1}\s?[Mm]{1})? # Logic as per start time
(\d{1,2}(:\d{2})?\s*([AaPp]{0,1}\s?[Mm]{0,1})? # Logic as per start time
|[Ll][Aa][Tt][Ee]))? # End time may be "late"
\s?Bands\s*樂[隊團]:\s? # "Bands 樂隊:" or "Bands 樂團" (character variants)
Expand Down

0 comments on commit babdf6a

Please sign in to comment.