Skip to content

Commit

Permalink
Automatically segment rich text
Browse files Browse the repository at this point in the history
There was a new rule published today that had an abstract longer than 2,000 characters! Rich text blocks have to be broken up into segments of 2,000 characters or fewer, so now the rich text helper does that automatically.
  • Loading branch information
Mr0grog committed Nov 15, 2024
1 parent 4a59715 commit 0d02a27
Showing 1 changed file with 12 additions and 2 deletions.
14 changes: 12 additions & 2 deletions rule_scout.py
Original file line number Diff line number Diff line change
Expand Up @@ -388,10 +388,20 @@ def main() -> None:
print('Done!')


def notion_rich_text(text: str | None):
def notion_rich_text(text: str | None) -> dict:
segments = []
if text:
segment_length = 2000
max_segments = 100
segments = []
remainder = text
while remainder and len(segments) < max_segments:
segments.append(notion_text(remainder[:segment_length]))
remainder = remainder[segment_length:]

return {
'type': 'rich_text',
'rich_text': None if text is None else [notion_text(text)]
'rich_text': segments
}


Expand Down

0 comments on commit 0d02a27

Please sign in to comment.