Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Light cleanup on html_render_diff.py #145

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Commits on May 2, 2023

  1. Clean up docstring style

    Trim spaces around docstring text, use consistent style for multiline docstrings.
    Mr0grog committed May 2, 2023
    Configuration menu
    Copy the full SHA
    4d7082c View commit details
    Browse the repository at this point in the history
  2. Fix bug in _limit_spacers()

    In diffs that went over the maximum number of spacers, it turns out that the `_limit_spacers()` function stripped out important tag information! This fixes the issue, but introduces some performance overhead. To handle that, a follow-on change should consider:
    1. Moving the spacer-limiting logic into `customize_tokens()` so we don't even create too many spacers in the first place.
    2. Revisit the whole spacer approach in the first place. There may be better approaches now.
    Mr0grog committed May 2, 2023
    Configuration menu
    Copy the full SHA
    5b2db97 View commit details
    Browse the repository at this point in the history
  3. Make contrast script a raw string

    This resolves some not-useful warnings about invalid escapes that we were getting. Nothing should be escaped in here in the first place; it's a pure JavaScript string with no substitutions or dynamic values.
    Mr0grog committed May 2, 2023
    Configuration menu
    Copy the full SHA
    38245ef View commit details
    Browse the repository at this point in the history
  4. Remove vestigial token balancing code

    There's a big TODO about removing this when we finally fully forked lxhtml's differ. That happened a long time ago, and we did in fact make the changes that turned this into effectively wasted iteration/dead code. I ran a few tests over a variety of big and small diffs to make sure the code being removed here really doesn't do anything anymore, and that seems to be the case. Reading the logic, it also seems like this should be entirely vestigial, and never wind up actually changing the tokens.
    Mr0grog committed May 2, 2023
    Configuration menu
    Copy the full SHA
    2e4483d View commit details
    Browse the repository at this point in the history
  5. Get rid of _customize_token()

    The only thing this function was doing was replacing `href_token` instances with `MinimalHrefToken`. We did this at a time when we were using parts of the tokenization internals from lxml instead of fully forking it. We have long since fully forked it, however, and we should just be creating `MinimalHrefToken` where we want them in the first place instead of looping through and replacing other tokens with them.
    Mr0grog committed May 2, 2023
    Configuration menu
    Copy the full SHA
    cb540b6 View commit details
    Browse the repository at this point in the history
  6. Rename _customize_tokens to _insert_spacers

    This is now more accurate to what the function does.
    Mr0grog committed May 2, 2023
    Configuration menu
    Copy the full SHA
    6b425b6 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b020d0c View commit details
    Browse the repository at this point in the history