Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interline wildcards #42

Merged
merged 7 commits into from
May 17, 2024
Merged

Commits on May 16, 2024

  1. Split apart the two different uses of the wildcard operator.

    Wildcards within, and between, lines behave very differently, so split
    them into two different constants. Right now this doesn't make any
    difference (the constants are the same!), but it makes it clearer in the
    code which is which at different points.
    ltratt committed May 16, 2024
    Configuration menu
    Copy the full SHA
    c6fa53e View commit details
    Browse the repository at this point in the history
  2. Shorten description.

    ltratt committed May 16, 2024
    Configuration menu
    Copy the full SHA
    a7ad059 View commit details
    Browse the repository at this point in the history
  3. Change the default interline wildcard syntax to "..?".

    This is so, when we shortly add new syntax, we can warn users of the
    change.
    ltratt committed May 16, 2024
    Configuration menu
    Copy the full SHA
    edf5480 View commit details
    Browse the repository at this point in the history
  4. Use README.md as the crate doc string.

    This means only having to edit one file instead of keeping two
    nearly-identical things in sync. It also means that the doc strings in
    the README are tested.
    ltratt committed May 16, 2024
    Configuration menu
    Copy the full SHA
    7944723 View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. Introduce the ..* interline wildcard.

    This allows "group matching", which also implies backtracing.
    
    Consider this pattern:
    
    ```text
    A
    ..?
    B
    C
    ..?
    ```
    
    This will match successfully against the literal:
    
    ```text
    A
    D
    B
    C
    E
    ```
    
    but fail to match against the literal:
    
    ```text
    A
    D
    B
    B
    C
    E
    ```
    
    because the `..?` matched against the first "B", anchored the search, then
    immediately failed to match against the second "B".
    
    In contrast the pattern:
    
    ```text
    A
    ..*
    B
    C
    ..?
    ```
    
    will, through backtracing, successfully match the literal.
    
    ```text
    A
    ..?
    B
    C
    ..*
    D
    E
    ```
    
    There are two reasons why you should default to using `..?` rather than `..*`.
    Most obviously `..?` does not backtrack and has linear performance. Less
    obviously `..?` prevents literals from matching when they contain multiple
    similar sequences. Informally, `..?` makes for more rigorous testing: `..?` can
    be thought of as "the next thing that matches must look like X" whereas `..*`
    says "skip things that are almost like X until you find something that is
    definitely X".
    
    Consider this pattern:
    
    ```text
    A
    ..?
    B
    C
    ..?
    ```
    
    This will match successfully against the literal:
    
    ```text
    A
    D
    B
    C
    E
    ```
    
    but fail to match against the literal:
    
    ```text
    A
    D
    B
    B
    C
    E
    ```
    
    because the `..?` matched against the first "B", anchored the search, then
    immediately failed to match against the second "B".
    
    In contrast the pattern:
    
    ```text
    A
    ..*
    B
    C
    ..?
    ```
    
    will, through backtracing, successfully match the literal.
    
    ```text
    A
    ..?
    B
    C
    ..*
    D
    E
    ```
    
    There are two reasons why you should default to using `..?` rather than `..*`.
    Most obviously `..?` does not backtrack and has linear performance. Less
    obviously `..?` prevents literals from matching when they contain multiple
    similar sequences. Informally, `..?` makes for more rigorous testing: `..?` can
    be thought of as "the next thing that matches must look like X" whereas `..*`
    says "skip things that are almost like X until you find something that is
    definitely X".
    ltratt committed May 17, 2024
    Configuration menu
    Copy the full SHA
    4e83550 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b71e3df View commit details
    Browse the repository at this point in the history
  3. Clarify the documentation.

    ltratt committed May 17, 2024
    Configuration menu
    Copy the full SHA
    0dc88e3 View commit details
    Browse the repository at this point in the history