Skip to content

Latest commit

 

History

History
108 lines (80 loc) · 2.86 KB

README.md

File metadata and controls

108 lines (80 loc) · 2.86 KB

Build status codecov.io Documentation status PyPI version

reparsec

Small parsec-like parser combinators library with semi-automatic error recovery.

Installation

pip install reparsec

Usage

Example

With reparsec, simple arithmetic expression parser and evaluator could be written like this:

from typing import Callable

from reparsec import Delay
from reparsec.scannerless import literal, parse, regexp
from reparsec.sequence import eof


def op_action(op: str) -> Callable[[int, int], int]:
    return {
        "+": lambda a, b: a + b,
        "-": lambda a, b: a - b,
        "*": lambda a, b: a * b,
    }[op]


spaces = regexp(r"\s*")
number = regexp(r"\d+").fmap(int) << spaces
mul_op = regexp(r"[*]").fmap(op_action) << spaces
add_op = regexp(r"[-+]").fmap(op_action) << spaces
l_paren = literal("(") << spaces
r_paren = literal(")") << spaces

expr = Delay[str, int]()
expr.define(
    (
        number |
        expr.between(l_paren, r_paren)
    )
    .chainl1(mul_op)
    .chainl1(add_op)
)

parser = expr << eof()

This parser can:

  • evaluate an expression:
>>> parser.parse("1 + 2 * (3 + 4)").unwrap()
15
  • report first syntax error:
>>> parser.parse("1 + 2 * * (3 + 4 5)").unwrap()
Traceback (most recent call last):
...
reparsec.types.ParseError: at 8: expected '('
  • attempt to recover and report multiple syntax errors:
>>> parser.parse("1 + 2 * * (3 + 4 5)", recover=True).unwrap()
Traceback (most recent call last):
...
reparsec.types.ParseError: at 8: expected '(' (skipped 2 tokens), at 17: expected ')' (skipped 1 token)
  • automatically repair input and return some result:
>>> parser.parse("1 + 2 * * (3 + 4 5)", recover=True).unwrap(recover=True)
15
  • track line and column numbers:
>>> parse(parser, """1 +
... 2 * * (
... 3 + 4 5)""", recover=True).unwrap()
Traceback (most recent call last):
...
reparsec.types.ParseError: at 2:5: expected '(' (skipped 2 tokens), at 3:7: expected ')' (skipped 1 token)

More examples: