Search the regex that fits all querying strings.
-
Dozens of pre-written regexes are indexed and organized as a partial order, available in
regexorder/templates.json
. -
The regex of all the querying strings' least upper bound in the partial order is returned.
-
templates.svg
plots the partial order.
The core part is the pre-written regexes and their respective structure. Currently they only cover the most common cases.
- Any idea or contribution is highly welcome.
For interesting applications of this library, please refer to extratools
.
This library is part of the implementation for our research paper under review. If you plan to use it for research purpose, please cite either this repo or our paper accordingly.
- Detailed information of our paper will be released soon.
@misc{regexorder,
author = {Chuancong Gao},
title = {{RegexOrder}},
howpublished = "\url{https://github.com/chuanconggao/RegexOrder}",
year = {2018}
}
This package is available on PyPI. Just use pip3 install -U RegexOrder
to install it.
Our regexes utilize some advanced Unicode features, that are not available in standard re
library yet. Thus, the more advanced regex
library must be used to match our regexes.
from regexorder import RegexOrder
r = RegexOrder()
t = r.match("123")
t.name
# 'pos_int'
t.regex
# '\\+?\\d+'
t = r.matchall(["apple", "banana", "cheese cake"])
t.name
# 'lower_words'
t.regex
# '\\p{Ll}+(\\s+\\p{Ll}+)*'