PCRE2 bindings for lua.
NOTE: this module is under heavy development.
- lauxhlib: https://github.com/mah0x211/lauxhlib
local pcre2 = require('pcre2')
ANCHORED
: Force pattern anchoring.ALT_BSUX
: Alternative handling of\u
,\U
, and\x
.ALT_CIRCUMFLEX
: Alternative handling of^
in multiline mode.AUTO_CALLOUT
: Compile automatic callouts.CASELESS
: Do caseless matching.DOLLAR_ENDONLY
:$
not to match newline at end.DOTALL
:.
matches anything including NL.DUPNAMES
: Allow duplicate names for subpatterns.EXTENDED
: Ignore white space and#
comments.FIRSTLINE
: Force matching to be before newline.MATCH_UNSET_BACKREF
: Match unset back references.MULTILINE
:^
and$
match newlines within data.NEVER_BACKSLASH_C
: Lock out the use of\C
in patterns.NEVER_UCP
: Lock outUCP
option, e.g. via (*UCP)NEVER_UTF
: Lock outUTF
option, e.g. via (*UTF)NO_AUTO_CAPTURE
: Disable numbered capturing par theses. (named ones available)NO_AUTO_POSSESS
: Disable auto-possessification.NO_DOTSTAR_ANCHOR
: Disable automatic anchoring for.*
.NO_START_OPTIMIZE
: Disable match-time start optimizations.NO_UTF_CHECK
: Do not check the pattern forUTF
valid. (only relevant ifUTF
option is set)UCP
: Use Unicode properties for\d
,\w
, etc.UNGREEDY
: Invert greediness of quantifiers.UTF
: Treat pattern and subjects as UTF strings
JIT_COMPLETE
: compile code for full matching.JIT_PARTIAL_SOFT
: compile code for soft partial matching.JIT_PARTIAL_HARD
: compile code for hard partial matching.
ANCHORED
: Match only at the first position.NOTBOL
: Subject string is not the beginning of a line.NOTEOL
: Subject string is not the end of a line.NOTEMPTY
: An empty string is not a valid match.NOTEMPTY_ATSTART
: An empty string at the start of the subject is not a valid match.NO_UTF_CHECK
: Do not check the subject for UTF validity (only relevant ifUTF
option was set at compile time)PARTIAL_SOFT
: ReturnPCRE2_ERROR_PARTIAL
for a partial match if no full matches are found.PARTIAL_HARD
: ReturnPCRE2_ERROR_PARTIAL
for a partial match if that is found before a full match.
For details of partial matching, see the pcre2partial
page.
creates a new PCRE2 object.
Params
pattern:string
: string containing expression to be compiled.opt, ...:number
: Compile options.
Returns
re:pcre2
: PCRE2 object.err:string
: error message.
This function requests JIT compilation, which, if the just-in-time compiler is available, further processes a compiled pattern into machine code that executes much faster than the pcre2_match() interpretive matching function. Full details are given in the pcre2jit
documentation.
Params
opt, ...:number
: JIT Compile options.
Returns
ok:boolean
: true on success.err:string
: error message.
matches a compiled regular expression against a given subject string, using a matching algorithm that is similar to Perl's. It returns offsets to captured substrings.
Params
sbj:string
: the subject string.offset:number
: offset in the subject at which to start matching.opt, ...:number
: Match options.
Returns
head:table
: array of start offsets.tail:table
: array of end offsets.err:string
: error message.
almost same as match
method but it returns only offsets of matched string.
Params
sbj:string
: the subject string.offset:number
: offset in the subject at which to start matching.opt, ...:number
: Match options.
Returns
head:number
: start offsets.tail:number
: end offsets.err:string
: error message.
local pcre2 = require('pcre2')
local re = assert( pcre2.new('(\\d+)(\\w)') )
assert( re:jit_compile( pcre2.JIT_COMPLETE ) )
local sbj = 'abc081abc134klj567'
local head, tail, err = re:match( sbj )
while head do
print( 'match' )
for i = 1, #head do
print( i, sbj:sub( head[i], tail[i] ) )
end
head, tail, err = re:match( sbj, tail[1] )
end
if err then
error( err )
end
print( 'done' )
--[[
this script will be output the following strings;
match
1 081a
2 081
3 a
match
1 134k
2 134
3 k
match
1 567
2 56
3 7
done
]]