Replies: 2 comments 1 reply
-
Thanks for looking into alternatives for a solution.
We don't want to bet on an engine that's going to be abandoned soon or won't have bug fixes. Wrapping the new engine and only using it where necessary help mitigate this to an extent. |
Beta Was this translation helpful? Give feedback.
-
@mrussek, com.florianingerl.util.regex has added the missing license in their repo and it's MIT. Would you want to try to wrap it for the necessary Japanese cases? I'd recommend running the perf test mentioned above before that to make sure there's no major issue. |
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
This is not a user-facing problem, but a problem facing developers of the Java version of this library.
The default Java regular expression engine does not support variable length negative lookbehind assertions. These assertions are commonly used in the specifications for many languages. The methods in RegExpUtility offer some mitigation, but they still do not handle the fact that the Java regex engine also does not support alternation in negative lookbehind assertions, as is required for Japanese number parsing for example. This makes parsing Japanese number parsing to Java effectively impossible.
Describe the solution you'd like
It would be possible to fix this issue by using a more fully featured regular expression engine, of which there are many available in Java. The branch at
alternative_regex
in my clone of the repository illustrates that we can usecom.florianingerl.util.regex
for example. That branch simply substitutes all instances ofjava.util.regex
classes with the equivalentcom.florianingerl.util.regex
classes, and all tests are passing.Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
There are other alternative regex engines available for Java.
re2j
library doesn't support variable length negative lookbehinds.We could also continue to proceed without an alternative regex implementation, but it is unclear how to implement languages that require this functionality in Java.
Beta Was this translation helpful? Give feedback.
All reactions