This project provides XPath bindings of the ICU library for processing common Unicode tasks. It's based on the ICU library for Java (ICU4J) and can be used in the Saxon XSLT/XQuery processor.
The bindings only use a small set of the ICU library. Other parts may be added in future, if they are needed. XPath functions for the following tasks are provided:
- normalization
- transliteration
The namespace name of the XPath extension functions is
https://unicode-org.github.io/icu/
. In this documentation, we are
using the prefix icu
bound to this namespace:
xmlns:icu="https://unicode-org.github.io/icu/"
.
- normalization
- transliteration
For getting started have a look at the example sections in the transliteration and normalization documentation.
Installation for the oXygen XML editor is very simple. You only have to provide the following URL to the installation dialog from Help -> Install new add-ons...:
https://scdh.github.io/icu-xpath-bindings/descriptor.xml
Note: As we don't have a key for signing the extension, we will have to proceed anyway at some stage of the installation process.
After the installation, you can use the new XPath function everywhere in oXygen. You don't need to clone this repo.
tl;dr: Run mvn package
and use the xslt.sh
or saxon.sh
shell
wrappers with the option -config:saxon-config.xml
.
Two things are necessary:
-
Tell Saxon that there are XPath functions. This can be done via a Saxon configuration file. Such a configuration is in
saxon-config.xml
. You can use it from the Saxon command line interface via the argument-config:saxon-config.xml
. -
Provide a jar file to the classpath, so that the Java classes that define the functions are available to Saxon. On the releases page, you can find jar files for each release. Use
icu-xpath-bindings-VERSION-with-dependencies.jar
oricu-xpath-bindings-VERSION.jar
. The former has everything but Saxon packed into it. If using the latter one, dependency packages like ICU4J also have to be included into the classpath:
- icu4j
- icu4j-charset
- icu4j-localespi
- slf4j-api
You can get the dependency jar files manually through Maven Central or you can clone this git repository and run the Maven build process, which downloads and builds everything for you automatically:
mvn package
After you have run mvn package
all the required jar files are
present within the project:
bindings/target/icu-xpath-bindings-VERSION.jar
bindings/target/lib/icu4j-VERSION.jar
bindings/target/lib/icu4j-charset-VERSION.jar
bindings/target/lib/icu4j-localespi-VERSION.jar
bindings/target/lib/slf4j-api-VERSION.jar
For convenience, after running mvn package
there will also be the
shell scripts xslt.sh
and saxon.sh
in the repo's root folder. It's
a shell wrapper around Saxon that sets the classpath correctly.
When using Java, you should also have a look at the
IcuXPathFunctionRegistry.register(Processor)
. Moreover,
the classes with the function definition are registered for loading
through the SPI.
You can build and test the project locally. You can also install the oxygen plugin from a local build. Therefore, run
mvn -Drelease.url="" package
Then, you can provide the descriptor file under
oxygen/target/descriptor.xml
to the oxygen extension installation
dialog.
MIT License
Copyright (c) 2023 SCDH, Westfälische Wilhelms-Universität Münster