fstalign
is a tool for creating alignment between two sequences of tokens (here out referred to as “reference” and “hypothesis”). It has two key functions: computing word error rate (WER) and aligning NLP-formatted references with CTM hypotheses.
Due to its use of OpenFST and lazy algorithms for text-based alignment, fstalign
is efficient for calculating WER while also providing significant flexibility for different measurement features and error analysis.
We use git submodules to manage third-party dependencies. Initialize and update submodules before proceeding to the main build steps.
git submodule update --init --recursive
This will pull the current dependencies:
- catch2 - for unit testing
- spdlog - for logging
- CLI11 - for CLI construction
- csv - for CTM and NLP input parsing
- jsoncpp - for JSON output construction
- strtk - for various string utilities
Additionally, we have dependencies outside of the third-party submodules:
- OpenFST - currently provided to the build system by settings the $OPENFST_ROOT environment variable or during the CMake command via
-DOPENFST_ROOT
.
The current build framework is CMake. Install CMake following the instructions here (https://cmake.org/install/).
To build fstalign, run:
mkdir build && cd build
cmake .. -DOPENFST_ROOT="<path to OpenFST>" -DDYNAMIC_OPENFST=ON
make
Note: -DDYNAMIC_OPENFST=ON
is needed if OpenFST at OPENFST_ROOT
is compiled as shared libraries. Otherwise static libraries are assumed.
Finally, tests can be run using:
make test
The fstalign docker image is hosted on Docker Hub and can be easily pulled and run:
docker pull revdotcom/fstalign
docker run --rm -it revdotcom/fstalign
See https://hub.docker.com/r/revdotcom/fstalign/tags for the available versions/tags to pull. If you desire to run the tool on local files you can mount local directories with the -v
flag of the docker run
command.
From inside the container:
/fstalign/build/fstalign --help
For development you can also build the docker image locally using:
docker build . -t fstalign-dev
For more information on how to use fstalign
see our documentation for more details.