Skip to content

Commit

Permalink
README points to documentation and removes explanations
Browse files Browse the repository at this point in the history
  • Loading branch information
pique0822 authored Oct 10, 2023
1 parent 8584e55 commit 6be81ce
Showing 1 changed file with 2 additions and 52 deletions.
54 changes: 2 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,55 +75,5 @@ For development you can also build the docker image locally using:
docker build . -t fstalign-dev
```

## Quickstart
```
Rev FST Align
Usage: ./fstalign [OPTIONS] [SUBCOMMAND]
Options:
-h,--help Print this help message and exit
--help-all Expand all help
--version Show fstalign version.
Subcommands:
wer Get the WER between a reference and an hypothesis.
align Produce an alignment between an NLP file and a CTM-like input.
```

### WER Subcommand

The wer subcommand is the most frequent usage of this tool. Required are two arguments traditional to WER calculation: a reference (`--ref <file_path>`) and a hypothesis (`--hyp <file_path>`) transcript. Currently the tool is configured to simply look at the file extension to determine the file format of the input transcripts and parse accordingly.

| File Extension | Reference Support | Hypothesis Supprt |
| ----------- | ----------- | ----------- |
| `.ctm` | :white_check_mark: | :white_check_mark: |
| `.nlp` | :white_check_mark: | :white_check_mark: |
| `.fst` | :white_check_mark: | :white_check_mark: |
| All other file extensions, assumed to be plain text | :white_check_mark: | :white_check_mark: |

Basic Example:
```
ref.txt
this is the best sentence
hyp.txt
this is a test sentence
./bin/fstalign wer --ref ref.txt --hyp hyp.txt
```

When run, fstalign will dump a log to STDOUT with summary WER information at the bottom. For the above example:
```
[+++] [20:37:10] [fstalign] done walking the graph
[+++] [20:37:10] [wer] best WER: 2/5 = 0.4000 (Total words in reference: 5)
[+++] [20:37:10] [wer] best WER: INS:0 DEL:0 SUB:2
[+++] [20:37:10] [wer] best WER: Precision:0.600000 Recall:0.600000
```

Note that in addition to general WER, the insertion/deletion/substitution breakdown is also printed. fstalign also has other useful outputs, including a JSON log for downstream machine parsing, and a side-by-side view of the alignment and errors generated. For more details, see the [Outputs](https://github.com/revdotcom/fstalign/blob/develop/docs/Advanced-Usage.md#outputs) section in the [Advanced Usage](https://github.com/revdotcom/fstalign/blob/develop/docs/Advanced-Usage.md) doc.

### Align Subcommand
Usage of the `align` subcommand is almost identical to the `wer` subcommand. The exception is that `align` can only be run if the provided reference is a NLP and the provided hypothesis is a CTM. This is because the core function of the subcommand is to align an NLP without timestamps to a CTM that has timestamps, producing an output of tokens from the reference with timings from the hypothesis.

## Advanced Usage
See [the advanced usage doc](https://github.com/revdotcom/fstalign/blob/develop/docs/Advanced-Usage.md) for more details.
## Documentation
For more information on how to use `fstalign` see our [documentation](https://github.com/revdotcom/fstalign/blob/develop/docs/Usage.md) for more details.

0 comments on commit 6be81ce

Please sign in to comment.