-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* json report to stdout, can be disabled with `--no-json` * rename * small optimisations
- Loading branch information
Showing
7 changed files
with
168 additions
and
223 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
/target | ||
/test-data | ||
/benchmarks/raw | ||
/benchmarks_raw | ||
Cargo.lock |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,67 +1,114 @@ | ||
# Benchmarks: | ||
|
||
Benchmarks were run in triplicate with 1 warmup run - using [hyperfine](https://github.com/sharkdp/hyperfine). | ||
Benchmarks were run 5x each with 1 warmup run - using [hyperfine](https://github.com/sharkdp/hyperfine). | ||
|
||
They were run on my own PC running Ubuntu in WSL2 (Plans to test in native linux): | ||
They were run on my own PC running Ubuntu 23.04: | ||
|
||
- i7 12700k - 12 cores | ||
- 32GB memory with 16GB Assigned to WSL2 | ||
- 32GB memory | ||
- M.2 SSD drive | ||
|
||
Hyperfine command: | ||
```bash | ||
hyperfine --warmup 1 -r 3 --export-json --export-markdown | ||
hyperfine --warmup 1 -r 5 --export-json --export-markdown | ||
``` | ||
## Unpaired | ||
|
||
## Benchmark 1: | ||
ONT long read sequencing, outputting a plain (non-compressed) fastq file | ||
Single tests were based on the [Zymo mock community](https://github.com/LomanLab/mockcommunity) Zymo-GridION-EVEN-BB-SN dataset. | ||
|
||
### Benchmark 1: | ||
ONT long read sequencing, input `fastq` format, outputting a non-compressed `fastq` file. Extracting all reads classified as *Bacillus spizizenii*. | ||
|
||
*Inputs:* | ||
|
||
| File type | Platform | Total reads | Reads to extract | Kraken output size| Output| | ||
|-----------|----------|-------------|------------------|-------------------|-------| | ||
|`.fastq.gz`|ONT |5,454,495 |3,489,386 |1.8GB |`.fastq` | ||
|`.fastq` |ONT |3,491,078 |490,984 |885MB |`.fastq` | ||
|
||
*Commands run:* | ||
|
||
| Tool | Command | | ||
|------|---------| | ||
| krakenXtract | `./target/release/krakenxtract --kraken test-data/output.kraken --fastq test-data/ont_test.fastq.gz --taxid 666 --output krakenxtract_benchmark.fq --no-compress` | | ||
| KrakenTools | `extract_kraken_reads.py -k test-data/output.kraken -s test-data/ont_test.fastq.gz -o krakentools_benchmark.fq --fastq-output -t 666` | | ||
| `KrakenTools` | `extract_kraken_reads.py -s Zymo-GridION-EVEN-BB-SN.fq -k out.kraken -o krakentools.fq -t 96241 --fastq-output` | | ||
| `Kractor` | `kractor -i Zymo-GridION-EVEN-BB-SN.fq -k out.kraken -o kractor.fq -t 96241` | | ||
|
||
|
||
*Results:* | ||
| Tool | Mean [s] | Min [s] | Max [s] | Relative | | ||
|:---|---:|---:|---:|---:| | ||
| `krakenXtract` | 80.585 ± 9.795 | 72.964 | 91.633 | 1.00 | | ||
| `KrakenTools` | 354.722 ± 8.613 | 349.398 | 364.659 | 4.40 ± 0.55 | | ||
| `krakenTools` | 254.881 ± 9.482 | 242.553 | 263.158 | 1.00 | | ||
| `Kractor` | 58.253 ± 5.651 | 48.358 | 62.222 | 4.38 ± 0.45 | | ||
|
||
### Benchmark 2: | ||
ONT long read sequencing, input `fastq.gz` format, outputting a non-compressed `fastq` file. Extracting all reads classified as *Bacillus spizizenii*. | ||
|
||
|
||
*Inputs:* | ||
|
||
| File type | Platform | Total reads | Reads to extract | Kraken output size| Output| | ||
|-----------|----------|-------------|------------------|-------------------|-------| | ||
|`.fastq.gz`|ONT |3,491,078 |490,984 |885MB |`.fastq` | ||
|
||
*Commands run:* | ||
|
||
Welch t-test: t = -36.4, p = 4e-06 | ||
| Tool | Command | | ||
|------|---------| | ||
| KrakenTools | `extract_kraken_reads.py -s Zymo-GridION-EVEN-BB-SN.fq.gz -k out.kraken -o krakentools.fq -t 96241 --fastq-output` | | ||
| Kractor | `kractor -i Zymo-GridION-EVEN-BB-SN.fq.gz -k out.kraken -o kractor.fq -t 96241` | | ||
|
||
There is a difference between the two benchmarks (p < 0.05). | ||
|
||
![Alt text](img/ont-benchmark-whisker.png) | ||
*Results:* | ||
| Command | Mean [s] | Min [s] | Max [s] | Relative | | ||
|:---|---:|---:|---:|---:| | ||
| `krakenTools` | 376.592 ± 3.905 | 373.343 | 383.315 | 1.00 | | ||
| `Kractor` | 100.044 ± 3.599 | 98.044 | 106.449 | 3.76 ± 0.14 | | ||
|
||
## Paired | ||
|
||
Paired end tests were based on SRA accession: SRR19995508 | ||
|
||
## Benchmark 2: | ||
ONT long read sequencing, outputting a compressed fastq file. | ||
### Benchmark 3: | ||
Illumina paired end sequencing, input `fastq` format, outputting a non-compressed `fastq` file. | ||
|
||
*Inputs:* | ||
|
||
| File type | Platform | Total reads | Reads to extract | Kraken output size| Output| | ||
|-----------|----------|-------------|------------------|-------------------|-------| | ||
|`.fastq`|Illumina paired |53,526,611 |1,646,117 |9.3GB |`.fastq` | ||
|
||
*Commands run:* | ||
|
||
| Tool | Command | | ||
|------|---------| | ||
| KrakenTools | `extract_kraken_reads.py -s SRR19995508_R1.fastq -s2 SRR19995508_R2.fastq -o R1_tools.fq -o2 R2_tools.fq -k out.kraken -t 590 --fastq-output` | | ||
| Kractor | `kractor -i SRR19995508_R1.fastq -i SRR19995508_R2.fastq -k out.kraken -t 590 -o R1_kractor.fq -o R2_kractor.fq` | | ||
|
||
|
||
*Results:* | ||
| Command | Mean [s] | Min [s] | Max [s] | Relative | | ||
|:---|---:|---:|---:|---:| | ||
| `krakenTools` | 898.306 ± 14.203 | 884.229 | 920.653 | 1.00 | | ||
| `Kractor` | 94.198 ± 2.317 | 90.852 | 96.474 | 9.54 ± 0.28 | | ||
|
||
NOTE: Kraken Tools has no option to output a compressed fastq. | ||
### Benchmark 4: | ||
Illumina paired end sequencing, input `fastq.gz` format, outputting a non-compressed `fastq` file. | ||
|
||
*Inputs:* | ||
|
||
| File type | Platform | Total reads | Reads to extract | Kraken output size| Output| | ||
|-----------|----------|-------------|------------------|-------------------|-------| | ||
|`.fastq.gz`|ONT |5,454,495 |3,489,386 |1.8GB |`.fastq.gz` | ||
|`.fastq.gz`|Illumina paired |53,526,611 |1,646,117 |9.3GB |`.fastq` | ||
|
||
*Commands run:* | ||
|
||
| Tool | Command | | ||
|------|---------| | ||
| krakenXtract | `./target/release/krakenxtract --kraken test-data/output.kraken --fastq test-data/ont_test.fastq.gz --taxid 666 --output krakenxtract_benchmark.fq.gz` | | ||
| KrakenTools | `extract_kraken_reads.py -s SRR19995508_R1.fastq.gz -s2 SRR19995508_R2.fastq.gz -o R1_tools.fq -o2 R2_tools.fq -k out.kraken -t 2 --fastq-output` | | ||
| Kractor | `kractor -i SRR19995508_R1.fastq.gz -i SRR19995508_R2.fastq.gz -k out.kraken -t 2 -o R1_kractor.fq -o R2_kractor.fq` | | ||
|
||
|
||
*Results:* | ||
| Command | Mean [s] | Min [s] | Max [s] | Relative | | ||
|:---|---:|---:|---:|---:| | ||
| krakenXtract | 78.042 ± 1.622 | 76.361 | 79.597 | 1.00 | | ||
| `krakenTools` | 1033.379 ± 25.238 | 1005.720 | 1068.522 | 1.00 | | ||
| `Kractor` | 49.071 ± 0.179 | 48.857 | 49.334 | 21.06 ± 0.52 | |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.