- Log messages are written to
stderr
rather thanstdout
.
-
Remove
--verbose
flag.Logging is always enabled. This flag was previously deprecated in 0.8.0.
-
commands/lint: Add
--record-definition-separator
option (#34).This allows a custom separator to be used to strip the description from a record name. When unset, the default remains the same with '/' and ' '.
-
commands/lint: Return a nonzero exit code if an error is logged.
When the lint mode is set to
log
, thelint
command will now exit with a nonzero status if there are any validation errors.
-
commands/filter: Add filter by sequence pattern (#27).
Records can be filtered by their sequence using a regular expression:
fq filter --sequence-pattern <regex> --dsts <dst> <src>
. It cannot be combined with name filtering.
-
commands/filter: Support multiple segments (#30).
The
filter
command now supports multiple segments. Each source is paired with a destination (i.e., the output is no longer written to stdout by default), which is filtered by whether the record in the first segment is matched. -
commands/subsample: Disallow 0% and 100% as probabilities.
At these extremes, use
touch
andcp
, respectively, instead.
-
commands/subsample: Count the lines from the decompressed data if the input is gzipped.
Used in the exact sampler, this previously counted "lines" from the compressed input.
-
commands/subsample: Clamp the destination record count to the range of the source record count.
Otherwise, this would cause the filter to never finish building.
-
commands/subsample: Add exact sampler.
This writes an exact number of samples to the output. Set the
-n/--record-count
option to use the exact sampler.
- Update argument parser to clap 3.
- Rename project to fq.
-
commands/generate: Add
-s
short option for--seed
. -
commands: Add
subsample
command.subsample
outputs a proportional subset of records from single or paired FASTQ files.
-
Deprecate
--verbose
flag.Logging is now always enabled.
-
main: Show global version in subcommands (#20).
This allows subcommands to show the global version, e.g.,
fq lint --version
.
generate
: Added--read-length
option to set the number of bases to generate in each record's sequence.
- The FASTQ reader handles files with CRLF (Windows) newlines and no final newline.
-
[BREAKING]
generate
: Renamed--n-records
to--record-count
. -
generate
:--record-count
is parsed as au64
rather than ani32
. The argument parser never allowed negative numbers, so this change still includes the entire previous input set.
- The
generate
command adds a--seed <u64>
option to seed the random number generator. This is useful to regenerate the same outputs.
- The FASTQ generator now uses the Sanger/Illumina 1.8+ range of quality scores ([0, 41]). It samples scores on a normal distribution (μ = 20.5, σ = 2.61).
- Updated dependency
bloom
-->bbloom
to reflect a name change in the library.
-
New
filter
command. This accepts an allowlist of record read names to keep in the output FASTQ. -
Add
Dockerfile
to build a self-contained image forfq
. Build withdocker build --tag fqlib .
. -
Show git commit ID and date in display version, e.g., when using
--version
. This makes it easier to know the exact build of fqlib being used.
- [BREAKING]
generate
: Renamed--num-blocks
to--n-records
.
-
For paired end reads,
fq lint
exits with unexpected EOF if the both streams do not finish together. -
Multistream gzip files can be used as inputs. Written files still use a single stream.
-
fq lint
can take one FASTQ file as input for only single read validation.
-
A single binary
fq
with subcommands replacesfqgen
andfqlint
. Update usages tofq generate
andfq lint
, respectively. -
Metadata from CASAVA 1.8 read names is truncated. This is handled the same as interleaves.
- Fix line offset in error messages, which was previously off by 4.
- Initial release