Skip to content

Releases: rhpvorderman/sequali

version 0.11.1

12 Aug 08:40
124690c
Compare
Choose a tag to compare
  • Fix a memory leak that occurred in Python 3.12 due to a refcounting API
    change.

version 0.11.0

09 Aug 11:53
b4d0f6f
Compare
Choose a tag to compare
  • Make figure IDs reproducible across HTML reports.
  • Fix a bug where the average phred score per read would be rounded, not
    floored. This would lead reads with a phred score such as 9.7 to be counted
    towards the Q>=10 results.
  • Replace some of the hand vectorized code with more generic code that can be
    automatically be optimized by the compiler. This should make things faster on
    Windows and ARM64 platforms. This also means results should be consistent
    across platforms and no longer depend on the presence of vector instructions.

version 0.10.0

03 Jun 06:48
802a0f9
Compare
Choose a tag to compare
  • Make overrepresented sequences table scrollable and smaller so it is easier
    to skip over when lots of entries are found.
  • Overrepresented sequence analysis now only counts unique fragments per read.
    Fragments that are duplicated inside the read are counted only once. This
    prevents long stretches of genomic repeats getting higher reported
    frequencies.
  • Speed up sequence identity calculations on AVX2 enabled CPUs by using a
    reverse-diagonal parallelized version of Smith-Waterman.

version 0.9.1

22 May 10:27
df22ab5
Compare
Choose a tag to compare
  • Fix an issue where the insert size metrics module would crash when no
    adapters where present.

version 0.9.0

21 May 11:27
c953a71
Compare
Choose a tag to compare
  • MultiQC support since MultiQC version 1.22
  • Sort modules for paired end reports in the same order as single end reports.
    For example, the sequence length distributions for read 1 and read 2 are now
    right after each other.
  • Add common human genome repeats and Illumina poly-G dark cycles to the
    overrepresented sequences database.
  • Illumina adapter trimming sequences were added to the contaminants database
    as these were missing from the UniVec database.
  • Sequence identity, rather than kmers matched is shown as a metric for
    similarity in the overrepresented sequences table.
  • Overrepresented sequence classification now uses stable sorting to ensure
    the classification results are the same on each rerun.
  • Overrepresented sequences are now classified using Smith-Waterman alignment
    and sequence identity.
  • Fix an off by one error in the insert size metrics that was triggered for
    insert sizes larger than 300 bp.

version 0.8.0

08 May 14:44
a39ceee
Compare
Choose a tag to compare
  • A citation file was added to the repository.
  • Calculate insert sizes and used adapters based on overlap between the
    read pairs.
  • Both reads from paired-end reads are taken into consideration when
    evaluating the duplication rate.
  • Support for paired-end reads added.
  • Minor performance improvement by providing a non-temporal cache hint in the
    QCMetrics module.

version 0.7.1

17 Apr 07:42
a284b91
Compare
Choose a tag to compare
  • Fix a small visual bug in the report sidebar.
  • PyGAL report htmls are now fully HTML5 compliant. HTML5 validation has been
    made a part of the integration testing.

version 0.7.0

10 Apr 11:48
af3f835
Compare
Choose a tag to compare
  • Image files can now be saved as SVG files.
  • The javascript file for the tooltip highlighting is now embedded in the
    html file so no internet access is needed for the functionality.
  • A sidebar with a table of contents is added to the report for easier
    navigation.
  • Graph fonts are made a little bigger. Graphs now respond to zooming in and
    out on the web page.
  • Enable building on ARM platforms such as M1 macintosh and Aarch64.
  • Speedup the overrepresented sequences module by adding an AVX2 k-mer
    construction algorithm.

version 0.6.0

29 Mar 13:52
72a7e65
Compare
Choose a tag to compare
  • Add links to the documentation in the report.
  • Moved documentation to readthedocs and added extensive module documentation.
  • Change the -deduplication-estimate-bits to a more understandable
    --duplication-max-stored-fingerprints.
  • Add a small table that lists how many reads are >=Q5, >=Q7 etc. in the
    per sequence average quality report.
  • The progressbar can track progress through more file formats.
  • The deduplication fingerprint that is used is now configurable from the
    command line.
  • The deduplication module starts by gathering all sequences rather than half
    of the sequences. This allows all sequences to be considered using a big
    enough hash table.

version 0.5.1

22 Mar 13:58
8014193
Compare
Choose a tag to compare
  • Fix a bug in the overrepresented sequence sampling where the fragments from
    the back half of the sequence were incorrectly sampled. Leading to the last
    fragment being sampled over and over again.