Skip to content

dwarfs-0.4.1

Compare
Choose a tag to compare
@mhx mhx released this 13 Mar 14:47
· 1851 commits to main since this release

Performance improvements

  • Binaries built with gcc have traditionally been much slower than those built with clang, but it was unclear why that was the case. It turns out the reason is simply that CMake defaults to -O3 optimization, which is known to cause performance regressions in some cases. The build has been changed to always build with -O2 when doing an optimized GCC build. The Clang build is unaffected. (fixes github #14)

  • The segmenting code now uses a bloom filter to discard unsuccessful matches as early and quickly as possible. While this only gives a minor speedup when using a single lookback block, as you increase the number of lookback blocks speed is barely affected whereas before it would slow down significantly. The bloom filter size (relative to the number of values) can be tuned by using --bloom-filter-size, though increasing it any further from the default is likely not going to make a difference.

  • nilsimsa similarity computation has been improved to make use of different instruction sets depending on CPU architecture, speeding up the process of ordering files by similarity by almost a factor of 2.

Bugfixes

  • [fix] Linking against libarchive was fixed so that it also works for shared library builds. (fixes github #36)

  • mkdwarfs didn't catch certain exceptions correctly, which would cause a stack trace instead of a simple error message. This has been fixed.

  • The statically linked executables were unable to handle any exceptions at all due to duplicate stack unwinding code. This has (hopefully) been fixed now.