Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block splitter #4136

Merged
merged 44 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
a5bce4a
XP: add a pre-splitter
Cyan4973 Sep 3, 2024
9e52789
fixed strict C90 semantic
Cyan4973 Sep 3, 2024
586ca96
do not use `new` as variable name
Cyan4973 Sep 3, 2024
e2d7d08
use ZSTD_memset()
Cyan4973 Sep 3, 2024
6021b66
minor C++-ism
Cyan4973 Sep 3, 2024
fa147cb
more ZSTD_memset() to apply
Cyan4973 Sep 3, 2024
83a3402
fix overlap write scenario in presence of incompressible data
Cyan4973 Oct 17, 2024
f83ed08
fixed RLE detection test
Cyan4973 Oct 17, 2024
8b3887f
fixed kernel build
Cyan4973 Oct 17, 2024
dd38c67
fixed single-library build
Cyan4973 Oct 17, 2024
0d4b520
only split full blocks
Cyan4973 Oct 17, 2024
20c3d17
fix assert
Cyan4973 Oct 17, 2024
6dc5212
fixed c90 comment style
Cyan4973 Oct 17, 2024
80a912d
fixed zstreamtest
Cyan4973 Oct 17, 2024
6939235
fixed meson build
Cyan4973 Oct 17, 2024
cdddcaa
new Makefile target mesonbuild
Cyan4973 Oct 17, 2024
76ad1d6
fixed VS2010 solution
Cyan4973 Oct 17, 2024
31d48e9
fixing minor formatting issue in 32-bit mode with logs enabled
Cyan4973 Oct 17, 2024
7f015c2
replaced uasan32 test by asan32 test
Cyan4973 Oct 18, 2024
73a6653
ZSTD_splitBlock_4k() uses externally provided workspace
Cyan4973 Oct 18, 2024
433f459
fixed minor conversion warnings on Visual
Cyan4973 Oct 18, 2024
4685eaf
fix alignment test
Cyan4973 Oct 18, 2024
cae8d13
splitter workspace is now provided by ZSTD_CCtx*
Cyan4973 Oct 18, 2024
4ce91cb
fixed workspace alignment on non 64-bit systems
Cyan4973 Oct 18, 2024
dac26ea
updated regression test results
Cyan4973 Oct 21, 2024
1c62e71
minor split optimization
Cyan4973 Oct 21, 2024
a167571
added a faster block splitter variant
Cyan4973 Oct 21, 2024
7bad787
made ZSTD_isPower2() an inline function
Cyan4973 Oct 22, 2024
5ae34e4
ensure `lastBlock` is correctly determined
Cyan4973 Oct 22, 2024
ea85dc7
conservatively estimate over-splitting in presence of incompressible …
Cyan4973 Oct 22, 2024
4662f6e
renamed: FingerPrint => Fingerprint
Cyan4973 Oct 22, 2024
1ec5f9f
changed loop exit condition so that there is no need to assert() with…
Cyan4973 Oct 22, 2024
16450d0
rewrite penalty update
Cyan4973 Oct 22, 2024
06b7cfa
rewrote ZSTD_cwksp_initialAllocStart() to be easier to read
Cyan4973 Oct 22, 2024
0be334d
fixes static state allocation check
Cyan4973 Oct 22, 2024
d2eeed5
updated compression results
Cyan4973 Oct 22, 2024
18b1e67
fixed extraneous return
Cyan4973 Oct 22, 2024
57239c4
fixed minor strict pedantic C90 issue
Cyan4973 Oct 23, 2024
b68ddce
rewrite fingerprint storage to no longer need 64-bit members
Cyan4973 Oct 23, 2024
7d3e5e3
split all full 128 KB blocks
Cyan4973 Oct 23, 2024
c80645a
stricter limits to ensure expansion factor with blind-split strategy
Cyan4973 Oct 23, 2024
bbda1ac
update regression results
Cyan4973 Oct 23, 2024
90095f0
apply limit conditions for all splitting strategies
Cyan4973 Oct 24, 2024
70c77d2
update regression results
Cyan4973 Oct 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 18 additions & 3 deletions .github/workflows/dev-long-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -156,15 +156,30 @@ jobs:
make libc6install
CFLAGS="-O3 -m32" FUZZER_FLAGS="--long-tests" make uasan-fuzztest

clang-asan-ubsan-fuzz32:
clang-asan-fuzz32:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # tag=v4.1.1
- name: clang + ASan + UBSan + Fuzz Test 32bit
- name: clang + ASan + Fuzz Test 32bit
run: |
sudo apt-get -qqq update
make libc6install
CC=clang CFLAGS="-O3 -m32" FUZZER_FLAGS="--long-tests" make uasan-fuzztest
CC=clang CFLAGS="-O3 -m32" FUZZER_FLAGS="--long-tests" make asan-fuzztest

# The following test seems to have issues on github CI specifically,
# it does not provide the `__mulodi4` instruction emulation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have issues before this PR, or only after? If it is the latter, this might also have issues in the kernel, and we would need to do something similar to what we do with ZSTD_div64()

#define ZSTD_div64(dividend, divisor) ((dividend) / (divisor))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this might be a compiler bug.

Copy link
Contributor Author

@Cyan4973 Cyan4973 Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the issue doesn't reproduce on my local linux workstation.
I suspect a bug in the version of ubsan for clang currently deployed within github ci.

It's likely that this issue will be fixed at some unspecified time in the future.
We just can't wait for this issue to be fixed, the CI needs to continue running.
We'll re-enable this test when it can work properly on Github CI again.

Also, dropping temporarily the ubsan test for clang for 32-bit x86 targets seems like a minor inconvenience, given that:

  • we mostly care about 64-bit x64
  • ubsan 32-bit is still running in CI with gcc
  • we keep the more important asan test for 32-bit on clang

# required for signed 64-bit multiplication.
# Replaced by asan-only test (above)
#
# clang-asan-ubsan-fuzz32:
# runs-on: ubuntu-20.04
# steps:
# - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # tag=v4.1.1
# - name: clang + ASan + UBSan + Fuzz Test 32bit
# run: |
# sudo apt-get -qqq update
# make libc6install
# CC=clang CFLAGS="-O3 -m32" FUZZER_FLAGS="--long-tests" make uasan-fuzztest

asan-ubsan-regression:
runs-on: ubuntu-20.04
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/dev-short-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -232,10 +232,10 @@ jobs:
-Dbin_tests=true \
-Dbin_contrib=true \
-Ddefault_library=both \
build/meson builddir
ninja -C builddir/
meson test -C builddir/ --print-errorlogs
meson install -C builddir --destdir staging/
build/meson mesonBuild
ninja -C mesonBuild/
meson test -C mesonBuild/ --print-errorlogs
meson install -C mesonBuild --destdir staging/

meson-mingw-cross-compilation:
runs-on: ubuntu-latest
Expand Down
20 changes: 19 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ clean:
$(Q)$(MAKE) -C contrib/largeNbDicts $@ > $(VOID)
$(Q)$(MAKE) -C contrib/externalSequenceProducer $@ > $(VOID)
$(Q)$(RM) zstd$(EXT) zstdmt$(EXT) tmp*
$(Q)$(RM) -r lz4 cmakebuild install
$(Q)$(RM) -r lz4 cmakebuild mesonbuild install
@echo Cleaning completed

#------------------------------------------------------------------------------
Expand Down Expand Up @@ -415,6 +415,24 @@ cmakebuild:
$(CMAKE) --build cmakebuild --target install -- -j V=1
cd cmakebuild; ctest -V -L Medium

MESON ?= meson
NINJA ?= ninja

.PHONY: mesonbuild
mesonbuild:
$(MESON) setup \
--buildtype=debugoptimized \
-Db_lundef=false \
-Dauto_features=enabled \
-Dbin_programs=true \
-Dbin_tests=true \
-Dbin_contrib=true \
-Ddefault_library=both \
build/meson mesonbuild
$(NINJA) -C mesonbuild/
$(MESON) test -C mesonbuild/ --print-errorlogs
$(MESON) install -C mesonbuild --destdir staging/

.PHONY: c89build gnu90build c99build gnu99build c11build bmix64build bmix32build bmi32build staticAnalyze
c89build: clean
$(CC) -v
Expand Down
1 change: 1 addition & 0 deletions build/VS2010/fullbench/fullbench.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@
<ClCompile Include="..\..\..\lib\compress\zstd_compress_literals.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_sequences.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_superblock.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_preSplit.c" />
<ClCompile Include="..\..\..\lib\compress\zstdmt_compress.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_double_fast.c" />
Expand Down
1 change: 1 addition & 0 deletions build/VS2010/fuzzer/fuzzer.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@
<ClCompile Include="..\..\..\lib\compress\zstd_compress_literals.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_sequences.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_superblock.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_preSplit.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_double_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_lazy.c" />
Expand Down
1 change: 1 addition & 0 deletions build/VS2010/libzstd-dll/libzstd-dll.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
<ClCompile Include="..\..\..\lib\compress\zstd_compress_literals.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_sequences.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_superblock.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_preSplit.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_double_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_lazy.c" />
Expand Down
1 change: 1 addition & 0 deletions build/VS2010/libzstd/libzstd.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
<ClCompile Include="..\..\..\lib\compress\zstd_compress_literals.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_sequences.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_superblock.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_preSplit.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_double_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_lazy.c" />
Expand Down
1 change: 1 addition & 0 deletions build/VS2010/zstd/zstd.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
<ClCompile Include="..\..\..\lib\compress\zstd_compress_literals.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_sequences.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_compress_superblock.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_preSplit.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_double_fast.c" />
<ClCompile Include="..\..\..\lib\compress\zstd_lazy.c" />
Expand Down
1 change: 1 addition & 0 deletions build/meson/lib/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ libzstd_sources = [join_paths(zstd_rootdir, 'lib/common/entropy_common.c'),
join_paths(zstd_rootdir, 'lib/compress/zstd_compress_literals.c'),
join_paths(zstd_rootdir, 'lib/compress/zstd_compress_sequences.c'),
join_paths(zstd_rootdir, 'lib/compress/zstd_compress_superblock.c'),
join_paths(zstd_rootdir, 'lib/compress/zstd_preSplit.c'),
join_paths(zstd_rootdir, 'lib/compress/zstdmt_compress.c'),
join_paths(zstd_rootdir, 'lib/compress/zstd_fast.c'),
join_paths(zstd_rootdir, 'lib/compress/zstd_double_fast.c'),
Expand Down
1 change: 1 addition & 0 deletions build/single_file_libs/zstd-in.c
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@
#include "compress/zstd_compress_literals.c"
#include "compress/zstd_compress_sequences.c"
#include "compress/zstd_compress_superblock.c"
#include "compress/zstd_preSplit.c"
#include "compress/zstd_compress.c"
#include "compress/zstd_double_fast.c"
#include "compress/zstd_fast.c"
Expand Down
6 changes: 6 additions & 0 deletions lib/common/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,12 @@
* Alignment check
*****************************************************************/

/* @return 1 if @u is a 2^n value, 0 otherwise
* useful to check a value is valid for alignment restrictions */
MEM_STATIC int ZSTD_isPower2(size_t u) {
return (u & (u-1)) == 0;
}

/* this test was initially positioned in mem.h,
* but this file is removed (or replaced) for linux kernel
* so it's now hosted in compiler.h,
Expand Down
Loading