-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s2: Don't use stack for index tables #1014
Conversation
Provide a pooled array pointer for tables instead of using stack. Seems like Go is still unstable with large stacks, so use alternative method.
📝 WalkthroughWalkthroughThe pull request introduces significant modifications across several files in the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (9)
s2/s2_test.go (3)
2229-2243
: LGTM: BenchmarkEncodeBlockSingle implementationThe
BenchmarkEncodeBlockSingle
function is well-implemented and follows the same structure asBenchmarkDecodeBlockSingle
. It correctly handles file downloading, applies size limits when necessary, and uses thebenchEncode
function for benchmarking.Consider extracting the common setup code (downloading, reading, and size limiting) into a separate helper function to reduce duplication between
BenchmarkDecodeBlockSingle
andBenchmarkEncodeBlockSingle
.
2245-2251
: LGTM: BenchmarkEncodeBlockParallel implementationThe
BenchmarkEncodeBlockParallel
function is correctly implemented, using thebenchFile
function for parallel encoding benchmarks. This approach is consistent with theBenchmarkDecodeBlockParallel
function and promotes code reuse.Consider refactoring
BenchmarkDecodeBlockParallel
andBenchmarkEncodeBlockParallel
into a single function with a parameter to determine whether to benchmark encoding or decoding. This would reduce code duplication and improve maintainability. For example:func BenchmarkBlockParallel(b *testing.B, decode bool) { for i := range testFiles { b.Run(fmt.Sprint(i, "-", testFiles[i].label), func(b *testing.B) { benchFile(b, i, decode) }) } } // Usage: func BenchmarkDecodeBlockParallel(b *testing.B) { BenchmarkBlockParallel(b, true) } func BenchmarkEncodeBlockParallel(b *testing.B) { BenchmarkBlockParallel(b, false) }
2205-2251
: Overall assessment: Solid implementation with room for minor improvementsThe new benchmark functions (
BenchmarkDecodeBlockSingle
,BenchmarkDecodeBlockParallel
,BenchmarkEncodeBlockSingle
, andBenchmarkEncodeBlockParallel
) are well-implemented and significantly enhance the test coverage for block encoding and decoding in both single-threaded and parallel contexts. They follow existing patterns in the file, promoting consistency and readability.To further improve the code:
- Consider extracting the common setup code in the single-threaded benchmarks into a helper function.
- Explore the possibility of combining the parallel benchmark functions into a single parameterized function to reduce duplication.
These refactorings would enhance maintainability and reduce the likelihood of inconsistencies in future updates.
s2/encode_amd64.go (1)
114-121
: Consider using a switch statement for improved readability.The repeated if-else conditions for checking the length of
src
can be replaced with a switch statement to improve code readability and clarity.Here's an example of how you can refactor the code using a switch statement:
func encodeBlockBetter(dst, src []byte) (d int) { // ... existing code ... switch { case len(src) > 4<<20: const sz, pool = 589824, 0 tmp := getPooledArray(&encBetterPools[pool], sz) defer encBetterPools[pool].Put(tmp) return encodeBetterBlockAsm(dst, src, tmp) case len(src) >= limit12B: const sz, pool = 589824, 0 tmp := getPooledArray(&encBetterPools[pool], sz) defer encBetterPools[pool].Put(tmp) return encodeBetterBlockAsm4MB(dst, src, tmp) case len(src) >= limit10B: const sz, pool = 81920, 0 tmp := getPooledArray(&encBetterPools[pool], sz) defer encBetterPools[pool].Put(tmp) return encodeBetterBlockAsm12B(dst, src, tmp) case len(src) >= limit8B: const sz, pool = 20480, 1 tmp := getPooledArray(&encBetterPools[pool], sz) defer encBetterPools[pool].Put(tmp) return encodeBetterBlockAsm10B(dst, src, tmp) case len(src) >= minNonLiteralBlockSize: const sz, pool = 5120, 2 tmp := getPooledArray(&encBetterPools[pool], sz) defer encBetterPools[pool].Put(tmp) return encodeBetterBlockAsm8B(dst, src, tmp) default: return 0 } }Using a switch statement with case conditions makes the code more readable and easier to understand the different length ranges and their corresponding actions.
Also applies to: 124-132, 135-143, 146-153, 158-166
s2/encodeblock_amd64.go (2)
14-14
: Consider adding unit tests to verify the behavior with the new function signatures.While the changes appear to be consistent, it would be beneficial to add unit tests specifically exercising the functions with the new signatures. This will help catch any potential issues related to the additional temporary byte array parameter and ensure the expected behavior is maintained.
Do you want me to generate the unit testing code or open a GitHub issue to track this task?
Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
14-14
: Document the purpose and usage of the temporary byte arrays.The temporary byte arrays have been introduced in the function signatures, but their purpose and usage are not clearly documented. It would be helpful to add comments explaining:
- Why the temporary byte arrays are needed.
- How they are expected to be used within the functions.
- Any assumptions or constraints related to their size and content.
Consider adding comments to each function to clarify the purpose and usage of the temporary byte arrays. For example:
// encodeBlockAsm encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4294967295 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. // The 'tmp' parameter is a temporary byte array used for intermediate storage during encoding. // It must be a pointer to a [65536]byte array. // //go:noescape func encodeBlockAsm(dst []byte, src []byte, tmp *[65536]byte) intAlso applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
s2/encode.go (1)
69-86
: Avoid magic numbers and use constants for pool sizes.The pool sizes
2048
and32768
are used as magic numbers in multiple places. Define them as constants at the package level with descriptive names to improve readability and maintainability.For example:
const ( smallBufferSize = 2048 largeBufferSize = 32768 ) func EstimateBlockSize(src []byte) (d int) { ... if len(src) <= 1024 { const sz, pool = smallBufferSize, 0 ... } else { const sz, pool = largeBufferSize, 1 ... } ... }🧰 Tools
🪛 GitHub Check: build (1.23.x, macos-latest)
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools🪛 GitHub Check: build (1.22.x, macos-latest)
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools🪛 GitHub Check: fuzz-s2
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools🪛 GitHub Check: build (1.21.x, macos-latest)
[failure] 70-70:
undefined: encPools
[failure] 74-74:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 84-84:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)
[failure] 85-85:
undefined: encPools
[failure] 70-70:
undefined: encPools
[failure] 74-74:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)🪛 GitHub Check: build-special
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPoolss2/_generate/gen.go (2)
149-156
: Consider using a struct literal for improved readability.Instead of initializing the
regTable
fields one by one, consider using a struct literal for improved readability:table := regTable{r: Load(Param("tmp"), GP64()), disp: 0}
931-933
: Consider using a struct literal for improved readability.Instead of initializing the
regTable
fields one by one, consider using struct literals for improved readability:lTab := regTable{r: table, disp: 0} sTab := regTable{r: table, disp: lTableSize}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
s2/testdata/fuzz/block-corpus-enc.zip
is excluded by!**/*.zip
📒 Files selected for processing (5)
- s2/_generate/gen.go (9 hunks)
- s2/encode.go (3 hunks)
- s2/encode_amd64.go (5 hunks)
- s2/encodeblock_amd64.go (1 hunks)
- s2/s2_test.go (1 hunks)
🧰 Additional context used
🪛 GitHub Check: build (1.23.x, macos-latest)
s2/encode.go
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools
[failure] 87-87:
too many arguments in call to calcBlockSize
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
🪛 GitHub Check: build (1.22.x, macos-latest)
s2/encode.go
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools
[failure] 87-87:
too many arguments in call to calcBlockSize
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
🪛 GitHub Check: fuzz-s2
s2/encode.go
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools
[failure] 87-87:
too many arguments in call to calcBlockSize
🪛 GitHub Check: build (1.21.x, macos-latest)
s2/encode.go
[failure] 70-70:
undefined: encPools
[failure] 74-74:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 84-84:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)
[failure] 85-85:
undefined: encPools
[failure] 87-87:
too many arguments in call to calcBlockSize
[failure] 70-70:
undefined: encPools
[failure] 74-74:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)
🪛 GitHub Check: build-special
s2/encode.go
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 80-80:
undefined: encPools
[failure] 85-85:
undefined: encPools
[failure] 87-87:
too many arguments in call to calcBlockSize
🔇 Additional comments (10)
s2/s2_test.go (2)
2205-2219
: LGTM: BenchmarkDecodeBlockSingle implementationThe implementation of
BenchmarkDecodeBlockSingle
is well-structured and consistent with other benchmark functions in the file. It correctly handles file downloading, applies size limits when necessary, and uses thebenchDecode
function for actual benchmarking.
2221-2227
: LGTM: BenchmarkDecodeBlockParallel implementationThe
BenchmarkDecodeBlockParallel
function is efficiently implemented, leveraging the existingbenchFile
function for parallel decoding benchmarks. This approach promotes code reuse and maintains consistency with other benchmark functions.s2/encode_amd64.go (3)
189-197
: Verify the impact of the changes on the Snappy compatibility.The changes made to the
encodeBlockSnappy
function, such as the usage ofsync.Pool
and the modified length conditions, should be carefully reviewed to ensure that they don't break the compatibility with the Snappy format.To verify the Snappy compatibility, you can run the following script:
#!/bin/bash # Run Snappy compatibility tests go test -run=TestSnappyCompatibility ./... # Verify that the Snappy-encoded data can be decoded correctly go test -run=TestSnappyDecoding ./...This script will run the Snappy compatibility tests and verify that the Snappy-encoded data can be decoded correctly. If any of the tests fail, it indicates that the changes might have introduced compatibility issues with the Snappy format.
Additionally, review the Snappy format specification and compare it with the modified code to ensure that the encoding and decoding processes still adhere to the specification.
Also applies to: 200-207, 210-217, 220-227, 232-239
262-270
: Ensure the changes are benchmarked and profiled.The modifications made to the
encodeBlockBetterSnappy
function, including the usage ofsync.Pool
and the updated length conditions, should be thoroughly benchmarked and profiled to measure their impact on performance.To benchmark and profile the changes, you can use the following commands:
# Run benchmarks for the Snappy encoding go test -bench=BenchmarkEncodeBlockBetterSnappy ./... # Run CPU and memory profiling go test -cpuprofile cpu.prof -memprofile mem.prof -bench=BenchmarkEncodeBlockBetterSnappy ./... # Analyze the CPU profile go tool pprof cpu.prof # Analyze the memory profile go tool pprof mem.profThese commands will run benchmarks specifically for the
encodeBlockBetterSnappy
function and generate CPU and memory profiles. Use thego tool pprof
command to analyze the profiles and identify any performance bottlenecks or excessive memory allocations.Compare the benchmark results and profile analysis with the previous implementation to ensure that the changes have a positive impact on performance and memory usage. If there are any regressions or unexpected behavior, investigate and optimize accordingly.
Also applies to: 274-282, 285-293, 296-303, 308-316
14-14
: Verify the usage ofsync.Pool
in the codebase.The introduction of
sync.Pool
for managing temporary byte arrays is a good approach to optimize memory usage. However, it's important to ensure that the usage ofsync.Pool
is consistent throughout the codebase and that it doesn't introduce any unexpected behavior or performance issues.To verify the usage of
sync.Pool
, you can run the following script:This script will help you locate all the instances where
sync.Pool
is used, as well as the specific usage ofencPools
andencBetterPools
. Review the results to ensure that the usage is consistent and follows the intended pattern.Also applies to: 90-90
s2/encodeblock_amd64.go (2)
14-14
: Ensure the temporary byte arrays are properly initialized and have the expected lifetime.The new function signatures include a pointer to a fixed-size byte array as the temporary storage. It's important to ensure that:
- The byte arrays are properly initialized before being passed to the functions.
- The lifetime of the byte arrays is managed correctly to avoid dangling pointers or memory leaks.
To verify the proper initialization and lifetime management of the temporary byte arrays, run the following script:
Review the results to ensure proper initialization and lifetime management of the temporary byte arrays.
Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
14-14
: Verify the impact of the function signature changes on the codebase.The function signatures have been updated to include an additional parameter for a temporary byte array. This change is consistent across all the affected functions. However, it's crucial to ensure that all the function calls within the codebase have been updated to pass the required temporary byte array.
To verify the impact and ensure all function calls have been updated, run the following script:
If any outdated function calls are found, please update them to pass the temporary byte array argument.
Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
✅ Verification successful
All function calls have been successfully updated with the additional
tmp
parameter. No issues found.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: # Verify that all function calls in Go files have been updated to match the new signatures. # Expected results: # - No occurrences of the old function signatures. # - All occurrences of the new function signatures should have the additional argument. # Search for each function and analyze the results funcs=( "encodeBlockAsm" "encodeBlockAsm4MB" "encodeBlockAsm12B" "encodeBlockAsm10B" "encodeBlockAsm8B" "encodeBetterBlockAsm" "encodeBetterBlockAsm4MB" "encodeBetterBlockAsm12B" "encodeBetterBlockAsm10B" "encodeBetterBlockAsm8B" "encodeSnappyBlockAsm" "encodeSnappyBlockAsm64K" "encodeSnappyBlockAsm12B" "encodeSnappyBlockAsm10B" "encodeSnappyBlockAsm8B" "encodeSnappyBetterBlockAsm" "encodeSnappyBetterBlockAsm64K" "encodeSnappyBetterBlockAsm12B" "encodeSnappyBetterBlockAsm10B" "encodeSnappyBetterBlockAsm8B" "calcBlockSize" "calcBlockSizeSmall" ) for func in "${funcs[@]}"; do echo "Analyzing function: $func" rg --type go --ignore-case -A 5 "\b$func\b" echo "---" doneLength of output: 51334
Script:
#!/bin/bash # Description: # Verify that all function calls in Go files have been updated to match the new signatures with the 'tmp' parameter. # Expected results: # - No occurrences of the old function signatures without the 'tmp' parameter. # - All occurrences of the updated function signatures should include the 'tmp' argument. # Define the list of functions to verify funcs=( "encodeBlockAsm" "encodeBlockAsm4MB" "encodeBlockAsm12B" "encodeBlockAsm10B" "encodeBlockAsm8B" "encodeBetterBlockAsm" "encodeBetterBlockAsm4MB" "encodeBetterBlockAsm12B" "encodeBetterBlockAsm10B" "encodeBetterBlockAsm8B" "encodeSnappyBlockAsm" "encodeSnappyBlockAsm64K" "encodeSnappyBlockAsm12B" "encodeSnappyBlockAsm10B" "encodeSnappyBlockAsm8B" "encodeSnappyBetterBlockAsm" "encodeSnappyBetterBlockAsm64K" "encodeSnappyBetterBlockAsm12B" "encodeSnappyBetterBlockAsm10B" "encodeSnappyBetterBlockAsm8B" "calcBlockSize" "calcBlockSizeSmall" ) # Iterate over each function and check for calls without the 'tmp' parameter for func in "${funcs[@]}"; do echo "Checking for calls to function: $func without 'tmp' parameter" # Search for function calls missing the third argument 'tmp' rg --type go --ignore-case "\\b$func\\s*\\([^,]+,[^,]+\\)" -n echo "---" done # Check for any remaining instances of the old function signatures (without 'tmp') echo "Checking for any old function signatures without 'tmp' parameter" rg --type go --ignore-case "func\s+$\{funcs[*]\}\s*\([^,]+,[^,]+\)" -n echo "---"Length of output: 7465
s2/encode.go (2)
Line range hint
1-11
: LGTM!The rest of the changes in this file look good to me. The core encoding logic remains unchanged, and the new imports and
sync.Pool
usage are properly scoped to theEstimateBlockSize
function.Also applies to: 15-57, 88-452
69-77
:⚠️ Potential issuePass the temporary buffer as an argument to
calcBlockSizeSmall
andcalcBlockSize
.The static analysis tools report that too many arguments are being passed to
calcBlockSizeSmall
andcalcBlockSize
. It seems that the temporary buffertmp
needs to be passed as an argument to these functions.Modify the calls to
calcBlockSizeSmall
andcalcBlockSize
to pass thetmp
buffer:- d = calcBlockSizeSmall(src, tmp) + d = calcBlockSizeSmall(src, tmp[:]) - d = calcBlockSize(src, tmp) + d = calcBlockSize(src, tmp[:])To verify that
calcBlockSizeSmall
andcalcBlockSize
accept the temporary buffer as an argument, run:The expected output should show that both functions accept two arguments, with the second one being a slice.
Also applies to: 79-87
🧰 Tools
🪛 GitHub Check: build (1.23.x, macos-latest)
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall🪛 GitHub Check: build (1.22.x, macos-latest)
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall🪛 GitHub Check: fuzz-s2
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall🪛 GitHub Check: build (1.21.x, macos-latest)
[failure] 70-70:
undefined: encPools
[failure] 74-74:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmall
[failure] 70-70:
undefined: encPools
[failure] 74-74:
cannot infer T (/Users/runner/work/compress/compress/internal/race/norace.go:12:17)🪛 GitHub Check: build-special
[failure] 70-70:
undefined: encPools
[failure] 75-75:
undefined: encPools
[failure] 77-77:
too many arguments in call to calcBlockSizeSmalls2/_generate/gen.go (1)
84-84
: Verify the function signature change in the codebase.The second parameter to
genEncodeBetterBlockAsm
has been changed from14
to13
for the "encodeSnappyBetterBlockAsm64K" function call. Ensure that all corresponding function definitions and calls have been updated to match this new signature.Run the following script to verify the function usage:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (8)
s2/encode_go.go (3)
320-320
: LGTM. Consider adding documentation for the new parameter.The addition of the
*[32768]byte
parameter aligns with the PR objective of using a pooled array pointer for index tables instead of the stack. This change should help address issues related to stack instability in Go when handling large stacks.Consider adding a comment explaining the purpose of this new parameter, even though it's currently unused within the function. This will help future maintainers understand the intention behind this change.
506-506
: LGTM. Consider adding documentation for the new parameter.The addition of the
*[2048]byte
parameter is consistent with the changes made tocalcBlockSize
and aligns with the PR objective. This should help optimize memory usage for smaller blocks while addressing stack instability issues.Consider adding a comment explaining the purpose of this new parameter and why it uses a smaller array size compared to
calcBlockSize
. This will provide clarity for future maintainers and contributors.
320-320
: Overall impact: Positive change, consider additional updates.The changes to
calcBlockSize
andcalcBlockSizeSmall
are well-aligned with the PR objective of using pooled array pointers for index tables. This should help address stack instability issues in Go when handling large stacks.Some considerations:
- Ensure that the calling code is updated to provide the new array pointers.
- Consider adding unit tests to verify that the functions work correctly with the new parameters.
- Update any relevant documentation or comments in other parts of the codebase that might reference these functions.
To fully implement the pooled array pointer strategy:
- Review and update any other functions in the package that might benefit from a similar approach.
- Consider creating a memory pool manager to efficiently allocate and reuse these arrays across multiple compression operations.
- Update the package documentation to explain this new memory management strategy and its benefits.
Also applies to: 506-506
s2/encode_amd64.go (1)
38-45
: Ensure proper error handling when retrieving from the pool.The code currently ignores the boolean return value from the pool's
Get()
method, which indicates whether the retrieved value is newly allocated. It's good practice to handle this explicitly to avoid any potential issues.Consider adding error handling or logging when the pool returns a newly allocated value. For example:
tmp, ok := encPools[pool].Get().(*[sz]byte) if !ok { log.Printf("encPools[%d]: created new buffer of size %d", pool, sz) tmp = &[sz]byte{} }This will help track when the pool is unable to reuse an existing buffer and has to allocate a new one.
Also applies to: 48-55, 58-65, 68-75, 80-87, 114-121, 124-132, 135-143, 146-153, 158-166, 189-197, 200-207, 210-217, 220-227, 232-239, 262-270, 274-282, 285-293, 296-303, 308-316
s2/encodeblock_amd64.go (2)
14-14
: Consider adding tests to cover the new function signatures.To ensure the correctness of the changes and prevent regressions, it's a good practice to add tests that exercise the new function signatures.
Create test cases that:
- Call the modified functions with valid input and the correctly sized
tmp
parameter.- Verify that the functions behave as expected and produce the correct output.
- Test edge cases and boundary conditions related to the
tmp
parameter.Adding these tests will improve the overall test coverage and provide confidence in the changes made to the function signatures.
Do you want me to generate some example test cases for the modified functions?
Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
14-14
: Document the purpose and usage of thetmp
parameter.Update the function documentation comments to clearly explain the purpose and usage of the new
tmp
parameter. Provide guidelines on how to allocate and initialize thetmp
array correctly.Include information such as:
- The role of the
tmp
parameter in the encoding process.- The required size of the
tmp
array for each function.- Any assumptions or constraints related to the
tmp
parameter.Clear documentation will help developers understand how to use these functions correctly and avoid potential issues related to incorrect usage of the
tmp
parameter.Apply this diff to update the documentation comments:
// encodeBlockAsm encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4294967295 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 65536. // //go:noescape func encodeBlockAsm(dst []byte, src []byte, tmp *[65536]byte) int // encodeBlockAsm4MB encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4194304 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 65536. // //go:noescape func encodeBlockAsm4MB(dst []byte, src []byte, tmp *[65536]byte) int // encodeBlockAsm12B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 16383 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 16384. // //go:noescape func encodeBlockAsm12B(dst []byte, src []byte, tmp *[16384]byte) int // encodeBlockAsm10B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4095 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 4096. // //go:noescape func encodeBlockAsm10B(dst []byte, src []byte, tmp *[4096]byte) int // encodeBlockAsm8B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 511 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 1024. // //go:noescape func encodeBlockAsm8B(dst []byte, src []byte, tmp *[1024]byte) int // encodeBetterBlockAsm encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4294967295 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 589824. // //go:noescape func encodeBetterBlockAsm(dst []byte, src []byte, tmp *[589824]byte) int // encodeBetterBlockAsm4MB encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4194304 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 589824. // //go:noescape func encodeBetterBlockAsm4MB(dst []byte, src []byte, tmp *[589824]byte) int // encodeBetterBlockAsm12B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 16383 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 81920. // //go:noescape func encodeBetterBlockAsm12B(dst []byte, src []byte, tmp *[81920]byte) int // encodeBetterBlockAsm10B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4095 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 20480. // //go:noescape func encodeBetterBlockAsm10B(dst []byte, src []byte, tmp *[20480]byte) int // encodeBetterBlockAsm8B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 511 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 5120. // //go:noescape func encodeBetterBlockAsm8B(dst []byte, src []byte, tmp *[5120]byte) int // encodeSnappyBlockAsm encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4294967295 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 65536. // //go:noescape func encodeSnappyBlockAsm(dst []byte, src []byte, tmp *[65536]byte) int // encodeSnappyBlockAsm64K encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 65535 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 65536. // //go:noescape func encodeSnappyBlockAsm64K(dst []byte, src []byte, tmp *[65536]byte) int // encodeSnappyBlockAsm12B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 16383 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 16384. // //go:noescape func encodeSnappyBlockAsm12B(dst []byte, src []byte, tmp *[16384]byte) int // encodeSnappyBlockAsm10B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4095 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 4096. // //go:noescape func encodeSnappyBlockAsm10B(dst []byte, src []byte, tmp *[4096]byte) int // encodeSnappyBlockAsm8B encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 511 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter is a pointer to a temporary byte array used during encoding. +// It must be a pointer to a byte array of size 1024. // //go:noescape func encodeSnappyBlockAsm8B(dst []byte, src []byte, tmp *[1024]byte) int // encodeSnappyBetterBlockAsm encodes a non-empty src to a guaranteed-large-enough dst. // Maximum input 4294967295 bytes. // It assumes that the varint-encoded length of the decompressed bytes has already been written. +// The `tmp` parameter Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161 </blockquote></details> <details> <summary>s2/encode.go (1)</summary><blockquote> `69-87`: **Consider adding comments to explain the pool sizes and their usage.** The changes introduce two different pool sizes (`2048` and `32768`) for handling small and larger input sizes respectively. To improve code readability and maintainability, consider adding comments to explain the rationale behind these pool sizes and their corresponding usage. Here's an example of how you can add comments to clarify the pool sizes: ```go // For small inputs (less than or equal to 1024 bytes), use a pool with byte arrays of size 2048. // This pool is optimized for small input sizes to reduce memory allocation overhead. const sz, pool = 2048, 0 tmp, ok := estblockPool[pool].Get().(*[sz]byte) // ... // For larger inputs (greater than 1024 bytes), use a pool with byte arrays of size 32768. // This pool is optimized for larger input sizes to provide better performance. const sz, pool = 32768, 1 tmp, ok := estblockPool[pool].Get().(*[sz]byte) // ...Adding clear comments will help future maintainers understand the purpose and usage of these pool sizes.
s2/_generate/gen.go (1)
874-883
: Ensure the temporary table size is sufficient.The
genEncodeBetterBlockAsm
function now takes an additionaltmp *[tableSize]byte
parameter to use a temporary table allocated on the heap instead of the stack.However, the table size calculation and checks seem to be duplicated from the
genEncodeBlockAsm
function. Consider refactoring this to avoid duplication and ensure the size is consistent and sufficient for all cases.Apply this diff to remove the duplicated table size calculation and checks:
- var lTableSize = 4 * (1 << lTableBits) - var sTableSize = 4 * (1 << sTableBits) - tableSize := lTableSize + sTableSize - - // Memzero needs at least 128 bytes. - if tableSize < 128 { - panic("tableSize must be at least 128 bytes") - } - arrPtr := fmt.Sprintf(", tmp *[%d]byte", tableSize) -Then update the function signature to reuse the
tableSize
variable fromgenEncodeBlockAsm
:-func (o options) genEncodeBetterBlockAsm(name string, lTableBits, sTableBits, skipLog, lHashBytes, maxLen int) { +func (o options) genEncodeBetterBlockAsm(name string, lTableBits, sTableBits, skipLog, lHashBytes, maxLen int, tableSize int) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
s2/testdata/fuzz/block-corpus-enc.zip
is excluded by!**/*.zip
📒 Files selected for processing (6)
- s2/_generate/gen.go (9 hunks)
- s2/encode.go (3 hunks)
- s2/encode_amd64.go (5 hunks)
- s2/encode_go.go (2 hunks)
- s2/encodeblock_amd64.go (1 hunks)
- s2/s2_test.go (1 hunks)
🔇 Additional comments (12)
s2/encode_go.go (1)
320-320
: Summary: Approved changes with minor suggestions for improvement.The modifications to
calcBlockSize
andcalcBlockSizeSmall
are well-implemented and align with the PR objective of using pooled array pointers for index tables. These changes should help address stack instability issues in Go when handling large stacks, without altering the core compression logic.Suggestions for improvement:
- Add comments explaining the purpose of the new parameters in both functions.
- Update relevant documentation and calling code to reflect these changes.
- Consider implementing a memory pool manager for efficient array allocation and reuse.
Overall, this is a positive change that should improve the package's performance and stability.
Also applies to: 506-506
s2/encode_amd64.go (1)
38-45
: Verify that the assembly functions are thread-safe.Since the buffers retrieved from the pools are shared across goroutines, it's crucial to ensure that the assembly functions (
encodeBlockAsm
,encodeBlockBetterAsm
, etc.) are thread-safe and don't introduce any data races when accessing the shared buffers.To verify the thread-safety of the assembly functions, you can use the Go race detector by running tests with the
-race
flag:If the race detector reports any issues, carefully review the assembly code and make necessary synchronization changes.
Additionally, consider adding a comment to each assembly function documenting its thread-safety guarantees. This will help maintainers understand the concurrency semantics of the code.
For example:
// encodeBlockAsm is thread-safe and can be called concurrently with the same dst and src buffers. // The tmp buffer must not be shared between concurrent calls. func encodeBlockAsm(dst, src, tmp []byte) int { // ... }Also applies to: 48-55, 58-65, 68-75, 80-87, 114-121, 124-132, 135-143, 146-153, 158-166, 189-197, 200-207, 210-217, 220-227, 232-239, 262-270, 274-282, 285-293, 296-303, 308-316
s2/encodeblock_amd64.go (2)
14-14
: Ensure thetmp
parameter is properly initialized and has the correct size.When calling these functions, make sure to allocate and pass a byte array of the appropriate size as the
tmp
parameter. Using an incorrectly sized array may lead to buffer overflows or unexpected behavior.Double-check that the size of the
tmp
array matches the size specified in the function signature for each function.Run the following script to verify the initialization of the
tmp
parameter:#!/bin/bash # Description: # Verify that the `tmp` parameter is properly initialized with the correct size when calling the modified functions. # Test: Search for the function usages and check the `tmp` parameter initialization. # Expect: The `tmp` parameter should be initialized with a byte array of the size specified in the function signature. functions=( "encodeBlockAsm:65536" "encodeBlockAsm4MB:65536" "encodeBlockAsm12B:16384" "encodeBlockAsm10B:4096" "encodeBlockAsm8B:1024" "encodeBetterBlockAsm:589824" "encodeBetterBlockAsm4MB:589824" "encodeBetterBlockAsm12B:81920" "encodeBetterBlockAsm10B:20480" "encodeBetterBlockAsm8B:5120" "encodeSnappyBlockAsm:65536" "encodeSnappyBlockAsm64K:65536" "encodeSnappyBlockAsm12B:16384" "encodeSnappyBlockAsm10B:4096" "encodeSnappyBlockAsm8B:1024" "encodeSnappyBetterBlockAsm:589824" "encodeSnappyBetterBlockAsm64K:294912" "encodeSnappyBetterBlockAsm12B:81920" "encodeSnappyBetterBlockAsm10B:20480" "encodeSnappyBetterBlockAsm8B:5120" "calcBlockSize:32768" "calcBlockSizeSmall:2048" ) for func_size in "${functions[@]}"; do func=${func_size%%:*} size=${func_size#*:} echo "Checking initialization of 'tmp' for $func..." rg --type go -A 5 "\b$func\b\(" | awk -v size="$size" '$0 ~ "&\\[[0-9]+\\]byte" { match($0, "&\\[([0-9]+)\\]byte", arr) if (arr[1] != size) { print "Error: Incorrect size for 'tmp' parameter. Expected: " size ", Found: " arr[1] } }' doneIf any mismatches are found, update the
tmp
parameter initialization to use the correct size.Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
14-14
: Verify the impact of the function signature changes in the codebase.The function signatures have been updated to include an additional
tmp
parameter, which is a pointer to a byte array of varying sizes. This change is consistent across all the affected functions.However, it's crucial to ensure that all the callers of these functions have been updated to pass the required
tmp
parameter. Failing to do so may result in compilation errors or unexpected behavior.Run the following script to verify the usage of these functions:
If any inconsistencies are found, the callers must be updated to match the new function signatures.
Also applies to: 21-21, 28-28, 35-35, 42-42, 49-49, 56-56, 63-63, 70-70, 77-77, 84-84, 91-91, 98-98, 105-105, 112-112, 119-119, 126-126, 133-133, 140-140, 147-147, 154-154, 161-161
✅ Verification successful
All function callers correctly pass the
tmp
parameter.No inconsistencies were found in the function calls.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: # Verify that all callers of the modified functions have been updated to pass the `tmp` parameter. # Test: Search for the function usages. # Expect: All occurrences should pass the `tmp` parameter. functions=( "encodeBlockAsm" "encodeBlockAsm4MB" "encodeBlockAsm12B" "encodeBlockAsm10B" "encodeBlockAsm8B" "encodeBetterBlockAsm" "encodeBetterBlockAsm4MB" "encodeBetterBlockAsm12B" "encodeBetterBlockAsm10B" "encodeBetterBlockAsm8B" "encodeSnappyBlockAsm" "encodeSnappyBlockAsm64K" "encodeSnappyBlockAsm12B" "encodeSnappyBlockAsm10B" "encodeSnappyBlockAsm8B" "encodeSnappyBetterBlockAsm" "encodeSnappyBetterBlockAsm64K" "encodeSnappyBetterBlockAsm12B" "encodeSnappyBetterBlockAsm10B" "encodeSnappyBetterBlockAsm8B" "calcBlockSize" "calcBlockSizeSmall" ) for func in "${functions[@]}"; do echo "Checking usages of $func..." rg --type go -A 5 "\b$func\b\(" doneLength of output: 11421
s2/encode.go (3)
69-87
: Evaluate the impact of the changes on the overall performance.The modifications in the
EstimateBlockSize
function introduce the usage ofestblockPool
and change the way temporary byte arrays are handled. While these changes aim to improve memory efficiency, it's important to evaluate their impact on the overall performance of the encoding process.To assess the performance impact, consider running benchmarks and profiling the code before and after the changes. Here's a suggested approach:
- Create benchmark tests for the
EstimateBlockSize
function with different input sizes (small and large).- Run the benchmarks on the original code and record the results.
- Apply the changes to the
EstimateBlockSize
function.- Run the benchmarks again on the modified code and compare the results with the original benchmarks.
- Analyze the benchmark results to determine if there are any significant performance improvements or regressions.
- If necessary, use profiling tools to identify any performance bottlenecks or excessive memory allocations.
By conducting a thorough performance evaluation, you can ensure that the changes indeed improve the efficiency of the encoding process without introducing any unexpected performance issues.
58-59
: Ensure proper synchronization and avoid contention.The introduction of
estblockPool
as a slice ofsync.Pool
is a good approach to manage temporary byte arrays. However, it's crucial to ensure that the access to the pool is properly synchronized to avoid contention and race conditions.To verify the proper usage of
estblockPool
, run the following script:#!/bin/bash # Check if estblockPool is accessed within a mutex lock if ! rg --type go 'estblockPool\[.*\]\.Get\(\)' -A 10 | rg -q 'mu(tex)?\.Lock\(\)'; then echo "estblockPool is accessed without proper synchronization." else echo "estblockPool access is properly synchronized." fiIf the pool access is not properly synchronized, consider adding necessary mutex locks or using atomic operations to ensure thread safety.
69-77
: Ensure proper usage of temporary byte arrays.The changes introduce the usage of temporary byte arrays fetched from
estblockPool
to optimize memory usage during block size estimation. It's important to ensure that these temporary arrays are properly initialized, used, and returned to the pool to avoid memory leaks or unexpected behavior.To verify the proper usage of temporary byte arrays, run the following script:
#!/bin/bash # Check if the temporary byte arrays are properly returned to the pool if ! rg --type go 'tmp, ok := estblockPool\[pool\]\.Get\(\)' -A 10 | rg -q 'defer estblockPool\[pool\]\.Put\(tmp\)'; then echo "Temporary byte arrays are not properly returned to the pool." else echo "Temporary byte arrays are properly returned to the pool." fi # Check if the temporary byte arrays are properly initialized if ! rg --type go 'tmp, ok := estblockPool\[pool\]\.Get\(\)' -A 10 | rg -q 'tmp = &\[.*\]byte{}'; then echo "Temporary byte arrays are not properly initialized when not found in the pool." else echo "Temporary byte arrays are properly initialized when not found in the pool." fiIf the temporary arrays are not properly managed, make sure to return them to the pool using
defer
statements and initialize them correctly when they are not found in the pool.Also applies to: 79-87
s2/s2_test.go (3)
2221-2227
: LGTM!The
BenchmarkDecodeBlockParallel
looks good. It uses the existingbenchFile
function to run parallel decoding benchmarks for each test file.
2245-2251
: LGTM!The
BenchmarkEncodeBlockParallel
looks good. It uses the existingbenchFile
function to run parallel encoding benchmarks for each test file.
2213-2215
: Verify the impact of limiting the data size on benchmarks.Limiting the data size for benchmarks with
sizeLimit
may affect the benchmark results and not provide a comprehensive performance assessment. Ensure that the benchmarks cover a representative range of data sizes to get meaningful performance measurements.To verify the impact, run the benchmarks with different
sizeLimit
values or without any limit and compare the results:s2/_generate/gen.go (2)
2942-2942
: Verify the output margin is sufficient for all cases.The output margin is being set to 8 bytes. However, it's important to ensure this margin is sufficient to handle all possible output scenarios without risking buffer overflows.
Perform a thorough analysis of the maximum possible output sizes across all compression algorithms and configurations:
84-84
: Verify the maximum skip value is correctly adjusted for Snappy compatibility.The maximum skip value is being set to 0 for Snappy compatibility. Please ensure this is the intended behavior and that it does not negatively impact performance for Snappy use cases.
To verify, search the codebase for references to
maxSkip
and confirm the intended usage:✅ Verification successful
Verified: The
maxSkip
value is correctly set to 0 for Snappy compatibility and does not negatively impact performance for Snappy use cases.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if setting maxSkip to 0 for Snappy is intentional and valid. # 1. Search for where maxSkip is set and used. # Expect: Should only be set to 0 for Snappy and used correctly. rg -A 5 -B 5 'maxSkip' # 2. Search for Snappy-related logic that may be impacted by maxSkip. # Expect: Logic should handle maxSkip=0 case properly. rg -A 5 -B 5 'snappy'Length of output: 142770
12 hours fuzz testing passed, execs: 938238817 |
Provide a pooled array pointer for tables instead of using stack.
Seems like Go is still unstable with large stacks, so use alternative method.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Tests