Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost. #445

Merged
merged 9 commits into from
Jun 17, 2024

Conversation

LinZhihao-723
Copy link
Member

@LinZhihao-723 LinZhihao-723 commented Jun 16, 2024

Description

Motivation:

As we plan to publish an npm package for the JavaScript clp-ffi library, we need to remove standard Decompressor's boost dependency to build the necessary components properly. To achieve this milestone, we should replace boost's memory-mapped file with our own implementation. This change is first introduced in PR #387

Implementation

This PR introduces two classes necessary to wrap mmap system call:

  • FileDescriptor: This is a wrapper class to a raw file descriptor. With RAII, it is easier to handle failures/exceptions where a raw file descriptor is required. To handle to potential failure in the destructor, we provide a customizable callback if users don't want to swallow the error.
  • MemoryMappedFileView: This is a wrapper class to mmap system call. It maintains the mapped memory region using RAII, and provides methods to access an immutable view of the file. The memory region is unmapped in the destructor.

With the above two classes, we can replace boost's mmap support with our implementation in standard Decompressor.

Validation performed

  1. Ensure the code can be successfully built for all the relevant build targets.
  2. Ensure all existing unit tests are passed. The current streaming compression unit tests will call the above implementation to decompress zstd format files.
  3. Add new unit tests to test memory-mapped file functionalities separately. It also tests the special case where the mapped file is empty.

junhaoliao
junhaoliao previously approved these changes Jun 16, 2024
Copy link
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks!

I believe the code changes are sufficient to get rid of the "boost" header dependencies for Emscripten (WASM) compilation for the Zstd decompressor. Looking at cleaning up the CLP repo, can we now also remove references to boost::iostreams in the CMakeList.txt files? (May we need to replace all memeory-mapped file implementations with ours. )

@LinZhihao-723
Copy link
Member Author

LinZhihao-723 commented Jun 16, 2024

lgtm thanks!

I believe the code changes are sufficient to get rid of the "boost" header dependencies for Emscripten (WASM) compilation for the Zstd decompressor. Looking at cleaning up the CLP repo, can we now also remove references to boost::iostreams in the CMakeList.txt files? (May we need to replace all memeory-mapped file implementations with ours. )

I agree that we can do more cleaning. However, this is already a non-trivial PR. I think we should delay the rest of the cleanup to the future PRs. Arguably speaking, this PR should only contain two new classes. We should delay the boost deprecation in zstd decompressor to the next PR. What do u think? @kirkrodrigues

components/core/tests/test-MemoryMappedFile.cpp Outdated Show resolved Hide resolved
components/core/src/clp/FileDescriptor.cpp Outdated Show resolved Hide resolved
components/core/src/clp/FileDescriptor.cpp Outdated Show resolved Hide resolved
components/core/src/clp/FileDescriptor.hpp Outdated Show resolved Hide resolved
components/core/src/clp/FileDescriptor.hpp Outdated Show resolved Hide resolved
components/core/tests/test-MemoryMappedFile.cpp Outdated Show resolved Hide resolved
components/core/tests/test-MemoryMappedFile.cpp Outdated Show resolved Hide resolved
components/core/tests/test-MemoryMappedFile.cpp Outdated Show resolved Hide resolved
components/core/src/clp/MemoryMappedFileView.hpp Outdated Show resolved Hide resolved
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to resolve.

@kirkrodrigues
Copy link
Member

However, this is already a non-trivial PR. I think we should delay the rest of the cleanup to the future PRs. Arguably speaking, this PR should only contain two new classes. We should delay the boost deprecation in zstd decompressor to the next PR. What do u think? @kirkrodrigues

Leaving further clean-up to another PR sounds good.

LinZhihao-723 and others added 2 commits June 17, 2024 10:37
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
components/core/src/clp/FileDescriptor.cpp Outdated Show resolved Hide resolved
components/core/src/clp/FileDescriptor.hpp Outdated Show resolved Hide resolved
components/core/src/clp/FileDescriptor.hpp Outdated Show resolved Hide resolved
components/core/tests/test-MemoryMappedFile.cpp Outdated Show resolved Hide resolved
LinZhihao-723 and others added 3 commits June 17, 2024 16:03
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost.

@LinZhihao-723
Copy link
Member Author

For the PR title, how about:

core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost.

lgtm

@LinZhihao-723 LinZhihao-723 changed the title Add support for read-only memory mapped file; Remove zstandard Decompressor's dependency on boost. core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost. Jun 17, 2024
@LinZhihao-723 LinZhihao-723 merged commit b335c11 into y-scope:main Jun 17, 2024
11 checks passed
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
… descriptors; Remove Zstandard's dependency on Boost. (y-scope#445)

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants