Skip to content

Commit

Permalink
(Backport) Fix OptionVarULE in zerovec to also use C, packed (#5144)
Browse files Browse the repository at this point in the history
Backport of #5143
  • Loading branch information
Manishearth authored Jun 27, 2024
1 parent 6336fc8 commit 3c47d82
Show file tree
Hide file tree
Showing 9 changed files with 20 additions and 19 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
- `zerovec`
- (0.10.3) Fix size regression by making `twox-hash` dep `no_std` (https://github.com/unicode-org/icu4x/pull/5007)
- (0.10.3) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to `repr(Rust)` (https://github.com/unicode-org/icu4x/pull/5049)
- (0.10.4) Enforce C,packed on OptionVarULE (https://github.com/unicode-org/icu4x/pull/5143)
- `zerovec_derive`
- (0.10.3) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to `repr(Rust)` (https://github.com/unicode-org/icu4x/pull/5049)
## icu4x 1.5 (May 28, 2024)
Expand Down
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion utils/zerovec/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[package]
name = "zerovec"
description = "Zero-copy vector backed by a byte array"
version = "0.10.3"
version = "0.10.4"
categories = ["rust-patterns", "memory-management", "caching", "no-std", "data-structures"]
keywords = ["zerocopy", "serialization", "zero-copy", "serde"]

Expand Down
4 changes: 2 additions & 2 deletions utils/zerovec/derive/src/ule.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@ pub fn derive_impl(input: &DeriveInput) -> TokenStream2 {

// Safety (based on the safety checklist on the ULE trait):
// 1. #name does not include any uninitialized or padding bytes.
// (achieved by enforcing #[repr(transparent)] or #[repr(packed)] on a struct of only ULE types)
// (achieved by enforcing #[repr(transparent)] or #[repr(C, packed)] on a struct of only ULE types)
// 2. #name is aligned to 1 byte.
// (achieved by enforcing #[repr(transparent)] or #[repr(packed)] on a struct of only ULE types)
// (achieved by enforcing #[repr(transparent)] or #[repr(C, packed)] on a struct of only ULE types)
// 3. The impl of validate_byte_slice() returns an error if any byte is not valid.
// 4. The impl of validate_byte_slice() returns an error if there are extra bytes.
// 5. The other ULE methods use the default impl.
Expand Down
4 changes: 2 additions & 2 deletions utils/zerovec/derive/src/varule.rs
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,9 @@ pub fn derive_impl(

// Safety (based on the safety checklist on the ULE trait):
// 1. #name does not include any uninitialized or padding bytes
// (achieved by enforcing #[repr(transparent)] or #[repr(packed)] on a struct of only ULE types)
// (achieved by enforcing #[repr(transparent)] or #[repr(C, packed)] on a struct of only ULE types)
// 2. #name is aligned to 1 byte.
// (achieved by enforcing #[repr(transparent)] or #[repr(packed)] on a struct of only ULE types)
// (achieved by enforcing #[repr(transparent)] or #[repr(C, packed)] on a struct of only ULE types)
// 3. The impl of `validate_byte_slice()` returns an error if any byte is not valid.
// 4. The impl of `validate_byte_slice()` returns an error if the slice cannot be used in its entirety
// 5. The impl of `from_byte_slice_unchecked()` returns a reference to the same data.
Expand Down
10 changes: 5 additions & 5 deletions utils/zerovec/design_doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ where

The trait is `unsafe` to implement since `ZeroVec` will rely on invariants promised by the trait. The main feature here is that this trait lets `ZeroVec` take a bytestream it is decoding and certify that it contains valid `Self` types. It allows `ZeroVec<T>` to turn `&[u8]` into `&[T::ULE]` during parsing or deserialization.

As `ULE` requires types to not have any alignment restrictions, most `ULE` types will be `#[repr(transparent)]` or `#[repr(packed)]` wrappers around other ULE types (or in general, types known to have no alignment requirements). Note that `#[repr(Rust)]` isn't defined or stable, so ULE types _must_ have _some_ `#[repr(..)]` tag for them to be able to stably uphold the invariants.
As `ULE` requires types to not have any alignment restrictions, most `ULE` types will be `#[repr(transparent)]` or `#[repr(C, packed)]` wrappers around other ULE types (or in general, types known to have no alignment requirements). Note that `#[repr(Rust)]` isn't defined or stable, so ULE types _must_ have _some_ `#[repr(..)]` tag for them to be able to stably uphold the invariants.

If you wish to make a custom ULE type, it will likely wrap [`RawBytesULE`] with added invariants (and `#[repr(transparent)]`, or do something like the following:

Expand All @@ -195,7 +195,7 @@ struct Foo {
}

// Implements ULE
#[repr(packed)]
#[repr(C, packed)]
struct FooULE {
field1: u32::ULE,
field2: char::ULE,
Expand Down Expand Up @@ -223,7 +223,7 @@ pub unsafe trait VarULE: 'static {

Similarly to [`ULE`], `VarULE` is an `unsafe` trait which mainly requires the user to specify whether a `&[u8]` slice contains a valid bit pattern for a _single_ `Self` instance. Since pointer metadata can vary between unsized types, `from_byte_slice_unchecked()` must also be specified by the implementor so that `VarZeroVec` can materialize `&Self` instances out of known-valid bit patterns after validation.

`VarULE` types must also accept any alignment, so most custom `VarULE` types will be `#[repr(packed)]` wrappers around structs containing `ULE` and `VarULE` types (like `str`, `[u8]`, [`VarZeroSlice`], [`ZeroSlice`]).
`VarULE` types must also accept any alignment, so most custom `VarULE` types will be `#[repr(C, packed)]` wrappers around structs containing `ULE` and `VarULE` types (like `str`, `[u8]`, [`VarZeroSlice`], [`ZeroSlice`]).


### `EncodeAsVarULE`
Expand Down Expand Up @@ -315,7 +315,7 @@ These are basic derives that can be applied to types to _just_ generate ULE and

These can only be applied to structs where all fields are ULE types (for `#[derive(VarULE)]`, the last field must be an unsized `VarULE` type). These derives will do the following things:

- Apply `#[repr(packed)]` to the type (or perhaps `#[repr(C)]` if we can determine that that will always work)
- Apply `#[repr(C, packed)]` to the type (or perhaps `#[repr(C)]` if we can determine that that will always work)
- Generate the appropriate `ZeroMapKV` impl (an opt-out can be provided)
- Generate a `ULE` or `VarULE` implementation that applies offsetted `validate_byte_slice()` for each field to implement the final `validate_byte_slice()`
- Generate `Copy`/`Clone` impls as necessary (`#[derive()]` does not work with packed types)
Expand Down Expand Up @@ -382,7 +382,7 @@ For an initial pass, we may only support single-heap-field structs for `#[make_v

Ideally, enums can have ULE impls autogenerated for them, but handling the discriminants gets tricky.

One way to handle this is to generate a private internal struct for each variant and do the usual ULE or VarULE generation for each. Then, manaully construct a `#[repr(packed)]` tagged union with a `u8` tag and a `union` of all of the internal structs. This is somewhat complicated but actually relatively simple to implement since it can use `#[make_ule]` and `#[make_varule]`.
One way to handle this is to generate a private internal struct for each variant and do the usual ULE or VarULE generation for each. Then, manaully construct a `#[repr(C, packed)]` tagged union with a `u8` tag and a `union` of all of the internal structs. This is somewhat complicated but actually relatively simple to implement since it can use `#[make_ule]` and `#[make_varule]`.


We may also do this completely manually, which gives us the opportunity for bitpacking the discriminant further, if combined with the bitpacking scheme discussed below.
Expand Down
8 changes: 4 additions & 4 deletions utils/zerovec/src/ule/custom.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,19 +47,19 @@
//! # field3: ZeroVec<'a, u32>
//! # }
//!
//! // Must be repr(packed) for safety of VarULE!
//! // Must be repr(C, packed) for safety of VarULE!
//! // Must also only contain ULE types
//! #[repr(packed)]
//! #[repr(C, packed)]
//! struct FooULE {
//! field1: <char as AsULE>::ULE,
//! field2: <u32 as AsULE>::ULE,
//! field3: ZeroSlice<u32>,
//! }
//!
//! // Safety (based on the safety checklist on the VarULE trait):
//! // 1. FooULE does not include any uninitialized or padding bytes. (achieved by `#[repr(packed)]` on
//! // 1. FooULE does not include any uninitialized or padding bytes. (achieved by `#[repr(C, packed)]` on
//! // a struct with only ULE fields)
//! // 2. FooULE is aligned to 1 byte. (achieved by `#[repr(packed)]` on
//! // 2. FooULE is aligned to 1 byte. (achieved by `#[repr(C, packed)]` on
//! // a struct with only ULE fields)
//! // 3. The impl of `validate_byte_slice()` returns an error if any byte is not valid.
//! // 4. The impl of `validate_byte_slice()` returns an error if the slice cannot be used in its entirety
Expand Down
2 changes: 1 addition & 1 deletion utils/zerovec/src/ule/niche.rs
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ impl<U: NicheBytes<N> + ULE + Eq, const N: usize> Eq for NichedOptionULE<U, N> {
/// containing only ULE fields.
/// NichedOptionULE either contains NICHE_BIT_PATTERN or valid U byte sequences.
/// In both cases the data is initialized.
/// 2. NichedOptionULE is aligned to 1 byte due to `#[repr(packed)]` on a struct containing only
/// 2. NichedOptionULE is aligned to 1 byte due to `#[repr(C, packed)]` on a struct containing only
/// ULE fields.
/// 3. validate_byte_slice impl returns an error if invalid bytes are encountered.
/// 4. validate_byte_slice impl returns an error there are extra bytes.
Expand Down
6 changes: 3 additions & 3 deletions utils/zerovec/src/ule/option.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ impl<U: Copy + Eq> Eq for OptionULE<U> {}
/// ```
// The slice field is empty when None (bool = false),
// and is a valid T when Some (bool = true)
#[repr(packed)]
#[repr(C, packed)]
pub struct OptionVarULE<U: VarULE + ?Sized>(PhantomData<U>, bool, [u8]);

impl<U: VarULE + ?Sized> OptionVarULE<U> {
Expand All @@ -166,8 +166,8 @@ impl<U: VarULE + ?Sized + core::fmt::Debug> core::fmt::Debug for OptionVarULE<U>

// Safety (based on the safety checklist on the VarULE trait):
// 1. OptionVarULE<T> does not include any uninitialized or padding bytes
// (achieved by being repr(packed) on ULE types)
// 2. OptionVarULE<T> is aligned to 1 byte (achieved by being repr(packed) on ULE types)
// (achieved by being repr(C, packed) on ULE types)
// 2. OptionVarULE<T> is aligned to 1 byte (achieved by being repr(C, packed) on ULE types)
// 3. The impl of `validate_byte_slice()` returns an error if any byte is not valid.
// 4. The impl of `validate_byte_slice()` returns an error if the slice cannot be used in its entirety
// 5. The impl of `from_byte_slice_unchecked()` returns a reference to the same data.
Expand Down

0 comments on commit 3c47d82

Please sign in to comment.