Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(all): use a builder pattern for safe serialization API #1465

Merged
merged 2 commits into from
Sep 30, 2024

Conversation

nsarlin-zama
Copy link
Contributor

@nsarlin-zama nsarlin-zama commented Aug 9, 2024

closes: https://github.com/zama-ai/tfhe-rs-internal/issues/646

PR content/description

The goal of this PR is to simplify the safe_serialization API. With the introduction of data versioning, there are many different serialization possibilities:

  • safe or not,
  • conformant or not (only for deserialization),
  • versioned or not

This can be difficult for us to maintain and confusing to the user since the function they should use by default have the most complicated names (safe_deserialize_conformant_versioned). The idea is to introduce a simple builder pattern to configure the serialization, with the safest defaults and with a possibility to disable functionalities for advanced users.

This PR introduces 2 objects, SerializationConfig and DeserializationConfig. The idea for serialization is to create this object, eventually modify it to disable some functionalities, and serialize an object with the serialize_into method:

  • default config (safe + conformance + versioning):
// Serialization is "safe" by default, so we have to give the max size to the config
SerializationConfig::new(1 << 20).serialize_into(&ct, &mut buffer).unwrap();
    

// Deserialization is also "conformant" by default, so we also have to give some conformance parameters
DeserializationConfig::new(1 << 20)
    .deserialize_from::<FheUint8>(buffer.as_slice(), &conformance_params)
    .unwrap();
  • no versioning:
// We use the `new` constructor and disable the functionality
SerializationConfig::new(1 << 20).disable_versioning().serialize_into(&ct, &mut buffer).unwrap();
    
// No need to configure that the data is not versioned, this is detected during deserialization
DeserializationConfig::new(1 << 20)
    .deserialize_from::<FheUint8>(buffer.as_slice())
    .unwrap();
  • no conformance:
SerializationConfig::new(1 << 20).serialize_into(&ct, &mut buffer).unwrap();
    
// Here we use the `new_without_conformance` so we don't have to give conformance params that won't be used
DeserializationConfig::new(1 << 20)
    .disable_conformance()
    .deserialize_from::<FheUint8>(buffer.as_slice())
    .unwrap();

The safe_serialize/safe_serialize have been kept as defaults for users that don't need customization. They include versioning.

Other changes:

  • The versioning state (enabled or not) is encoded in the serialized data and is detected automatically during deserialization. This mean that DeserializationConfig::new(...).deserialize_from(...) handles both versioned and unversioned data.
  • If the data is unversioned, the tfhe-rs version is stored in the binary file, and we raise an error if the version is different than the current one (this check can be disabled if the user is sure that the types are compatible).
  • In the C API, the only safe_serialization method is versioned. There are also serialize/deserialize (without safe) that are only wrappers around serde.
  • In the JS API the only safe_serialization method is versioned.

Check-list:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features) -> update serialization doc
  • Relevant issues are marked as resolved/closed, related issues are linked in the description
  • Check for breaking changes (including serialization changes) and add them to commit message following the conventional commit specification

@cla-bot cla-bot bot added the cla-signed label Aug 9, 2024
@nsarlin-zama nsarlin-zama force-pushed the ns/safe_serialize_api branch 5 times, most recently from b8bdd99 to 7159326 Compare August 12, 2024 15:07
@nsarlin-zama nsarlin-zama changed the base branch from main to ns/versionize_zk September 25, 2024 16:09
@nsarlin-zama nsarlin-zama force-pushed the ns/safe_serialize_api branch 2 times, most recently from 74d8771 to 456a7aa Compare September 26, 2024 08:13
@nsarlin-zama nsarlin-zama force-pushed the ns/versionize_zk branch 4 times, most recently from a7822c5 to 15c9215 Compare September 27, 2024 12:11
@nsarlin-zama nsarlin-zama marked this pull request as ready for review September 27, 2024 12:50
tfhe/src/high_level_api/integers/signed/tests.rs Outdated Show resolved Hide resolved
tfhe/src/zk.rs Outdated Show resolved Hide resolved
tfhe/src/safe_serialization.rs Outdated Show resolved Hide resolved
tfhe/src/safe_serialization.rs Show resolved Hide resolved
tfhe/src/safe_serialization.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@mayeul-zama mayeul-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

tfhe/src/safe_serialization.rs Show resolved Hide resolved
tfhe/src/safe_serialization.rs Outdated Show resolved Hide resolved
tfhe/src/safe_serialization.rs Show resolved Hide resolved
.serialize_into(&ct, &mut buffer)
.unwrap();

assert!(DeserializationConfig::new(1 << 20)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think having a default value for the size_limit could be good, and maybe have an impl Default for SerializationConfig that uses that default size_limit ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be a good default value ? We don't know the kind of device where this will run nor the concrete type that is deserialized

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all our example we use 1 << 20, which is roughly 1GB, and I bet most users are just going to copy paste that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you ok if I add this as an issue to not rush this before release ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1<<20 = 1 MB IIRC

1 << 30 = 1GB

@nsarlin-zama nsarlin-zama force-pushed the ns/versionize_zk branch 3 times, most recently from ed3d310 to 274c7ec Compare September 27, 2024 15:39
@zama-bot zama-bot removed the approved label Sep 27, 2024
@zama-bot zama-bot removed the approved label Sep 30, 2024
Base automatically changed from ns/versionize_zk to main September 30, 2024 11:18
BREAKING CHANGES:
- `safe_serialize` and `safe_deserialize` are replaced by
  `SerializationConfig::serialize_into` and
  `DeserializationConfig::deserialize_from`.
- C API: the `XXX_safe_serialize_versioned` is deprecated, `XXX_safe_serialize`
  is now versioned by default
- JS API: the `safe_serialize` method now versionize the data before
  serialization.

This is *NOT* a serialization breaking change for data serialized in previous
versions with `safe_serialize_versioned`.
@zama-bot zama-bot removed the approved label Sep 30, 2024
@nsarlin-zama nsarlin-zama merged commit e9d3e21 into main Sep 30, 2024
77 of 82 checks passed
@nsarlin-zama nsarlin-zama deleted the ns/safe_serialize_api branch September 30, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants