Skip to content

Commit

Permalink
🔄 synced local 'docs/guide/' with remote 'docs/guide/'
Browse files Browse the repository at this point in the history
  • Loading branch information
chaokunyang committed Aug 27, 2024
1 parent 5ce5384 commit b1595dc
Show file tree
Hide file tree
Showing 4 changed files with 65 additions and 35 deletions.
2 changes: 1 addition & 1 deletion docs/guide/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ id: development

## How to build Fury

Please pull latest code from [Github Repository](https://github.com/apache/fury).
Please checkout the source tree from https://github.com/apache/fury.

### Build Fury Java

Expand Down
5 changes: 3 additions & 2 deletions docs/guide/graalvm_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ GraalVM `native image` can compile java code into native code ahead to build fas
The native image doesn't have a JIT compiler to compile bytecode into machine code, and doesn't support
reflection unless configure reflection file.

Fury runs on GraalVM native image pretty well. Fury generates all serializer code for `Fury JIT framework` and `MethodHandle/LambdaMetafactory` at graalvm build time. Then use those generated code for serialization at runtime without any extra cost, the performance is great.
Fury runs on GraalVM native image pretty well. Fury generates all serializer code for `Fury JIT framework` and `MethodHandle/LambdaMetafactory` at graalvm build time. Then use those generated code for serialization at runtime without
any extra cost, the performance is great.

In order to use Fury on graalvm native image, you must create Fury as an **static** field of a class, and **register** all classes at
the enclosing class initialize time. Then configure `native-image.properties` under
Expand Down Expand Up @@ -143,7 +144,7 @@ When Fury compression is enabled:
- Struct: Fury is `24x speed, 31% size` compared to JDK.
- Pojo: Fury is `12x speed, 48% size` compared to JDK.

See [Benchmark.java](https://github.com/apache/fury/blob/main/integration_tests/graalvm_tests/src/main/java/org/apache/fury/graalvm/Benchmark.java) for benchmark code.
See [[Benchmark.java](https://github.com/apache/fury/blob/main/integration_tests/graalvm_tests/src/main/java/org/apache/fury/graalvm/Benchmark.java)] for benchmark code.

### Struct Benchmark

Expand Down
86 changes: 58 additions & 28 deletions docs/guide/java_serialization_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ graph serialization.

## Quick Start

Note that fury creation is not cheap, the **fury instances should be reused between serializations** instead of creating it everytime.
You should create a global static variable for Fury, or a limited number of Fury instance objects; Fury itself takes up some memory, so don't create tens of thousands of Fury objects!
Note that fury creation is not cheap, the **fury instances should be reused between serializations** instead of creating
it everytime.
You should keep fury to a static global variable, or instance variable of some singleton object or limited objects.

Fury for single-thread usage:

Expand Down Expand Up @@ -100,9 +101,9 @@ public class Example {
| `compressInt` | Enables or disables int compression for smaller size. | `true` |
| `compressLong` | Enables or disables long compression for smaller size. | `true` |
| `compressString` | Enables or disables string compression for smaller size. | `true` |
| `classLoader` | The class loader associated with the current Fury. Each Fury is associated with an immutable class loader that caches class metadata. If you need to switch class loaders, use `LoaderBinding` or `ThreadSafeFury` to update them. | `Thread.currentThread().getContextClassLoader()` |
| `classLoader` | The classloader should not be updated; Fury caches class metadata. Use `LoaderBinding` or `ThreadSafeFury` for classloader updates. | `Thread.currentThread().getContextClassLoader()` |
| `compatibleMode` | Type forward/backward compatibility config. Also Related to `checkClassVersion` config. `SCHEMA_CONSISTENT`: Class schema must be consistent between serialization peer and deserialization peer. `COMPATIBLE`: Class schema can be different between serialization peer and deserialization peer. They can add/delete fields independently. | `CompatibleMode.SCHEMA_CONSISTENT` |
| `checkClassVersion` | Determines whether to check for class schema consistency. If enabled, Fury will write `classVersionHash` and check for type consistency based on it. It will be automatically disabled when `CompatibleMode#COMPATIBLE` is enabled. Disabling is not recommended unless you can ensure the class won't evolve. | `false` |
| `checkClassVersion` | Determines whether to check the consistency of the class schema. If enabled, Fury checks, writes, and checks consistency using the `classVersionHash`. It will be automatically disabled when `CompatibleMode#COMPATIBLE` is enabled. Disabling is not recommended unless you can ensure the class won't evolve. | `false` |
| `checkJdkClassSerializable` | Enables or disables checking of `Serializable` interface for classes under `java.*`. If a class under `java.*` is not `Serializable`, Fury will throw an `UnsupportedOperationException`. | `true` |
| `registerGuavaTypes` | Whether to pre-register Guava types such as `RegularImmutableMap`/`RegularImmutableList`. These types are not public API, but seem pretty stable. | `true` |
| `requireClassRegistration` | Disabling may allow unknown classes to be deserialized, potentially causing security risks. | `true` |
Expand Down Expand Up @@ -162,7 +163,7 @@ ThreadSafeFury fury=Fury.builder()
System.out.println(fury.deserialize(bytes));
```

### Configure Fury to generate smaller serialization volumes
### Smaller size

`FuryBuilder#withIntCompressed`/`FuryBuilder#withLongCompressed` can be used to compress int/long for smaller size.
Normally compress int is enough.
Expand All @@ -187,7 +188,10 @@ For long compression, fury support two encoding:
numbers.

If a number are `long` type, it can't be represented by smaller bytes mostly, the compression won't get good enough
result, not worthy compared to performance cost. Maybe you should try to disable long compression if you find it didn't bring much space savings.
result,
not worthy compared to performance cost. Maybe you should try to disable long compression if you find it didn't bring
much
space savings.

### Object deep copy

Expand All @@ -214,9 +218,11 @@ Fury fury=Fury.builder()

### Implement a customized serializer

In some cases, you may want to implement a serializer for your type, especially for classes customized serialization by JDK serialization. JDK serialization is very inefficient in performance and space. You can write your own Serializer and configure Fury to used it for serialization instead.

For example, if you don't want belowing `Foo#writeObject` to be called, you can implement and register `FooSerializer`:
In some cases, you may want to implement a serializer for your type, especially some class customize serialization by
JDK
writeObject/writeReplace/readObject/readResolve, which is very inefficient. For example, you don't want
following `Foo#writeObject`
got invoked, you can take following `FooSerializer` as an example:

```java
class Foo {
Expand Down Expand Up @@ -256,17 +262,20 @@ Fury fury=getFury();

### Security & Class Registration

`FuryBuilder#requireClassRegistration` can be used to disable class registration, this will allow to deserialize objects unknown types, more flexible but ****if the class contains malicious code, a security vulnerability can occur**.**.l

**Do not disable class registration checking unless you can ensure the security of your runtime environment and external interactions**.
`FuryBuilder#requireClassRegistration` can be used to disable class registration, this will allow to deserialize objects
unknown types,
more flexible but **may be insecure if the classes contains malicious code**.

Malicious code in `init/equals/hashCode` can be executed when deserializing unknown/untrusted types when this option is disabled.
**Do not disable class registration unless you can ensure your environment is secure**.
Malicious code in `init/equals/hashCode` can be executed when deserializing unknown/untrusted types when this option
disabled.

Class registration can not only reduce security risks, but also avoid classname serialization cost.

You can register class with API `Fury#register`.

> Note that class registration order is important, serialization and deserialization peer should have same registration order.
Note that class registration order is important, serialization and deserialization peer
should have same registration order.

```java
Fury fury=xxx;
Expand Down Expand Up @@ -388,11 +397,15 @@ losing any information.
If metadata sharing is not enabled, the new class data will be skipped and an `NonexistentSkipClass` stub object will be
returned.

## Serialization Library Migration
## Migration

### JDK migration

If you use JDK serialization before, and you can't upgrade your client and server at the same time, which is common for online application. Fury provided an util method `org.apache.fury.serializer.JavaSerializer.serializedByJDK` to check whether the binary are generated by jdk serialization, you use following pattern to make existing serialization protocol-aware, then upgrade serialization to fury in a rolling-up way:
If you use jdk serialization before, and you can't upgrade your client and server at the same time, which is common for
online application. Fury provided an util method `org.apache.fury.serializer.JavaSerializer.serializedByJDK` to check
whether
the binary are generated by jdk serialization, you use following pattern to make exiting serialization protocol-aware,
then upgrade serialization to fury in an async rolling-up way:

```java
if(JavaSerializer.serializedByJDK(bytes)){
Expand All @@ -405,11 +418,16 @@ if(JavaSerializer.serializedByJDK(bytes)){

### Upgrade fury

Currently binary compatibility is ensured for minor versions only. For example, if you are using fury`v0.2.0`, binary compatibility will be provided if you upgrade to fury `v0.2.1`. But if you upgrade fury to `v0.4.1`, no binary compatibility are ensured.
Currently binary compatibility is ensured for minor versions only. For example, if you are using fury`v0.2.0`, binary
compatibility will
be provided if you upgrade to fury `v0.2.1`. But if upgrade to fury `v0.4.1`, no binary compatibility are ensured.
Most of the time there is no need to upgrade fury to newer major version, the current version is fast and compact
enough, and we provide some minor fix for recent older versions.
enough,
and we provide some minor fix for recent older versions.

But if you do want to upgrade fury for better performance and smaller size, you need to write fury version as header to serialized data using code like following to keep binary compatibility:
But if you do want to upgrade fury for better performance and smaller size, you need to write fury version as header to
serialized data
using code like following to keep binary compatibility:

```java
MemoryBuffer buffer=xxx;
Expand All @@ -426,25 +444,37 @@ MemoryBuffer buffer=xxx;
fury.deserialize(buffer);
```

`getFury` is the way to load the corresponding version of Fury. You can use the maven shade plugin to shade different versions of Fury and relocate them under different packages, so that you can load different versions of Fury under different paths.
`getFury` is a method to load corresponding fury, you can shade and relocate different version of fury to different
package, and load fury by version.

If you upgrade fury by minor version, or you won't have data serialized by older fury, you can upgrade fury directly, no need to `versioning` the data.
If you upgrade fury by minor version, or you won't have data serialized by older fury, you can upgrade fury directly,
no need to `versioning` the data.

## Troubleshooting Common Issues
## Trouble shooting

### Class inconsistency and class version check

If you create fury without setting `CompatibleMode` to `org.apache.fury.config.CompatibleMode.COMPATIBLE`, and you got a strange serialization error, it may be caused by class inconsistency between serialization peer and deserialization peer.
If you create fury without setting `CompatibleMode` to `org.apache.fury.config.CompatibleMode.COMPATIBLE`, and you got a
strange
serialization error, it may be caused by class inconsistency between serialization peer and deserialization peer.

In such cases, you can invoke `FuryBuilder#withClassVersionCheck` to create fury to validate it, if deserialization throws `org.apache.fury.exception.ClassNotCompatibleException`, it shows class are inconsistent, and you should create fury with `FuryBuilder#withCompaibleMode(CompatibleMode.COMPATIBLE)`.
In such cases, you can invoke `FuryBuilder#withClassVersionCheck` to create fury to validate it, if deserialization
throws `org.apache.fury.exception.ClassNotCompatibleException`, it shows class are inconsistent, and you should create
fury with
`FuryBuilder#withCompaibleMode(CompatibleMode.COMPATIBLE)`.

`CompatibleMode.COMPATIBLE` has more performance and space cost, do not set it by default if your classes are always consistent between serialization and deserialization.
`CompatibleMode.COMPATIBLE` has more performance and space cost, do not set it by default if your classes are always
consistent between serialization and deserialization.

### Use wrong API for deserialization

If you serialize an object by invoking `Fury#serialize`, you should invoke `Fury#deserialize` for deserialization
instead of `Fury#deserializeJavaObject`.
instead of
`Fury#deserializeJavaObject`.

If you serialize an object by invoking `Fury#serializeJavaObject`, you should invoke `Fury#deserializeJavaObject` for deserialization instead of `Fury#deserializeJavaObjectAndClass`/`Fury#deserialize`.
If you serialize an object by invoking `Fury#serializeJavaObject`, you should invoke `Fury#deserializeJavaObject` for
deserialization instead of `Fury#deserializeJavaObjectAndClass`/`Fury#deserialize`.

If you serialize an object by invoking `Fury#serializeJavaObjectAndClass`, you should invoke `Fury#deserializeJavaObjectAndClass` for deserialization instead of `Fury#deserializeJavaObject`/`Fury#deserialize`.
If you serialize an object by invoking `Fury#serializeJavaObjectAndClass`, you should
invoke `Fury#deserializeJavaObjectAndClass` for deserialization instead
of `Fury#deserializeJavaObject`/`Fury#deserialize`.
7 changes: 3 additions & 4 deletions docs/guide/scala_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,11 @@ fury.register(Class.forName("scala.Enumeration.Val"))
```

If you want to avoid such registration, you can disable class registration by `FuryBuilder#requireClassRegistration(false)`.
Note that this option allow to deserialize objects unknown types, more flexible but may be insecure if the classes contains malicious code.

> Note that this option allow to deserialize objects unknown types, more flexible but may be insecure if the classes contains malicious code.
And circular references are common in scala, `Reference tracking` should be enabled by `FuryBuilder#withRefTracking(true)`. If you don't enable reference tracking, [StackOverflowError](https://github.com/apache/fury/issues/1032) may happen for some scala versions when serializing scala Enumeration.

And circular references are common in scala, `Reference tracking` should be enabled by `FuryBuilder#withRefTracking(true)`. If you don't enable `Reference tracking`, [StackOverflowError](https://github.com/apache/fury/issues/1032) may happen for some scala versions when serializing scala Enumeration.

> Note that fury instance should be shared between multiple serialization, the creation of fury instance is not cheap.
Note that fury instance should be shared between multiple serialization, the creation of fury instance is not cheap.

If you use shared fury instance across multiple threads, you should create `ThreadSafeFury` instead by `FuryBuilder#buildThreadSafeFury()` instead.

Expand Down

0 comments on commit b1595dc

Please sign in to comment.