diff --git a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/VarByteChunkForwardIndexWriterV5.java b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/VarByteChunkForwardIndexWriterV5.java index 968891089b8..a9f6fad975b 100644 --- a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/VarByteChunkForwardIndexWriterV5.java +++ b/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/VarByteChunkForwardIndexWriterV5.java @@ -26,14 +26,54 @@ /** - * Forward index writer that extends {@link VarByteChunkForwardIndexWriterV4} with the only difference being the - * version tag is now bumped from 4 to 5. + * Forward index writer that extends {@link VarByteChunkForwardIndexWriterV4} and overrides the data layout for + * multi-value fixed byte operations to improve space efficiency. * - *

The {@code VERSION} tag is a {@code static final} class variable set to {@code 5}. Since static variables - * are shadowed in the child class thus associated with the class that defines them, care must be taken to ensure - * that the parent class can correctly observe the child class's {@code VERSION} value at runtime.

+ *

Consider the following multi-value document as an example: {@code [int(1), int(2), int(3)]}. + * The current binary data layout in {@code VarByteChunkForwardIndexWriterV4} is as follows:

+ *
+ *     0x00000010 0x00000003 0x00000001 0x00000002 0x00000003
+ * 
* - *

To achieve this, the {@code getVersion()} method is overridden to return the concrete subclass's + *

    + *
  1. The first 4 bytes ({@code 0x00000010}) represent the total payload length of the byte array + * containing the multi-value document content, which in this case is 16 bytes.
  2. + * + *
  3. The next 4 bytes ({@code 0x00000003}) represent the number of elements in the multi-value document (i.e., 3) + * .
  4. + * + *
  5. The remaining 12 bytes ({@code 0x00000001 0x00000002 0x00000003}) represent the 3 integer values of the + * multi-value document: 1, 2, and 3.
  6. + *
+ * + *

In Pinot, the fixed byte raw forward index can only store one specific fixed-length data type: + * {@code int}, {@code long}, {@code float}, or {@code double}. Instead of explicitly storing the number of elements + * for each document for multi-value document, this value can be inferred by:

+ *
+ *     number of elements = buffer payload length / size of data type
+ * 
+ * + *

If the forward index uses the passthrough chunk compression type (i.e., no compression), we can save + * 4 bytes per document by omitting the explicit element count. This leads to the following space savings:

+ * + * + * + *

For forward indexes that use compression to reduce data size, the savings can be even more significant + * in certain cases. This is demonstrated in the unit test {@link VarByteChunkV5Test#validateCompressionRatioIncrease}, + * where ZStandard was used as the chunk compressor. In the test, 1 million short multi-value (MV) documents + * were inserted, following a Gaussian distribution for document lengths. Additionally, the values of each integer + * in the MV documents were somewhat repetitive. Under these conditions, we observed a 50%+ reduction in on-disk + * file size compared to the V4 forward index writer version.

+ * + *

Note that the {@code VERSION} tag is a {@code static final} class variable set to {@code 5}. Since static + * variables are shadowed in the child class thus associated with the class that defines them, care must be taken to + * ensure that the parent class can correctly observe the child class's {@code VERSION} value at runtime. To handle + * this cleanly and correctly, the {@code getVersion()} method is overridden to return the concrete subclass's * {@code VERSION} value, ensuring that the correct version number is returned even when using a reference * to the parent class.

*