Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add estimateSerializedSize to BatchVectorSerializer (facebookincubato…
…r#8712) Summary: Pull Request resolved: facebookincubator#8712 This change adds an estimateSerializedSize function to the BatchVectorSerializer interface. In the DefaultBatchVectorSerializer this simple defers to the existing estimateSerializedSize in VectorSerde. However, in PrestoBatchVectorSerializer since we preserve encodings, its implementation needs to take this into account. PrestoBatchVectorSerializer's shares a lot of code with PrestoVectorSerde's estimateSerializedSize, but adds the following: * estimateConstantSerializedSize: this sets the size to the size of the single constant value, regardless of the width of the range. * estimateDictionarySerializedSize: this sets the size to the aggregate of the indices plus the size of the selected entries in the dictionary. * estimateSerializedSizeImpl: like estimateSerializedSizeInt in PrestoVectorSerde, this drives the estimation, but now calling our new functions I found a few bugs in the existing estimation logic which i fixed. Note that we inherit a number of inaccuracies from PrestoVectorSerde's estimateSerializedSize, but these are mostly constants if you ignore that they become multiplied when they occur in the elements of a complex type (it really is just an estimate). It also doesn't account for the fact the Serializer may disable dictionary encoding if it doesn't provide value. I plan to add this in a follow up to keep this change from becoming more complicated. Reviewed By: bikramSingh91 Differential Revision: D53593847 fbshipit-source-id: 639d01437797e07c378bc837e0a0283c7d08343d
- Loading branch information