Handle Array(Bytes)
encoding for query params containing non-Unicode text
#290
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ran into #267 tonight trying to bulk-insert MessagePack-encoded data (via
unnest
) and can confirm that it'sArray(Bytes)
that it's choking on. Since all arrays are text-encoded, we need to ensure text encoding forBytes
can handle binary data.I tried special-casing
Array(Bytes)
all the way down, but not only does that require a lot more understanding of the binary encoding format than I currently have, it would've been an incomplete solution that only handles linearbytea
arrays. Anyone using nested arrays would run into the same problem. So I decided against that.I don't like that this solution increases the payload size over the wire by encoding every byte as 2 hexadecimal characters. Arrays of strings are implemented in terms of this method, so this impacts
text[]
encoding, but it doesn't impact encoding of strings outside of arrays, so performance impact should be minimal for most use cases. And every other PQ implementation I've found so far, other thanlibpq
itself, also uses the hex encoding so it seems that if we're slow we're at least in good company. I think my ideal solution would be to convert all param types to binary encoding, but this was a far easier step and I couldn't find the specification for the binary encoding and it's late and I'm sleeeeeeepy.Please check my work on this. I did confirm that the test case deserializes into the expected string:
… and my MessagePack-encoded data unpacks correctly after bulk-inserting it now, but I'm not confident there isn't something I've overlooked.
Fixes #267