-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: decode in the case of field missing in both writer binary and reader struct. #452
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…missing_struct_field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hello, and again, thank you so much for this library Sorry if there is something that I'm totally missing, but is there a way to set the func TestLeaveMeExampleOfConvertToAvroWithOldSchemaThenDecodeWithNew(t *testing.T) {
schema, err := avro.Parse(`{
"type": "record",
"name": "simple",
"namespace": "org.hamba.avro",
"fields" : [
{"name": "a", "type": "int"},
{"name": "b", "type": "string"},
{"name": "c", "type": "string"},
{"name": "d", "type": "string"}
]
}`)
require.NoError(t, err)
// encode
// Need to add and remove magic bytes before and after the wire
buf := bytes.NewBuffer([]byte{})
enc := avro.NewEncoderForSchema(schema, buf)
obj2 := map[string]any{
"a": 3,
"b": "John",
"c": "Jill",
"d": "Jack",
}
err = enc.Encode(obj2)
require.NoError(t, err)
data := buf.Bytes()
fmt.Println(data)
// Decode
newSchema, err := avro.Parse(`{
"type": "record",
"name": "simple",
"namespace": "org.hamba.avro",
"fields" : [
{"name": "a", "type": "int"},
{"name": "b", "type": "string"},
{"name": "c", "type": "string"},
{"name": "d", "type": "string"},
{"name": "e","type": ["null","string"],"default": null}
]
}`)
require.NoError(t, err)
decoder := avro.NewDecoderForSchema(newSchema, buf)
stringToInterfaceMap := map[string]any{}
err = decoder.Decode(&stringToInterfaceMap)
require.NoError(t, err)
fmt.Println(stringToInterfaceMap)
expected := map[string]any{
"a": 3,
"b": "John",
"c": "Jill",
"d": "Jack",
"e": nil,
}
assert.EqualValues(t, expected, stringToInterfaceMap)
} If I create compositeSchema using SchemaCompatibility I do get a schema with the field "e" having an Action of set_to_default which is then able to correctly decode. schemaCompatibility := avro.NewSchemaCompatibility()
compositeSchema, err := schemaCompatibility.Resolve(newSchema, schema)
assert.NoError(t, err) However in practice having to use NewSchemaCompatibility isn't ideal as I'd like to just always pull the latest schema and decode to the default for new fields that were not in previous versions. The only work-around I can think of is I'll always need to pull the first and last version of a schema to create the composite schema. Again sorry if I'm missing something super basic. Thank You!! |
What you want does not work. To have differing schemas between read and write, you must use a composite. Avro and the spec assume that the byte data is complete, as there is nothing in the data to tell it otherwise. |
Thank you for your response. I was curious how I hadn't come across this yet so I reverted back to v2.18.0 and the above test does work. Just pointing out that this is a breaking change. Thanks Again. |
I think I overlooked an edge case the last time I worked on schema evolution:
If a field is missing in both the binary and the reader struct, it should be ignored rather than calling skipDecoder (which technically performs a read and moves the cursor).
The current implementation might perform a dirty read, causing errors like: