-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compat][da-vinci][test] Global RT DIV improvement (part 1): RT and VT DIV se… #1179
Conversation
…paration Problem Data Integrity Validator (DIV) is used for validating and descovering data issues residing in the Kafka topics when consumed. However, without a shared gloabal view of the RealTime (RT) topics among all leaders, DIV states for RT topics are scattered among all past leaders and without a way to pass RT DIV from leader to its successors, DIV results can be inaccurate especially during the leadership transitions. At a high level, we proposes to replicate the leader DIV states into the local VT periodically or on-demand. Followers consume these leader DIV states and sync itself up to the leader state along the way. As the first part of the implementation for global DIV improvement, this PR contains the following: 1. introduce a new flag for the global rt div feature. 2. introduce a new realtimeTopicProducerStates in PartitionState in the local rocksdb checkpoint. 3. divide today's complete DIV into two groups: VT and RT DIV, so that each group can be updated separately based on which topic the Kafka records come from. (without feature enabled, only VT DIV is updated). 4. Adding test to verify the read/write of an offsetrecord with new the new added field.
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Show resolved
Hide resolved
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the SIT is not ingesting the leader replica, will SIT load the RT DIV or not?
Also when the replica transitions from leader replicas to follower replica, will SIT offload RT DIV from the memory and reload it when necessary?
Maybe this is not part of the scope of this PR, and some clarification will be hepful.
...ient/src/main/java/com/linkedin/davinci/kafka/consumer/LeaderFollowerStoreIngestionTask.java
Outdated
Show resolved
Hide resolved
clients/da-vinci-client/src/main/java/com/linkedin/davinci/config/VeniceServerConfig.java
Show resolved
Hide resolved
...-vinci-client/src/main/java/com/linkedin/davinci/validation/KafkaDataIntegrityValidator.java
Show resolved
Hide resolved
internal/venice-common/src/main/java/com/linkedin/venice/offsets/OffsetRecord.java
Outdated
Show resolved
Hide resolved
Will update the description part to address this comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…paration
Problem:
Data Integrity Validator (DIV) is used for validating and discovering data issues residing in the Kafka topics when consumed. However, without a shared global view of the RealTime (RT) topics among all leaders, DIV states for RT topics are scattered among all past leaders and without a way to pass RT DIV from leader to its successors, DIV results can be inaccurate especially during the leadership transitions.
At a high level, we proposes to replicate the leader DIV states into the local VT periodically or on-demand. Followers consume these leader DIV states and sync itself up to the leader state along the way.
This PR:
As the first part of the implementation for global DIV improvement, this PR contains the following:
OffsetRecord
with new the new added field.Future changes:
After this PR, as the next step, we will start to implement the following:
How was this PR tested?
CI (with or without feature enabled)
Does this PR introduce any user-facing changes?