-
Notifications
You must be signed in to change notification settings - Fork 85
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[server][test] Race condition fix for re-subscription of real-time to…
…pic partitions. (#1263) This improvement addresses a previously reverted #1192. The unit test in the prior fix triggered separate threads resubscribing to the same real-time topic partition, resulting in frequent resubscriptions. These frequent resubscriptions caused excessive metric emissions, leading to GC issues and causing other unit tests to fail in CI. To prevent metric emissions, we use a mocked AggKafkaConsumerServiceStats instead of passing null when instantiating KafkaConsumerService in the unit test, as KafkaConsumerService would otherwise create a real AggKafkaConsumerServiceStats when null is passed. Previous #1192 commit message: Recently we introduced re-subscription feature for different store versions. Leader and follower partitions from different version topics will experience re-subscription triggered by store version role change concurrently. Problem: Two StoreIngestionTask threads for different store versions (store version 1 and store version 2) try to do re-subscription (unsub and sub) for the same real-time topic partition concurrently. During re-subscription triggered by one store version, operation of consumer.unSubscribe to remove assignment of the topic partition and operation of consumerToConsumptionTask.get(consumer).removeDataReceiver(pubSubTopicPartition) are sequentially executed. It is possible that StoreIngestionTask thread for store version 2 got the same ConsumptionTask, but DataReceiver inside ConsumptionTask is still from store version 1). As each ConsumptionTask will only allow one DataReceiver for one particular real-time topic partition, the DataReceiver from store version 2 will not be able to be added to this ConsumptionTask. Solution: Using SharedKafkaConsumer level lock to protect KafkaConsumcerService for unSubscribe, SetDataReceiver, unSubscribeAll to guarantee there will one real-time partition from specific store version doing DataReceiver related assignment change. Co-authored-by: Hao Xu <xhao@xhao-mn3.linkedin.biz>
- Loading branch information
Showing
3 changed files
with
236 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters