Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server][common] Allow store config repo to fetch configs only when needed #1204

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

adamxchen
Copy link
Collaborator

@adamxchen adamxchen commented Sep 27, 2024

Summary

#1132 introduced additional ZK watch. The implementation of HelixReadOnlyStoreConfigRepository reads all store configs, regardless of which cluster, this creates an issue since it would increase the additional ZK watch by store number x host name. In this PR, refresh call is removed since we don't need all configs and add some logics to lazy fetch the store config and cache.

How was this PR tested?

new unit test

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

@adamxchen adamxchen changed the title [common] Allow store config repo to fetch configs only when needed [server][common] Allow store config repo to fetch configs only when needed Sep 27, 2024
Copy link
Contributor

@gaojieliu gaojieliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking the following refactoring strategy, which I think will be safer:

  1. Keep the existing refresh calls as today.
  2. HelixReadOnlyStoreConfigRepository#refresh will place a child-change listener as today, but it will only get a list of store names, not store configs.
  3. In the child-change listener, whenever there is a child change, it will only mutate the maintained store list and of coz, for deleted stores, we will remove the corresponding entry from the local store config map and corresponding zk watch.
  4. We will only place a zk watch on store config node in lazy fetch function.
  5. In HelixReadOnlyStoreConfigRepository#getStoreConfig function, if the requested store doesn't in the store list, return Optional.empty() directly, and if it is present, fetch the store config on the fly and place the zk watch to monitor data change.
  6. HelixReadOnlyStoreConfigRepository would maintain a storeConfigMap, but this map will be lazily populated by HelixReadOnlyStoreConfigRepository#getStoreConfig function.

WDYT?

@@ -147,7 +147,6 @@ private void initServerStoreAndSchemaRepository() {
adapter,
clusterConfig.getRefreshAttemptsForZkReconnect(),
clusterConfig.getRefreshIntervalForZkReconnectInMs());
storeConfigRepo.refresh();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still call refresh to get a list of stores, and please see my following comments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great suggestions! The core idea is only place watch when there's a need to get store config, and for other stores, we just cache them to know whether they are present or not. Will update it

accessor.subscribeStoreConfigDataChangedListener(veniceStoreName, storeConfigChangedListener);
}
}

if (config != null) {
return Optional.of(config.cloneStoreConfig());
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a branch for config being null. Shall we put the new logic here or remove it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the new logic, the config can still be null or not-null so this branch still applies. We may use ternary operator to make it look nice tho

config = accessor.getStoreConfig(veniceStoreName);
if (config != null) {
storeConfigMap.get().put(config.getStoreName(), config);
accessor.subscribeStoreConfigDataChangedListener(veniceStoreName, storeConfigChangedListener);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can unsubscribe when the store is deleted from this cluster; is it feasible with the current implementation? Does it know if the store is deleted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we will need the StoreConfigAddedOrDeletedChangedListener for this and that's only used in refresh.. Let me see if I can incorporate Gaojie's suggestion to refactor this into next revision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants