Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[venice-server] Dropping unassigned partitions #1196

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

kristyelee
Copy link

Summary, imperative, start upper case, don't end with a period

The goal is to remove unassigned partitions that are on disk state

This involves storage service and engine, which upon starting (constructed), will go through folders on disk and verify status of folders
Helix's ideal state is used for consulting partitions and their assignment
The change proposed includes defining a function in VeniceServer that goes through partitions in idealState and compares to partitions in storageEngine
StorageEngine object initialization is also being updated to run the new function that performs this verification.

Resolves #650

How was this PR tested?

Will write corresponding test.

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

@kristyelee kristyelee marked this pull request as draft September 24, 2024 02:37
@kvargha
Copy link
Contributor

kvargha commented Sep 24, 2024

It looks like spotBugs is failing.

If you run this locally, you'll be able to read the report to see what's wrong.
./gradlew :services:venice-server:spotbugsMain

Make sure to build before running this.

@kristyelee kristyelee force-pushed the kristy_lee/650 branch 2 times, most recently from 3e496b1 to 56e7711 Compare September 25, 2024 18:58
@@ -714,7 +714,13 @@ private Function<String, Boolean> functionToCheckWhetherStorageEngineShouldBeKep
return storageEngineName -> {
String storeName = Version.parseStoreFromKafkaTopicName(storageEngineName);

AbstractStorageEngine storageEngine = storageService.getStorageEngine(storageEngineName);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the correct storageEngine from storageEngineName seems to depend on consulting the directories in storageService, though when this function is called, storageService is not fully instantiated, and so the implementation in this line needs to be revised to not reference storageService.

A method that could be used: obtain the store config from veniceConfigLoader and consult factory [e.g., RocksDBStorageEngineFactory]'s getStorageEngine method. Though, the respective factory also seems to be instantiated and referenced in StorageService [in initInternalStorageEngineFactories].

Copy link
Contributor

@kvargha kvargha Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkWhetherStorageEngineShouldBeKeptOrNot determines whether or not the entire storageEngine should be dropped, and where it's called it assumes that the storageEngine hasn't been opened yet.

What we should do here is modify the StorageService constructor to take another function that determines whether partitions should be kept or not. Ex: checkWhetherStoragePartitionsShouldBeKeptOrNot. Return type can be void to differentiate between the two checks.

This new function can be invoked here after the storageEngine has been opened.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idea makes sense. I will integrate this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operations involving Helix [HelixParticipationService, SafeHelixDataAccessor] may require that services are started for StorageService. [requires that Venice server has started]. SafeHelixDataAccessor is used to obtain the ideal state as referenced in #650. Will have to check the structure here so that checkWhetherStoragePartitionsShouldBeKeptOrNot executes correctly.

@@ -88,6 +89,7 @@ public class StorageService extends AbstractVeniceService {
boolean restoreDataPartitions,
boolean restoreMetadataPartitions,
Function<String, Boolean> checkWhetherStorageEngineShouldBeKeptOrNot,
Function<AbstractStorageEngine, Void> checkWhetherStoragePartitionsShouldBeKeptOrNot,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create an additional constructor that doesn't have checkWhetherStoragePartitionsShouldBeKeptOrNot, so if users don't want to use this, they wouldn't have to pass in a functional interface that does nothing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that sounds correct; checkWhetherStoragePartitionsShouldBeKeptOrNot is mainly being written in VeniceServer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Reconcile on disk state with assigned partitions for given resource and delete unused partitions
2 participants