Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix schema name in table config during controller startup #11574

Merged
merged 3 commits into from
Oct 11, 2023

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Sep 12, 2023

Fix schema name in table config during controller startup:

  1. If schema name is already null in table config -> GOOD
  2. If rawTable name is already in the schema, then likely there is a misconfiguration, set the schema name to null and update table config.
  3. Copy referred schema to the same name of rawTable, and set schema name to null in table config.

@codecov-commenter
Copy link

codecov-commenter commented Sep 12, 2023

Codecov Report

Merging #11574 (5e965bd) into master (17ffa3e) will increase coverage by 0.01%.
Report is 1 commits behind head on master.
The diff coverage is 59.45%.

@@             Coverage Diff              @@
##             master   #11574      +/-   ##
============================================
+ Coverage     63.09%   63.11%   +0.01%     
  Complexity     1117     1117              
============================================
  Files          2342     2342              
  Lines        125920   125993      +73     
  Branches      19370    19377       +7     
============================================
+ Hits          79454    79517      +63     
- Misses        40812    40818       +6     
- Partials       5654     5658       +4     
Flag Coverage Δ
integration <0.01% <0.00%> (ø)
integration1 <0.01% <0.00%> (ø)
integration2 0.00% <0.00%> (ø)
java-11 63.06% <59.45%> (+0.01%) ⬆️
java-17 62.94% <59.45%> (-0.02%) ⬇️
java-20 62.96% <59.45%> (+<0.01%) ⬆️
temurin 63.11% <59.45%> (+0.01%) ⬆️
unittests 63.10% <59.45%> (+0.01%) ⬆️
unittests1 67.25% <0.00%> (+0.01%) ⬆️
unittests2 14.44% <59.45%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...g/apache/pinot/common/metrics/ControllerGauge.java 0.00% <0.00%> (ø)
...ache/pinot/common/metadata/ZKMetadataProvider.java 11.00% <0.00%> (-0.50%) ⬇️
...apache/pinot/controller/BaseControllerStarter.java 77.64% <74.57%> (-0.52%) ⬇️

... and 14 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Jackie-Jiang
Copy link
Contributor

We don't really need a periodic task after preventing new table with different schema to be created. It can be done as a one time scan when controller starts

@Jackie-Jiang
Copy link
Contributor

Actually a periodic task is also fine. We will remove it after one release anyway

@xiangfu0
Copy link
Contributor Author

Actually a periodic task is also fine. We will remove it after one release anyway

I would say we can keep this job for 1.x release, then remove it in 2.x :)

@eaugene
Copy link
Contributor

eaugene commented Sep 19, 2023

@xiangfu0 , I am not able to understand - Why are we needing a periodic task for this , as new tables creation with different schema name is anyways going to blocked . Any problems we encounter by keeping this -

  1. A one time task ( As Jackie pointed out , we can limit it during controller spawn up ) ?
  2. To have it more transparent to table owners , giving the control to them to trigger this action by a REST API ?

@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Sep 19, 2023

@xiangfu0 , I am not able to understand - Why are we needing a periodic task for this , as new tables creation with different schema name is anyways going to blocked . Any problems we encounter by keeping this -

  1. A one time task ( As Jackie pointed out , we can limit it during controller spawn up ) ?
  2. To have it more transparent to table owners , giving the control to them to trigger this action by a REST API ?

My feeling is #11570 may not get merged before this PR :p

The purpose is to ensure each table has its own schema, no cross references.

@xiangfu0 xiangfu0 changed the title Adding a controller task to update schema name Fix schema name in table config during controller startup Sep 19, 2023
@xiangfu0 xiangfu0 force-pushed the update_schema_with_table_name branch 3 times, most recently from 6d81401 to 2019605 Compare September 19, 2023 23:46
@xiangfu0 xiangfu0 added metrics cleanup incompatible Indicate PR that introduces backward incompatibility labels Sep 28, 2023
@xiangfu0 xiangfu0 force-pushed the update_schema_with_table_name branch 2 times, most recently from 843e8bf to 409974a Compare September 29, 2023 06:05
Copy link
Member

@jackjlli jackjlli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for adding the code changes!

@xiangfu0 xiangfu0 force-pushed the update_schema_with_table_name branch 5 times, most recently from 004786d to fe68a6b Compare October 4, 2023 18:57
@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Oct 4, 2023

Rebased to incorporate the changes from #11591

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make the counter more specific:

  • Tables without schema
  • Misconfigured tables: tables with schema name not matching the raw table name
  • Fixed tables
  • Failed to copy schema
  • Failed to update table config

MAX_RECORD_AVAILABILITY_LAG_MS("maxRecordAvailabilityLagMs", false),

// Number of table schema got fixed
FIXED_SCHEMA_TABLE_COUNT("FixedSchemaTableCount", true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have these configs covered in the prometheus yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those should be covered by the match all rule

// Update table config to remove schema name
tableConfig.getValidationConfig().setSchemaName(null);
try {
_helixResourceManager.updateTableConfig(tableConfig);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do version check and update. Currently it might override another table config update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to do this in another PR for helixResourceManager API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to change PinotHelixResouceManager. Actually we should not even use this updateTableConfig() method, but should use property store instead which can check the versions.
PinotHelixResourceManager.updateTableConfig() is always overriding the whole table config, instead of changing a field, and that is why version check is not required

@xiangfu0 xiangfu0 force-pushed the update_schema_with_table_name branch from fe68a6b to d0ea80e Compare October 9, 2023 20:58
@xiangfu0 xiangfu0 force-pushed the update_schema_with_table_name branch 3 times, most recently from 0eb66f0 to 5e965bd Compare October 10, 2023 23:25
Comment on lines 95 to 96
return propertyStore.set(constructPropertyStorePathForResourceConfig(tableNameWithType), tableConfigZNRecord,
expectVersion, AccessOption.PERSISTENT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part should be outside of the try-catch

*
* @return true if update is successful.
*/
public static boolean setTableConfig(ZkHelixPropertyStore<ZNRecord> propertyStore, String tableNameWithType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) tableNameWithType is not needed

TableConfig tableConfig = toTableConfig(
propertyStore.get(constructPropertyStorePathForResourceConfig(tableNameWithType), tableConfigStat,
AccessOption.PERSISTENT));
return ImmutablePair.of(tableConfig, tableConfigStat.getVersion());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When tableConfig is null, we should probably directly return null instead of a pair. Also annotate the return as @Nullable

@xiangfu0 xiangfu0 force-pushed the update_schema_with_table_name branch from 5e965bd to 313aa28 Compare October 11, 2023 00:43
@xiangfu0 xiangfu0 merged commit 169aba9 into apache:master Oct 11, 2023
17 of 19 checks passed
@xiangfu0 xiangfu0 deleted the update_schema_with_table_name branch October 11, 2023 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup incompatible Indicate PR that introduces backward incompatibility metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants