From 8c1dc8cda0edae8a80d67f115b0920bf8982e2e3 Mon Sep 17 00:00:00 2001 From: Aolin Date: Thu, 9 Nov 2023 10:45:42 +0800 Subject: [PATCH 01/29] server recommendations: add a note for production environment (#15287) --- hardware-and-software-requirements.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hardware-and-software-requirements.md b/hardware-and-software-requirements.md index 4dc4c56186cab..7555b6e10b3a6 100644 --- a/hardware-and-software-requirements.md +++ b/hardware-and-software-requirements.md @@ -169,7 +169,7 @@ You can deploy and run TiDB on the 64-bit generic hardware server platform in th > **Note:** > > - In the production environment, the TiDB and PD instances can be deployed on the same server. If you have a higher requirement for performance and reliability, try to deploy them separately. -> - It is strongly recommended to use higher configuration in the production environment. +> - It is strongly recommended to configure TiDB, TiKV, and TiFlash with at least 8 CPU cores each in the production environment. To get better performance, a higher configuration is recommended. > - It is recommended to keep the size of TiKV hard disk within 4 TB if you are using PCIe SSDs or within 1.5 TB if you are using regular SSDs. Before you deploy TiFlash, note the following items: From b3ecd3ca8215e4266b2ce297d37410e026ed3818 Mon Sep 17 00:00:00 2001 From: ekexium Date: Thu, 9 Nov 2023 11:32:12 +0800 Subject: [PATCH 02/29] add slow logs in troubleshoot-stale-read.md (#15174) --- grafana-tikv-dashboard.md | 2 +- troubleshoot-stale-read.md | 28 +++++++++++++++++++++++++--- 2 files changed, 26 insertions(+), 4 deletions(-) diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md index 2e313a25d7d33..a50f74c644fff 100644 --- a/grafana-tikv-dashboard.md +++ b/grafana-tikv-dashboard.md @@ -419,7 +419,7 @@ This section provides a detailed description of these key metrics on the **TiKV- - Check Leader Duration: The distribution of time spent on processing leader requests. The duration is from sending requests to receiving responses in leader - Max gap of resolved-ts in Region leaders: The maximum time difference between the resolved-ts of all active Regions in this TiKV and the current time, only for Region leaders - Min Leader Resolved TS Region: The ID of the Region whose resolved-ts is the minimal, only for Region leaders -- Lock heap size: The memory footprint of the heap that tracks locks in the resolved-ts module +- Lock heap size: The size of the heap that tracks locks in the resolved-ts module ### Memory diff --git a/troubleshoot-stale-read.md b/troubleshoot-stale-read.md index f32b7e11e6cde..112e1ac045943 100644 --- a/troubleshoot-stale-read.md +++ b/troubleshoot-stale-read.md @@ -35,7 +35,7 @@ The Region leader uses a resolver to manage resolved-ts. This resolver tracks lo ## Diagnose Stale Read issues -This section introduces how to diagnose Stale Read issues using Grafana and `tikv-ctl`. +This section introduces how to diagnose Stale Read issues using Grafana, `tikv-ctl`, and logs. ### Identify issues @@ -100,6 +100,28 @@ The preceding output helps you determine: - Whether the apply index is too small to update safe-ts. - Whether the leader is sending a sufficiently updated resolved-ts when a follower peer exists. +### Use logs to diagnose + +Every 10 seconds, TiKV checks the following metrics: + +- The Region leader whose resolved-ts is the minimal +- The Region follower whose safe-ts is the minimal +- The Region follower whose resolved-ts is the minimal + +If any of these timestamps is abnormally small, TiKV prints a log. + +These logs are especially useful when you want to diagnose a historical problem that is no longer present. + +The following shows an example of the logs: + +```log +[2023/08/29 16:48:18.118 +08:00] [INFO] [endpoint.rs:505] ["the max gap of leader resolved-ts is large"] [last_resolve_attempt="Some(LastAttempt { success: false, ts: TimeStamp(443888082736381953), reason: \"lock\", lock: Some(7480000000000000625F728000000002512B5C) })"] [duration_to_last_update_safe_ts=10648ms] [min_memory_lock=None] [txn_num=0] [lock_num=0] [min_lock=None] [safe_ts=443888117326544897] [gap=110705ms] [region_id=291] + +[2023/08/29 16:48:18.118 +08:00] [INFO] [endpoint.rs:526] ["the max gap of follower safe-ts is large"] [oldest_candidate=None] [latest_candidate=None] [applied_index=3276] [duration_to_last_consume_leader=11460ms] [resolved_ts=443888117117353985] [safe_ts=443888117117353985] [gap=111503ms] [region_id=273] + +[2023/08/29 16:48:18.118 +08:00] [INFO] [endpoint.rs:547] ["the max gap of follower resolved-ts is large; it's the same region that has the min safe-ts"] +``` + ## Troubleshooting tips ### Handle slow transaction commit @@ -190,7 +212,7 @@ To get the exact transaction and the keys of some of the locks, you can check Ti [2023/07/17 21:16:44.257 +08:00] [INFO] [resolver.rs:213] ["locks with the minimum start_ts in resolver"] [keys="[74800000000000006A5F7280000000000405F6, ... , 74800000000000006A5F72800000000000EFF6, 74800000000000006A5F7280000000000721D9, 74800000000000006A5F72800000000002F691]"] [start_ts=442918429687808001] [region_id=3121] ``` -From the TiKV log, you can get the start_ts of the transaction, that is `442918429687808001`. To get more information about the statement and transaction, you can grep `start_ts` in TiDB logs. The output is as follows: +From the TiKV log, you can get the start_ts of the transaction, that is `442918429687808001`. To get more information about the statement and transaction, you can grep this timestamp in TiDB logs. The output is as follows: ```log [2023/07/17 21:16:18.287 +08:00] [INFO] [2pc.go:685] ["[BIG_TXN]"] [session=2826881778407440457] ["key sample"=74800000000000006a5f728000000000000000] [size=319967171] [keys=10000000] [puts=10000000] [dels=0] [locks=0] [checks=0] [txnStartTS=442918429687808001] @@ -212,4 +234,4 @@ Then, you can basically locate the statement that caused the problem. To further The output shows that someone is executing an unexpected `UPDATE` statement (`update t set b = b + 1`), which results in a large transaction and hinders Stale Read. -To resolve this issue, you can stop the application that is running this `UPDATE` statement. \ No newline at end of file +To resolve this issue, you can stop the application that is running this `UPDATE` statement. From 2cd14da5e7a75d49be4194952b21630be09f18bc Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 9 Nov 2023 14:56:13 +0800 Subject: [PATCH 03/29] mark the support of foreign key as experimental (#15290) --- basic-features.md | 2 +- constraints.md | 2 +- .../dev-guide-sample-application-nodejs-prisma.md | 12 +++++++----- .../dev-guide-sample-application-nodejs-typeorm.md | 2 +- foreign-key.md | 5 +++-- releases/release-6.6.0.md | 6 +++--- 6 files changed, 16 insertions(+), 13 deletions(-) diff --git a/basic-features.md b/basic-features.md index aab7b85b2c27a..6127851320cd3 100644 --- a/basic-features.md +++ b/basic-features.md @@ -62,7 +62,7 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u | [Clustered index on integer `PRIMARY KEY`](/clustered-indexes.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | [Clustered index on composite or non-integer key](/clustered-indexes.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | | [Multi-valued indexes](/sql-statements/sql-statement-create-index.md#multi-valued-indexes) | Y | Y | Y | Y | N | N | N | N | N | N | N | N | -| [Foreign key](/constraints.md#foreign-key) | Y | Y | Y | Y | N | N | N | N | N | N | N | N | +| [Foreign key](/constraints.md#foreign-key) | E | E | E | E | N | N | N | N | N | N | N | N | | [TiFlash late materialization](/tiflash/tiflash-late-materialization.md) | Y | Y | Y | Y | N | N | N | N | N | N | N | N | ## SQL statements diff --git a/constraints.md b/constraints.md index dbfda1043c83d..097a0d7e277ee 100644 --- a/constraints.md +++ b/constraints.md @@ -412,7 +412,7 @@ For more details about the primary key of the `CLUSTERED` type, refer to [cluste > **Note:** > -> Starting from v6.6.0, TiDB supports the [FOREIGN KEY constraints](/foreign-key.md) feature. Before v6.6.0, TiDB supports creating and deleting foreign key constraints, but the constraints are not actually effective. After upgrading TiDB to v6.6.0, you can delete the invalid foreign key and create a new one to make the foreign key constraints effective. +> Starting from v6.6.0, TiDB supports the [FOREIGN KEY constraints](/foreign-key.md) as an experimental feature. Before v6.6.0, TiDB supports creating and deleting foreign key constraints, but the constraints are not actually effective. After upgrading TiDB to v6.6.0, you can delete the invalid foreign key and create a new one to make the foreign key constraints effective. TiDB supports creating `FOREIGN KEY` constraints in DDL commands. diff --git a/develop/dev-guide-sample-application-nodejs-prisma.md b/develop/dev-guide-sample-application-nodejs-prisma.md index 1c482f8167989..ac1484546223c 100644 --- a/develop/dev-guide-sample-application-nodejs-prisma.md +++ b/develop/dev-guide-sample-application-nodejs-prisma.md @@ -351,13 +351,15 @@ For more information, refer to [Delete data](/develop/dev-guide-delete-data.md). ### Foreign key constraints vs Prisma relation mode -For TiDB v6.6.0 or later, it's recommended to use [Foreign key constraints](https://docs.pingcap.com/tidb/stable/foreign-key) instead of [Prisma relation mode](https://www.prisma.io/docs/concepts/components/prisma-schema/relations/relation-mode) for [referential integrity](https://en.wikipedia.org/wiki/Referential_integrity?useskin=vector) checking. +To check [referential integrity](https://en.wikipedia.org/wiki/Referential_integrity?useskin=vector), you can use foreign key constraints or Prisma relation mode: -Relation mode is the emulation of referential integrity in Prisma Client side. However, it should be noted that there are performance implications, as it requires additional database queries to maintain referential integrity. +- [Foreign key](https://docs.pingcap.com/tidb/stable/foreign-key) is an experimental feature supported starting from TiDB v6.6.0, which allows cross-table referencing of related data, and foreign key constraints to maintain data consistency. -> **Note** -> -> **Foreign keys are suitable for small and medium-volumes data scenarios.** Using foreign keys in large data volumes might lead to serious performance issues and could have unpredictable effects on the system. If you plan to use foreign keys, conduct thorough validation first and use them with caution. + > **Warning:** + > + > **Foreign keys are suitable for small and medium-volumes data scenarios.** Using foreign keys in large data volumes might lead to serious performance issues and could have unpredictable effects on the system. If you plan to use foreign keys, conduct thorough validation first and use them with caution. + +- [Prisma relation mode](https://www.prisma.io/docs/concepts/components/prisma-schema/relations/relation-mode) is the emulation of referential integrity in Prisma Client side. However, it should be noted that there are performance implications, as it requires additional database queries to maintain referential integrity. ## Next steps diff --git a/develop/dev-guide-sample-application-nodejs-typeorm.md b/develop/dev-guide-sample-application-nodejs-typeorm.md index 93e2949949a51..5f3190e9ca552 100644 --- a/develop/dev-guide-sample-application-nodejs-typeorm.md +++ b/develop/dev-guide-sample-application-nodejs-typeorm.md @@ -341,7 +341,7 @@ For more information, refer to [TypeORM: DataSource API](https://typeorm.io/data ### Foreign key constraints -Using foreign key constraints ensures the [referential integrity](https://en.wikipedia.org/wiki/Referential_integrity) of data by adding checks on the database side. However, this might lead to serious performance issues in scenarios with large data volumes. +Using [foreign key constraints](https://docs.pingcap.com/tidb/stable/foreign-key) (experimental) ensures the [referential integrity](https://en.wikipedia.org/wiki/Referential_integrity) of data by adding checks on the database side. However, this might lead to serious performance issues in scenarios with large data volumes. You can control whether foreign key constraints are created when constructing relationships between entities by using the `createForeignKeyConstraints` option (default value is `true`). diff --git a/foreign-key.md b/foreign-key.md index b31038a07fb52..157d7e6d035b8 100644 --- a/foreign-key.md +++ b/foreign-key.md @@ -7,9 +7,10 @@ summary: An overview of the usage of FOREIGN KEY constraints for the TiDB databa Starting from v6.6.0, TiDB supports the foreign key feature, which allows cross-table referencing of related data, and foreign key constraints to maintain data consistency. -> **Note:** +> **Warning:** > -> The foreign key feature is usually used for providing integrity and consistency constraint checks for data in small or medium volumes. However, for large data volumes in a distributed database system, the use of foreign keys might lead to serious performance issues and could have unpredictable effects on the system. If you plan to use foreign keys, conduct thorough validation first and use them with caution. +> - Currently, the foreign key feature is experimental. It is not recommended that you use it in production environments. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. +> - The foreign key feature is usually used for providing integrity and consistency constraint checks for data in small or medium volumes. However, for large data volumes in a distributed database system, the use of foreign keys might lead to serious performance issues and could have unpredictable effects on the system. If you plan to use foreign keys, conduct thorough validation first and use them with caution. The foreign key is defined in the child table. The syntax is as follows: diff --git a/releases/release-6.6.0.md b/releases/release-6.6.0.md index 34327e529665c..7679c6974bd64 100644 --- a/releases/release-6.6.0.md +++ b/releases/release-6.6.0.md @@ -50,7 +50,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: SQL functionalities
- Foreign key + Foreign key (experimental) Support MySQL-compatible foreign key constraints to maintain data consistency and improve data quality. @@ -176,7 +176,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: ### SQL -* Support MySQL-compatible foreign key constraints [#18209](https://github.com/pingcap/tidb/issues/18209) @[crazycs520](https://github.com/crazycs520) +* Support MySQL-compatible foreign key constraints (experimental) [#18209](https://github.com/pingcap/tidb/issues/18209) @[crazycs520](https://github.com/crazycs520) TiDB v6.6.0 introduces the foreign key constraints feature, which is compatible with MySQL. This feature supports referencing within a table or between tables, constraints validation, and cascade operations. This feature helps to migrate applications to TiDB, maintain data consistency, improve data quality, and facilitate data modeling. @@ -337,7 +337,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: ### MySQL compatibility -* Support MySQL-compatible foreign key constraints [#18209](https://github.com/pingcap/tidb/issues/18209) @[crazycs520](https://github.com/crazycs520) +* Support MySQL-compatible foreign key constraints (experimental) [#18209](https://github.com/pingcap/tidb/issues/18209) @[crazycs520](https://github.com/crazycs520) For more information, see the [SQL](#sql) section in this document and [documentation](/foreign-key.md). From 42123259daed716014319bc600fa0aa503b033f6 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 10 Nov 2023 16:23:12 +0800 Subject: [PATCH 04/29] information-schema.md: add `CHECK_CONSTRAINTS` (#15318) --- information-schema/information-schema.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/information-schema/information-schema.md b/information-schema/information-schema.md index e5c2fb31db2a8..4994d48ba4319 100644 --- a/information-schema/information-schema.md +++ b/information-schema/information-schema.md @@ -17,6 +17,7 @@ Many `INFORMATION_SCHEMA` tables have a corresponding `SHOW` command. The benefi | Table Name | Description | |-----------------------------------------------------------------------------------------|-----------------------------| | [`CHARACTER_SETS`](/information-schema/information-schema-character-sets.md) | Provides a list of character sets the server supports. | +| [`CHECK_CONSTRAINTS`](/information-schema/information-schema-check-constraints.md) | Provides information about [`CHECK` constraints](/constraints.md#check) on tables. | | [`COLLATIONS`](/information-schema/information-schema-collations.md) | Provides a list of collations that the server supports. | | [`COLLATION_CHARACTER_SET_APPLICABILITY`](/information-schema/information-schema-collation-character-set-applicability.md) | Explains which collations apply to which character sets. | | [`COLUMNS`](/information-schema/information-schema-columns.md) | Provides a list of columns for all tables. | @@ -58,6 +59,7 @@ Many `INFORMATION_SCHEMA` tables have a corresponding `SHOW` command. The benefi | Table Name | Description | |-----------------------------------------------------------------------------------------|-----------------------------| | [`CHARACTER_SETS`](/information-schema/information-schema-character-sets.md) | Provides a list of character sets the server supports. | +| [`CHECK_CONSTRAINTS`](/information-schema/information-schema-check-constraints.md) | Provides information about [`CHECK` constraints](/constraints.md#check) on tables. | | [`COLLATIONS`](/information-schema/information-schema-collations.md) | Provides a list of collations that the server supports. | | [`COLLATION_CHARACTER_SET_APPLICABILITY`](/information-schema/information-schema-collation-character-set-applicability.md) | Explains which collations apply to which character sets. | | [`COLUMNS`](/information-schema/information-schema-columns.md) | Provides a list of columns for all tables. | From 3adcfd4cf7e77faa735d372677a0aadc2cd661be Mon Sep 17 00:00:00 2001 From: Ran Date: Mon, 13 Nov 2023 12:59:13 +0800 Subject: [PATCH 05/29] dm: add update validation description (#15251) --- dm/dm-continuous-data-validation.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/dm/dm-continuous-data-validation.md b/dm/dm-continuous-data-validation.md index c82ca029ab7d1..1507761b1e76f 100644 --- a/dm/dm-continuous-data-validation.md +++ b/dm/dm-continuous-data-validation.md @@ -103,6 +103,8 @@ Method 1: run the `dmctl query-status ` command. If continuous data v "stage": "Running", // Current stage. "Running" or "Stopped". "validatorBinlog": "(mysql-bin.000001, 5989)", // The binlog position of the validation "validatorBinlogGtid": "1642618e-cf65-11ec-9e3d-0242ac110002:1-30", // The GTID position of the validation + "cutoverBinlogPos": "", // The specified binlog position for cutover + "cutoverBinlogGTID": "1642618e-cf65-11ec-9e3d-0242ac110002:1-30", // The specified GTID position for cutover "result": null, // When the validation is abnormal, show the error message "processedRowsStatus": "insert/update/delete: 0/0/0", // Statistics of the processed binlog rows. "pendingRowsStatus": "insert/update/delete: 0/0/0", // Statistics of the binlog rows that are not validated yet or that fail to be validated but are not marked as "error rows" @@ -135,6 +137,8 @@ In the preceding command, you can use `--table-stage` to filter the tables that "stage": "Running", "validatorBinlog": "(mysql-bin.000001, 6571)", "validatorBinlogGtid": "", + "cutoverBinlogPos": "(mysql-bin.000001, 6571)", + "cutoverBinlogGTID": "", "result": null, "processedRowsStatus": "insert/update/delete: 2/0/0", "pendingRowsStatus": "insert/update/delete: 0/0/0", @@ -245,6 +249,26 @@ Flags: For detailed usage, refer to [`dmctl validation start`](#method-2-enable-using-dmctl). +## Set the cutover point for continuous data validation + +Before switching the application to another database, you might need to perform continuous data validation immediately after the data is replicated to a specific position to ensure data integrity. To achieve this, you can set this specific position as the cutover point for continuous validation. + +To set the cutover point for continuous data validation, use the `validation update` command: + +``` +Usage: + dmctl validation update [flags] + +Flags: + --cutover-binlog-gtid string specify the cutover binlog gtid for validation, only valid when source config's gtid is enabled, e.g. '1642618e-cf65-11ec-9e3d-0242ac110002:1-30' + --cutover-binlog-pos string specify the cutover binlog name for validation, should include binlog name and pos in brackets, e.g. '(mysql-bin.000001, 5989)' + -h, --help help for update +``` + +* `--cutover-binlog-gtid`: specifies the cutover position for validation, in the format of `1642618e-cf65-11ec-9e3d-0242ac110002:1-30`. Only valid when GTID is enabled in the upstream cluster. +* `--cutover-binlog-pos`: specifies the cutover position for validation, in the format of `(mysql-bin.000001, 5989)`. +* `task-name`: the name of the task for continuous data validation. This parameter is **required**. + ## Implementation The architecture of continuous data validation (validator) in DM is as follows: From 96477ed1db98b9d13e80cd4869d5a09a7187b4f2 Mon Sep 17 00:00:00 2001 From: Ran Date: Mon, 13 Nov 2023 14:00:44 +0800 Subject: [PATCH 06/29] add ticdc sql mode config (#15246) --- ticdc/ticdc-changefeed-config.md | 4 ++++ ticdc/ticdc-ddl.md | 23 +++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 19bf4ff648edb..119cb574c2b7e 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -64,6 +64,10 @@ case-sensitive = true # Note: This configuration item only takes effect if the downstream is TiDB. # sync-point-retention = "1h" +# Specifies the SQL mode used when parsing DDL statements. Multiple modes are separated by commas. +# The default value is the same as the default SQL mode of TiDB. +# sql-mode = "ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION" + [mounter] # The number of threads with which the mounter decodes KV data. The default value is 16. # worker-num = 16 diff --git a/ticdc/ticdc-ddl.md b/ticdc/ticdc-ddl.md index 91be173ead44d..1d05faf4d2c58 100644 --- a/ticdc/ticdc-ddl.md +++ b/ticdc/ticdc-ddl.md @@ -101,3 +101,26 @@ TiCDC processes this type of DDL as follows: | `RENAME TABLE test.t1 TO test.ignore1, test.t3 TO test.ignore2` | Replicate | The old database name, the old table names, and the new database name match the filter rule. | | `RENAME TABLE test.t1 TO ignore.t1, test.t2 TO test.t22;` | Report an error | The new database name `ignore` does not match the filter rule. | | `RENAME TABLE test.t1 TO test.t4, test.t3 TO test.t1, test.t4 TO test.t3;` | Report an error | The `RENAME TABLE` DDL swaps the names of `test.t1` and `test.t3` in one DDL statement, which TiCDC cannot handle correctly. In this case, refer to the error message for handling. | + +### SQL mode + +By default, TiCDC uses the default SQL mode of TiDB to parse DDL statements. If your upstream TiDB cluster uses a non-default SQL mode, you must specify the SQL mode in the TiCDC configuration file. Otherwise, TiCDC might fail to parse DDL statements correctly. For more information about TiDB SQL mode, see [SQL Mode](/sql-mode.md). + +For example, if the upstream TiDB cluster uses the `ANSI_QUOTES` mode, you must specify the SQL mode in the changefeed configuration file as follows: + +```toml +# In the value, "ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION" is the default SQL mode of TiDB. +# "ANSI_QUOTES" is the SQL mode added to your upstream TiDB cluster. + +sql-mode = "ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,ANSI_QUOTES" +``` + +If the SQL mode is not configured, TiCDC might fail to parse some DDL statements correctly. For example: + +```sql +CREATE TABLE "t1" ("a" int PRIMARY KEY); +``` + +Because in the default SQL mode of TiDB, double quotation marks are treated as strings rather than identifiers, TiCDC fails to parse the DDL statement correctly. + +Therefore, when creating a replication task, it is recommended that you specify the SQL mode used by the upstream TiDB cluster in the configuration file. From 8973cc9b3b2b8aea92a6bc971d524178eeb08aa8 Mon Sep 17 00:00:00 2001 From: Weizhen Wang Date: Mon, 13 Nov 2023 14:03:44 +0800 Subject: [PATCH 07/29] add tidb_gogc_tuner_max_value and tidb_gogc_tuner_min_value (#15270) --- system-variables.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/system-variables.md b/system-variables.md index 7d1fb30720aa5..96d8f027226d6 100644 --- a/system-variables.md +++ b/system-variables.md @@ -2972,6 +2972,26 @@ For a system upgraded to v5.0 from an earlier version, if you have not modified - When this variable is set to `ON`, you can view visual execution plans in TiDB Dashboard. Note that TiDB Dashboard only provides visual display for execution plans generated after this variable is enabled. - You can execute the `SELECT tidb_decode_binary_plan('xxx...')` statement to parse the specific plan from a binary plan. +### tidb_gogc_tuner_max_value New in v7.5.0 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Integer +- Default value: `500` +- Range: `[10, 2147483647]` +- The variable is used to control the maximum value of GOGC that the GOGC Tuner can adjust. + +### tidb_gogc_tuner_min_value New in v7.5.0 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Integer +- Default value: `100` +- Range: `[10, 2147483647]` +- The variable is used to control the minimum value of GOGC that the GOGC Tuner can adjust. + ### tidb_gogc_tuner_threshold New in v6.4.0 > **Note:** From a4298a1a62d87c93d60c056c7bb5f636675e003e Mon Sep 17 00:00:00 2001 From: Aolin Date: Mon, 13 Nov 2023 14:11:15 +0800 Subject: [PATCH 08/29] v7.5 supports Rocky Linux 9.1 (#15235) --- hardware-and-software-requirements.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/hardware-and-software-requirements.md b/hardware-and-software-requirements.md index 7555b6e10b3a6..03934b25a4428 100644 --- a/hardware-and-software-requirements.md +++ b/hardware-and-software-requirements.md @@ -78,6 +78,10 @@ As an open-source distributed SQL database with high performance, TiDB can be de openSUSE Leap later than v15.3 (not including Tumbleweed) x86_64 + + Rocky Linux 9.1 or later +
  • x86_64
  • ARM 64
+ SUSE Linux Enterprise Server 15 x86_64 From c5883defee963d72e18b6c1fbd745e8ebf2a6461 Mon Sep 17 00:00:00 2001 From: Frank945946 <108602632+Frank945946@users.noreply.github.com> Date: Mon, 13 Nov 2023 14:33:44 +0800 Subject: [PATCH 09/29] Deprecate mydumper tikv importer (#15289) --- binary-package.md | 2 - develop/dev-guide-timeouts-in-tidb.md | 2 +- dumpling-overview.md | 6 +- ecosystem-tool-user-guide.md | 2 +- migration-overview.md | 4 +- tidb-cloud/changefeed-sink-to-mysql.md | 2 +- tidb-lightning/monitor-tidb-lightning.md | 93 +------------------ .../tidb-lightning-command-line-full.md | 1 - .../tidb-lightning-configuration.md | 6 +- tidb-lightning/tidb-lightning-faq.md | 41 +------- .../tidb-lightning-physical-import-mode.md | 1 - tidb-lightning/troubleshoot-tidb-lightning.md | 22 +---- tidb-troubleshooting-map.md | 13 +-- 13 files changed, 17 insertions(+), 178 deletions(-) diff --git a/binary-package.md b/binary-package.md index 1cb7f815bb391..1489b525943f6 100644 --- a/binary-package.md +++ b/binary-package.md @@ -40,7 +40,6 @@ The `TiDB-community-toolkit` package contains the following contents. | Content | Change history | |---|---| -| tikv-importer-{version}-linux-{arch}.tar.gz | | | pd-recover-{version}-linux-{arch}.tar.gz | | | etcdctl | New in v6.0.0 | | tiup-linux-{arch}.tar.gz | | @@ -67,7 +66,6 @@ The `TiDB-community-toolkit` package contains the following contents. | sync_diff_inspector | | | reparo | | | arbiter | | -| mydumper | New in v6.0.0 | | server-{version}-linux-{arch}.tar.gz | New in v6.2.0 | | grafana-{version}-linux-{arch}.tar.gz | New in v6.2.0 | | alertmanager-{version}-linux-{arch}.tar.gz | New in v6.2.0 | diff --git a/develop/dev-guide-timeouts-in-tidb.md b/develop/dev-guide-timeouts-in-tidb.md index 4b326cb766857..6cf44e68c518c 100644 --- a/develop/dev-guide-timeouts-in-tidb.md +++ b/develop/dev-guide-timeouts-in-tidb.md @@ -13,7 +13,7 @@ TiDB's transaction implementation uses the MVCC (Multiple Version Concurrency Co By default, each MVCC version (consistency snapshots) is kept for 10 minutes. Transactions that take longer than 10 minutes to read will receive an error `GC life time is shorter than transaction duration`. -If you need longer read time, for example, when you are using **Mydumper** for full backups (**Mydumper** backs up consistent snapshots), you can adjust the value of `tikv_gc_life_time` in the `mysql.tidb` table in TiDB to increase the MVCC version retention time. Note that `tikv_gc_life_time` takes effect globally and immediately. Increasing the value will increase the life time of all existing snapshots, and decreasing it will immediately shorten the life time of all snapshots. Too many MVCC versions will impact TiKV's processing efficiency. So you need to change `tikv_gc_life_time` back to the previous setting in time after doing a full backup with **Mydumper**. +If you need longer read time, for example, when you are using **Dumpling** for full backups (**Dumpling** backs up consistent snapshots), you can adjust the value of `tikv_gc_life_time` in the `mysql.tidb` table in TiDB to increase the MVCC version retention time. Note that `tikv_gc_life_time` takes effect globally and immediately. Increasing the value will increase the life time of all existing snapshots, and decreasing it will immediately shorten the life time of all snapshots. Too many MVCC versions will impact TiKV's processing efficiency. So you need to change `tikv_gc_life_time` back to the previous setting in time after doing a full backup with **Dumpling**. For more information about GC, see [GC Overview](/garbage-collection-overview.md). diff --git a/dumpling-overview.md b/dumpling-overview.md index f03c5b931e7fa..bc40f54d704c2 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -46,11 +46,9 @@ TiDB also provides other tools that you can choose to use as needed. > **Note:** > -> PingCAP previously maintained a fork of the [mydumper project](https://github.com/maxbube/mydumper) with enhancements specific to TiDB. This fork has since been replaced by [Dumpling](/dumpling-overview.md), which has been rewritten in Go, and supports more optimizations that are specific to TiDB. It is strongly recommended that you use Dumpling instead of mydumper. -> -> For more information on Mydumper, refer to [v4.0 Mydumper documentation](https://docs.pingcap.com/tidb/v4.0/backup-and-restore-using-mydumper-lightning). +> PingCAP previously maintained a fork of the [mydumper project](https://github.com/maxbube/mydumper) with enhancements specific to TiDB. Starting from v7.5.0, [Mydumper](https://docs.pingcap.com/tidb/v4.0/mydumper-overview) is deprecated and most of its features have been replaced by [Dumpling](/dumpling-overview.md). It is strongly recommended that you use Dumpling instead of mydumper. -Compared to Mydumper, Dumpling has the following improvements: +Dumpling has the following advantages: - Support exporting data in multiple formats, including SQL and CSV. - Support the [table-filter](https://github.com/pingcap/tidb-tools/blob/master/pkg/table-filter/README.md) feature, which makes it easier to filter data. diff --git a/ecosystem-tool-user-guide.md b/ecosystem-tool-user-guide.md index 279959f546d82..60b1f17d81512 100644 --- a/ecosystem-tool-user-guide.md +++ b/ecosystem-tool-user-guide.md @@ -75,7 +75,7 @@ The following are the basics of Dumpling: > **Note:** > -> PingCAP previously maintained a fork of the [mydumper project](https://github.com/maxbube/mydumper) with enhancements specific to TiDB. This fork has since been replaced by [Dumpling](/dumpling-overview.md), which has been rewritten in Golang, and provides more optimizations specific to TiDB. It is strongly recommended that you use Dumpling instead of mydumper. +> PingCAP previously maintained a fork of the [mydumper project](https://github.com/maxbube/mydumper) with enhancements specific to TiDB. Starting from v7.5.0, [Mydumper](https://docs.pingcap.com/tidb/v4.0/mydumper-overview) is deprecated and most of its features have been replaced by [Dumpling](/dumpling-overview.md). It is strongly recommended that you use Dumpling instead of mydumper. ### Full data import - TiDB Lightning diff --git a/migration-overview.md b/migration-overview.md index ffc3a7f3d49c6..6cf6e549ce19f 100644 --- a/migration-overview.md +++ b/migration-overview.md @@ -8,8 +8,8 @@ summary: Learn the overview of data migration scenarios and the solutions. This document gives an overview of the data migration solutions that you can use with TiDB. The data migration solutions are as follows: - Full data migration. - - To import Amazon Aurora snapshots, CSV files, or Mydumper SQL files into TiDB, you can use TiDB Lightning to perform the full migration. - - To export all TiDB data as CSV files or Mydumper SQL files, you can use Dumpling to perform the full migration, which makes data migration from MySQL or MariaDB easier. + - To import Amazon Aurora snapshots, CSV files, or SQL dump files into TiDB, you can use TiDB Lightning to perform the full migration. + - To export all TiDB data as CSV files or SQL dump files, you can use Dumpling to perform the full migration, which makes data migration from MySQL or MariaDB easier. - To migrate all data from a database with a small data size volume (for example, less than 1 TiB), you can also use TiDB Data Migration (DM). - Quick initialization of TiDB. TiDB Lightning supports quickly importing data and can quickly initialize a specific table in TiDB. Before you use this feature, pay attention that the quick initialization has a great impact on TiDB and the cluster does not provide services during the initialization period. diff --git a/tidb-cloud/changefeed-sink-to-mysql.md b/tidb-cloud/changefeed-sink-to-mysql.md index ef0d757c8a18f..bc3c285115a56 100644 --- a/tidb-cloud/changefeed-sink-to-mysql.md +++ b/tidb-cloud/changefeed-sink-to-mysql.md @@ -69,7 +69,7 @@ To load the existing data: SET GLOBAL tidb_gc_life_time = '720h'; ``` -2. Use [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) to export data from your TiDB cluster, then use community tools such as [mydumper/myloader](https://centminmod.com/mydumper.html) to load data to the MySQL service. +2. Use [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) to export data from your TiDB cluster, then use community tools such as myloader to load data to the MySQL service. 3. From the [exported files of Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#format-of-exported-files), get the start position of MySQL sink from the metadata file: diff --git a/tidb-lightning/monitor-tidb-lightning.md b/tidb-lightning/monitor-tidb-lightning.md index 536f1a233e461..6b922b4445c37 100644 --- a/tidb-lightning/monitor-tidb-lightning.md +++ b/tidb-lightning/monitor-tidb-lightning.md @@ -22,13 +22,6 @@ pprof-port = 8289 ... ``` -and in `tikv-importer.toml`: - -```toml -# Listening address of the status server. -status-server-address = '0.0.0.0:8286' -``` - You need to configure Prometheus to make it discover the servers. For instance, you can directly add the server address to the `scrape_configs` section: ```yaml @@ -37,9 +30,6 @@ scrape_configs: - job_name: 'tidb-lightning' static_configs: - targets: ['192.168.20.10:8289'] - - job_name: 'tikv-importer' - static_configs: - - targets: ['192.168.20.9:8286'] ``` ## Grafana dashboard @@ -134,88 +124,7 @@ If any of the duration is too high, it indicates that the disk used by TiDB Ligh ## Monitoring metrics -This section explains the monitoring metrics of `tikv-importer` and `tidb-lightning`, if you need to monitor other metrics not covered by the default Grafana dashboard. - -### `tikv-importer` - -Metrics provided by `tikv-importer` are listed under the namespace `tikv_import_*`. - -- **`tikv_import_rpc_duration`** (Histogram) - - Bucketed histogram for the duration of an RPC action. Labels: - - - **request**: what kind of RPC is executed - * `switch_mode`: switched a TiKV node to import/normal mode - * `open_engine`: opened an engine file - * `write_engine`: received data and written into an engine - * `close_engine`: closed an engine file - * `import_engine`: imported an engine file into the TiKV cluster - * `cleanup_engine`: deleted an engine file - * `compact_cluster`: explicitly compacted the TiKV cluster - * `upload`: uploaded an SST file - * `ingest`: ingested an SST file - * `compact`: explicitly compacted a TiKV node - - **result**: the execution result of the RPC - * `ok` - * `error` - -- **`tikv_import_write_chunk_bytes`** (Histogram) - - Bucketed histogram for the uncompressed size of a block of KV pairs received from TiDB Lightning. - -- **`tikv_import_write_chunk_duration`** (Histogram) - - Bucketed histogram for the time needed to receive a block of KV pairs from TiDB Lightning. - -- **`tikv_import_upload_chunk_bytes`** (Histogram) - - Bucketed histogram for the compressed size of a chunk of SST file uploaded to TiKV. - -- **`tikv_import_upload_chunk_duration`** (Histogram) - - Bucketed histogram for the time needed to upload a chunk of SST file to TiKV. - -- **`tikv_import_range_delivery_duration`** (Histogram) - - Bucketed histogram for the time needed to deliver a range of KV pairs into a `dispatch-job`. - -- **`tikv_import_split_sst_duration`** (Histogram) - - Bucketed histogram for the time needed to split off a range from the engine file into a single SST file. - -- **`tikv_import_sst_delivery_duration`** (Histogram) - - Bucketed histogram for the time needed to deliver an SST file from a `dispatch-job` to an `ImportSSTJob`. - -- **`tikv_import_sst_recv_duration`** (Histogram) - - Bucketed histogram for the time needed to receive an SST file from a `dispatch-job` in an `ImportSSTJob`. - -- **`tikv_import_sst_upload_duration`** (Histogram) - - Bucketed histogram for the time needed to upload an SST file from an `ImportSSTJob` to a TiKV node. - -- **`tikv_import_sst_chunk_bytes`** (Histogram) - - Bucketed histogram for the compressed size of the SST file uploaded to a TiKV node. - -- **`tikv_import_sst_ingest_duration`** (Histogram) - - Bucketed histogram for the time needed to ingest an SST file into TiKV. - -- **`tikv_import_each_phase`** (Gauge) - - Indicates the running phase. Possible values are 1, meaning running inside the phase, and 0, meaning outside the phase. Labels: - - - **phase**: `prepare`/`import` - -- **`tikv_import_wait_store_available_count`** (Counter) - - Counts the number of times a TiKV node is found to have insufficient space when uploading SST files. Labels: - - - **store_id**: The TiKV store ID. - -### `tidb-lightning` +This section explains the monitoring metrics of `tidb-lightning`. Metrics provided by `tidb-lightning` are listed under the namespace `lightning_*`. diff --git a/tidb-lightning/tidb-lightning-command-line-full.md b/tidb-lightning/tidb-lightning-command-line-full.md index cb6995ffd024d..b3895427e3511 100644 --- a/tidb-lightning/tidb-lightning-command-line-full.md +++ b/tidb-lightning/tidb-lightning-command-line-full.md @@ -23,7 +23,6 @@ You can configure the following parameters using `tidb-lightning`: | `--backend ` | Select an import mode. `local` refers to [physical import mode](/tidb-lightning/tidb-lightning-physical-import-mode.md); `tidb` refers to [logical import mode](/tidb-lightning/tidb-lightning-logical-import-mode.md). | `tikv-importer.backend` | | `--log-file ` | Log file path. By default, it is `/tmp/lightning.log.{timestamp}`. If set to '-', it means that the log files will be output to stdout. | `lightning.log-file` | | `--status-addr ` | Listening address of the TiDB Lightning server | `lightning.status-port` | -| `--importer ` | Address of TiKV Importer | `tikv-importer.addr` | | `--pd-urls ` | PD endpoint address | `tidb.pd-addr` | | `--tidb-host ` | TiDB server host | `tidb.host` | | `--tidb-port ` | TiDB server port (default = 4000) | `tidb.port` | diff --git a/tidb-lightning/tidb-lightning-configuration.md b/tidb-lightning/tidb-lightning-configuration.md index c4f46f797b84e..858396c01c42a 100644 --- a/tidb-lightning/tidb-lightning-configuration.md +++ b/tidb-lightning/tidb-lightning-configuration.md @@ -53,10 +53,7 @@ enable-diagnose-logs = false # The maximum number of engines to be opened concurrently. # Each table is split into one "index engine" to store indices, and multiple # "data engines" to store row data. These settings control the maximum -# concurrent number for each type of engines. -# These values affect the memory and disk usage of tikv-importer. -# The sum of these two values must not exceed the max-open-engines setting -# for tikv-importer. +# concurrent number for each type of engines. Generally, you can use the following two default values. index-concurrency = 2 table-concurrency = 6 @@ -454,7 +451,6 @@ log-progress = "5m" | --backend *[backend](/tidb-lightning/tidb-lightning-overview.md)* | Select an import mode. `local` refers to the physical import mode; `tidb` refers to the logical import mode. | `local` | | --log-file *file* | Log file path. By default, it is `/tmp/lightning.log.{timestamp}`. If set to '-', it means that the log files will be output to stdout. | `lightning.log-file` | | --status-addr *ip:port* | Listening address of the TiDB Lightning server | `lightning.status-port` | -| --importer *host:port* | Address of TiKV Importer | `tikv-importer.addr` | | --pd-urls *host:port* | PD endpoint address | `tidb.pd-addr` | | --tidb-host *host* | TiDB server host | `tidb.host` | | --tidb-port *port* | TiDB server port (default = 4000) | `tidb.port` | diff --git a/tidb-lightning/tidb-lightning-faq.md b/tidb-lightning/tidb-lightning-faq.md index fb4414f8c27fb..5aaf8b97c702c 100644 --- a/tidb-lightning/tidb-lightning-faq.md +++ b/tidb-lightning/tidb-lightning-faq.md @@ -26,27 +26,8 @@ If only one table has an error encountered, the rest will still be processed nor ## How to properly restart TiDB Lightning? -If you are using Importer-backend, depending on the status of `tikv-importer`, the basic sequence of restarting TiDB Lightning is like this: - -If `tikv-importer` is still running: - -1. [Stop `tidb-lightning`](#how-to-stop-the-tidb-lightning-process). -2. Perform the intended modifications, such as fixing the source data, changing settings, replacing hardware etc. -3. If the modification previously has changed any table, [remove the corresponding checkpoint](/tidb-lightning/tidb-lightning-checkpoints.md#--checkpoint-remove) too. -4. Start `tidb-lightning`. - -If `tikv-importer` needs to be restarted: - -1. [Stop `tidb-lightning`](#how-to-stop-the-tidb-lightning-process). -2. [Stop `tikv-importer`](#how-to-stop-the-tikv-importer-process). -3. Perform the intended modifications, such as fixing the source data, changing settings, replacing hardware etc. -4. Start `tikv-importer`. -5. Start `tidb-lightning` *and wait until the program fails with CHECKSUM error, if any*. - * Restarting `tikv-importer` would destroy all engine files still being written, but `tidb-lightning` did not know about it. As of v3.0 the simplest way is to let `tidb-lightning` go on and retry. -6. [Destroy the failed tables and checkpoints](/tidb-lightning/troubleshoot-tidb-lightning.md#checkpoint-for--has-invalid-status-error-code) -7. Start `tidb-lightning` again. - -If you are using Local-backend or TiDB-backend, the operations are the same as those of using Importer-backend when the `tikv-importer` is still running. +1. [Stop the `tidb-lightning` process](#how-to-stop-the-tidb-lightning-process). +2. Start a new `tidb-lightning` task: execute the previous start command, such as `nohup tiup tidb-lightning -config tidb-lightning.toml`. ## How to ensure the integrity of the imported data? @@ -93,16 +74,6 @@ sql-mode = "STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION" ... ``` -## Can one `tikv-importer` serve multiple `tidb-lightning` instances? - -Yes, as long as every `tidb-lightning` instance operates on different tables. - -## How to stop the `tikv-importer` process? - -To stop the `tikv-importer` process, you can choose the corresponding operation according to your deployment method. - -- For manual deployment: if `tikv-importer` is running in foreground, press Ctrl+C to exit. Otherwise, obtain the process ID using the `ps aux | grep tikv-importer` command and then terminate the process using the `kill ${PID}` command. - ## How to stop the `tidb-lightning` process? To stop the `tidb-lightning` process, you can choose the corresponding operation according to your deployment method. @@ -122,12 +93,6 @@ With the default settings of 3 replicas, the space requirement of the target TiK - The space occupied by indices - Space amplification in RocksDB -## Can TiKV Importer be restarted while TiDB Lightning is running? - -No. TiKV Importer stores some information of engines in memory. If `tikv-importer` is restarted, `tidb-lightning` will be stopped due to lost connection. At this point, you need to [destroy the failed checkpoints](/tidb-lightning/tidb-lightning-checkpoints.md#--checkpoint-error-destroy) as those TiKV Importer-specific information is lost. You can restart TiDB Lightning afterwards. - -See also [How to properly restart TiDB Lightning?](#how-to-properly-restart-tidb-lightning) for the correct sequence. - ## How to completely destroy all intermediate data associated with TiDB Lightning? 1. Delete the checkpoint file. @@ -140,7 +105,7 @@ See also [How to properly restart TiDB Lightning?](#how-to-properly-restart-tidb If, for some reason, you cannot run this command, try manually deleting the file `/tmp/tidb_lightning_checkpoint.pb`. -2. If you are using Local-backend, delete the `sorted-kv-dir` directory in the configuration. If you are using Importer-backend, delete the entire `import` directory on the machine hosting `tikv-importer`. +2. If you are using Local-backend, delete the `sorted-kv-dir` directory in the configuration. 3. Delete all tables and databases created on the TiDB cluster, if needed. diff --git a/tidb-lightning/tidb-lightning-physical-import-mode.md b/tidb-lightning/tidb-lightning-physical-import-mode.md index 039a055407338..f7ff899fe05c3 100644 --- a/tidb-lightning/tidb-lightning-physical-import-mode.md +++ b/tidb-lightning/tidb-lightning-physical-import-mode.md @@ -69,7 +69,6 @@ It is recommended that you allocate CPU more than 32 cores and memory greater th - TiDB Lightning >= v4.0.3. - TiDB >= v4.0.0. -- If the target TiDB cluster is v3.x or earlier, you need to use Importer-backend to complete the data import. In this mode, `tidb-lightning` needs to send the parsed key-value pairs to `tikv-importer` via gRPC, and `tikv-importer` will complete the data import. ### Limitations diff --git a/tidb-lightning/troubleshoot-tidb-lightning.md b/tidb-lightning/troubleshoot-tidb-lightning.md index 28d67340bab72..cc7adf0855c6e 100644 --- a/tidb-lightning/troubleshoot-tidb-lightning.md +++ b/tidb-lightning/troubleshoot-tidb-lightning.md @@ -119,24 +119,6 @@ tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy= See the [Checkpoints control](/tidb-lightning/tidb-lightning-checkpoints.md#checkpoints-control) section for other options. -### `ResourceTemporarilyUnavailable("Too many open engines …: …")` - -**Cause**: The number of concurrent engine files exceeds the limit specified by `tikv-importer`. This could be caused by misconfiguration. Additionally, if `tidb-lightning` exited abnormally, an engine file might be left at a dangling open state, which could cause this error as well. - -**Solutions**: - -1. Increase the value of `max-open-engines` setting in `tikv-importer.toml`. This value is typically dictated by the available memory. This could be calculated by using: - - Max Memory Usage ≈ `max-open-engines` × `write-buffer-size` × `max-write-buffer-number` - -2. Decrease the value of `table-concurrency` + `index-concurrency` so it is less than `max-open-engines`. - -3. Restart `tikv-importer` to forcefully remove all engine files (default to `./data.import/`). This also removes all partially imported tables, which requires TiDB Lightning to clear the outdated checkpoints. - - ```sh - tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all - ``` - ### `cannot guess encoding for input file, please convert to UTF-8 manually` **Cause**: TiDB Lightning only recognizes the UTF-8 and GB-18030 encodings for the table schemas. This error is emitted if the file isn't in any of these encodings. It is also possible that the file has mixed encoding, such as containing a string in UTF-8 and another string in GB-18030, due to historical `ALTER TABLE` executions. @@ -164,9 +146,7 @@ See the [Checkpoints control](/tidb-lightning/tidb-lightning-checkpoints.md#chec TZ='Asia/Shanghai' bin/tidb-lightning -config tidb-lightning.toml ``` -2. When exporting data using Mydumper, make sure to include the `--skip-tz-utc` flag. - -3. Ensure the entire cluster is using the same and latest version of `tzdata` (version 2018i or above). +2. Ensure the entire cluster is using the same and latest version of `tzdata` (version 2018i or above). On CentOS, run `yum info tzdata` to check the installed version and whether there is an update. Run `yum upgrade tzdata` to upgrade the package. diff --git a/tidb-troubleshooting-map.md b/tidb-troubleshooting-map.md index abb9230c20eb5..a5a3a5dadb6d5 100644 --- a/tidb-troubleshooting-map.md +++ b/tidb-troubleshooting-map.md @@ -528,7 +528,7 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - If TiDB Lightning shares a server with other services (for example, Importer), you must manually set `region-concurrency` to 75% of the total number of CPU cores on that server. - If there is a quota on CPU (for example, limited by Kubernetes settings), TiDB Lightning might not be able to read this out. In this case, `region-concurrency` must also be manually reduced. - - Every additional index introduces a new KV pair for each row. If there are N indices, the actual size to be imported would be approximately (N+1) times the size of the [Mydumper](https://docs.pingcap.com/tidb/v4.0/mydumper-overview) output. If the indices are negligible, you may first remove them from the schema, and add them back via `CREATE INDEX` after the import is complete. + - Every additional index introduces a new KV pair for each row. If there are N indices, the actual size to be imported would be approximately (N+1) times the size of the [Dumpling](/dumpling-overview.md) output. If the indices are negligible, you may first remove them from the schema, and add them back via `CREATE INDEX` after the import is complete. - The version of TiDB Lightning is old. Try the latest version, which might improve the import speed. - 6.3.3 `checksum failed: checksum mismatched remote vs local`. @@ -537,7 +537,7 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - Cause 2: If the checksum of the target database is 0, which means nothing is imported, it is possible that the cluster is too hot and fails to take in any data. - - Cause 3: If the data source is generated by the machine and not backed up by [Mydumper](https://docs.pingcap.com/tidb/v4.0/mydumper-overview), ensure it respects the constrains of the table. For example: + - Cause 3: If the data source is generated by the machine and not backed up by [Dumpling](/dumpling-overview.md), ensure it respects the constrains of the table. For example: - `AUTO_INCREMENT` columns need to be positive, and do not contain the value "0". - UNIQUE and PRIMARY KEYs must not have duplicate entries. @@ -550,18 +550,13 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - Solution: See [Troubleshooting Solution](/tidb-lightning/troubleshoot-tidb-lightning.md#checkpoint-for--has-invalid-status-error-code). -- 6.3.5 `ResourceTemporarilyUnavailable("Too many open engines …: 8")` - - - Cause: The number of concurrent engine files exceeds the limit specified by tikv-importer. This could be caused by misconfiguration. In addition, even when the configuration is correct, if tidb-lightning has exited abnormally before, an engine file might be left at a dangling open state, which could cause this error as well. - - Solution: See [Troubleshooting Solution](/tidb-lightning/troubleshoot-tidb-lightning.md#resourcetemporarilyunavailabletoo-many-open-engines--). - -- 6.3.6 `cannot guess encoding for input file, please convert to UTF-8 manually` +- 6.3.5 `cannot guess encoding for input file, please convert to UTF-8 manually` - Cause: TiDB Lightning only supports the UTF-8 and GB-18030 encodings. This error means the file is not in any of these encodings. It is also possible that the file has mixed encoding, such as containing a string in UTF-8 and another string in GB-18030, due to historical ALTER TABLE executions. - Solution: See [Troubleshooting Solution](/tidb-lightning/troubleshoot-tidb-lightning.md#cannot-guess-encoding-for-input-file-please-convert-to-utf-8-manually). -- 6.3.7 `[sql2kv] sql encode error = [types:1292]invalid time format: '{1970 1 1 0 45 0 0}'` +- 6.3.6 `[sql2kv] sql encode error = [types:1292]invalid time format: '{1970 1 1 0 45 0 0}'` - Cause: A timestamp type entry has a time value that does not exist. This is either because of DST changes or because the time value has exceeded the supported range (from Jan 1, 1970 to Jan 19, 2038). From f18e5c80dcdacdb72cbafe2bc216b3aa26f13231 Mon Sep 17 00:00:00 2001 From: Ran Date: Mon, 13 Nov 2023 14:59:14 +0800 Subject: [PATCH 10/29] sql: pause and resume ddl jobs GA (#15294) --- ddl-introduction.md | 4 ++-- sql-statements/sql-statement-admin-pause-ddl.md | 6 ------ sql-statements/sql-statement-admin-resume-ddl.md | 4 ---- 3 files changed, 2 insertions(+), 12 deletions(-) diff --git a/ddl-introduction.md b/ddl-introduction.md index ec994786815b5..079089e448dd3 100644 --- a/ddl-introduction.md +++ b/ddl-introduction.md @@ -178,11 +178,11 @@ When TiDB is adding an index, the phase of backfilling data will cause read and If a completed DDL task is canceled, you can see the `DDL Job:90 not found` error in the `RESULT` column, which means that the task has been removed from the DDL waiting queue. -- `ADMIN PAUSE DDL JOBS job_id [, job_id]`: Used to pause the DDL jobs that are being executed. After the command is executed, the SQL statement that executes the DDL job is displayed as being executed, while the background job has been paused. For details, refer to [`ADMIN PAUSE DDL JOBS`](/sql-statements/sql-statement-admin-pause-ddl.md). (Experimental feature) +- `ADMIN PAUSE DDL JOBS job_id [, job_id]`: Used to pause the DDL jobs that are being executed. After the command is executed, the SQL statement that executes the DDL job is displayed as being executed, while the background job has been paused. For details, refer to [`ADMIN PAUSE DDL JOBS`](/sql-statements/sql-statement-admin-pause-ddl.md). You can only pause DDL tasks that are in progress or still in the queue. Otherwise, the `Job 3 can't be paused now` error is shown in the `RESULT` column. -- `ADMIN RESUME DDL JOBS job_id [, job_id]`: Used to resume the DDL tasks that have been paused. After the command is executed, the SQL statement that executes the DDL task is displayed as being executed, and the background task is resumed. For details, refer to [`ADMIN RESUME DDL JOBS`](/sql-statements/sql-statement-admin-resume-ddl.md). (Experimental feature) +- `ADMIN RESUME DDL JOBS job_id [, job_id]`: Used to resume the DDL tasks that have been paused. After the command is executed, the SQL statement that executes the DDL task is displayed as being executed, and the background task is resumed. For details, refer to [`ADMIN RESUME DDL JOBS`](/sql-statements/sql-statement-admin-resume-ddl.md). You can only resume a paused DDL task. Otherwise, the `Job 3 can't be resumed` error is shown in the `RESULT` column. diff --git a/sql-statements/sql-statement-admin-pause-ddl.md b/sql-statements/sql-statement-admin-pause-ddl.md index 75fd004a424a6..ac21d2d3bd6ee 100644 --- a/sql-statements/sql-statement-admin-pause-ddl.md +++ b/sql-statements/sql-statement-admin-pause-ddl.md @@ -9,10 +9,6 @@ summary: An overview of the usage of ADMIN PAUSE DDL JOBS for the TiDB database. You can use this statement to pause a DDL job that is issued but not yet completed executing. After the pause, the SQL statement that executes the DDL job does not return immediately, but looks like it is still running. If you try to pause a DDL job that has already been completed, you will see the `DDL Job:90 not found` error in the `RESULT` column, which indicates that the job has been removed from the DDL waiting queue. -> **Warning:** -> -> This feature is an experimental feature. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. - ## Synopsis ```ebnf+diagram @@ -40,7 +36,6 @@ If the pause fails, the specific reason for the failure is displayed. > + This statement can pause a DDL job, but other operations and environment changes (such as machine restarts and cluster restarts) do not pause DDL jobs except for cluster upgrades. > + During the cluster upgrade, the ongoing DDL jobs are paused, and the DDL jobs initiated during the upgrade are also paused. After the upgrade, all paused DDL jobs will resume. The pause and resume operations during the upgrade are taken automatically. For details, see [TiDB Smooth Upgrade](/smooth-upgrade-tidb.md). > + This statement can pause multiple DDL jobs. You can use the [`ADMIN SHOW DDL JOBS`](/sql-statements/sql-statement-admin-show-ddl.md) statement to obtain the `job_id` of a DDL job. -> + If the job to be paused has already been completed or is about to be completed, the pause operation will fail. @@ -50,7 +45,6 @@ If the pause fails, the specific reason for the failure is displayed. > + This statement can pause a DDL job, but other operations and environment changes (such as machine restarts and cluster restarts) do not pause DDL jobs except for cluster upgrades. > + During the cluster upgrade, the ongoing DDL jobs are paused, and the DDL jobs initiated during the upgrade are also paused. After the upgrade, all paused DDL jobs will resume. The pause and resume operations during the upgrade are taken automatically. For details, see [TiDB Smooth Upgrade](https://docs.pingcap.com/tidb/stable/smooth-upgrade-tidb). > + This statement can pause multiple DDL jobs. You can use the [`ADMIN SHOW DDL JOBS`](/sql-statements/sql-statement-admin-show-ddl.md) statement to obtain the `job_id` of a DDL job. -> + If the job to be paused has already been completed or is about to be completed, the pause operation will fail. diff --git a/sql-statements/sql-statement-admin-resume-ddl.md b/sql-statements/sql-statement-admin-resume-ddl.md index a68a5ee135a7e..b5a7f65a5e50a 100644 --- a/sql-statements/sql-statement-admin-resume-ddl.md +++ b/sql-statements/sql-statement-admin-resume-ddl.md @@ -9,10 +9,6 @@ summary: An overview of the usage of ADMIN RESUME DDL for the TiDB database. You can use this statement to resume a paused DDL job. After the resume is completed, the SQL statement that executes the DDL job continues to show as being executed. If you try to resume a DDL job that has already been completed, you will see the `DDL Job:90 not found` error in the `RESULT` column, which indicates that the job has been removed from the DDL waiting queue. -> **Warning:** -> -> This feature is an experimental feature. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. - ## Synopsis ```ebnf+diagram From a8b8f769a7c473a1c3b02d91a8366c21d9af61b4 Mon Sep 17 00:00:00 2001 From: Aolin Date: Mon, 13 Nov 2023 15:26:15 +0800 Subject: [PATCH 11/29] tidb: remove fast and incremental analyze (#15139) --- grafana-tidb-dashboard.md | 1 - releases/release-7.3.0.md | 2 +- sql-statements/sql-statement-analyze-table.md | 8 ++- .../sql-statement-show-variables.md | 1 - statistics.md | 52 ++++--------------- system-variables.md | 2 +- 6 files changed, 14 insertions(+), 52 deletions(-) diff --git a/grafana-tidb-dashboard.md b/grafana-tidb-dashboard.md index fc0d1b2a005df..3e6427576186f 100644 --- a/grafana-tidb-dashboard.md +++ b/grafana-tidb-dashboard.md @@ -168,7 +168,6 @@ The following metrics relate to requests sent to TiKV. Retry requests are counte - Store Query Feedback QPS: the number of operations per second to store the feedback information of the union query, which is performed in TiDB memory - Significant Feedback: the number of significant feedback pieces that update the statistics information - Update Stats OPS: the number of operations of updating statistics with feedback -- Fast Analyze Status 100: the status for quickly collecting statistical information ### Owner diff --git a/releases/release-7.3.0.md b/releases/release-7.3.0.md index 31aae31995387..dc3ab1152cc16 100644 --- a/releases/release-7.3.0.md +++ b/releases/release-7.3.0.md @@ -219,7 +219,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v7.3/quick-start-with- * TiDB - The [`Fast Analyze`](/system-variables.md#tidb_enable_fast_analyze) feature (experimental) for statistics will be deprecated in v7.5.0. - - The [incremental collection](/statistics.md#incremental-collection) feature for statistics will be deprecated in v7.5.0. + - The [incremental collection](https://docs.pingcap.com/tidb/v7.3/statistics#incremental-collection) feature for statistics will be deprecated in v7.5.0. ## Improvements diff --git a/sql-statements/sql-statement-analyze-table.md b/sql-statements/sql-statement-analyze-table.md index d921dd825fccc..a26f6d05ed40a 100644 --- a/sql-statements/sql-statement-analyze-table.md +++ b/sql-statements/sql-statement-analyze-table.md @@ -10,7 +10,7 @@ This statement updates the statistics that TiDB builds on tables and indexes. It TiDB will also automatically update its statistics over time as it discovers that they are inconsistent with its own estimates. -Currently, TiDB collects statistical information in two ways: full collection (implemented using the `ANALYZE TABLE` statement) and incremental collection (implemented using the `ANALYZE INCREMENTAL TABLE` statement). For detailed usage of these two statements, refer to [introduction to statistics](/statistics.md) +Currently, TiDB collects statistical information as a full collection by using the `ANALYZE TABLE` statement. For more information, see [Introduction to statistics](/statistics.md). ## Synopsis @@ -97,10 +97,8 @@ The statistics is now correctly updated and loaded. TiDB differs from MySQL in **both** the statistics it collects and how it makes use of statistics during query execution. While this statement is syntactically similar to MySQL, the following differences apply: -1. TiDB might not include very recently committed changes when running `ANALYZE TABLE`. After a batch-update of rows, you might need to `sleep(1)` before executing `ANALYZE TABLE` in order for the statistics update to reflect these changes. [#16570](https://github.com/pingcap/tidb/issues/16570). -2. `ANALYZE TABLE` takes significantly longer to execute in TiDB than MySQL. This performance difference can be partially mitigated by enabling fast analyze with `SET GLOBAL tidb_enable_fast_analyze=1`. Fast analyze makes use of sampling, leading to less accurate statistics. Its usage is still considered experimental. - -MySQL does not support the `ANALYZE INCREMENTAL TABLE` statement. TiDB supports incremental collection of statistics. For detailed usage, refer to [incremental collection](/statistics.md#incremental-collection). ++ TiDB might not include very recently committed changes when running `ANALYZE TABLE`. After a batch update of rows, you might need to `sleep(1)` before executing `ANALYZE TABLE` in order for the statistics update to reflect these changes. See [#16570](https://github.com/pingcap/tidb/issues/16570). ++ `ANALYZE TABLE` takes significantly longer to execute in TiDB than MySQL. ## See also diff --git a/sql-statements/sql-statement-show-variables.md b/sql-statements/sql-statement-show-variables.md index 1ba6b491ad561..04530f2edc9e3 100644 --- a/sql-statements/sql-statement-show-variables.md +++ b/sql-statements/sql-statement-show-variables.md @@ -57,7 +57,6 @@ mysql> SHOW GLOBAL VARIABLES LIKE 'tidb%'; | tidb_enable_cascades_planner | 0 | | tidb_enable_chunk_rpc | 1 | | tidb_enable_collect_execution_info | 1 | -| tidb_enable_fast_analyze | 0 | | tidb_enable_index_merge | 0 | | tidb_enable_noop_functions | 0 | | tidb_enable_radix_join | 0 | diff --git a/statistics.md b/statistics.md index de80c9c2a98f5..040e3594a5a77 100644 --- a/statistics.md +++ b/statistics.md @@ -73,8 +73,8 @@ Count-Min Sketch is a hash structure. When an equivalence query contains `a = 1` A hash collision might occur since Count-Min Sketch is a hash structure. In the `EXPLAIN` statement, if the estimate of the equivalent query deviates greatly from the actual value, it can be considered that a larger value and a smaller value have been hashed together. In this case, you can take one of the following ways to avoid the hash collision: -- Modify the `WITH NUM TOPN` parameter. TiDB stores the high-frequency (top x) data separately, with the other data stored in Count-Min Sketch. Therefore, to prevent a larger value and a smaller value from being hashed together, you can increase the value of `WITH NUM TOPN`. In TiDB, its default value is 20. The maximum value is 1024. For more information about this parameter, see [Full Collection](#full-collection). -- Modify two parameters `WITH NUM CMSKETCH DEPTH` and `WITH NUM CMSKETCH WIDTH`. Both affect the number of hash buckets and the collision probability. You can increase the values of the two parameters appropriately according to the actual scenario to reduce the probability of hash collision, but at the cost of higher memory usage of statistics. In TiDB, the default value of `WITH NUM CMSKETCH DEPTH` is 5, and the default value of `WITH NUM CMSKETCH WIDTH` is 2048. For more information about the two parameters, see [Full Collection](#full-collection). +- Modify the `WITH NUM TOPN` parameter. TiDB stores the high-frequency (top x) data separately, with the other data stored in Count-Min Sketch. Therefore, to prevent a larger value and a smaller value from being hashed together, you can increase the value of `WITH NUM TOPN`. In TiDB, its default value is 20. The maximum value is 1024. For more information about this parameter, see [Manual collection](#manual-collection). +- Modify two parameters `WITH NUM CMSKETCH DEPTH` and `WITH NUM CMSKETCH WIDTH`. Both affect the number of hash buckets and the collision probability. You can increase the values of the two parameters appropriately according to the actual scenario to reduce the probability of hash collision, but at the cost of higher memory usage of statistics. In TiDB, the default value of `WITH NUM CMSKETCH DEPTH` is 5, and the default value of `WITH NUM CMSKETCH WIDTH` is 2048. For more information about the two parameters, see [Manual collection](#manual-collection). ## Top-N values @@ -84,19 +84,12 @@ Top-N values are values with the top N occurrences in a column or index. TiDB re ### Manual collection -You can run the `ANALYZE` statement to collect statistics. +Currently, TiDB collects statistical information as a full collection. You can execute the `ANALYZE TABLE` statement to collect statistics. > **Note:** > -> The execution time of `ANALYZE TABLE` in TiDB is longer than that in MySQL or InnoDB. In InnoDB, only a small number of pages are sampled, while in TiDB a comprehensive set of statistics is completely rebuilt. Scripts that were written for MySQL may naively expect `ANALYZE TABLE` will be a short-lived operation. -> -> For quicker analysis, you can set `tidb_enable_fast_analyze` to `1` to enable the Quick Analysis feature. The default value for this parameter is `0`. -> -> After Quick Analysis is enabled, TiDB randomly samples approximately 10,000 rows of data to build statistics. Therefore, in the case of uneven data distribution or a relatively small amount of data, the accuracy of statistical information is relatively poor. It might lead to poor execution plans, such as choosing the wrong index. If the execution time of the normal `ANALYZE` statement is acceptable, it is recommended to disable the Quick Analysis feature. -> -> `tidb_enable_fast_analyze` is an experimental feature, which currently **does not match exactly** with the statistical information of `tidb_analyze_version=2`. Therefore, you need to set the value of `tidb_analyze_version` to `1` when `tidb_enable_fast_analyze` is enabled. - -#### Full collection +> - The execution time of `ANALYZE TABLE` in TiDB is longer than that in MySQL or InnoDB. In InnoDB, only a small number of pages are sampled, while in TiDB a comprehensive set of statistics is completely rebuilt. Scripts that were written for MySQL might mistakenly expect that `ANALYZE TABLE` will be a short-lived operation. +> - Starting from v7.5.0, the [Fast Analyze feature (`tidb_enable_fast_analyze`)](/system-variables.md#tidb_enable_fast_analyze) and the [incremental collection feature](https://docs.pingcap.com/tidb/v7.4/statistics#incremental-collection) for statistics are deprecated. You can perform full collection using the following syntax. @@ -140,7 +133,7 @@ The current sampling rate is calculated based on an adaptive algorithm. When you -##### Collect statistics on some columns +#### Collect statistics on some columns In most cases, when executing SQL statements, the optimizer only uses statistics on some columns (such as columns in the `WHERE`, `JOIN`, `ORDER BY`, and `GROUP BY` statements). These columns are called `PREDICATE COLUMNS`. @@ -272,7 +265,7 @@ SHOW COLUMN_STATS_USAGE WHERE db_name = 'test' AND table_name = 't' AND last_ana 3 rows in set (0.00 sec) ``` -##### Collect statistics on indexes +#### Collect statistics on indexes To collect statistics on all indexes in `IndexNameList` in `TableName`, use the following syntax: @@ -288,7 +281,7 @@ When `IndexNameList` is empty, this syntax collects statistics on all indexes in > > To ensure that the statistical information before and after the collection is consistent, when `tidb_analyze_version` is `2`, this syntax collects statistics on the entire table (including all columns and indexes), instead of only on indexes. -##### Collect statistics on partitions +#### Collect statistics on partitions - To collect statistics on all partitions in `PartitionNameList` in `TableName`, use the following syntax: @@ -318,7 +311,7 @@ When `IndexNameList` is empty, this syntax collects statistics on all indexes in ANALYZE TABLE TableName PARTITION PartitionNameList [COLUMNS ColumnNameList|PREDICATE COLUMNS|ALL COLUMNS] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` -##### Collect statistics of partitioned tables in dynamic pruning mode +#### Collect statistics of partitioned tables in dynamic pruning mode When accessing partitioned tables in [dynamic pruning mode](/partitioned-table.md#dynamic-pruning-mode), TiDB collects table-level statistics, which is called GlobalStats. Currently, GlobalStats is aggregated from statistics of all partitions. In dynamic pruning mode, a statistics update of any partitioned table can trigger the GlobalStats to be updated. @@ -335,33 +328,6 @@ When accessing partitioned tables in [dynamic pruning mode](/partitioned-table.m > > - In dynamic pruning mode, the Analyze configurations of partitions and tables should be the same. Therefore, if you specify the `COLUMNS` configuration following the `ANALYZE TABLE TableName PARTITION PartitionNameList` statement or the `OPTIONS` configuration following `WITH`, TiDB will ignore them and return a warning. -#### Incremental collection - -To improve the speed of analysis after full collection, incremental collection could be used to analyze the newly added sections in monotonically non-decreasing columns such as time columns. - -> **Note:** -> -> + Currently, the incremental collection is only provided for index. -> + When using the incremental collection, you must ensure that only `INSERT` operations exist on the table, and that the newly inserted value on the index column is monotonically non-decreasing. Otherwise, the statistical information might be inaccurate, affecting the TiDB optimizer to select an appropriate execution plan. - -You can perform incremental collection using the following syntax. - -+ To incrementally collect statistics on index columns in all `IndexNameLists` in `TableName`: - - {{< copyable "sql" >}} - - ```sql - ANALYZE INCREMENTAL TABLE TableName INDEX [IndexNameList] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; - ``` - -+ To incrementally collect statistics on index columns for partitions in all `PartitionNameLists` in `TableName`: - - {{< copyable "sql" >}} - - ```sql - ANALYZE INCREMENTAL TABLE TableName PARTITION PartitionNameList INDEX [IndexNameList] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; - ``` - ### Automatic update diff --git a/system-variables.md b/system-variables.md index 96d8f027226d6..2d910173f75c8 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1982,7 +1982,7 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; > **Warning:** > -> Currently, `Fast Analyze` is an experimental feature. It is not recommended that you use it in production environments. +> Starting from v7.5.0, this variable is deprecated. - Scope: SESSION | GLOBAL - Persists to cluster: Yes From 2fb9f230cf278cebc9cd68bd367e4c1523b899fd Mon Sep 17 00:00:00 2001 From: Aolin Date: Mon, 13 Nov 2023 15:27:46 +0800 Subject: [PATCH 12/29] =?UTF-8?q?cdc:=20update=20description=20about=20par?= =?UTF-8?q?tition=20dispatcher=20and=20add=20column=20sel=E2=80=A6=20(#152?= =?UTF-8?q?78)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ticdc/ticdc-changefeed-config.md | 18 +++++++-- ticdc/ticdc-sink-to-kafka.md | 64 +++++++++++++++++++++++++++----- 2 files changed, 69 insertions(+), 13 deletions(-) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 119cb574c2b7e..9c9e26d607989 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -121,10 +121,20 @@ enable-table-across-nodes = false # Note: When the downstream MQ is Pulsar, if the routing rule for `partition` is not specified as any of `ts`, `index-value`, `table`, or `default`, each Pulsar message will be routed using the string you set as the key. # For example, if you specify the routing rule for a matcher as the string `code`, then all Pulsar messages that match that matcher will be routed with `code` as the key. # dispatchers = [ -# {matcher = ['test1.*', 'test2.*'], topic = "Topic expression 1", partition = "ts" }, -# {matcher = ['test3.*', 'test4.*'], topic = "Topic expression 2", partition = "index-value" }, -# {matcher = ['test1.*', 'test5.*'], topic = "Topic expression 3", partition = "table"}, -# {matcher = ['test6.*'], partition = "ts"} +# {matcher = ['test1.*', 'test2.*'], topic = "Topic expression 1", partition = "index-value"}, +# {matcher = ['test3.*', 'test4.*'], topic = "Topic expression 2", partition = "index-value", index-name="index1"}, +# {matcher = ['test1.*', 'test5.*'], topic = "Topic expression 3", partition = "table"}, +# {matcher = ['test6.*'], partition = "columns", columns = "['a', 'b']"} +# {matcher = ['test7.*'], partition = "ts"} +# ] + +# column-selectors is introduced in v7.5.0 and only takes effect when the downstream is Kafka. +# column-selectors is used to select specific columns for replication. +# column-selectors = [ +# {matcher = ['test.t1'], columns = ['a', 'b']}, +# {matcher = ['test.*'], columns = ["*", "!b"]}, +# {matcher = ['test1.t1'], columns = ['column*', '!column1']}, +# {matcher = ['test3.t'], columns = ["column?", "!column1"]}, # ] # The protocol configuration item specifies the protocol format used for encoding messages. diff --git a/ticdc/ticdc-sink-to-kafka.md b/ticdc/ticdc-sink-to-kafka.md index d22f418e72fcc..ba5f5d30602dc 100644 --- a/ticdc/ticdc-sink-to-kafka.md +++ b/ticdc/ticdc-sink-to-kafka.md @@ -222,32 +222,78 @@ For example, for a dispatcher like `matcher = ['test.*'], topic = {schema}_{tabl ### Partition dispatchers -You can use `partition = "xxx"` to specify a partition dispatcher. It supports four dispatchers: `default`, `ts`, `index-value`, and `table`. The dispatcher rules are as follows: +You can use `partition = "xxx"` to specify a partition dispatcher. It supports five dispatchers: `default`, `index-value`, `columns`, `table`, and `ts`. The dispatcher rules are as follows: -- `default`: dispatches events in the `table` mode. -- `ts`: uses the commitTs of the row change to hash and dispatch events. -- `index-value`: uses the value of the primary key or the unique index of the table to hash and dispatch events. -- `table`: uses the schema name of the table and the table name to hash and dispatch events. +- `default`: uses the `table` dispatcher rule by default. It calculates the partition number using the schema name and table name, ensuring data from a table is sent to the same partition. As a result, the data from a single table only exists in one partition and is guaranteed to be ordered. However, this dispatcher rule limits the send throughput, and the consumption speed cannot be improved by adding consumers. +- `index-value`: calculates the partition number using either the primary key, a unique index, or an explicitly specified index, distributing table data across multiple partitions. The data from a single table is sent to multiple partitions, and the data in each partition is ordered. You can improve the consumption speed by adding consumers. +- `columns`: calculates the partition number using the values of explicitly specified columns, distributing table data across multiple partitions. The data from a single table is sent to multiple partitions, and the data in each partition is ordered. You can improve the consumption speed by adding consumers. +- `table`: calculates the partition number using the schema name and table name. +- `ts`: calculates the partition number using the commitTs of the row change, distributing table data across multiple partitions. The data from a single table is sent to multiple partitions, and the data in each partition is ordered. You can improve the consumption speed by adding consumers. However, multiple changes of a data item might be sent to different partitions and the consumer progress of different consumers might be different, which might cause data inconsistency. Therefore, the consumer needs to sort the data from multiple partitions by commitTs before consuming. + +Take the following configuration of `dispatchers` as an example: + +```toml +[sink] +dispatchers = [ + {matcher = ['test.*'], partition = "index-value"}, + {matcher = ['test1.*'], partition = "index-value", index-name = "index1"}, + {matcher = ['test2.*'], partition = "columns", columns = ["id", "a"]}, + {matcher = ['test3.*'], partition = "table"}, +] +``` + +- Tables in the `test` database use the `index-value` dispatcher, which calculates the partition number using the value of the primary key or unique index. If a primary key exists, the primary key is used; otherwise, the shortest unique index is used. +- Tables in the `test1` table use the `index-value` dispatcher and calculate the partition number using values of all columns in the index named `index1`. If the specified index does not exist, an error is reported. Note that the index specified by `index-name` must be a unique index. +- Tables in the `test2` database use the `columns` dispatcher and calculate the partition number using the values of columns `id` and `a`. If any of the columns does not exist, an error is reported. +- Tables in the `test3` database use the `table` dispatcher. +- Tables in the `test4` database use the `default` dispatcher, that is the `table` dispatcher, as they do not match any of the preceding rules. + +If a table matches multiple dispatcher rules, the first matching rule takes precedence. > **Note:** > -> > Since v6.1.0, to clarify the meaning of the configuration, the configuration used to specify the partition dispatcher has been changed from `dispatcher` to `partition`, with `partition` being an alias for `dispatcher`. For example, the following two rules are exactly equivalent. > > ``` > [sink] > dispatchers = [ -> {matcher = ['*.*'], dispatcher = "ts"}, -> {matcher = ['*.*'], partition = "ts"}, +> {matcher = ['*.*'], dispatcher = "index-value"}, +> {matcher = ['*.*'], partition = "index-value"}, > ] > ``` > > However, `dispatcher` and `partition` cannot appear in the same rule. For example, the following rule is invalid. > > ``` -> {matcher = ['*.*'], dispatcher = "ts", partition = "table"}, +> {matcher = ['*.*'], dispatcher = "index-value", partition = "table"}, > ``` +## Column selectors + +The column selector feature supports selecting columns from events and sending only the data changes related to those columns to the downstream. + +Take the following configuration of `column-selectors` as an example: + +```toml +[sink] +column-selectors = [ + {matcher = ['test.t1'], columns = ['a', 'b']}, + {matcher = ['test.*'], columns = ["*", "!b"]}, + {matcher = ['test1.t1'], columns = ['column*', '!column1']}, + {matcher = ['test3.t'], columns = ["column?", "!column1"]}, +] +``` + +- For table `test.t1`, only columns `a` and `b` are sent. +- For tables in the `test` database (excluding the `t1` table), all columns except `b` are sent. +- For table `test1.t1`, any column starting with `column` is sent, except for `column1`. +- For table `test3.t`, any 7-character column starting with `column` is sent, except for `column1`. +- For tables that do not match any rule, all columns are sent. + +> **Note:** +> +> After being filtered by the `column-selectors` rules, the data in the table must have a primary key or unique key to be replicated. Otherwise, the changefeed reports an error when it is created or running. + ## Scale out the load of a single large table to multiple TiCDC nodes This feature splits the data replication range of a single large table into multiple ranges, according to the data volume and the number of modified rows per minute, and it makes the data volume and the number of modified rows replicated in each range approximately the same. This feature distributes these ranges to multiple TiCDC nodes for replication, so that multiple TiCDC nodes can replicate a large single table at the same time. This feature can solve the following two problems: From db5883230ce65148d76532dad58a864277fe7dd5 Mon Sep 17 00:00:00 2001 From: Suhaha Date: Mon, 13 Nov 2023 15:47:16 +0800 Subject: [PATCH 13/29] ci: ja translation (#15336) --- .github/workflows/translation.yaml | 108 +++++++++++++++++++++++++++++ 1 file changed, 108 insertions(+) create mode 100644 .github/workflows/translation.yaml diff --git a/.github/workflows/translation.yaml b/.github/workflows/translation.yaml new file mode 100644 index 0000000000000..fd8ec27d49a84 --- /dev/null +++ b/.github/workflows/translation.yaml @@ -0,0 +1,108 @@ +name: translation + +on: + workflow_dispatch: + +jobs: + ja: + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v3 + name: Download translator repo + with: + repository: "shczhen/markdown-translator" + ref: "openai" + path: "markdown-translator" + - uses: actions/checkout@v3 + name: Download docs repo and specified branch + with: + ref: "i18n-ja-release-7.1" + path: "docs" + - uses: actions/setup-node@v3 + name: Setup node 18 + with: + node-version: 18 + + - run: | + sudo apt install tree -y + + - name: Download files by comparing commits + run: | + export GH_TOKEN=${{github.token}} + cd docs + npm i + node scripts/filterUpdateFiles.js + tree tmp + cd .. + - name: Copy new files to translator folder + run: | + cp -r docs/tmp markdown-translator/markdowns + - name: Config and translate + run: | + cd markdown-translator + export LANGLINK_ACCESS_KEY=${{ secrets.LANGLINK_ACCESS_KEY }} + export LANGLINK_ACCESS_SECRET=${{ secrets.LANGLINK_ACCESS_SECRET }} + export LANGLINK_USER=${{ secrets.LANGLINK_USER }} + yarn + node src/index.js + cd .. + - name: Copy translated files to docs repo + run: | + cp -r markdown-translator/output/markdowns/* docs/ + + - name: Git commit and push + run: | + cd docs + git status + git config user.name github-actions + git config user.email github-actions@github.com + git add . + if git status | grep -q "Changes to be committed" + then + git commit -m "Update translated files" + echo "::set-output name=committed::1" + else + echo "No changes detected, skipped" + fi + - name: Set build ID + id: build_id + run: echo "::set-output name=id::$(date +%s)" + - name: Create PR + uses: peter-evans/create-pull-request@v5 + if: steps.git_commit.outputs.committed == 1 + with: + token: ${{ github.token }} + branch: jp-translation/${{ steps.build_id.outputs.id }} + title: "ci: JP translation ${{ steps.build_id.outputs.id }}" + body: | + ### What is changed, added or deleted? (Required) + + Translate docs to Japanese. + + ### Which TiDB version(s) do your changes apply to? (Required) + + + + **Tips for choosing the affected version(s):** + + By default, **CHOOSE MASTER ONLY** so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, **CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER**. + + For details, see [tips for choosing the affected versions](https://github.com/pingcap/docs/blob/master/CONTRIBUTING.md#guideline-for-choosing-the-affected-versions). + + - [x] i18n-ja-release-7.1 (TiDB 7.1 versions) + + ### What is the related PR or file link(s)? + + + + - This PR is translated from: en + - Other reference link(s): + + ### Do your changes match any of the following descriptions? + + - [ ] Delete files + - [ ] Change aliases + - [ ] Need modification after applied to another branch + - [ ] Might cause conflicts after applied to another branch + delete-branch: true From ca54bcaa04f037d55e9b91d357d94a6d69502ce5 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Tue, 14 Nov 2023 11:38:14 +0800 Subject: [PATCH 14/29] tidb: `tidb_service_scope` GA (#15226) --- system-variables.md | 4 ---- tidb-distributed-execution-framework.md | 6 +++--- 2 files changed, 3 insertions(+), 7 deletions(-) diff --git a/system-variables.md b/system-variables.md index 2d910173f75c8..9bc82ff868e9a 100644 --- a/system-variables.md +++ b/system-variables.md @@ -4821,10 +4821,6 @@ SHOW WARNINGS; ### tidb_service_scope New in v7.4.0 -> **Warning:** -> -> This feature is an experimental feature. It is not recommended to use it in production environments. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. - - Scope: GLOBAL diff --git a/tidb-distributed-execution-framework.md b/tidb-distributed-execution-framework.md index 9838b72e36cef..469cd7d6ab555 100644 --- a/tidb-distributed-execution-framework.md +++ b/tidb-distributed-execution-framework.md @@ -118,10 +118,10 @@ Adjust the following system variables related to Fast Online DDL: * [`tidb_ddl_reorg_batch_size`](/system-variables.md#tidb_ddl_reorg_batch_size): use the default value. The recommended maximum value is `1024`. 3. Starting from v7.4.0, you can adjust the number of TiDB nodes that perform background tasks according to actual needs. After deploying a TiDB cluster, you can set the instance-level system variable [`tidb_service_scope`](/system-variables.md#tidb_service_scope-new-in-v740) for each TiDB node in the cluster. When `tidb_service_scope` of a TiDB node is set to `background`, the TiDB node can execute background tasks. When `tidb_service_scope` of a TiDB node is set to the default value "", the TiDB node cannot execute background tasks. If `tidb_service_scope` is not set for any TiDB node in a cluster, the TiDB distributed execution framework schedules all TiDB nodes to execute background tasks by default. - - > **Warning:** + > **Note:** > - > `tidb_service_scope` is an experimental feature. It is not recommended to use it in production environments. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. + > - During the execution of a distributed task, if some TiDB nodes are offline, the distributed task randomly assigns subtasks to available TiDB nodes instead of dynamically assigning subtasks according to `tidb_service_scope`. + > - During the execution of a distributed task, changes to the `tidb_service_scope` configuration will not take effect for the current task, but will take effect from the next task. > **Tip:** > From b297f8bb6ca2377fb31790bedafbcb708bc0c325 Mon Sep 17 00:00:00 2001 From: Weizhen Wang Date: Tue, 14 Nov 2023 12:03:44 +0800 Subject: [PATCH 15/29] system-variable: add tidb_enable_async_merge_global_stats (#15225) --- system-variables.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/system-variables.md b/system-variables.md index 9bc82ff868e9a..80536ce0f6d7b 100644 --- a/system-variables.md +++ b/system-variables.md @@ -3637,16 +3637,21 @@ For a system upgraded to v5.0 from an earlier version, if you have not modified ### tidb_merge_partition_stats_concurrency -> **Warning:** -> -> The feature controlled by this variable is not fully functional in the current TiDB version. Do not change the default value. - - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Default value: `1` - This variable specifies the concurrency of merging statistics for a partitioned table when TiDB analyzes the partitioned table. +### tidb_enable_async_merge_global_stats New in v7.5.0 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Boolean +- Default value: `ON`. When you upgrade TiDB from a version earlier than v7.5.0 to v7.5.0 or a later version, the default value is `OFF`. +- This variable is used for TiDB to merge global statistics asynchronously to avoid OOM issues. + ### tidb_metric_query_range_duration New in v4.0 > **Note:** From 1e26662fbd7c1235761468c16825d27f4a86ca7a Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Tue, 14 Nov 2023 12:39:44 +0800 Subject: [PATCH 16/29] br: public parameter --ignore-stats to backup statistic data (#15222) --- br/br-snapshot-guide.md | 2 +- br/br-snapshot-manual.md | 26 +++++++++++++++++++++++++- faq/backup-and-restore-faq.md | 2 +- 3 files changed, 27 insertions(+), 3 deletions(-) diff --git a/br/br-snapshot-guide.md b/br/br-snapshot-guide.md index 13073355b8ba5..3827c356dc36e 100644 --- a/br/br-snapshot-guide.md +++ b/br/br-snapshot-guide.md @@ -141,7 +141,7 @@ Starting from BR v5.1.0, when you back up snapshots, BR backs up the **system ta **BR does not restore the following system tables:** -- Statistics tables (`mysql.stat_*`) +- Statistics tables (`mysql.stat_*`). But statistics can be restored. See [Back up statistics](/br/br-snapshot-manual.md#back-up-statistics). - System variable tables (`mysql.tidb` and `mysql.global_variables`) - [Other system tables](https://github.com/pingcap/tidb/blob/master/br/pkg/restore/systable_restore.go#L31) diff --git a/br/br-snapshot-manual.md b/br/br-snapshot-manual.md index 78e79151aad83..d7585da5142e7 100644 --- a/br/br-snapshot-manual.md +++ b/br/br-snapshot-manual.md @@ -12,6 +12,7 @@ This document describes the commands of TiDB snapshot backup and restore accordi - [Back up a database](#back-up-a-database) - [Back up a table](#back-up-a-table) - [Back up multiple tables with table filter](#back-up-multiple-tables-with-table-filter) +- [Back up statistics](#back-up-statistics) - [Encrypt the backup data](#encrypt-the-backup-data) - [Restore cluster snapshots](#restore-cluster-snapshots) - [Restore a database or a table](#restore-a-database-or-a-table) @@ -109,6 +110,29 @@ br backup full \ --log-file backupfull.log ``` +## Back up statistics + +Starting from TiDB v7.5.0, the `br` command-line tool introduces the `--ignore-stats` parameter. When you set this parameter to `false`, the `br` command-line tool supports backing up and restoring statistics of columns, indexes, and tables. In this case, you do not need to manually run the statistics collection task for the TiDB database restored from the backup, or wait for the completion of the automatic collection task. This feature simplifies the database maintenance work and improves the query performance. + +If you do not set this parameter to `false`, the `br` command-line tool uses the default setting `--ignore-stats=true`, which means statistics are not backed up during data backup. + +The following is an example of backing up cluster snapshot data and backing up table statistics with `--ignore-stats=false`: + +```shell +br backup full \ +--storage local:///br_data/ --pd "${PD_IP}:2379" --log-file restore.log \ +--ignore-stats=false +``` + +After backing up data with the preceding configuration, when you restore data, the `br` command-line tool automatically restores table statistics if table statistics are included in the backup: + +```shell +br restore full \ +--storage local:///br_data/ --pd "${PD_IP}:2379" --log-file restore.log +``` + +When the backup and restore feature backs up data, it stores statistics in JSON format within the `backupmeta` file. When restoring data, it loads statistics in JSON format into the cluster. For more information, see [LOAD STATS](/sql-statements/sql-statement-load-stats.md). + ## Encrypt the backup data > **Warning:** @@ -153,7 +177,7 @@ br restore full \ In the preceding command: -- `--with-sys-table`: BR restores **data in some system tables**, including account permission data and SQL bindings. However, it does not restore statistics tables (`mysql.stat_*`) and system variable tables (`mysql.tidb` and `mysql.global_variables`). For more information, see [Restore tables in the `mysql` schema](/br/br-snapshot-guide.md#restore-tables-in-the-mysql-schema). +- `--with-sys-table`: BR restores **data in some system tables**, including account permission data and SQL bindings, and statistics (see [Back up statistics](/br/br-snapshot-manual.md#back-up-statistics)). However, it does not restore statistics tables (`mysql.stat_*`) and system variable tables (`mysql.tidb` and `mysql.global_variables`). For more information, see [Restore tables in the `mysql` schema](/br/br-snapshot-guide.md#restore-tables-in-the-mysql-schema). - `--ratelimit`: The maximum speed **per TiKV** performing backup tasks. The unit is in MiB/s. - `--log-file`: The target file where the `br` log is written. diff --git a/faq/backup-and-restore-faq.md b/faq/backup-and-restore-faq.md index ab413fbaeaa12..fb3b489955947 100644 --- a/faq/backup-and-restore-faq.md +++ b/faq/backup-and-restore-faq.md @@ -292,7 +292,7 @@ br restore full -f 'mysql.usertable' -s $external_storage_url --with-sys-table Note that even if you configures [table filter](/table-filter.md#syntax), **BR does not restore the following system tables**: -- Statistics tables (`mysql.stat_*`) +- Statistics tables (`mysql.stat_*`). But statistics can be restored. See [Back up statistics](/br/br-snapshot-manual.md#back-up-statistics). - System variable tables (`mysql.tidb`, `mysql.global_variables`) - [Other system tables](https://github.com/pingcap/tidb/blob/master/br/pkg/restore/systable_restore.go#L31) From 0472399d0ed2511b9408c66fc57e238c7d45251e Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Tue, 14 Nov 2023 12:48:45 +0800 Subject: [PATCH 17/29] dm: add error binlog filter description (#15306) --- dm/dm-binlog-event-filter.md | 56 ++++++++++++++++++++++++++++++------ 1 file changed, 48 insertions(+), 8 deletions(-) diff --git a/dm/dm-binlog-event-filter.md b/dm/dm-binlog-event-filter.md index 5bcc5d35bc240..49e4b6ff893ab 100644 --- a/dm/dm-binlog-event-filter.md +++ b/dm/dm-binlog-event-filter.md @@ -5,7 +5,7 @@ summary: Learn how to use the binlog event filter feature of DM. # TiDB Data Migration Binlog Event Filter -TiDB Data Migration (DM) provides the binlog event filter feature to filter out or only receive specified types of binlog events for some schemas or tables. For example, you can filter out all `TRUNCATE TABLE` or `INSERT` events. The binlog event filter feature is more fine-grained than the [block and allow lists](/dm/dm-block-allow-table-lists.md) feature. +TiDB Data Migration (DM) provides the binlog event filter feature to filter out, block and report errors, or only receive specified types of binlog events for some schemas or tables. For example, you can filter out all `TRUNCATE TABLE` or `INSERT` events. The binlog event filter feature is more fine-grained than the [block and allow lists](/dm/dm-block-allow-table-lists.md) feature. ## Configure the binlog event filter @@ -40,6 +40,7 @@ When you use the wildcard for matching schemas and tables, note the following: | `all` | | Includes all the events below | | `all dml` | | Includes all DML events below | | `all ddl` | | Includes all DDL events below | + | `incompatible ddl changes` | | Includes all incompatible DDL events, where "incompatible DDL" means DDL operations that might cause data loss | | `none` | | Includes none of the events below | | `none ddl` | | Includes none of the DDL events below | | `none dml` | | Includes none of the DML events below | @@ -47,18 +48,39 @@ When you use the wildcard for matching schemas and tables, note the following: | `update` | DML | The `UPDATE` DML event | | `delete` | DML | The `DELETE` DML event | | `create database` | DDL | The `CREATE DATABASE` DDL event | - | `drop database` | DDL | The `DROP DATABASE` DDL event | + | `drop database` | incompatible DDL | The `DROP DATABASE` DDL event | | `create table` | DDL | The `CREATE TABLE` DDL event | | `create index` | DDL | The `CREATE INDEX` DDL event | - | `drop table` | DDL | The `DROP TABLE` DDL event | - | `truncate table` | DDL | The `TRUNCATE TABLE` DDL event | - | `rename table` | DDL | The `RENAME TABLE` DDL event | - | `drop index` | DDL | The `DROP INDEX` DDL event | + | `drop table` | incompatible DDL | The `DROP TABLE` DDL event | + | `truncate table` | incompatible DDL | The `TRUNCATE TABLE` DDL event | + | `rename table` | incompatible DDL | The `RENAME TABLE` DDL event | + | `drop index` | incompatible DDL | The `DROP INDEX` DDL event | | `alter table` | DDL | The `ALTER TABLE` DDL event | + | `value range decrease` | incompatible DDL | A DDL statement that decreases the value range of a column field, such as the `ALTER TABLE MODIFY COLUMN` statement that changes `VARCHAR(20)` to `VARCHAR(10)` | + | `precision decrease` | incompatible DDL | A DDL statement that decreases the precision of a column field, such as the `ALTER TABLE MODIFY COLUMN` statement that changes `Decimal(10, 2)` to `Decimal(10, 1)` | + | `modify column` | incompatible DDL | A DDL statement that changes the type of a column field, such as the `ALTER TABLE MODIFY COLUMN` statement that changes `INT` to `VARCHAR` | + | `rename column` | incompatible DDL | A DDL statement that changes the name of a column, such as the `ALTER TABLE RENAME COLUMN` statement | + | `rename index` | incompatible DDL | A DDL statement that changes the index name, such as the `ALTER TABLE RENAME INDEX` statement | + | `drop column` | incompatible DDL | A DDL statement that drops a column from a table, such as the `ALTER TABLE DROP COLUMN` statement | + | `drop index` | incompatible DDL | A DDL statement that drops an index in a table, such as the `ALTER TABLE DROP INDEX` statement | + | `truncate table partition` | incompatible DDL | A DDL statement that removes all data from a specified partition, such as the `ALTER TABLE TRUNCATE PARTITION` statement | + | `drop primary key` | incompatible DDL | A DDL statement that drops the primary key, such as the `ALTER TABLE DROP PRIMARY KEY` statement | + | `drop unique key` | incompatible DDL | A DDL statement that drops a unique key, such as the `ALTER TABLE DROP UNIQUE KEY` statement | + | `modify default value` | incompatible DDL | A DDL statement that modifies a column's default value, such as the `ALTER TABLE CHANGE DEFAULT` statement | + | `modify constraint` | incompatible DDL | A DDL statement that modifies the constraint, such as the `ALTER TABLE ADD CONSTRAINT` statement | + | `modify columns order` | incompatible DDL | A DDL statement that modifies the order of the columns, such as the `ALTER TABLE CHANGE AFTER` statement | + | `modify charset` | incompatible DDL | A DDL statement that modifies the charset of a column, such as the `ALTER TABLE MODIFY CHARSET` statement | + | `modify collation` | incompatible DDL | A DDL statement that modifies a column collation, such as the `ALTER TABLE MODIFY COLLATE` statement | + | `remove auto increment` | incompatible DDL | A DDL statement that removes an auto-incremental key | + | `modify storage engine` | incompatible DDL | A DDL statement that modifies the table storage engine, such as the `ALTER TABLE ENGINE = MyISAM` statement | + | `reorganize table partition` | incompatible DDL | A DDL statement that reorganizes partitions in a table, such as the `ALTER TABLE REORGANIZE PARTITION` statement | + | `rebuild table partition` | incompatible DDL | A DDL statement that rebuilds the table partition, such as the `ALTER TABLE REBUILD PARTITION` statement | + | `exchange table partition` | incompatible DDL | A DDL statement that exchanges a partition between two tables, such as the `ALTER TABLE EXCHANGE PARTITION` statement | + | `coalesce table partition` | incompatible DDL | A DDL statement that decreases the number of partitions in a table, such as the `ALTER COALESCE PARTITION` statement | - `sql-pattern`: it is used to filter specified DDL SQL statements. The matching rule supports using a regular expression. For example, `"^DROP\\s+PROCEDURE"`. -- `action`: the string (`Do`/`Ignore`). Based on the following rules, it judges whether to filter. If either of the two rules is satisfied, the binlog is filtered; otherwise, the binlog is not filtered. +- `action`: the string (`Do`/`Ignore`/`Error`). Based on the rules, it judges as follows: - `Do`: the allow list. The binlog is filtered in either of the following two conditions: - The type of the event is not in the `event` list of the rule. @@ -66,7 +88,12 @@ When you use the wildcard for matching schemas and tables, note the following: - `Ignore`: the block list. The binlog is filtered in either of the following two conditions: - The type of the event is in the `event` list of the rule. - The SQL statement of the event can be matched by `sql-pattern` of the rule. - - When multiple rules match the same table, the rules are applied sequentially. The block list has a higher priority than the allow list. For example, if both the `Ignore` and `Do` rules are applied to the same table, the `Ignore` rule takes effect. + - `Error`: the error list. The binlog reports an error in either of the following two conditions: + - The type of the event is in the `event` list of the rule. + - The SQL statement of the event can be matched by `sql-pattern` of the rule. + - When multiple rules match the same table, the rules are applied sequentially. The block list has a higher priority than the error list, and the error list has a higher priority than the allow list. For example: + - If both the `Ignore` and `Error` rules are applied to the same table, the `Ignore` rule takes effect. + - If both the `Error` and `Do` rules are applied to the same table, the `Error` rule takes effect. ## Usage examples @@ -148,3 +175,16 @@ filters: sql-pattern: ["ALTER\\s+TABLE[\\s\\S]*ADD\\s+PARTITION", "ALTER\\s+TABLE[\\s\\S]*DROP\\s+PARTITION"] action: Ignore ``` + +### Report errors on some DDL statements + +If you need to block and report errors on DDL statements generated by some upstream operations before DM replicates them to TiDB, you can use the following settings: + +```yaml +filters: + filter-procedure-rule: + schema-pattern: "test_*" + table-pattern: "t_*" + events: ["truncate table", "truncate table partition"] + action: Error +``` \ No newline at end of file From de48d2fb6524902cab50638201b71f361995a36d Mon Sep 17 00:00:00 2001 From: Weizhen Wang Date: Tue, 14 Nov 2023 12:50:14 +0800 Subject: [PATCH 18/29] system-variable: update variable for analyze (#15224) --- system-variables.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/system-variables.md b/system-variables.md index 80536ce0f6d7b..9b1717a6cd5af 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1076,7 +1076,7 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No -- Default value: `1` +- Default value: `2`. The default value is `1` for v7.4.0 and earlier versions. - This variable specifies the concurrency of reading and writing statistics for a partitioned table when TiDB analyzes the partitioned table. ### tidb_analyze_version New in v5.1.0 @@ -1301,12 +1301,24 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Integer -- Default value: `4` +- Default value: `2`. The default value is `4` for v7.4.0 and earlier versions. - Range: `[1, 256]` - Unit: Threads - This variable is used to set the concurrency of executing the `ANALYZE` statement. - When the variable is set to a larger value, the execution performance of other queries is affected. +### tidb_build_sampling_stats_concurrency New in v7.5.0 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Integer +- Unit: Threads +- Default value:`2` +- Range: `[1, 256]` +- This variable is used to set the sampling concurrency in the `ANALYZE` process. +- When the variable is set to a larger value, the execution performance of other queries is affected. + ### tidb_capture_plan_baselines New in v4.0 - Scope: GLOBAL From ac720c641704e9a6bc7165aa726124312a676d83 Mon Sep 17 00:00:00 2001 From: Weizhen Wang Date: Tue, 14 Nov 2023 15:15:45 +0800 Subject: [PATCH 19/29] update analyze concurrency for statistics (#15286) --- media/analyze_concurrency.png | Bin 0 -> 38342 bytes statistics.md | 22 ++++++++++++++++++---- 2 files changed, 18 insertions(+), 4 deletions(-) create mode 100644 media/analyze_concurrency.png diff --git a/media/analyze_concurrency.png b/media/analyze_concurrency.png new file mode 100644 index 0000000000000000000000000000000000000000..76fd603d223e3ab76618aa29fa5911dcb2936d3a GIT binary patch literal 38342 zcmeFZS6GwX5;kl{L{y6Oq5?sr2?|IHC@LM4jx;5JQX?Rp1d*b2=^aFRHz0(XC_<3l z2|WVRl0c*+fzZAn@BYuey^sFuJ9!V_;Nf9CYt77>b2CKbIG!osb-RuK#0Zy_dv9aPoIjtHjuL$j3i(lgRercucYWh~czFpAW?#AjU=@ zW5~jSxPq;)LRVVpVAq3AoH~E)?SJ`Ib5^IM86bS)zr8~pK{+OvGxNmBGgp-V@4o`i zo#LP3_yxQ5KR(GnwS4-&PwM>LE7xX6PsA89{Fl#BCv$D~G4FqWL-|wS30n9$@3Qd! zB7>vJ(8^H$#~Y{UW&eFGW(S0S9FzBEAd-^4_n_p58lE#)V9O8Y3qNyafajk$JdT;+e`k3YCgWsz5HJ->io5S<+s6RrxlDI{oifn zXfkhuIU)c3jT8Tx)L%>b*QEY6slQJ3UoXXR{*BPAbmMD^@@e6J9L%TUZ@LkuE=h?7 z6EjamZoawrkAr3lJ+VG{+dwx#eNS4!OyluiU-L=(RAyLVl}=qxraa*=L)M*A2tC-$ zKD?jq%j$9e=#}N#Z+@8FnSgu#+so+7Tw{mXjiLi(#K_H_JrmA=e~TkVSIWQsl{{2v-JaX~_s`Q;Qsl{ss(#J!;p^e8W{HZu z_pBa8*#2|CUCTeQZlr~rVO*LC1FzzL8ur}NkST2}<{rIF%Ry;We_m-jl2q!%}Ua{sWS(Lu1z|>;5>}E(1 zjg@R4TBuK+us7qgWYwP!)HM4$iDx(d%+d9cE&bQef}O)9@NL!?<&gp-&bCr)OJ?^z(MJpW81^j&x*aqVu)h{PGunuh%y}d`Gal~%m*_2w%j5b-bw#(7_lykj ztsAXi;*<&9@prh+HZbbi#=aJwIeS2eAYi77}I1M2BiUm1Au41a&! zP9Hes&#;+90i=R$=s+?E0B}4t?=Puw;lc5FLp@en!Ry}CA0v^lV{s>+af>OGRIh7Z z`Wbi{nc*T=z&{DyjxY4Bw(DEdv&t|3t3`dSUn45-eR(PKn(YG9zkk5mQlvrQQnp1S zqFQWA?{8_%9^1mTE5gcqEP{{rSe9D#jvu{wdTq(up;K_m+)I1Ehm7cpceYT_v%f=? z$?daefA?l7)nobQ9eVX$kPFWUw!w;0=_Mu zHTNPi*L1#!)Cugmm;E4C@*l8st(ZDbA;ERC!I{S|j;Hxs=p#r$YD94JH=i?&hnM`D ztsVtG-txa%`1$NPX-j{V)FT9Sy@`ALp0oWPue3GofA((diS=}*tZnCq`>{R8FzNQy zvu+WE1->oLWy=p6wN&K@O`Cp)^M||R26*FP_M-)Vnm={85#VPP>RE;0JjlU z-;JBil-7F_5yR0+Q!d-|~&mxWn>emaLIu-e_n}3S&zTU+nHB+DFv0j7n%_bp? zpWZlrGR^o?H0#FfX!(Jr!PPKeQE8jJkO^4E~^Pt~7g3rkV| zS84w$?XS`PyVL&NY5%&%f8FCTWcc??`xl7(3q<}Op-0K{w=2Y6AxW1ABT= z%}zZG|0fR|ID7u+pyJf&`)u$JwC4MGUqQ;{B^4FwVDj^iQ6bCEYu~&PzZ9()tgQA9 zcRAx?uHp!iw&n*pm&pfuy-E znP7ux)L*3gs!{peNz`TtOs9_4XDE-8N!poEH+=aRmtOh{RB^A#g)7JSmY?D0H7b}+ z$`gEY{uo}AJDGlfbJl|!$9eKeWK zBKpk0Pu|10X?G2Yg%C^0kzX$rti>o)<-)xPGc#4TT^Goy^JxhsSu#^!K9h&_b%|ol zTZ1zY^0ZZQ6!9pqs8j}=M1_A2T$FcdDEEkW0X@1NPtxd6Y%|%Iw$`fO8tih2DbJC3 z7Ne=OzoHcR`@<_0Zml11XTL9p0m-6f!0$H<6}zlUV)1V;Ip1Y|pC6ts|4>7pLHUy{ zHGt^lmMmU38xgw>>Z^60^kud1e@X~%<1KFR4cIBZLKtx{=`ynBDX^8dQ5%s!pVmwg zNxfZYY#T4{1woSWYcKtHg%_<^yR)rq4U-<72xOqMVbrB(_Q|)IyX4U&n!T!cupL(a zy$`ZhJF~ICB>2%4{Jy zW{2)KvB!&e3^)VX^-JG3i3851v^cLSf8GkF*kiGrcee@5H zpr<9iajzX43bB>E;XmF?`L|Q(oAc>ij7qIEM~`d=dlZ$f0;Hg&4P=+5b;?Y&Y(%nl;jYy6^OD0e3na zDY#zoIA?153Pg?21iT1qBN%Zfh_(iq4HuaUT zj0mxYBJ!Uf7#_IwtIvWi!mCR1x|LEwT$7%rVg&?ap%`kq>iK77yCc1a z2pz3oHj2r`O>lE<&ZWJ7XE3t4%F8Y5t$N~52VX9nO$6?gyrH6coz%2G5q-JsMe*Ds zMapR@Z81ZjM!0cJ?WrHryoI^K$_u2>jYm5Pfy!$~npe!=hn7xwS@&fcTkZ_ZARvM+ zUrPMq@6y6c{0e6__@}V-HMZp`Y@m6RI2o}X<-C7ucjl%*$Jbod7*ekgqdh_p{d=X> z%($H6F2n%LXdj_SRDNDUYjYGAMy$S-`*cg*wad*6eqea`&f`ZYeT}%edR^NpJ8cQN z%ycOJ(4v0h4wL0cmbB%tNey`>mjdS>(4KsR{)bNjwe4THPK_LeuNMNZwjm4D;rvX} zi>E~1AQB1-yqpeozz6Tn)(|Q(E->9LW;7k~+u4?~@gM)X(zS15ztEX3IHeF*lEpuj zwpL8Ry`TGH;Of55QS06_hKpvdk!f!)veeK1OrH?=?ItzQ&!DK!!Ao|fNK-qS+@GjZpoL4XC2q^LwX(VK?G)Tt0l52?8e`U)Qm3|p`c>#1M-Lu>30aEG z`Njbma6l^oko54=)uT9EVU}gQ+&T#UbM9S4>}xv4N1|7#>%5g`QWO<9VxvO){B~WK`MAYVO=Qj?HnR;z( zFj{!6+!OILSI*P@4cqAmn+J^X>`fgcwYo=ZDc`J%0lpJ#%c;h$Az5RuZpn^AkoFK$ z-oWCc#6u40NBG4PG*#Gf1_SkjSzLz@ft77D|EJVw78aFR8|#1unH zI=-Yh!!DkZrMgmCvP0Zbl z@p~w%Cq6_o(X$obK4#nPx2wHtja1e1QZ#P%huo~C_dQH#&Q+*wD(PAQ-P@W@NwL)* zcSs)FcbKRkhi&ba;Ebip8vR{PhAw>hyiJf(hr_I!QJB}9MlC%f;y9geT8rB&>D|1< zopzvS34j~hYTyxLjP~`|?@qBtRrJ@RjbAoHtk@@t*6REHV^pvVw{ji0SQ?eEJmSTu zyfI(w0v=Z6q-4ChYbS4qX(i?N@7!Wu`kC?!jx>_`O14O3D_(=yqX#NfMrv^&3#*p~ zHARh0Ti;)LDGeddo#$>WK!!$d0@hRQRj21oN5O;QSh)!t15#EXz7=dpemd>`@|AoA zlt%-wjUCKGzQ?jt{7SrY)j7BBU3KRvlV4(}QX_hCGa0fc>p5F>+Pm?LZc6JJjkxvz zFI&A?Z}~1Wifv^BCcl_cNOs$MB}X%yr!?%WI!$~4d(rLcocz=yx+*NW@&L9zVAC5Y zC5w*UNGn`B=r~Ghu`JUr3Vp6>+nVNpG10}f*0~QA%*@QSs2++~`ac5iQ;5x7;)l-< z(G_0Ilc$qKy!sTq$TZLP`GWHO&v?(agm~;WdMV1kO^}-oc3=AZpIQK)CDeh1Ak5J zwdprQo!i3xn3wgmZuk|)o^5iRWdd-vV}DY1@^=igA3la#y0gQMR(P=TcP+-aS77=1X+WpH{=J%E@$UG+d%6tYZ-5 z6tCu*jN+^ya4UMgdF9nc zp#(+FW(%XTg;@e@TEy-y{j`wx(z!?IO7UqACH4~YuI<*yRcz)tnW4Vqx0wKx5UzTN z%tFvbi>K9@@}uCJ*s3vkoy5!`6<3p2ypztvRa}3{FOK=n=+ZIHNb6{AjERkPE_LPf zsJN0Pk#VT0)dumtV$#mU^K4|vS3Z=0XKEG1*t47u=RDm|!+SWPR57L@J|FIfXH%z& zt8s0Dv1v3;&Z&8Kjkdi3eAucbg5r_4Q*xR3>Y5S4g5m?oGw^9#tORonJ9`q9n#&M5cZZF`Fc}Y`=HMUe_sONQs`v6UI1W5IKptv5_7gNIHyOXdU%2l0xxR=(6wq*?3TDrj{p=(-}2jcOI zp6?V^nn1c*EK2hDE}P{)TwE!N5jBx5FiP%ipS)bA?y+GG@O?={7Lj{!CAbkOH-;TE6}pwK3*AT?bi?m6-GPoq3FqA? zs3^ct-DT9}2m6(H0&M=1#p7^N`ux+*XN{h#!9J)IQgR!k9gpY0=SM+U98`}asx+i( z`^Z=Uj2zDqS(NBX+mUQ6Crv#~bslfBDZz2uSA#9}4hnFecyC=$3@k~e?@5+g8Ty-h zKGl4mw+!*96fiYfVR6SrZ})SA*e$WIFSk6EsvDKA^Rxc48&phq9e*07^|34S%;r+| zotvsyVcUITGx@CenNl8R$_-tBYWCR6#!EweDKoq4x(BsLpJqjM$!U|i6^cf-@AfSCGS*=-4OqDQweKMf=PqfLGXs+c zbb}5nxv6~LSm?^MP$8?Dej5=-p;*YZPCFV(Hzg_?KSl&kx-?EtDq65Q6;s?oPRwJi z)2(hESj0I@ir!qBf5crYkCiamtMF_QrQ)4c!_2p^_Sd`KBOvcrryc{&_C#G-yf@I^ z%q2)>0eNHd(q#kGEyOLo*Pllg>Aml_sT@!Kp;NaL?ZK_zlOmg|B+E?c_syJt4qA^T zgZ)8f0fvg9>8o9YRp_Tx|0ThleT#G%`FI!DlKMDOvQ}xwM&iCvYu3iRTykhF-~Jn~ zb%V&%?l)0St0Je&tq+6ezC2O7=htT6 z^YVCQeI?gBDvoD~pUj=-35&&hKW+DR%YYCDR5O?C(exUN+XK5lf#*Pgi=Y^8q1`Z~ zpY`i3$zHjN;dtE&&aF4u)y_RWx&$W7m~HgSl}&=x_QBlaN{r6Wi6sEUd}0!Q{SBI*1>yldw*#+e)OC1TEG!4!U98Af3;k~G z`3E4_o+zwzr$W@1-I!ZXkRHoF^*e97!x1>3oA8s?eW53tT_#Z6V~UF2gVbYR@K1G5 zce^pvz4#I8%8DQ4^`ydm0%CV9`%A4N(moa4@j;M2dqV!!bsLRmUQg)0`K{T4)7{7mN%H!nkhPS4W zNpQB!N9U<03WhHL!{*-`xL>L9A}Z8oQae0$cGILB!{@OvfUPdw07@wE(fIVu9H8~7 z^Yu)+S>5r#*JiG&Uz{*a(Vb`B*!eBKdx|y`7d||kaI9RRfic>seX1mA4KCe7OE_Yk z;#l0w%fuL;FRH6dzZCeeZ@K<3Z9liK+RYp+7Uw{ke(ILrE$?XmT14fEf@ME{TgVUq=%-b6>=%k&e z?$-GLJX*A4?pN3}w_5v=n(kf?17k&ZJ$I^bs3={%0-qawUcd{Pf<1Y#E&6zqD$pZ{ z82$8#7nBsm$Lberut@>(Y)j@c_k-u=o+w?YF%%(s@8|L!+O%BHQ8d?eZi_7Rh`wXm zK5Vr|5JqujDkL{_aWel#U6wEAX^=+`x8qe;G#TIH3ODi(EB7O$_DxXmm{Qan)p?_} zM~7f(d+=$b)}fitW)BnN(BdK++Se_zT$E850CdkRtchslYmeb2lq2>@-?K%VfIJT>a@N&fmLP?w^Oi4;r^CK*|_srtvz+O2|~2p%9nh>t>Q{ zVf3Cko+~G85Kx&gOca*)F?a9DCW&|J3r=mROKh=KP8+I-+4->F{oPIu4hGgY0!pkY z(0I>>Jnwqjws0yXO_cmHt2FwRlhY&ZH4%=dI?cDz(bs|*TWob5$D?9sHEc7V@D6y9 z61m$+)3&xAQetk=e#>SUPXte!$b6dCC@UiZq(@Lfb4O7Wgw<LBDZl=U9=9`!)JN-G0-Uq(ntro5-b0lgBUegyMC54@Rjs-OG7~BgWkt0T-I9Olk=Z$!nYlceu}ImCiq+xmMh( zCCKq;T+7pTw#79yw5+0GOL-Z$??#u$fpq*bf>yQ~y1bJ7B;fVFqUXDxmNL?_09+zb zzC;GDbMWoYGZ7;h`~*1OTry6;p>+&gFJXI zOviF@UUo3U)a2Jz@0xelYYNJlPxgb?J~Vp{&ciEpmmO0}Kw|5Yf2fI&rH}otFFJ+I zbOd)TbooxY%qIZR$!#|(Hp@)m_|T3Lu%JX&BdHmf zylvgt58t;Yr#sd&yQU@xFUezwFd$81FN~sKDP1Tu85SjBofN^&s)FWjs7#7FyNHr8 zW;8*Mcqta`Dvzuy6$A{k)8tR3;FJ_u3SJ>`9Q=y~0m&-Z;B+UxuL0Kcx&5wulai~R z0&wfO-2Qqaa>N+yApwkOdKizJPB|z`F215`DC*1K75_TIrjn^?{B**l&Ij>0Z9tJ$ z{8ku{$i0yhG(DVyUn8Uh8g_(b^j@(1fT-t~ghyl?h(7@xuG?9IX zoc=YpA`_TZqV*Zuf$l$H=6Ujp-^B!?#2s1_;)qmnOy;fcZbcP2VcO#uU*56r%34UV z#&07h?Or1hd9+0lu6YuzU1Da`h6*vrmZ2Hys)^j6C}V~>m4wT7a7B9*%zg{Av- zwO7zP*wvNwF0yaylk8|kmp6Ltz{2;n3|LhOkDKy~TNr=ZQ_Z~k%^HLFh|34-hXVmE zB3ljCjg9)*t@LXaQf*of~ipQykK6%jC4Oq;$ z>WBa7`GA(cy7`I1tGQ48K#BDP$U*oDnliQK>)YC7{|Q@DKF*6M)amvSNyvD!O09|; zKMK~*WBBFHY=%}tKp{B zR`QNmbKDz;Dm76*4kHW!qA=W&_L18fuB=YU`Uu7>Fd*ftow{cC{E;Zglx7C)9 z8zbtUy3f${WE!n@(a#q%r*rzU^H1Ff1D!6Yx-a)8xwt|jp7xeSbp0<{3${m+7g^}e z@--;j8GnwxKJwgORh`|_(Zr8AF*TqOpsQYeD%;1KL+#$ zyDodM*)ly{d?>p55xx5pz8|`yFY~di3PY?SJ+#0}{GyXiGdIr%dOlzZ%!x=XxnFb1 z5)Wh@s_dHJZP2VL(@le58*pm62l-IzvEkyD;GWx;%U>ilf4(_ItjNGcU@(O7KPKzD zMhdfb6uXal$i^`2C9m{k|=&aCnb4Bb7LP-Zbn!W$2tTQT~2YPW`jmR-$bmE@Z!SNoie6f zyZCgSc>%v+BE^GvAV6T$=&JBwFl-_Ajq_ydxfb!3K32mP?$o<+H zIiOZ2dAs@)*QdOip3|jULsk54S{C^{u-y`@1otL?Vrh5nSENRU9h{HICX&0QI`uG!0{p}e2Xgov zvJ^0x_W?F%CCumoi>HcQKsu}cJ`_twV)pnTnyap$UmKMoSf|AssfcdlDfN|AEWVCR zMstXYqX+_RGBy5{Zlky^awcWoEtV}7Y&D^zFFI?7mZ91Y<>?UJll6;=wstPR*ik#v z2U(@9-Vv2`p99SL6;)N?BHmIdDd!kC*Dm_)@u4@ zrSv4G#>$ec`7Q0GoZv#SMy#1^=*U5dYnqFSlhajfE({_>no5$SBI!H+9+zwC%d%ID z^Co4uSD^ZWYb6s%_oD*CN2Ir7Z-oG}EOH+n(0C?(|J)I1{Fo_Cc>qUYjDoG6r#JG* zeqHJ&!CRtYBeNl97vMiFa#KA*KKycVBVto8Uxz$ZT|@qL`trp%rwz7!+hkF!sL5-W z0DZ5Sde|XSo6)5Tn&~-QQ<|<-QHAM>ZW{xAgJ)Z#y>_%vuz5O{&bXikZP@Mzxgn=& z)5dhSl#8`OF0Jv|s{CLN0VFnNJ?SEzOuZ7AfPdaT^oVoTYc*Mq37*VPCapCR4jL>R zTYO;>Q9zoBb{i-59~yOrVc-&Tlac#LHv0upRy_XKrW3BKCytaF%QJ!9PP#7@Xb179 z^ZSXJy)gwt*O3t{`%nA)skgn8;WPTWiiUFbR~t^q#4|~rqm!L4 z1IGe52KUq=y@s@9sACyYyiBdqWZg%vq1=i2W8R-#NYS}Ll960;5hl8CS>_eet1 z6QaNVg2Lz)CBNH#Y&|skDx`zE-gUhGERcDd?ST7i^;6p zI3Xib$q}NONfE1pIdwlr1b;uiumk_IPv-}(cs5dLq@!Ecwg#$JN}Kt3WHtDJX_^O1 zbqx^zvDHeB!WGY8N{y*Bv#>$ESfuGFVn@<0d$fL~32)Bn;9jJIAiPM!6hbGXD}jHr zZVwa*naK8YS`xJIs+fie!MXz0_Qh{>&4u^kUVJf0JlRD?9FE01zxtv_*T=B_L^6GM@+^|4IBS zPh;%$q_XfBu6N-+tzbVQr!F!ieFWg1(DuGa+;fVk;Oi zR`;?t{waUV@_9ow8m+}x$}Lu>oQ z{4P;;!vC}m`SNC3T|=ej4XWJ)Q|+cGW;`H&zk=Jtc#>$Qa431$b? z$PmZW1`Ku(kQ11Tk0t+gBltj{osDhqOb97;{Q@yI<-`q-FI*Ft9J%*c>JGw{M))jz z^vMH~-iZ#^_G8@XZika9ZXQUf*V2xgtXE#8g^z(hher2}xjQ7%zI}qJI?Vcf31)jy zc(wq}>)ZASUtdv@3-;Ny2SbcMA!2j!F5in2UJG*?tAK2acn!Ik{8{@bi+p~_4x7%N zHuy`B<;k>?n4dn@Bcj>d{)c_gRX;838;kO6RKLVd)5wkS+Zy1H_5f>?xXld93=L;K z*(74d=BLov*1e`e(cIPhDXS~eBRjpRPQnFNU2X_UaIc_5dtn$;&yF!%_-@j6ry$h5 zlRpe&O)9noyfII)rRME&blvj8$)UuanX%0$L*BXg6wyW>!x+BB;)4cSnP%~BSFTM@ zMi`Pb;Ya~KZtk@R4qlIT;ec_ARBY-j)Z2hd-|K{k2FrSiE)dp=3 z@Ko5UIHCbpKX6N?#2&Qi0Aga|B&@vDB7^|L3~PA31dcA{#_wRyfIYuGSK4=Nn~U*= z&)B@{lzQD}L*NslM6_A{S(aDF2a1x!hWk4TU0OsJf7FrGahx~7VfOLg#F2hB>IXfYf+^}nSrQ)SE#9VsI<&+(HdLc|)|5h=m253+52Y#NG~{{hEb!vp6IWH5D?xFrfSF=< z%gQQ(e-|nX4RK9ftF~Q@3cIhe1g%NfXVwt-I?_yK0*a}Y*@QaJ#(w=3!%asvNUv?(M1nI zsKBN)5<0Oq@5-ybi`*?P#XVv9l2oGt=6fS(;1n>-OHLpuZuS2d@G``OiCCrQ1yHZM zbhij6+}YcCPTK|*BA5iWp6OeCndLl@&Li*Z2V3GYZU5RqzBSx`5JfB9iOua3Hrx#< zP8U<5ykDV_V71O6CGIQco&Y@>=wfJ`{RA#r7l0G>A9+~DS{U_>rL#PcFtqrQLNu3YUOt%R40zxXEvt*d&-d3W z+Q~OmR5a!v`KTj<(*~xP+}$$p>jsuLXqh@0Eb?j;s3}@kXm7c4$eru*$JXYSBwNn# zmo0c1qA)3(TpmsYx%SX#di}ntgVIAIB1-j+>%1mMrz?4d6U^p+xkD|n06 zF^e(Qo^6XLi&KyrQ%(pNiIHZ`0qmpf;Oqh_E(1_n=;iaq9o4765GPwfu8q zBQqr3<(X{7xyf3^u}NR@bP!!pe;Z8eCdRcrlooT)A8+nKI3bcd@IC&Fl-uWF3~y|Z z#}E~dg5U38pNrkJh~GTI(r^#lv?48Vfp60Tp>p-YuPuxP#x(ZrkBnEXeAKLYZGi5; zV>B>fHC6ov+U6}!k^f2k@rTLy-14?Z1zWP4^Vn*MK-+&3mw2c*da*LH@sG4~1`y)A z-cZzcv9mug5AWqit=^hyz*g_mz20%%2?a3ppgPa#j`!ANZx4MJ&R@F~_lM-=|9jU@ z-qR`j&zX!%Uk@6#fghntP52@0bZKch&%54E&xLTy(j7Z16^*s`Fc#t-=6X$sGdj}k zXE<{x%WvB~%+oiA+E~T$JC+OfjCb8=GN$UTY^KEmAf}@|{NGHT-W4#c(l$Yb2T*Ix zVC{~ttiN%2=}Dk}XK;DtxI1+F;84~#Mlu3QEI+hvsaVjg%zKp`4hFjyRs;$6kA~Ur z-m3T z%S0@V>z&41(T{N^(^-)tO!){pi)k0MPfP2m0`3Y@`_#p;JZ|la#F}IRfX9BmHxt9l z$~PW=$1V|JCS%K3=h>KOQtG-CM(ipu{*X`DQhpDm;Ux)U%;l{j!jxdqm1*E35QI3R(t-Mghc52Ex+<25s-C#KYfjD2eaQ=Tb4>Jl%e4uEb6lL zu5%{vMNCkI>=)gFVi0O^xi}q{Hqoj|^7=xo9|sa9Tr0(a9MtYP zX1_GZwf3?6WYM(O)lZxPgi0HE4$a*&>|6}y+K=wC$BErFHDNIT>VZw&m#57bU9R@N z2FUW4s?pr#HBZhF$>rOF8q027bruUy>;z4u)#qSy{-?EZD8tU=Tc-6Hwu3i>w{dpXa`xG#1i0{Kl_ zm7x#fPwEi_G?Pg-7LY)DvP|l2vnT%F{QI&L6{}-@M{l+BDoJFo`gL}q$x$!kJq_-i z(WE)MsY?5WRg{v8_z}O4e$2gc*n#*s8vk~~t9ph>cW?Sv$CI6ALNvM`=7MG7P+0Zd zj<#*&`2qC2>*@5|esq1j6XSt#&%`{i)O)N2*=l z<~ahkLjG3p0LoaKs_Dn0&F@XU{z2SdH-&0x3Gfyts~e1qONRE{L_N0BGOV^D!66H( zN^&WGk`UK+pFl710;l|J+M^MHPKP;Qh@W{gF~xq$mRtX2SbFE&x0FqrBJuVN&DG7K zGSD1m?El-${du1ev?^r%YQn-IU0C*SFr#2TWq|gDkQrVpvT*-F*zMw&FpQ0&`i-AG zx8!zmk_RB_*)jg{^}b5!P!Ui~>yM3A-FVNtHCt_JZcc=m{YV*|B(^`OxvA&9;0pJe z;T|t&MA#jh{U?VtxSYPHi*hQRLc5_EdltP+8=#7q-(1(j%zfUeKmo$DfJwJ;tb4X> zOugm!Rr#l!X%~kegWGl10T}u)IlxUE4QxMoyGKaMXs_0FkEQW)w;RIIY6UJATI7pw zsObBB@9w&)X^ZI5Cf067fh6bX$BoAISt+Q~cam0IZ{20h0V{v3r%`v?f#ThR2->G| z>XefM1|G4-q-S`xW!d>Bd#1Yk-IOq>0e5W(%}H8>hTYG1x45FZT8Z8iPp!KT} zG%`-Dk@b`mDTzdOF7#(@x6GR^)Mcw~W_{!4%1{rd_UIsz#I=rwngE#H72chS zLx1<@#S>uk4=Hi6XcF{w&z%l8Ip8zsG;#$(t-OXBXOsy}*_hd`$F`2xAuw0{Q09b$ zOvfHq)=6=H8TOUDrUco@_gpC#eT&7w*{rw zXfV!?u3Qg%^y;04&J4AbJ0EV>lXkzQ)o}DS@DzAL$ioCVt3(>D*g>9Sifp(f!TTAm zd!e>;#%RP?L&H?A|4rr;|EmcTg~UJ8sywOvoczB`tfsEs(5?H)T~th?HBLHBt;wAv z{R#@YuWS6^)Yb?bvj%AkyTn5D^?hDLbEi6E%BO)+JF>6beg=RDHY*~?IID*I=&vu@ zV#v(YpGG=15Y+~7-T=eR;89ey4>F9sS$yfA%MaI$n#T+4zuq~)OmG|LCSyz*SZ>MPm5 zv`3qo3mZ0B7w-s;;{KHGi{<2az;wGRIzyl8oPR5TQ@p=T=qbZfvcOlozzq$kRy0{({tuNR4?I*c8>O<+*Re_>x*g z-B$l=ZAJzG>q*@zb&G76AVl!mn6j(#%UUZzW+P75)XTDtHcMPD5P#phO7U;nYrvr_ zKH(tE2b!E2y7eixZ@BEN>O!!l9$FvNZzZm7HdLyR5`~tyecVHYdp~+~M!REVHFe*q zSv{j(?9(pb#&XKU)w%s`myLs^?N#GJB5!&{BSt#sK0fj1mXp>>f3T0F@qhbKn`#JY|ytto9(|EeOPKC_nrU0!)N<}!aW*c6z*=6QC z>|lT9G;nmW?yF1PHIQj9EMlc1 z>?ik>9b;ixS-5uOv#GCl47u@?uJib@66C;zGgPQx+Zzg{uqh6S1 z*L3|EsdSM~RU1ehG)<)s@ZFkwv zu=TO_^wYu%I;3B)3m*um*Qvj-UGmxVx)b`|@^o&W$6)1Foh=21%W&xB_U#^QhjH;4 zi&$phIeBcC#4Tv1vr0T}nxRK#ZY{Bx2DFoyF6$&^rmt?+X+2EY)H3Bxi&m^cIPQh9 zt2)UjO6arDpfUuBYzw$9D)3Y^eGb247byuu+!HbAOR2?r)>+@*EHSQMaaf5fjYwG{ z0PvnZxq<3D2KyiUw0Ns_U*aX~va9_U8^p)E5{Lp25$wGI86&i_YUI;b>|v%`hzMC! zo|gQjuoZX?=*O_saDYd;`TCP(b1)$QVmajThI?M3J-XP+N#PZNuaOD`yM|XZO2^ zT3F}Ck4Tw3o*a-gvrEjf#3foW23I?-;~B{yzo?N)T7PAQeoy(VD3%TtKT8EhuCBQ~FBtl$<%{MFkSx2Fy zN26fBu?^>p-OJEdfFx63+j7KLU2@tY;>`w#JB!HbYEfq{BSo82_~OE>GDCm#szVM; z&-Ub|9EVRciW(wI4;MyhEl_(^Zsj%_KW}+l|9VYpxHR1Q%SE%r+c1zNC#(BFk(CO7 zR%UMDHLt7aeSh=5s2YH-GLo^IoC&{GL%n=(U_ctKJRGgcP{qwoQ%hO*WlHnZ^3bJ$ zexhOGq^p~+J)l4%f6b95*!A~yWzqM=aIaTmyRy3B+?0Zlr&InZM{F3P7*N}|BqZ=@ z%}Nfr!$=fdB93X5w3~ZbvYRksg_J=$$Tzwek3=k#()|>^0EEob?-x!2s*wvU&kd4a zwg7BU^>_H054}Ww7|CkkQ>0t95FKz|&((?XL()uc3{ZL-pl#x?$7m6g92QebVoS=_ zI_c9h81*PpurgNLql^*BfXrX}k)c`|(Mc{_YjOt#v@wTmVWs|5K3zz6^%SC{dr(y( zWKsCc)Z9<2Msc*9qT&B$wFbSF?GjP%!V#+TmoDSQ7Gvk%PQMhrQ*JMI@zsrfe&OKP zz$%cc(l`L)(nvs5YIhDfze}XDVicW@yt?p5qvK%;bCB}YKrVpi)^8Mc1Wrt%_Pzxw zXYo@z3{cQ2ZRpl-qpRXaO=_MOenuHE&Uf$+MsrW@9twkRds8ceNKpMc4NYn{&@<|N zvm8U$C}Z7HR=MKN8n!o6o(8mSGb3VJI!JHSd0ou0-=E5e6xe!7lBO1nnSq*bQ7_!K zmc9Wt>2*Eb{}P1=tbJkr%ldMgcm^+jSzx>tx}WSz$10Zb6?H@$@($DjIO558Vt6e% zEYq=KXUMv0e;m5`QY(9)(R;S7&9V8gC?Ke9rrD2DnH|Sl|IPd`pT3z|XP-Q6>decR#s~C_uA{p zbt0484}V-qu|BmpvRg2*krdOD4%fnRr2x8)Iy0o(E}bv=%Tn^uQW*g9BtQ#f0=~bS zX`$8`m?8Y<^si81FQWB`zr<2&jQ@yZrR2g-8C6o#mLB+f{gTcOcw57a$N$$MJO9kr zkg4aeH!JNC0h)l-u*H(ycMH^xL&V!A2bZPZL*R#XJzgBO2MwaKXN2skvI~C8KP7z? z4c_4R+9hT-jEuFTR--dG>P$a!3-2mev;4Gp{x{z>@F|~-4#7|sUcPFJur*lwFGETo)2R&#^G>P+2=#OAmyH#b3*4RS{*p=hK&qqEuB2AgRN#UOSYnRH`1dn-4WQjTx z?~aIFZrlFlT3J2+Z?p3Mqh{rQsc!z|C;FG4=wDJPDi^`0e@UhO-uM5@1okg|+c7oA zzw~YYe{$|oDGht&58urkZU-D1!+rUWdGvlh_+@oKIQ&eevVG!eMZ^x~*F`vtkqq;U zE|HtW>!wWqp&X%dQiL$Mw1QW5x&NWb`xsB<4IDPmt?Ln!L*JkNeu*1$r|{sh*z9-_k0}RVRa5yK-3eEaHT6S!GRN9~2B2~tN{ukN z3@S$BfEzNHUFZ>4Xr1kJDE63U!mRJ^H_0C962UGl5C6pn@E@WBhhXz#-m^r~OPl9=Dy!4$`yM6;)Bzev?~+kkuvsINI}218aK z^_>@*iVR)pEt;A2(>&`s)QBSzq+~Y@m zDKfs>uwJ54N68ys&+_Mp{}cGqtol{&vyPoDuc$3to6R3^?qt{;{6}&ff>d+cXe^ zM^@Wv{~Ax^dC2y7`l>Uv_{{3H+$0;TdjXh)tlC^El4Z{xUVToWs&X5f+_~)juH!5+ zRB*w4t4NyL)gsEFKB8Y-o!3lh!4d73wd-6D2T{)z*Hcm!Jlxg!q>w}Ta1Q9 z#&WnZDr?<598IK7B26_%o$L=l*y4vEzff_zb_3wGkJ-T$3uk%khBS}a4FyR<+in!K zD{C6UFZ@8Cy|0A_uzXU9-c!|cPhS3jSH1t<`1t&7^#l{(Dl&chv;2tNFDi8EIKA`W0P@G;GVFhl zfB{@#J@J?KJ=(hYvVynqUbjxFiMjs}P;x38PZJf!3Y_`=0vYe@hGOyJ(`aQEepQJr zz(Q`dOppd@%(fSXKi5fe1e^7Efy{HdV-`}$w+>`!SmOUDanEBwW5zvjJm zTC3Up{K~y;1N|ewGDQ~hT0J3Z^r=p)~|5F-(-xL5vzsvr5 z-@ibp|9sxCF8~>Tw3_ALL*#$D0s!fYsl_?|)oed`iMn?G(Wvv-@ZUFH&=LdMGsq9W zIKT7Xm%i<;{+q!&=K->7|K~>jj0X5jo>O=-u_Gpj@9)Kt4Uk8yaxo?(;lg%{0*|nO46im(r=HA^q)qbd<-H!T z>7lZGa@=e5^vK~bm;OA&1{oX6ymPebg@$*hB~&1Ii;T?@YNZT|{j}WdBgxjACqt&A z4=fG29QEr6O3U_7kD>HSz94nY1Z)7NW|?;`j_ljLeEnsa2Vae zNHC0sx7nZ)gQ=Ce6B}d#8129QZK@;vNGfcu!LMR2oxHnnM23o9U&Y3p9LB`_N?QYV z2nY$Ntam1w*)Ut*j*Y5EP&)KM_y(sBrSS$T4wN(L`cZxjHD&YPUvKdkI7TV%hoXRP z1o4tU*5chbe1a|GnDDRy-MUwUcT3~d9w5-Zw`YIE<@y>8HU$mGSl$__bm2*F{ML-` zhFEL&-yJyyZq;P2xa1?uQ6G8?KfcVnw;kTYF0jw@8$|Hi!0V{F_;W5sTi%6{{u)e* zC9-K_tfE4^!rjs13QA>IZqObT?+ys>UrCsO+zyZgxw)4q-mV|Lg-x*UO>xR-3~F0a zfVxc27@JWpMz3_71^CkPXM6qV_B=6rmt03?hoEkc|5MPatyl_ti|(emfd|P%(Teq9 z_RYw^W_PL9aV&|RXW^9La}#B8ydRS+7t)%thM}eEJy>SVP82PRi(DVp9Qu{Me&m|h z3{r%Cm=@0n?JRyM!wc~ zT{f4o7BY&!uP^8twVCTvI0&ciYuT`g943uqRXW}Pa-7tU{-VR&+*TK`#b%d)YXJ%T zs5eMl`$WH9LI~xy6}0uOovWFD)i6>Z4Pf0ZJv=K{B7!#*p*{I>_~(s4tvktV4tB4! z2lWacr3DXcyyNzm^NY^@9u6EJ$Ni4so8!DPywBGXNaCugnkBJEKgZT{s40gR_r0_% zmh8y9rO(yo`>b^7A@702ZTNg|o?lbPYKQ7mu0->RKh5w|0DpX%>B--wQ4j~bQY3!X z=V68CKF_(qvzF~+SlQ_Tz;|hjThHmh9k`I)z(Y#`xSJKn*@1peO$yied=-(fPFj z=x?_rQP~gVwqD4XclxbVK@GpQZVSnX*y7j}*Q3NMaQdgBy-GqV-t&YySCYpP&^Duy z5Un&3$ei5@K_;PlG<;Tg95L!X2@M$h#`dyar8(h;dS1A9Co4J))K7rizpsu!8vT74 zlc)Dqzl1?93GyK714#LK;H;`kNXA?mI*2*+!i=c7-Wet{GOFscG8~X$dJq572;BG$ zIA>U|#k4s%7E73ofI_8E*m!rcfd)BI!Lj@H-NAT=oXou(Xi}?Ej6g`;qK7wF5O=jdb7 zaydwg|9<~WuT3;xi!ZOTmzF1h6K@gihTL>)q7UG=?e;_ zsMyC40Z{q4#)LiX^R7Hf=$09p{^7M51=&wlTu!H~;P~P%bmDR{TFYaFB#bfpTfM5FZ~7^Xpj(W{e)jw_%V^etsZ`~V9O)_VqtJ|#s&A0y;gFM3P-!A>dXA^ zNiy;Z*0xlSpGmYdX?d;-)>xPs&TI15E3y-%J+b8MN9dr8jtix3xwW63@|+*o33(aV zqk(B%iAr&TPe$3|k^?VE#T@gJU%irqiTJU%p#1w0HJLkeidNXSZGBN0NCCC-%LqhI zCijaCS4_&{Ltxe zi}@qln=y!8M?RSWE4_)~Z)hpDb8?{{ztqQl*njs-a|ltlck4B+aqC9?!RM+wM1*9d z6NN+uY*DM7Oc-`ln_q!b)(sOfnDF|fbexCSs#Br@dqg>T$xKhDZwlly2P^4rHt8*Q zpTCj=RT(vt$tMnF`j1p{Fr-8RhatOC{ij-ZQ|$|ohtFLe?Sq4iSK-9`14SIb;fl`Z z_cuUmm}T)Xcp1L-KyFMO$79uo!>@YE{j(~%&bYhGtkJ(}{ zU1U4X5E4Ykd{as}_PQam$nPK@j*YZ$3IOGzbGBBu%BbR=RP8+i{I#}45Wz7-BjH-+ z%}dhj0SXhJdd|`Fj@BRE3t_gF1YhbB6vFppums|by_SWzdFHFxHz;;Xl46$~76Qd9 zi@#olV)begmU%4jpI)1xhLlDgP3bT^mOjUO&hblV$Os%rF7J%t1jV@V;5N?55gTgJ z3;~!=%IQRUFed~li%#;JW32=sAf~Zzy}bEjLlz%x*YG#>?|h)Krf5cB7?xH_FqhDV z#Ojp0_8!oz|6H|OVQ|DQJd-;JtDSG-IoIGrS7N4SnL#F$+Q(XE8Z69VJA1J5SR)hf zon7e_bK|p-k45%`&c1JoHLDHsOP)rzJaxjrd#xSTEV!;OGiEbeRhh*nz~Og?Hy^be z_z~U9c@5iN71~J^W-Y1JXALo!Nel?Rrm6PK%(pb3BLB*FKqs9_Yrs}Ush^J)WPx6Y z>@ZwCidE$c#RWZZYF$$C!3x01pB=YaqB7@SghyHUVCWBaooVgYO~Yxx8xUq?O?FrJ z9+Q8^oFijVx;IblTuo1LDwSLjv^Yzs5!X z9=0u&CfJLVTg9&}a6Ebo!fgph6$m^XWsJy<67a&6;y^TWF707X@q|o%M(~KJ4@Q-7 zj+Wuprz<`>LqrOM`1#K6k}y$Be2n^NJ3RQ0@r+s`N^mET7NlA8l&4ofXmGblv93UO!; z)wcy*BV03LbtHM3_|?du@mp2)-u6)NZM`rfK9lW(6mbV0*^2mJG0*Wq%)T#(UM{P2URrz2F41Uh!79NupgM;>*^e(- z_+T4pgK52KX@*v#%k3#p=3F)O^v%%(n5qw^0S{KCy|gf7BH@1a^-Kl9>K3_2Rw;C2Q;@I;3zFsdN`BHRj{ZUMeNsb`2aiBgJGZvI4dRS5EO?a1t zc*gnq8cW#Y320qESd9!E7c)6CDdw>j(UOv~ufk5E|N3%x!EQX7s1hEpKdOJSgwZnj zkRc}KcMr%9746w}wa<3TRkp6Rez!0eO@ z^i!fHgeF;^$L?u<$s1MK#|6?pU@Vl4K+nN5!1Bx7J^Ev5B2KvJ)40 z+O9$1^g?7Pu2!Q`@67At^!jr+#S1smzV^sYPuO~%noxORzB>Ntix7lu_3je4dtGd6 zXaH8x@v-{)1^c+AWg!0=N!YA_o~eNJO3gmXv{D#7Yl+$H;i;#$&GKd?4_S)a!N57g z3^VW*8vQ5`Q+tyu0XJVKb)=3HYXx|~u(8m^h$1~h4BRrVW>JL`dom2gVcq|f>Cq`) z_M8}ED{qjwU4q!thN@eWSJ8`BT1y);elK8?jIpQZ+7*?KaPQ^IDpz^)BjHb6YvpYU z{QYXwKvs4th*-9Up# zvDtv|Y`k@jlXlR4>$S5!SU^j!;&Oly@EE`EJLANo@?kkLWyg#lAt|5Vd$950a-u+G zZy@3|_e66i6zDf}}OQ@)Ww(mi!K1V8O7mOMXiW5ZK zVSfFnCU{YjM96+Q6cRxD2AX5-4il4$OEF&5FeY##g>#u zX|TQ3a0mNgR8^VGq-kr)&qjbG&d*-)Fl$$)*!0_m)$2BE_z1ACVd+v8<~4!wBf<_# zsva-5JCF544EMAL)9QS7luaWFu0Nh@eB0usalJA6Ei%P`)6S*n?5fTL`7k=gqIh4ApgZbIa2p)8~A1T2G?YV@p@eK^Mb zQr6O$`od2wmrC=O*9GP6T$c_l`Qh1Y+gUe>UofQ?Ur$abcl2?vbTN+U%qfWE|uH8lA{mM0mNAC_B;Q1fFo;nrcu>p{ahJKQY! z`6WxddS@5Z)*^}Lk~|#bR?M{)wltAW=|v-1l#dI#Nkl!9=~FHcoNsrUUd`+lAHGNo zf^>&MOHD(*ZPgOic8QE(Y_ggLj?XH^mp1mKo;6W@ z_hRp!+-Nr))z(8HcbRSNK@ZNx$@fCk5uVIV+T*i<8?v23N|T8k8YlBJOTx?*e={k; z(Yyp_S`5==GKJKYm6ZcMXBRi2(~W>Yj>Almmw5%@h`{qlunt2!FPl)C#wd~bGrEr{ z(xQBR#Y6$~KC|qD8{{elO`KP0$Bn)o>8AR)OyOtVb~nB9_g@)xT*LdNWv0+%(TViT zH_u3G(HTZ+eZN7f+89E*QUaF&`Dt^0$|9W*z1AtNye=JdG|^oF#h5BC#kPkxlJY+o zW#~A$;jt}e6te3}bDckqZ8jTr#~8GmkZaMjnN@N9jtmVXz4mR*Y_$)zR0hvoZDJ{t81fdptQPKg zQcpp~u6;+Gc|H+aX{LK5f)d2S^SlZ+1A-M#pdW?B#nK z3{QOUxN;>xac7`3=+IZCjhpiE)T-I|{RBQRUxJj+OY!Ve9jGei8An<4TLez{mbTWA zSz8rdduvoKPj8mye0?5;52+LuT3oz5L;_la(=Jj_!PNv$9Mm#J@Ea>x>Tn~f|tJs z(?;jrHEY|pVP-`>TCz5i(a>lBsI@{^)+1_tglhoDoW{baSxOn6{NfZ32S{FH5A#?5MdqU_oGScD#JUgLD zMdsgh{Gcn5yXm$Q#*W&}CB7Hnqj(iDQg@HFW#@EwperIEWEwjaUh!SJ;1HlqUTa}# z(d0N4m%`=Hxa$!Kq?y~nbxh?ht-KM+Rvs|~)Y)P8y3wh3g{s}4JcPKZZ)?4*wo)~e zTgU3fSX-XsJ8-6vg&#Gz5+~huAalhXIcImkdRaywsUw;$rhL7@bGEa&F|^#vvcYqV z*X`M((E_>oGq4qNzy&|ucER)c+?HL_OOrQM->r`55jm4e4%TemT1Pe3&>zGCo)*G4 z?Qh%Bwhl`a-Er59PxPV@grU3mxzfxswP&y@-M;CiJ?({q$O{< znx=3yuk3HX9kbTHa?!ZzJUVb|I#4>!)iR9(*SojaaJ(kFVRR8JUqaVl;#F4 z!bCeI!d2R8HQwk)Vm(T7S}b9RBn_l5CJqE;Z0iM#KMPth8Q*Uk))ZKhlOgK0dIn~k zRj7&KGRRKvEdvxF?CHI`@5`jw6!UNIVdl%3sfEWE##>AA=s=&EYg;aCOxaJ}RGC66za+Qb@x)XeXTw{2hb zmWv24ojuLh*6-(@)<)q={Q5hQV1CXP|Lt32*48=?A{qsCR}V$HqkK%znkE$)D{WpJ zXeQ}lDd_eaAwz4O#9Jq;a6fc0Y>F(+{X}|X$+QUWX$)5N@!6X6cBwyIgAR3Ag^P?Y z@|z_kPAaqQ+LwMcO#K|MD7<5DG@Xm1s&P=GGpl?t@{p?dh_ON*5RbQH-atmgB>l)1 zOkZ5}X@kUIBlhY7sE4Ic&cd3`vOc5JpRA9&1I7^+ta9527^h-)FJf&V4T+tLI#P4U zLwswAEu&7W^_S8Bk!yX!u}Pq`(bD};Jbl-z^R?nkL1N4nq`pm02cDmQdtG3Z8vHWe z_H{LlUj@y=5_?6dtiFy8O;aGmfU)Jvxn)LJuZ58l15N;k2~?vE)nXNn@c7Nw zi;XU(aEJ}=)vo7?Msn2;!~>e3k<4J)U7MTJgXEgQ2{7Pj;-x&(c_qN=zM{o_eNXhc zjp$RwO;{s=HXwdn`KBCisblxVi;kAglh9vJ_zFOkFNT1siq>nyo}z-A5^oT0Jn6(7=JEOGHW17-P zlaDGM|mIjL=&ntBZbd*dfcWL ziacLfjD7jamETuVPO9+oSIynvvV#M+vTRbBuiPo?U+&B`_sG0i&XyW@zIW9+WS|?*Z~=`_sJb>yD;Ti+E$-=PFIcRlNz-c7fPTJhv-x9Ml$n@WGDd$oG*uSMB^K9 z|CTJXewiKpOej+zl$wMBU2jj>adM|;FXUwCt=9CcL1c-#hjS{gk-UqT<*7xMerW`$ z7wxk!9ojqU$~Kr03^F|VxpLD7eh>RzPkkJB{D5!IO&J)e9JK|lzPY@qL$q|>C zzHO0SPw^VfM7p#-C_Q!zL!v%LaS8*rtQ+QwjNTCAa?wYV`y0>ML{eMeudF*>^sPQS zQ$!pJHSE)5F*S+RWJ}d^D^w0482Q#{Q-gb%vC+wX#p9n9In{*3og=$Q_7?FDQL$F_ zUJS6@HEut`s0>mj#?5CXlE+5>RjH{mofL@{h<=!h2ny{>G>@_ax zodW{OT8ret1wmcsNEwo5c;eNoYZfE0ogng)*)Hx`{IlsPbp^B1iD-x2gb$Ym&0D0# zN=nEHA0HBxra=`0b0m{H{^IbVd=My=%OmI;Cw}5Kq9N4t`{swQ>tFYh(_AzpL6~YI zcX|HyYoO_kt_m|^n~E;=N=Jt6XZN*^?AZbqEmG5B<8C$$yT}sT9wQO^yHeCJne+J- zwX}mqvd4%lY}-fNz+S8&0w12WVTmn2;&MkUCh|ijj|r|4CQPAFnQ?n-Xnvd*zfvgi z1LMmXRe^{hOt&%rnkja(%5m0LH!ifdlnt|!6g7y_mf7LVATA&j6Y#!`x>Qwk}ZEeT`wYi`SYq`Oz zwi{S?`x>r>g_09$vdE(TQ%zj$ekJ=qQ!SODHn=hxNFKfh1zWwT_|m_ervha(2L3{owd_nVB*dnzZuK|cVMy^ z0gC>p>iPi90di-8s})gvfOS>NGTAEL0L3g{$6bQC@NziWA&P>j< zd^^{*AEs7XXzV#o3~$o4lesl`Pto+K*S&StMF5c?$>XzgJFfKy+1|IC{_-(b{Zsffi672v$`(z|yjQvu zT3vBGvppXP=BlL#3BChf0SvX_@x0tg5^_vJc8y`IXEJe;YGyE709H z6O!8KUt+R@ki@}6B{MIqDDurhlIt1_59w^7N_sf&5Ivw{g__}?sULsu?0Ps_Ip!E@ ziiEHhTy=DmG6K>?Kho*l1Rya|KEO8Ja-(n@MlEb2PMmk@DM9JCMpzsOjv36-~X()tMKz^NT<> zvihsZ&h_`01YOx(hZFiJp?-`OSF*_Uh@D1MYIBna`JXYXr#F zy~;OWRzzjF`Be?M3H|%;wYh;r#0BKtN#Ir*Eb2E~(bq?Zwx~0gK_UK{adKpS<`-{R z5`INMp#OU?|J4SlE1fhRXecG%~bP=G*5kR^mdrEkuRpYtAgY2>Kx6bD@m8VJgF03wh$@`cXy8s z7RXdXK!Y7A2Cmr2$;r8iiVOz6SD-YeJl+27^_M{_n_v6ZySp>it7|3}AHMMGy;a(g z*L(3R0Owp~4LJ?jTSb_4cIhbW3Lv@F5fpCr$;1Gg5x{ZtD6Z49Vhb7x?Vwy-qo!0x z#Yt*)U@v_dnnWsR*dY8Mf(jHJV(f-Q<`Z2xtqvGU@Om!wyeMT6do#H)&}+p5j>#KB zE{sI3VSoY%Lm2UeMqf;50NY-Dc=Tyr@?z`7s>P4}Jp*}b=gDp@CiuftGs7iLkFqRx zOBIRZ1?{rdWm$^Vq&F9-p&vH1N1;qq2xK!c#>Ce!)XNu9 zvciy~)w`3kP)$NkQo;!)LJtdYtm&HIgU8!j;)7fg!e#&{k(aSg6eE}x*qsn%I(ak6sQJ= zmo24Fd!F4*@@rj#=@~fK8K?fqxB#`-CEzf{sXB|(?CDlv=8Vwo+N5A|DK`bzEz?wg zl9d4+3-NYCI{r6Hzoz~+roygdwqpmI=3A|MBaD?E)`pQy_;Nm&xv?>6MumTkP*M3) zF83WdbM(2VW+jeqsNZ s&0m_|KW6if+5DG_`~S_`yl3sGKT6+Ox`!F|8}O%n)!<6$r8^J)A9_6vX#fBK literal 0 HcmV?d00001 diff --git a/statistics.md b/statistics.md index 040e3594a5a77..e22fa955770c7 100644 --- a/statistics.md +++ b/statistics.md @@ -398,19 +398,33 @@ For more information on the `KILL` statement, see [`KILL`](/sql-statements/sql-s ### Control `ANALYZE` concurrency -When you run the `ANALYZE` statement, you can adjust the concurrency using the following parameters, to control its effect on the system. +When you run the `ANALYZE` statement, you can adjust the concurrency using system variables, to control its effect on the system. + +The relationships of the relevant system variables are shown below: + +![analyze_concurrency](/media/analyze_concurrency.png) + +`tidb_build_stats_concurrency`, `tidb_build_sampling_stats_concurrency`, and `tidb_analyze_partition_concurrency` are in an upstream-downstream relationship, as shown in the preceding diagram. The actual total concurrency is: `tidb_build_stats_concurrency` * (`tidb_build_sampling_stats_concurrency` + `tidb_analyze_partition_concurrency`). When modifying these variables, you need to consider their respective values at the same time. It is recommended to adjust them one by one in the order of `tidb_analyze_partition_concurrency`, `tidb_build_sampling_stats_concurrency`, `tidb_build_stats_concurrency`, and observe the impact on the system. The larger the values of these three variables, the greater the resource overhead on the system. #### `tidb_build_stats_concurrency` -Currently, when you run the `ANALYZE` statement, the task is divided into multiple small tasks. Each task only works on one column or index. You can use the `tidb_build_stats_concurrency` parameter to control the number of simultaneous tasks. The default value is `4`. +When you run the `ANALYZE` statement, the task is divided into multiple small tasks. Each task only works on statistics of one column or index. You can use the [`tidb_build_stats_concurrency`](/system-variables.md#tidb_build_stats_concurrency) variable to control the number of simultaneous small tasks. The default value is `2`. The default value is `4` for v7.4.0 and earlier versions. + +#### `tidb_build_sampling_stats_concurrency` + +When analyzing ordinary columns, you can use [`tidb_build_sampling_stats_concurrency`](/system-variables.md#tidb_build_sampling_stats_concurrency-new-in-v750) to control the concurrency of executing sampling tasks. The default value is `2`. + +#### `tidb_analyze_partition_concurrency` + +When running the `ANALYZE` statement, you can use [`tidb_analyze_partition_concurrency`](/system-variables.md#tidb_analyze_partition_concurrency) to control the concurrency of reading and writing statistics for a partitioned table. The default value is `2`. The default value is `1` for v7.4.0 and earlier versions. #### `tidb_distsql_scan_concurrency` -When you analyze regular columns, you can use the `tidb_distsql_scan_concurrency` parameter to control the number of Region to be read at one time. The default value is `15`. +When you analyze regular columns, you can use the [`tidb_distsql_scan_concurrency`](/system-variables.md#tidb_distsql_scan_concurrency) variable to control the number of Regions to be read at one time. The default value is `15`. Note that changing the value will affect query performance. Adjust the value carefully. #### `tidb_index_serial_scan_concurrency` -When you analyze index columns, you can use the `tidb_index_serial_scan_concurrency` parameter to control the number of Region to be read at one time. The default value is `1`. +When you analyze index columns, you can use the [`tidb_index_serial_scan_concurrency`](/system-variables.md#tidb_index_serial_scan_concurrency) variable to control the number of Regions to be read at one time. The default value is `1`. Note that changing the value will affect query performance. Adjust the value carefully. ### Persist ANALYZE configurations From aa5ecb271b43328264028665beddba17ff451ec7 Mon Sep 17 00:00:00 2001 From: Xiang Zhang Date: Tue, 14 Nov 2023 16:23:15 +0800 Subject: [PATCH 20/29] remove TiSpark from offline package (#15300) --- binary-package.md | 2 -- download-ecosystem-tools.md | 1 - 2 files changed, 3 deletions(-) diff --git a/binary-package.md b/binary-package.md index 1489b525943f6..9bada3c1e8e48 100644 --- a/binary-package.md +++ b/binary-package.md @@ -53,8 +53,6 @@ The `TiDB-community-toolkit` package contains the following contents. | dm-master-{version}-linux-{arch}.tar.gz | | | dmctl-{version}-linux-{arch}.tar.gz | | | br-{version}-linux-{arch}.tar.gz | | -| spark-{version}-any-any.tar.gz | | -| tispark-{version}-any-any.tar.gz | | | package-{version}-linux-{arch}.tar.gz | | | bench-{version}-linux-{arch}.tar.gz | | | errdoc-{version}-linux-{arch}.tar.gz | | diff --git a/download-ecosystem-tools.md b/download-ecosystem-tools.md index 28f04b7808ff0..fe65a624f7cc1 100644 --- a/download-ecosystem-tools.md +++ b/download-ecosystem-tools.md @@ -48,7 +48,6 @@ Depending on which tools you want to use, you can install the corresponding offl | [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) | `pump-{version}-linux-{arch}.tar.gz`
`drainer-{version}-linux-{arch}.tar.gz`
`binlogctl`
`reparo` | | [Backup & Restore (BR)](/br/backup-and-restore-overview.md) | `br-{version}-linux-{arch}.tar.gz` | | [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) | `sync_diff_inspector` | -| [TiSpark](/tispark-overview.md) | `tispark-{tispark-version}-any-any.tar.gz`
`spark-{spark-version}-any-any.tar.gz` | | [PD Recover](/pd-recover.md) | `pd-recover-{version}-linux-{arch}.tar` | > **Note:** From feffa9b58f82eb3bc7f6724f327009016899fc54 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Wed, 15 Nov 2023 14:35:15 +0800 Subject: [PATCH 21/29] *: refine placement rule in sql docs (#15231) --- TOC-tidb-cloud.md | 1 + TOC.md | 1 + placement-rules-in-sql.md | 523 ++++++++++++------ releases/release-6.3.0.md | 2 +- releases/release-6.6.0.md | 8 +- .../sql-statement-alter-placement-policy.md | 1 + sql-statements/sql-statement-alter-range.md | 32 ++ .../sql-statement-create-placement-policy.md | 1 + 8 files changed, 383 insertions(+), 186 deletions(-) create mode 100644 sql-statements/sql-statement-alter-range.md diff --git a/TOC-tidb-cloud.md b/TOC-tidb-cloud.md index 5faace65cbac3..f44ecfee672c1 100644 --- a/TOC-tidb-cloud.md +++ b/TOC-tidb-cloud.md @@ -338,6 +338,7 @@ - [`ALTER INDEX`](/sql-statements/sql-statement-alter-index.md) - [`ALTER INSTANCE`](/sql-statements/sql-statement-alter-instance.md) - [`ALTER PLACEMENT POLICY`](/sql-statements/sql-statement-alter-placement-policy.md) + - [`ALTER RANGE`](/sql-statements/sql-statement-alter-range.md) - [`ALTER RESOURCE GROUP`](/sql-statements/sql-statement-alter-resource-group.md) - [`ALTER TABLE`](/sql-statements/sql-statement-alter-table.md) - [`ALTER TABLE COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) diff --git a/TOC.md b/TOC.md index e65985ace8f4b..f83b17bb98b91 100644 --- a/TOC.md +++ b/TOC.md @@ -710,6 +710,7 @@ - [`ALTER INDEX`](/sql-statements/sql-statement-alter-index.md) - [`ALTER INSTANCE`](/sql-statements/sql-statement-alter-instance.md) - [`ALTER PLACEMENT POLICY`](/sql-statements/sql-statement-alter-placement-policy.md) + - [`ALTER RANGE`](/sql-statements/sql-statement-alter-range.md) - [`ALTER RESOURCE GROUP`](/sql-statements/sql-statement-alter-resource-group.md) - [`ALTER TABLE`](/sql-statements/sql-statement-alter-table.md) - [`ALTER TABLE COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) diff --git a/placement-rules-in-sql.md b/placement-rules-in-sql.md index d8a475a9cfc10..2e35f8828ffc2 100644 --- a/placement-rules-in-sql.md +++ b/placement-rules-in-sql.md @@ -5,276 +5,363 @@ summary: Learn how to schedule placement of tables and partitions using SQL stat # Placement Rules in SQL -Placement Rules in SQL is a feature that enables you to specify where data is stored in a TiKV cluster using SQL interfaces. Using this feature, tables and partitions are scheduled to specific regions, data centers, racks, or hosts. This is useful for scenarios including optimizing a high availability strategy with lower cost, ensuring that local replicas of data are available for local stale reads, and adhering to data locality requirements. +Placement Rules in SQL is a feature that enables you to specify where data is stored in a TiKV cluster using SQL statements. With this feature, you can schedule data of clusters, databases, tables, or partitions to specific regions, data centers, racks, or hosts. + +This feature can fulfill the following use cases: + +- Deploy data across multiple data centers and configure rules to optimize high availability strategies. +- Merge multiple databases from different applications and isolate data of different users physically, which meets the isolation requirements of different users within an instance. +- Increase the number of replicas for important data to improve application availability and data reliability. > **Note:** > -> - This feature is not available on [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless) clusters. -> - The implementation of *Placement Rules in SQL* relies on the *placement rules feature* of PD. For details, refer to [Configure Placement Rules](https://docs.pingcap.com/zh/tidb/stable/configure-placement-rules). In the context of Placement Rules in SQL, *placement rules* might refer to *placement policies* attached to other objects, or to rules that are sent from TiDB to PD. +> This feature is not available on [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless) clusters. -The detailed user scenarios are as follows: +## Overview -- Merge multiple databases of different applications to reduce the cost on database maintenance -- Increase replica count for important data to improve the application availability and data reliability -- Store new data into NVMe storage and store old data into SSDs to lower the cost on data archiving and storage -- Schedule the leaders of hotspot data to high-performance TiKV instances -- Separate cold data to lower-cost storage mediums to improve cost efficiency -- Support the physical isolation of computing resources between different users, which meets the isolation requirements of different users in a cluster, and the isolation requirements of CPU, I/O, memory, and other resources with different mixed loads +With the Placement Rules in SQL feature, you can [create placement policies](#create-and-attach-placement-policies) and configure desired placement policies for data at different levels, with granularity from coarse to fine as follows: -## Specify placement rules +| Level | Description | +|------------------|--------------------------------------------------------------------------------------| +| Cluster | By default, TiDB configures a policy of 3 replicas for a cluster. You can configure a global placement policy for your cluster. For more information, see [Specify the number of replicas globally for a cluster](#specify-the-number-of-replicas-globally-for-a-cluster). | +| Database | You can configure a placement policy for a specific database. For more information, see [Specify a default placement policy for a database](#specify-a-default-placement-policy-for-a-database). | +| Table | You can configure a placement policy for a specific table. For more information, see [Specify a placement policy for a table](#specify-a-placement-policy-for-a-table). | +| Partition | You can create partitions for different rows in a table and configure placement policies for partitions separately. For more information, see [Specify a placement policy for a partitioned table](#specify-a-placement-policy-for-a-partitioned-table). | -To specify placement rules, first create a placement policy using [`CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-create-placement-policy.md): +> **Tip:** +> +> The implementation of *Placement Rules in SQL* relies on the *placement rules feature* of PD. For details, refer to [Configure Placement Rules](https://docs.pingcap.com/zh/tidb/stable/configure-placement-rules). In the context of Placement Rules in SQL, *placement rules* might refer to *placement policies* attached to other objects, or to rules that are sent from TiDB to PD. -```sql -CREATE PLACEMENT POLICY myplacementpolicy PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1"; -``` +## Limitations -Then attach the policy to a table or partition using either `CREATE TABLE` or `ALTER TABLE`. Then, the placement rules are specified on the table or the partition: +- To simplify maintenance, it is recommended to limit the number of placement policies within a cluster to 10 or fewer. +- It is recommended to limit the total number of tables and partitions attached with placement policies to 10,000 or fewer. Attaching policies to too many tables and partitions can increase computation workloads on PD, thereby affecting service performance. +- It is recommended to use the Placement Rules in SQL feature according to examples provided in this document rather than using other complex placement policies. -```sql -CREATE TABLE t1 (a INT) PLACEMENT POLICY=myplacementpolicy; -CREATE TABLE t2 (a INT); -ALTER TABLE t2 PLACEMENT POLICY=myplacementpolicy; -``` +## Prerequisites -A placement policy is not associated with any database schema and has the global scope. Therefore, assigning a placement policy does not require any additional privileges over the `CREATE TABLE` privilege. +Placement policies rely on the configuration of labels on TiKV nodes. For example, the `PRIMARY_REGION` placement option relies on the `region` label in TiKV. -To modify a placement policy, you can use [`ALTER PLACEMENT POLICY`](/sql-statements/sql-statement-alter-placement-policy.md), and the changes will propagate to all objects assigned with the corresponding policy. + -```sql -ALTER PLACEMENT POLICY myplacementpolicy FOLLOWERS=5; +When you create a placement policy, TiDB does not check whether the labels specified in the policy exist. Instead, TiDB performs the check when you attach the policy. Therefore, before attaching a placement policy, make sure that each TiKV node is configured with correct labels. The configuration method for a TiDB Self-Hosted cluster is as follows: + +``` +tikv-server --labels region=,zone=,host= ``` -To drop policies that are not attached to any table or partition, you can use [`DROP PLACEMENT POLICY`](/sql-statements/sql-statement-drop-placement-policy.md): +For detailed configuration methods, see the following examples: -```sql -DROP PLACEMENT POLICY myplacementpolicy; -``` +| Deployment method | Example | +| --- | --- | +| Manual deployment | [Schedule replicas by topology labels](/schedule-replicas-by-topology-labels.md) | +| Deployment with TiUP | [Geo-distributed deployment topology](/geo-distributed-deployment-topology.md) | +| Deployment with TiDB Operator | [Configure a TiDB cluster in Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster#high-data-high-availability) | -## View current placement rules +> **Note:** +> +> For TiDB Dedicated clusters, you can skip these label configuration steps because the labels on TiKV nodes in TiDB Dedicated clusters are configured automatically. -If a table has placement rules attached, you can view the placement rules in the output of [`SHOW CREATE TABLE`](/sql-statements/sql-statement-show-create-table.md). To view the definition of the policy available, execute [`SHOW CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-show-create-placement-policy.md): + -```sql -tidb> SHOW CREATE TABLE t1\G -*************************** 1. row *************************** - Table: t1 -Create Table: CREATE TABLE `t1` ( - `a` int(11) DEFAULT NULL -) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![placement] PLACEMENT POLICY=`myplacementpolicy` */ -1 row in set (0.00 sec) - -tidb> SHOW CREATE PLACEMENT POLICY myplacementpolicy\G -*************************** 1. row *************************** - Policy: myplacementpolicy -Create Policy: CREATE PLACEMENT POLICY myplacementpolicy PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1" -1 row in set (0.00 sec) -``` + -You can also view definitions of placement policies using the [`INFORMATION_SCHEMA.PLACEMENT_POLICIES`](/information-schema/information-schema-placement-policies.md) table. +For TiDB Dedicated clusters, labels on TiKV nodes are configured automatically. -```sql -tidb> select * from information_schema.placement_policies\G -***************************[ 1. row ]*************************** -POLICY_ID | 1 -CATALOG_NAME | def -POLICY_NAME | p1 -PRIMARY_REGION | us-east-1 -REGIONS | us-east-1,us-west-1 -CONSTRAINTS | -LEADER_CONSTRAINTS | -FOLLOWER_CONSTRAINTS | -LEARNER_CONSTRAINTS | -SCHEDULE | -FOLLOWERS | 4 -LEARNERS | 0 -1 row in set -``` + -The `information_schema.tables` and `information_schema.partitions` tables also include a column for `tidb_placement_policy_name`, which shows all objects with placement rules attached: +To view all available labels in the current TiKV cluster, you can use the [`SHOW PLACEMENT LABELS`](/sql-statements/sql-statement-show-placement-labels.md) statement: ```sql -SELECT * FROM information_schema.tables WHERE tidb_placement_policy_name IS NOT NULL; -SELECT * FROM information_schema.partitions WHERE tidb_placement_policy_name IS NOT NULL; +SHOW PLACEMENT LABELS; ++--------+----------------+ +| Key | Values | ++--------+----------------+ +| disk | ["ssd"] | +| region | ["us-east-1"] | +| zone | ["us-east-1a"] | ++--------+----------------+ +3 rows in set (0.00 sec) ``` -Rules that are attached to objects are applied *asynchronously*. To view the current scheduling progress of placement, use [`SHOW PLACEMENT`](/sql-statements/sql-statement-show-placement.md). +## Usage -## Option reference +This section describes how to create, attach, view, modify, and delete placement policies using SQL statements. -> **Note:** -> -> - Placement options depend on labels correctly specified in the configuration of each TiKV node. For example, the `PRIMARY_REGION` option depends on the `region` label in TiKV. To see a summary of all labels available in your TiKV cluster, use the statement [`SHOW PLACEMENT LABELS`](/sql-statements/sql-statement-show-placement-labels.md): -> -> ```sql -> mysql> show placement labels; -> +--------+----------------+ -> | Key | Values | -> +--------+----------------+ -> | disk | ["ssd"] | -> | region | ["us-east-1"] | -> | zone | ["us-east-1a"] | -> +--------+----------------+ -> 3 rows in set (0.00 sec) -> ``` -> -> - When you use `CREATE PLACEMENT POLICY` to create a placement policy, TiDB does not check whether the labels exist. Instead, TiDB performs the check when you attach the policy to a table. +### Create and attach placement policies -| Option Name | Description | -|----------------------------|------------------------------------------------------------------------------------------------| -| `PRIMARY_REGION` | Raft leaders are placed in stores that have the `region` label that matches the value of this option. | -| `REGIONS` | Raft followers are placed in stores that have the `region` label that matches the value of this option. | -| `SCHEDULE` | The strategy used to schedule the placement of followers. The value options are `EVEN` (default) or `MAJORITY_IN_PRIMARY`. | -| `FOLLOWERS` | The number of followers. For example, `FOLLOWERS=2` means that there will be 3 replicas of the data (2 followers and 1 leader). | +1. To create a placement policy, use the [`CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-create-placement-policy.md) statement: -In addition to the placement options above, you can also use the advance configurations. For details, see [Advance placement options](#advanced-placement-options). + ```sql + CREATE PLACEMENT POLICY myplacementpolicy PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1"; + ``` -| Option Name | Description | -| --------------| ------------ | -| `CONSTRAINTS` | A list of constraints that apply to all roles. For example, `CONSTRAINTS="[+disk=ssd]"`. | -| `LEADER_CONSTRAINTS` | A list of constraints that only apply to leader. | -| `FOLLOWER_CONSTRAINTS` | A list of constraints that only apply to followers. | -| `LEARNER_CONSTRAINTS` | A list of constraints that only apply to learners. | -| `LEARNERS` | The number of learners. | -| `SURVIVAL_PREFERENCE` | The replica placement priority according to the disaster tolerance level of the labels. For example, `SURVIVAL_PREFERENCE="[region, zone, host]"`. | + In this statement: -## Examples + - The `PRIMARY_REGION="us-east-1"` option means placing Raft Leaders on nodes with the `region` label as `us-east-1`. + - The `REGIONS="us-east-1,us-west-1"` option means placing Raft Followers on nodes with the `region` label as `us-east-1` and nodes with the `region` label as `us-west-1`. -### Increase the number of replicas + For more configurable placement options and their meanings, see the [Placement options](#placement-option-reference). - +2. To attach a placement policy to a table or a partitioned table, use the `CREATE TABLE` or `ALTER TABLE` statement to specify the placement policy for that table or partitioned table: -The default configuration of [`max-replicas`](/pd-configuration-file.md#max-replicas) is `3`. To increase this for a specific set of tables, you can use a placement policy as follows: + ```sql + CREATE TABLE t1 (a INT) PLACEMENT POLICY=myplacementpolicy; + CREATE TABLE t2 (a INT); + ALTER TABLE t2 PLACEMENT POLICY=myplacementpolicy; + ``` - + `PLACEMENT POLICY` is not associated with any database schema and can be attached in a global scope. Therefore, specifying a placement policy using `CREATE TABLE` does not require any additional privileges. - +### View placement policies -The default configuration of [`max-replicas`](https://docs.pingcap.com/tidb/stable/pd-configuration-file#max-replicas) is `3`. To increase this for a specific set of tables, you can use a placement policy as follows: +- To view an existing placement policy, you can use the [`SHOW CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-show-create-placement-policy.md) statement: - + ```sql + SHOW CREATE PLACEMENT POLICY myplacementpolicy\G + *************************** 1. row *************************** + Policy: myplacementpolicy + Create Policy: CREATE PLACEMENT POLICY myplacementpolicy PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1" + 1 row in set (0.00 sec) + ``` + +- To view the placement policy attached to a specific table, you can use the [`SHOW CREATE TABLE`](/sql-statements/sql-statement-show-create-table.md) statement: + + ```sql + SHOW CREATE TABLE t1\G + *************************** 1. row *************************** + Table: t1 + Create Table: CREATE TABLE `t1` ( + `a` int(11) DEFAULT NULL + ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![placement] PLACEMENT POLICY=`myplacementpolicy` */ + 1 row in set (0.00 sec) + ``` + +- To view the definitions of placement policies in a cluster, you can query the [`INFORMATION_SCHEMA.PLACEMENT_POLICIES`](/information-schema/information-schema-placement-policies.md) system table: + + ```sql + SELECT * FROM information_schema.placement_policies\G + ***************************[ 1. row ]*************************** + POLICY_ID | 1 + CATALOG_NAME | def + POLICY_NAME | p1 + PRIMARY_REGION | us-east-1 + REGIONS | us-east-1,us-west-1 + CONSTRAINTS | + LEADER_CONSTRAINTS | + FOLLOWER_CONSTRAINTS | + LEARNER_CONSTRAINTS | + SCHEDULE | + FOLLOWERS | 4 + LEARNERS | 0 + 1 row in set + ``` + +- To view all tables that are attached with placement policies in a cluster, you can query the `tidb_placement_policy_name` column of the `information_schema.tables` system table: + + ```sql + SELECT * FROM information_schema.tables WHERE tidb_placement_policy_name IS NOT NULL; + ``` + +- To view all partitions that are attached with placement policies in a cluster, you can query the `tidb_placement_policy_name` column of the `information_schema.partitions` system table: + + ```sql + SELECT * FROM information_schema.partitions WHERE tidb_placement_policy_name IS NOT NULL; + ``` + +- Placement policies attached to all objects are applied *asynchronously*. To check the scheduling progress of placement policies, you can use the [`SHOW PLACEMENT`](/sql-statements/sql-statement-show-placement.md) statement: + + ```sql + SHOW PLACEMENT; + ``` + +### Modify placement policies + +To modify a placement policy, you can use the [`ALTER PLACEMENT POLICY`](/sql-statements/sql-statement-alter-placement-policy.md) statement. The modification will apply to all objects that are attached with the corresponding policy. ```sql -CREATE PLACEMENT POLICY fivereplicas FOLLOWERS=4; -CREATE TABLE t1 (a INT) PLACEMENT POLICY=fivereplicas; +ALTER PLACEMENT POLICY myplacementpolicy FOLLOWERS=4; ``` -Note that the PD configuration includes the leader and follower count, thus 4 followers + 1 leader equals 5 replicas in total. +In this statement, the `FOLLOWERS=4` option means configuring 5 replicas for the data, including 4 Followers and 1 Leader. For more configurable placement options and their meanings, see [Placement option reference](#placement-option-reference). + +### Drop placement policies -To expand on this example, you can also use `PRIMARY_REGION` and `REGIONS` placement options to describe the placement for the followers: +To drop a policy that is not attached to any table or partition, you can use the [`DROP PLACEMENT POLICY`](/sql-statements/sql-statement-drop-placement-policy.md) statement: ```sql -CREATE PLACEMENT POLICY eastandwest PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-east-2,us-west-1" SCHEDULE="MAJORITY_IN_PRIMARY" FOLLOWERS=4; -CREATE TABLE t1 (a INT) PLACEMENT POLICY=eastandwest; +DROP PLACEMENT POLICY myplacementpolicy; ``` -The `SCHEDULE` option instructs TiDB on how to balance the followers. The default schedule of `EVEN` ensures a balance of followers in all regions. +## Placement option reference -To ensure that enough followers are placed in the primary region (`us-east-1`) so that quorum can be achieved, you can use the `MAJORITY_IN_PRIMARY` schedule. This schedule helps provide lower latency transactions at the expense of some availability. If the primary region fails, `MAJORITY_IN_PRIMARY` cannot provide automatic failover. +When creating or modifying placement policies, you can configure placement options as needed. -### Assign placement to a partitioned table +> **Note:** +> +> The `PRIMARY_REGION`, `REGIONS`, and `SCHEDULE` options cannot be specified together with the `CONSTRAINTS` option, or an error will occur. -In addition to assigning placement options to tables, you can also assign the options to table partitions. For example: +### Regular placement options -```sql -CREATE PLACEMENT POLICY p1 FOLLOWERS=5; -CREATE PLACEMENT POLICY europe PRIMARY_REGION="eu-central-1" REGIONS="eu-central-1,eu-west-1"; -CREATE PLACEMENT POLICY northamerica PRIMARY_REGION="us-east-1" REGIONS="us-east-1"; - -SET tidb_enable_list_partition = 1; -CREATE TABLE t1 ( - country VARCHAR(10) NOT NULL, - userdata VARCHAR(100) NOT NULL -) PLACEMENT POLICY=p1 PARTITION BY LIST COLUMNS (country) ( - PARTITION pEurope VALUES IN ('DE', 'FR', 'GB') PLACEMENT POLICY=europe, - PARTITION pNorthAmerica VALUES IN ('US', 'CA', 'MX') PLACEMENT POLICY=northamerica, - PARTITION pAsia VALUES IN ('CN', 'KR', 'JP') -); -``` +Regular placement options can meet the basic requirements of data placement. + +| Option name | Description | +|----------------------------|------------------------------------------------------------------------------------------------| +| `PRIMARY_REGION` | Specifies that placing Raft Leaders on nodes with a `region` label that matches the value of this option. | +| `REGIONS` | Specifies that placing Raft Followers on nodes with a `region` label that matches the value of this option. | +| `SCHEDULE` | Specifies the strategy for scheduling the placement of Followers. The value options are `EVEN` (default) or `MAJORITY_IN_PRIMARY`. | +| `FOLLOWERS` | Specifies the number of Followers. For example, `FOLLOWERS=2` means there will be 3 replicas of the data (2 Followers and 1 Leader). | + +### Advanced placement options + +Advanced configuration options provide more flexibility for data placement to meet the requirements of complex scenarios. However, configuring advanced options is more complex than regular options and requires you to have a deep understanding of the cluster topology and the TiDB data sharding. + +| Option name | Description | +| --------------| ------------ | +| `CONSTRAINTS` | A list of constraints that apply to all roles. For example, `CONSTRAINTS="[+disk=ssd]"`. | +| `LEADER_CONSTRAINTS` | A list of constraints that only apply to Leader. | +| `FOLLOWER_CONSTRAINTS` | A list of constraints that only apply to Followers. | +| `LEARNER_CONSTRAINTS` | A list of constraints that only apply to learners. | +| `LEARNERS` | The number of learners. | +| `SURVIVAL_PREFERENCE` | The replica placement priority according to the disaster tolerance level of the labels. For example, `SURVIVAL_PREFERENCE="[region, zone, host]"`. | + +### CONSTRAINTS formats -If a partition has no attached policies, it tries to apply possibly existing policies on the table. For example, the `pEurope` partition will apply the `europe` policy, but the `pAsia` partition will apply the `p1` policy from table `t1`. If `t1` has no assigned policies, `pAsia` will not apply any policy, too. +You can configure `CONSTRAINTS`, `FOLLOWER_CONSTRAINTS`, and `LEARNER_CONSTRAINTS` placement options using either of the following formats: -You can also alter the placement policies assigned to a specific partition. For example: +| CONSTRAINTS format | Description | +|----------------------------|-----------------------------------------------------------------------------------------------------------| +| List format | If a constraint to be specified applies to all replicas, you can use a key-value list format. Each key starts with `+` or `-`. For example:
  • `[+region=us-east-1]` means placing data on nodes that have a `region` label as `us-east-1`.
  • `[+region=us-east-1,-type=fault]` means placing data on nodes that have a `region` label as `us-east-1` but do not have a `type` label as `fault`.

| +| Dictionary format | If you need to specify different numbers of replicas for different constraints, you can use the dictionary format. For example:
  • `FOLLOWER_CONSTRAINTS="{+region=us-east-1: 1,+region=us-east-2: 1,+region=us-west-1: 1}";` means placing one Follower in `us-east-1`, one Follower in `us-east-2`, and one Follower in `us-west-1`.
  • `FOLLOWER_CONSTRAINTS='{"+region=us-east-1,+type=scale-node": 1,"+region=us-west-1": 1}';` means placing one Follower on a node that is located in the `us-east-1` region and has the `type` label as `scale-node`, and one Follower in `us-west-1`.
The dictionary format supports each key starting with `+` or `-` and allows you to configure the special `#reject-leader` attribute. For example, `FOLLOWER_CONSTRAINTS='{"+region=us-east-1":1, "+region=us-east-2": 2, "+region=us-west-1,#reject-leader": 1}'` means that the Leaders elected in `us-west-1` will be evicted as much as possible during disaster recovery.| + +> **Note:** +> +> - The `LEADER_CONSTRAINTS` placement option only supports the list format. +> - Both list and dictionary formats are based on the YAML parser, but YAML syntax might be incorrectly parsed in some cases. For example, `"{+region=east:1,+region=west:2}"` (no space after `:`) can be incorrectly parsed as `'{"+region=east:1": null, "+region=west:2": null}'`, which is unexpected. However, `"{+region=east: 1,+region=west: 2}"` (space after `:`) can be correctly parsed as `'{"+region=east": 1, "+region=west": 2}'`. Therefore, it is recommended to add a space after `:`. + +## Basic examples + +### Specify the number of replicas globally for a cluster + +After a cluster is initialized, the default number of replicas is `3`. If a cluster needs more replicas, you can increase this number by configuring a placement policy, and then apply the policy at the cluster level using [`ALTER RANGE`](/sql-statements/sql-statement-alter-range.md). For example: ```sql -ALTER TABLE t1 PARTITION pEurope PLACEMENT POLICY=p1; +CREATE PLACEMENT POLICY five_replicas FOLLOWERS=4; +ALTER RANGE global PLACEMENT POLICY five_replicas; ``` -### Set the default placement for a schema +Note that because TiDB defaults the number of Leaders to `1`, `five replicas` means `4` Followers and `1` Leader. + +### Specify a default placement policy for a database -You can directly attach the default placement rules to a database schema. This works similar to setting the default character set or collation for a schema. Your specified placement options apply when no other options are specified. For example: +You can specify a default placement policy for a database. This works similarly to setting a default character set or collation for a database. If no other placement policy is specified for a table or partition in the database, the placement policy for the database will apply to the table and partition. For example: ```sql -CREATE PLACEMENT POLICY p1 PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-east-2"; -- Create placement policies +CREATE PLACEMENT POLICY p1 PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-east-2"; -- Creates a placement policy CREATE PLACEMENT POLICY p2 FOLLOWERS=4; CREATE PLACEMENT POLICY p3 FOLLOWERS=2; -CREATE TABLE t1 (a INT); -- Creates a table t1 with no placement options. +CREATE TABLE t1 (a INT); -- Creates a table t1 without specifying any placement policy. -ALTER DATABASE test PLACEMENT POLICY=p2; -- Changes the default placement option, and does not apply to the existing table t1. +ALTER DATABASE test PLACEMENT POLICY=p2; -- Changes the default placement policy of the database to p2, which does not apply to the existing table t1. -CREATE TABLE t2 (a INT); -- Creates a table t2 with the default placement policy p2. +CREATE TABLE t2 (a INT); -- Creates a table t2. The default placement policy p2 applies to t2. -CREATE TABLE t3 (a INT) PLACEMENT POLICY=p1; -- Creates a table t3 without the default policy p2, because this statement has specified another placement rule. +CREATE TABLE t3 (a INT) PLACEMENT POLICY=p1; -- Creates a table t3. Because this statement has specified another placement rule, the default placement policy p2 does not apply to table t3. -ALTER DATABASE test PLACEMENT POLICY=p3; -- Changes the default policy, and does not apply to existing tables. +ALTER DATABASE test PLACEMENT POLICY=p3; -- Changes the default policy of the database again, which does not apply to existing tables. -CREATE TABLE t4 (a INT); -- Creates a table t4 with the default policy p3. +CREATE TABLE t4 (a INT); -- Creates a table t4. The default placement policy p3 applies to t4. -ALTER PLACEMENT POLICY p3 FOLLOWERS=3; -- The table with policy p3 (t4) will have FOLLOWERS=3. +ALTER PLACEMENT POLICY p3 FOLLOWERS=3; -- `FOLLOWERS=3` applies to the table attached with policy p3 (that is, table t4). ``` -Note that this is different from the inheritance between partitions and tables, where changing the policy of tables will affect their partitions. Tables inherit the policy of schema only when they are created without policies attached, and modifying the policies of schemas does not affect created tables. +Note that the policy inheritance from a table to its partitions differs from the policy inheritance in the preceding example. When you change the default policy of a table, the new policy also applies to partitions in that table. However, a table inherits the policy from the database only if it is created without any policy specified. Once a table inherits the policy from the database, modifying the default policy of the database does not apply to that table. -### Advanced placement options +### Specify a placement policy for a table -The placement options `PRIMARY_REGION`, `REGIONS`, and `SCHEDULE` meet the basic needs of data placement at the loss of some flexibility. For more complex scenarios with the need for higher flexibility, you can also use the advanced placement options of `CONSTRAINTS` and `FOLLOWER_CONSTRAINTS`. You cannot specify the `PRIMARY_REGION`, `REGIONS`, or `SCHEDULE` option with the `CONSTRAINTS` option at the same time. If you specify both at the same time, an error will be returned. +You can specify a default placement policy for a table. For example: -For example, to set constraints that data must reside on a TiKV store where the label `disk` must match a value: +```sql +CREATE PLACEMENT POLICY five_replicas FOLLOWERS=4; + +CREATE TABLE t (a INT) PLACEMENT POLICY=five_replicas; -- Creates a table t and attaches the 'five_replicas' placement policy to it. + +ALTER TABLE t PLACEMENT POLICY=default; -- Removes the placement policy 'five_replicas' from the table t and resets the placement policy to the default one. +``` + +### Specify a placement policy for a partitioned table + +You can also specify a placement policy for a partitioned table or a partition. For example: ```sql -CREATE PLACEMENT POLICY storageonnvme CONSTRAINTS="[+disk=nvme]"; -CREATE PLACEMENT POLICY storageonssd CONSTRAINTS="[+disk=ssd]"; +CREATE PLACEMENT POLICY storageforhisotrydata CONSTRAINTS="[+node=history]"; +CREATE PLACEMENT POLICY storagefornewdata CONSTRAINTS="[+node=new]"; CREATE PLACEMENT POLICY companystandardpolicy CONSTRAINTS=""; CREATE TABLE t1 (id INT, name VARCHAR(50), purchased DATE) PLACEMENT POLICY=companystandardpolicy PARTITION BY RANGE( YEAR(purchased) ) ( - PARTITION p0 VALUES LESS THAN (2000) PLACEMENT POLICY=storageonssd, + PARTITION p0 VALUES LESS THAN (2000) PLACEMENT POLICY=storageforhisotrydata, PARTITION p1 VALUES LESS THAN (2005), PARTITION p2 VALUES LESS THAN (2010), PARTITION p3 VALUES LESS THAN (2015), - PARTITION p4 VALUES LESS THAN MAXVALUE PLACEMENT POLICY=storageonnvme + PARTITION p4 VALUES LESS THAN MAXVALUE PLACEMENT POLICY=storagefornewdata ); ``` -You can either specify constraints in list format (`[+disk=ssd]`) or in dictionary format (`{+disk=ssd: 1,+disk=nvme: 2}`). +If no placement policy is specified for a partition in a table, the partition attempts to inherit the policy (if any) from the table. In the preceding example: -In list format, constraints are specified as a list of key-value pairs. The key starts with either a `+` or a `-`. `+disk=ssd` indicates that the label `disk` must be set to `ssd`, and `-disk=nvme` indicates that the label `disk` must not be `nvme`. +- The `p0` partition will apply the `storageforhisotrydata` policy. +- The `p4` partition will apply the `storagefornewdata` policy. +- The `p1`, `p2`, and `p3` partitions will apply the `companystandardpolicy` placement policy inherited from the table `t1`. +- If no placement policy is specified for the table `t1`, the `p1`, `p2`, and `p3` partitions will inherit the database default policy or the global default policy. -In dictionary format, constraints also indicate a number of instances that apply to that rule. For example, `FOLLOWER_CONSTRAINTS="{+region=us-east-1: 1,+region=us-east-2: 1,+region=us-west-1: 1}";` indicates that 1 follower is in us-east-1, 1 follower is in us-east-2 and 1 follower is in us-west-1. For another example, `FOLLOWER_CONSTRAINTS='{"+region=us-east-1,+disk=nvme":1,"+region=us-west-1":1}';` indicates that 1 follower is in us-east-1 with an nvme disk, and 1 follower is in us-west-1. +After placement policies are attached to these partitions, you can change the placement policy for a specific partition as in the following example: -> **Note:** -> -> Dictionary and list formats are based on the YAML parser, but the YAML syntax might be incorrectly parsed. For example, `"{+disk=ssd:1,+disk=nvme:2}"` is incorrectly parsed as `'{"+disk=ssd:1": null, "+disk=nvme:1": null}'`. But `"{+disk=ssd: 1,+disk=nvme: 1}"` is correctly parsed as `'{"+disk=ssd": 1, "+disk=nvme": 1}'`. +```sql +ALTER TABLE t1 PARTITION p1 PLACEMENT POLICY=storageforhisotrydata; +``` + +## High availability examples -### Survival preferences +Assume that there is a cluster with the following topology, where TiKV nodes are distributed across 3 regions, with each region containing 3 available zones: -When you create or modify a placement policy, you can use the `SURVIVAL_PREFERENCES` option to set the preferred survivability for your data. +```sql +SELECT store_id,address,label from INFORMATION_SCHEMA.TIKV_STORE_STATUS; ++----------+-----------------+--------------------------------------------------------------------------------------------------------------------------+ +| store_id | address | label | ++----------+-----------------+--------------------------------------------------------------------------------------------------------------------------+ +| 1 | 127.0.0.1:20163 | [{"key": "region", "value": "us-east-1"}, {"key": "zone", "value": "us-east-1a"}, {"key": "host", "value": "host1"}] | +| 2 | 127.0.0.1:20162 | [{"key": "region", "value": "us-east-1"}, {"key": "zone", "value": "us-east-1b"}, {"key": "host", "value": "host2"}] | +| 3 | 127.0.0.1:20164 | [{"key": "region", "value": "us-east-1"}, {"key": "zone", "value": "us-east-1c"}, {"key": "host", "value": "host3"}] | +| 4 | 127.0.0.1:20160 | [{"key": "region", "value": "us-east-2"}, {"key": "zone", "value": "us-east-2a"}, {"key": "host", "value": "host4"}] | +| 5 | 127.0.0.1:20161 | [{"key": "region", "value": "us-east-2"}, {"key": "zone", "value": "us-east-2b"}, {"key": "host", "value": "host5"}] | +| 6 | 127.0.0.1:20165 | [{"key": "region", "value": "us-east-2"}, {"key": "zone", "value": "us-east-2c"}, {"key": "host", "value": "host6"}] | +| 7 | 127.0.0.1:20166 | [{"key": "region", "value": "us-west-1"}, {"key": "zone", "value": "us-west-1a"}, {"key": "host", "value": "host7"}] | +| 8 | 127.0.0.1:20167 | [{"key": "region", "value": "us-west-1"}, {"key": "zone", "value": "us-west-1b"}, {"key": "host", "value": "host8"}] | +| 9 | 127.0.0.1:20168 | [{"key": "region", "value": "us-west-1"}, {"key": "zone", "value": "us-west-1c"}, {"key": "host", "value": "host9"}] | ++----------+-----------------+--------------------------------------------------------------------------------------------------------------------------+ + +``` -For example, assuming that you have a TiDB cluster across 3 availability zones, with multiple TiKV instances deployed on each host in each zone. And when creating placement policies for this cluster, you have set the `SURVIVAL_PREFERENCES` as follows: +### Specify survival preferences + +If you are not particularly concerned about the exact data distribution but prioritize fulfilling disaster recovery requirements, you can use the `SURVIVAL_PREFERENCES` option to specify data survival preferences. + +As in the preceding example, the TiDB cluster is distributed across 3 regions, with each region containing 3 zones. When creating placement policies for this cluster, assume that you configure the `SURVIVAL_PREFERENCES` as follows: ``` sql -CREATE PLACEMENT POLICY multiaz SURVIVAL_PREFERENCES="[zone, host]"; -CREATE PLACEMENT POLICY singleaz CONSTRAINTS="[+zone=zone1]" SURVIVAL_PREFERENCES="[host]"; +CREATE PLACEMENT POLICY multiaz SURVIVAL_PREFERENCES="[region, zone, host]"; +CREATE PLACEMENT POLICY singleaz CONSTRAINTS="[+region=us-east-1]" SURVIVAL_PREFERENCES="[zone]"; ``` After creating the placement policies, you can attach them to the corresponding tables as needed: -- For tables attached with the `multiaz` placement policy, data will be placed in 3 replicas in different availability zones, prioritizing survival goals of data isolation cross zones, followed by survival goals of data isolation cross hosts. -- For tables attached with the `singleaz` placement policy, data will be placed in 3 replicas in the `zone1` availability zone first, and then meet survival goals of data isolation cross hosts. +- For tables attached with the `multiaz` placement policy, data will be placed in 3 replicas in different regions, prioritizing to meet the cross-region survival goal of data isolation, followed by the cross-zone survival goal, and finally the cross-host survival goal. +- For tables attached with the `singleaz` placement policy, data will be placed in 3 replicas in the `us-east-1` region first, and then meet the cross-zone survival goal of data isolation. @@ -292,30 +379,104 @@ After creating the placement policies, you can attach them to the corresponding +### Specify a cluster with 5 replicas distributed 2:2:1 across multiple data centers + +If you need a specific data distribution, such as a 5-replica distribution in the proportion of 2:2:1, you can specify different numbers of replicas for different constraints by configuring these `CONSTRAINTS` in the [dictionary formats](#constraints-formats): + +```sql +CREATE PLACEMENT POLICY `deploy221` CONSTRAINTS='{"+region=us-east-1":2, "+region=us-east-2": 2, "+region=us-west-1": 1}'; + +ALTER RANGE global PLACEMENT POLICY = "deploy221"; + +SHOW PLACEMENT; ++-------------------+---------------------------------------------------------------------------------------------+------------------+ +| Target | Placement | Scheduling_State | ++-------------------+---------------------------------------------------------------------------------------------+------------------+ +| POLICY deploy221 | CONSTRAINTS="{\"+region=us-east-1\":2, \"+region=us-east-2\": 2, \"+region=us-west-1\": 1}" | NULL | +| RANGE TiDB_GLOBAL | CONSTRAINTS="{\"+region=us-east-1\":2, \"+region=us-east-2\": 2, \"+region=us-west-1\": 1}" | SCHEDULED | ++-------------------+---------------------------------------------------------------------------------------------+------------------+ +``` + +After the global `deploy221` placement policy is set for the cluster, TiDB distributes data according to this policy: placing two replicas in the `us-east-1` region, two replicas in the `us-east-2` region, and one replica in the `us-west-1` region. + +### Specify the distribution of Leaders and Followers + +You can specify a specific distribution of Leaders and Followers using constraints or `PRIMARY_REGION`. + +#### Use constraints + +If you have specific requirements for the distribution of Raft Leaders among nodes, you can specify the placement policy using the following statement: + +```sql +CREATE PLACEMENT POLICY deploy221_primary_east1 LEADER_CONSTRAINTS="[+region=us-east-1]" FOLLOWER_CONSTRAINTS='{"+region=us-east-1": 1, "+region=us-east-2": 2, "+region=us-west-1: 1}'; +``` + +After this placement policy is created and attached to the desired data, the Raft Leader replicas of the data will be placed in the `us-east-1` region specified by the `LEADER_CONSTRAINTS` option, while other replicas of the data will be placed in regions specified by the `FOLLOWER_CONSTRAINTS` option. Note that if the cluster fails, such as a node outage in the `us-east-1` region, a new Leader will still be elected from other regions, even if these regions are specified in `FOLLOWER_CONSTRAINTS`. In other words, ensuring service availability takes the highest priority. + +In the event of a failure in the `us-east-1` region, if you do not want to place new Leaders in `us-west-1`, you can configure a special `reject-leader` attribute to evict the newly elected Leaders in that region: + +```sql +CREATE PLACEMENT POLICY deploy221_primary_east1 LEADER_CONSTRAINTS="[+region=us-east-1]" FOLLOWER_CONSTRAINTS='{"+region=us-east-1": 1, "+region=us-east-2": 2, "+region=us-west-1,#reject-leader": 1}'; +``` + +#### Use `PRIMARY_REGION` + +If the `region` label is configured in your cluster topology, you can also use the `PRIMARY_REGION` and `REGIONS` options to specify a placement policy for Followers: + +```sql +CREATE PLACEMENT POLICY eastandwest PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-east-2,us-west-1" SCHEDULE="MAJORITY_IN_PRIMARY" FOLLOWERS=4; +CREATE TABLE t1 (a INT) PLACEMENT POLICY=eastandwest; +``` + +- `PRIMARY_REGION` specifies the distribution region of the Leaders. You can only specify one region in this option. +- The `SCHEDULE` option specifies how TiDB balances the distribution of Followers. + - The default `EVEN` scheduling rule ensures a balanced distribution of Followers across all regions. + - If you want to ensure a sufficient number of Follower replicas are placed in the `PRIMARY_REGION` (that is, `us-east-1`), you can use the `MAJORITY_IN_PRIMARY` scheduling rule. This scheduling rule provides lower latency transactions at the expense of some availability. If the primary region fails, `MAJORITY_IN_PRIMARY` does not provide automatic failover. + +## Data isolation examples + +As in the following example, when creating placement policies, you can configure a constraint for each policy, which requires data to be placed on TiKV nodes with the specified `app` label. + +```sql +CREATE PLACEMENT POLICY app_order CONSTRAINTS="[+app=order]"; +CREATE PLACEMENT POLICY app_list CONSTRAINTS="[+app=list_collection]"; +CREATE TABLE order (id INT, name VARCHAR(50), purchased DATE) +PLACEMENT POLICY=app_order +CREATE TABLE list (id INT, name VARCHAR(50), purchased DATE) +PLACEMENT POLICY=app_list +``` + +In this example, the constraints are specified using the list format, such as `[+app=order]`. You can also specify them using the dictionary format, such as `{+app=order: 3}`. + +After executing the statements in the example, TiDB will place the `app_order` data on TiKV nodes with the `app` label as `order`, and place the `app_list` data on TiKV nodes with the `app` label as `list_collection`, thus achieving physical data isolation in storage. + +## Compatibility + +## Compatibility with other features + +- Temporary tables do not support placement policies. +- Placement policies only ensure that data at rest resides on the correct TiKV nodes but do not guarantee that data in transit (via either user queries or internal operations) only occurs in a specific region. +- To configure TiFlash replicas for your data, you need to [create TiFlash replicas](/tiflash/create-tiflash-replicas.md) rather than using placement policies. +- Syntactic sugar rules are permitted for setting `PRIMARY_REGION` and `REGIONS`. In the future, we plan to add varieties for `PRIMARY_RACK`, `PRIMARY_ZONE`, and `PRIMARY_HOST`. See [issue #18030](https://github.com/pingcap/tidb/issues/18030). + ## Compatibility with tools | Tool Name | Minimum supported version | Description | | --- | --- | --- | -| Backup & Restore (BR) | 6.0 | Supports importing and exporting placement rules. Refer to [BR Compatibility](/br/backup-and-restore-overview.md#compatibility) for details. | +| Backup & Restore (BR) | 6.0 | Before v6.0, BR does not support backing up and restoring placement policies. For more information, see [Why does an error occur when I restore placement rules to a cluster](/faq/backup-and-restore-faq.md#why-does-an-error-occur-when-i-restore-placement-rules-to-a-cluster). | | TiDB Lightning | Not compatible yet | An error is reported when TiDB Lightning imports backup data that contains placement policies | -| TiCDC | 6.0 | Ignores placement rules, and does not replicate the rules to the downstream | -| TiDB Binlog | 6.0 | Ignores placement rules, and does not replicate the rules to the downstream | +| TiCDC | 6.0 | Ignores placement policies, and does not replicate the policies to the downstream | +| TiDB Binlog | 6.0 | Ignores placement policies, and does not replicate the policies to the downstream | +| Tool Name | Minimum supported version | Description | +| --- | --- | --- | | TiDB Lightning | Not compatible yet | An error is reported when TiDB Lightning imports backup data that contains placement policies | -| TiCDC | 6.0 | Ignores placement rules, and does not replicate the rules to the downstream | - - - -## Known limitations - -The following known limitations are as follows: +| TiCDC | 6.0 | Ignores placement policies, and does not replicate the policies to the downstream | -* Temporary tables do not support placement options. -* Syntactic sugar rules are permitted for setting `PRIMARY_REGION` and `REGIONS`. In the future, we plan to add varieties for `PRIMARY_RACK`, `PRIMARY_ZONE`, and `PRIMARY_HOST`. See [issue #18030](https://github.com/pingcap/tidb/issues/18030). -* Placement rules only ensure that data at rest resides on the correct TiKV store. The rules do not guarantee that data in transit (via either user queries or internal operations) only occurs in a specific region. +
\ No newline at end of file diff --git a/releases/release-6.3.0.md b/releases/release-6.3.0.md index 0bfcde6ea6186..517ddc6b08c55 100644 --- a/releases/release-6.3.0.md +++ b/releases/release-6.3.0.md @@ -152,7 +152,7 @@ In v6.3.0-DMR, the key new features and improvements are as follows: * Address the conflict between SQL-based data Placement Rules and TiFlash replicas [#37171](https://github.com/pingcap/tidb/issues/37171) @[lcwangchao](https://github.com/lcwangchao) - TiDB v6.0.0 provides SQL-based data Placement Rules. But this feature conflicts with TiFlash replicas due to implementation issues. TiDB v6.3.0 optimizes the implementation mechanisms, and [resolves the conflict between SQL-based data Placement Rules and TiFlash](/placement-rules-in-sql.md#known-limitations). + TiDB v6.0.0 provides [SQL-based data Placement Rules](/placement-rules-in-sql.md). But this feature conflicts with TiFlash replicas due to implementation issues. TiDB v6.3.0 optimizes the implementation mechanisms, and resolves the conflict between SQL-based data Placement Rules and TiFlash. ### MySQL compatibility diff --git a/releases/release-6.6.0.md b/releases/release-6.6.0.md index 7679c6974bd64..dac4d3688ecc5 100644 --- a/releases/release-6.6.0.md +++ b/releases/release-6.6.0.md @@ -166,7 +166,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: - For TiDB clusters deployed across cloud regions, when a cloud region fails, the specified databases or tables can survive in another cloud region. - For TiDB clusters deployed in a single cloud region, when an availability zone fails, the specified databases or tables can survive in another availability zone. - For more information, see [documentation](/placement-rules-in-sql.md#survival-preferences). + For more information, see [documentation](/placement-rules-in-sql.md#specify-survival-preferences). * Support rolling back DDL operations via the `FLASHBACK CLUSTER TO TIMESTAMP` statement [#14088](https://github.com/tikv/tikv/pull/14088) @[Defined2014](https://github.com/Defined2014) @[JmPotato](https://github.com/JmPotato) @@ -224,7 +224,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: For more information, see [documentation](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task). -* TiDB Lightning supports enabling compressed transfers when sending key-value pairs to TiKV [#41163](https://github.com/pingcap/tidb/issues/41163) @[gozssky](https://github.com/gozssky) +* TiDB Lightning supports enabling compressed transfers when sending key-value pairs to TiKV [#41163](https://github.com/pingcap/tidb/issues/41163) @[sleepymole](https://github.com/sleepymole) Starting from v6.6.0, TiDB Lightning supports compressing locally encoded and sorted key-value pairs for network transfer when sending them to TiKV, thus reducing the amount of data transferred over the network and lowering the network bandwidth overhead. In the earlier TiDB versions before this feature is supported, TiDB Lightning requires relatively high network bandwidth and incurs high traffic charges in case of large data volumes. @@ -490,7 +490,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: - Support setting the maximum number of conflicts by `lightning.max-error` [#40743](https://github.com/pingcap/tidb/issues/40743) @[dsdashun](https://github.com/dsdashun) - Support importing CSV data files with BOM headers [#40744](https://github.com/pingcap/tidb/issues/40744) @[dsdashun](https://github.com/dsdashun) - Optimize the processing logic when encountering TiKV flow-limiting errors and try other available regions instead [#40205](https://github.com/pingcap/tidb/issues/40205) @[lance6716](https://github.com/lance6716) - - Disable checking the table foreign keys during import [#40027](https://github.com/pingcap/tidb/issues/40027) @[gozssky](https://github.com/gozssky) + - Disable checking the table foreign keys during import [#40027](https://github.com/pingcap/tidb/issues/40027) @[sleepymole](https://github.com/sleepymole) + Dumpling @@ -603,7 +603,7 @@ In v6.6.0-DMR, the key new features and improvements are as follows: - Fix the issue that TiDB Lightning might incorrectly skip conflict resolution when all but the last TiDB Lightning instance encounters a local duplicate record during a parallel import [#40923](https://github.com/pingcap/tidb/issues/40923) @[lichunzhu](https://github.com/lichunzhu) - Fix the issue that precheck cannot accurately detect the presence of a running TiCDC in the target cluster [#41040](https://github.com/pingcap/tidb/issues/41040) @[lance6716](https://github.com/lance6716) - Fix the issue that TiDB Lightning panics in the split-region phase [#40934](https://github.com/pingcap/tidb/issues/40934) @[lance6716](https://github.com/lance6716) - - Fix the issue that the conflict resolution logic (`duplicate-resolution`) might lead to inconsistent checksums [#40657](https://github.com/pingcap/tidb/issues/40657) @[gozssky](https://github.com/gozssky) + - Fix the issue that the conflict resolution logic (`duplicate-resolution`) might lead to inconsistent checksums [#40657](https://github.com/pingcap/tidb/issues/40657) @[sleepymole](https://github.com/sleepymole) - Fix a possible OOM problem when there is an unclosed delimiter in the data file [#40400](https://github.com/pingcap/tidb/issues/40400) @[buchuitoudegou](https://github.com/buchuitoudegou) - Fix the issue that the file offset in the error report exceeds the file size [#40034](https://github.com/pingcap/tidb/issues/40034) @[buchuitoudegou](https://github.com/buchuitoudegou) - Fix an issue with the new version of PDClient that might cause parallel import to fail [#40493](https://github.com/pingcap/tidb/issues/40493) @[AmoebaProtozoa](https://github.com/AmoebaProtozoa) diff --git a/sql-statements/sql-statement-alter-placement-policy.md b/sql-statements/sql-statement-alter-placement-policy.md index 96a9b55615f62..e56848ff57e30 100644 --- a/sql-statements/sql-statement-alter-placement-policy.md +++ b/sql-statements/sql-statement-alter-placement-policy.md @@ -51,6 +51,7 @@ AdvancedPlacementOption ::= | "LEADER_CONSTRAINTS" EqOpt stringLit | "FOLLOWER_CONSTRAINTS" EqOpt stringLit | "LEARNER_CONSTRAINTS" EqOpt stringLit +| "SURVIVAL_PREFERENCES" EqOpt stringLit ``` ## Examples diff --git a/sql-statements/sql-statement-alter-range.md b/sql-statements/sql-statement-alter-range.md new file mode 100644 index 0000000000000..bcf11b636a14b --- /dev/null +++ b/sql-statements/sql-statement-alter-range.md @@ -0,0 +1,32 @@ +--- +title: ALTER RANGE +summary: An overview of the usage of ALTER RANGE for TiDB. +--- + +# ALTER RANGE + +Currently, the `ALTER RANGE` statement can only be used to modify the range of a specific placement policy in TiDB. + +## Synopsis + +```ebnf+diagram +AlterRangeStmt ::= + 'ALTER' 'RANGE' Identifier PlacementPolicyOption +``` + +`ALTER RANGE` supports the following two parameters: + +- `global`: indicates the range of all data in a cluster. +- `meta`: indicates the range of internal metadata stored in TiDB. + +## Examples + +```sql +CREATE PLACEMENT POLICY `deploy111` CONSTRAINTS='{"+region=us-east-1":1, "+region=us-east-2": 1, "+region=us-west-1": 1}'; +CREATE PLACEMENT POLICY `five_replicas` FOLLOWERS=4; + +ALTER RANGE global PLACEMENT POLICY = "deploy111"; +ALTER RANGE meta PLACEMENT POLICY = "five_replicas"; +``` + +The preceding example creates two placement policies (`deploy111` and `five_replicas`), specifies constraints for different regions, and then applies the `deploy111` placement policy to all data in the cluster range and the `five_replicas` placement policy to the metadata range. \ No newline at end of file diff --git a/sql-statements/sql-statement-create-placement-policy.md b/sql-statements/sql-statement-create-placement-policy.md index 028dc307c7a1e..7d0c7c0d77235 100644 --- a/sql-statements/sql-statement-create-placement-policy.md +++ b/sql-statements/sql-statement-create-placement-policy.md @@ -44,6 +44,7 @@ AdvancedPlacementOption ::= | "LEADER_CONSTRAINTS" EqOpt stringLit | "FOLLOWER_CONSTRAINTS" EqOpt stringLit | "LEARNER_CONSTRAINTS" EqOpt stringLit +| "SURVIVAL_PREFERENCES" EqOpt stringLit ``` ## Examples From c1c66aadaed67f8cc518754eea9e0c830577c4f1 Mon Sep 17 00:00:00 2001 From: Liki Du Date: Wed, 15 Nov 2023 17:42:46 -0800 Subject: [PATCH 22/29] Update TiDB Cloud overview page read more section (#15350) --- develop/dev-guide-overview.md | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/develop/dev-guide-overview.md b/develop/dev-guide-overview.md index aa8e480527b5c..631d238aa2a9e 100644 --- a/develop/dev-guide-overview.md +++ b/develop/dev-guide-overview.md @@ -40,13 +40,13 @@ TiDB guarantees atomicity for all statements between the start of `BEGIN` and th -If you are not sure what an **optimistic transaction** is, do ***NOT*** use it yet. Because **optimistic transactions** require that the application can correctly handle [all errors](/error-codes.md) returned by the `COMMIT` statement. If you are not sure how your application handles them, use a **pessimistic transaction** instead. +If you are not sure what an **optimistic transaction** is, do **_NOT_** use it yet. Because **optimistic transactions** require that the application can correctly handle [all errors](/error-codes.md) returned by the `COMMIT` statement. If you are not sure how your application handles them, use a **pessimistic transaction** instead. -If you are not sure what an **optimistic transaction** is, do ***NOT*** use it yet. Because **optimistic transactions** require that the application can correctly handle [all errors](https://docs.pingcap.com/tidb/stable/error-codes) returned by the `COMMIT` statement. If you are not sure how your application handles them, use a **pessimistic transaction** instead. +If you are not sure what an **optimistic transaction** is, do **_NOT_** use it yet. Because **optimistic transactions** require that the application can correctly handle [all errors](https://docs.pingcap.com/tidb/stable/error-codes) returned by the `COMMIT` statement. If you are not sure how your application handles them, use a **pessimistic transaction** instead. @@ -74,13 +74,30 @@ Since TiDB is compatible with the MySQL protocol and MySQL syntax, most of the O +Here you can find additional resources to connect, manage and develop with TiDB Cloud. + +**To explore your data** + - [Quick Start](/develop/dev-guide-build-cluster-in-cloud.md) +- [Use AI-powered SQL Editor beta](/tidb-cloud/explore-data-with-chat2query.md) +- Connect with client tools such as [VSCode](/develop/dev-guide-gui-vscode-sqltools.md), [DBeaver](/develop/dev-guide-gui-dbeaver.md) or [DataGrip](/develop/dev-guide-gui-datagrip.md) + +**To build your application** + - [Choose Driver or ORM](/develop/dev-guide-choose-driver-or-orm.md) +- [Use TiDB Cloud Data API beta](/tidb-cloud/data-service-overview.md) + +**To manage your cluster** + +- [TiDB Cloud Command Line Tools](/tidb-cloud/get-started-with-cli.md) +- [TiDB Cloud Administration API](https://docs.pingcap.com/tidbcloud/api/v1beta1) + +**To learn more about TiDB** + - [Database Schema Design](/develop/dev-guide-schema-design-overview.md) - [Write Data](/develop/dev-guide-insert-data.md) - [Read Data](/develop/dev-guide-get-data-from-single-table.md) - [Transaction](/develop/dev-guide-transaction-overview.md) - [Optimize](/develop/dev-guide-optimize-sql-overview.md) -- [Example Applications](/develop/dev-guide-sample-application-java-spring-boot.md) From 68e00ec11a5cc56063c52627abbe1640d9d61d89 Mon Sep 17 00:00:00 2001 From: Roger Song Date: Thu, 16 Nov 2023 13:59:16 +0800 Subject: [PATCH 23/29] include limitation of LIKE when it works with non-binary collation (#15352) --- character-set-and-collation.md | 2 ++ character-set-gbk.md | 2 ++ 2 files changed, 4 insertions(+) diff --git a/character-set-and-collation.md b/character-set-and-collation.md index 25357949e6fba..6ac04f4dd2ff5 100644 --- a/character-set-and-collation.md +++ b/character-set-and-collation.md @@ -113,6 +113,8 @@ SHOW COLLATION; > **Warning:** > > TiDB incorrectly treats latin1 as a subset of utf8. This can lead to unexpected behaviors when you store characters that differ between latin1 and utf8 encodings. It is strongly recommended to the utf8mb4 character set. See [TiDB #18955](https://github.com/pingcap/tidb/issues/18955) for more details. +> +> If the predicates include `LIKE` for string prefixes, such as `LIKE 'prefix%'`, and the target column is set to a non-binary collation (the suffix does not end with `_bin`), the optimizer currently cannot convert this predicate into a range scan. Instead, it performs a full scan. As a result, such SQL queries might lead to unexpected resource consumption. > **Note:** > diff --git a/character-set-gbk.md b/character-set-gbk.md index 68b5d9d088d22..75170e3194ea1 100644 --- a/character-set-gbk.md +++ b/character-set-gbk.md @@ -98,6 +98,8 @@ In the above table, the result of `SELECT HEX('a');` in the `utf8mb4` byte set i - Currently, for binary characters of the `ENUM` and `SET` types, TiDB deals with them as the `utf8mb4` character set. +- If the predicates include `LIKE` for string prefixes, such as `LIKE 'prefix%'`, and the target column is set to a GBK collation (either `gbk_bin` or `gbk_chinese_ci`), the optimizer currently cannot convert this predicate into a range scan. Instead, it performs a full scan. As a result, such SQL queries might lead to unexpected resource consumption. + ## Component compatibility - Currently, TiFlash does not support the GBK character set. From f6235cd5c4532aeb7c18ea61a172e2e9aacb1ca9 Mon Sep 17 00:00:00 2001 From: Ran Date: Thu, 16 Nov 2023 16:45:48 +0800 Subject: [PATCH 24/29] GA- tidb-distributed-execution-framework (#15357) --- system-variables.md | 8 ++------ tidb-distributed-execution-framework.md | 4 ---- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/system-variables.md b/system-variables.md index 9b1717a6cd5af..cf05f2082609d 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1566,10 +1566,6 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; ### tidb_enable_dist_task New in v7.1.0 -> **Warning:** -> -> This feature is still in the experimental stage. It is not recommended to enable this feature in production environments. - - Scope: GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No @@ -3661,7 +3657,7 @@ For a system upgraded to v5.0 from an earlier version, if you have not modified - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Boolean -- Default value: `ON`. When you upgrade TiDB from a version earlier than v7.5.0 to v7.5.0 or a later version, the default value is `OFF`. +- Default value: `ON`. When you upgrade TiDB from a version earlier than v7.5.0 to v7.5.0 or a later version, the default value is `OFF`. - This variable is used for TiDB to merge global statistics asynchronously to avoid OOM issues. ### tidb_metric_query_range_duration New in v4.0 @@ -4790,7 +4786,7 @@ SHOW WARNINGS; - Range: `[2, 255]` - This variable limits how many historical schema versions can be cached in a TiDB instance. The default value is `16`, which means that TiDB caches 16 historical schema versions by default. - Generally, you do not need to modify this variable. When the [Stale Read](/stale-read.md) feature is used and DDL operations are executed very frequently, it will cause the schema version to change very frequently. Consequently, when Stale Read tries to obtain schema information from a snapshot, it might take a lot of time to rebuild the information due to schema cache misses. In this case, you can increase the value of `tidb_schema_version_cache_limit` (for example, `32`) to avoid the problem of schema cache misses. -- Modifying this variable causes the memory usage of TiDB to increase slightly. Monitor the memory usage of TiDB to avoid OOM problems. +- Modifying this variable causes the memory usage of TiDB to increase slightly. Monitor the memory usage of TiDB to avoid OOM problems. ### tidb_server_memory_limit New in v6.4.0 diff --git a/tidb-distributed-execution-framework.md b/tidb-distributed-execution-framework.md index 469cd7d6ab555..a2120cc0670be 100644 --- a/tidb-distributed-execution-framework.md +++ b/tidb-distributed-execution-framework.md @@ -5,10 +5,6 @@ summary: Learn the use cases, limitations, usage, and implementation principles # TiDB Backend Task Distributed Execution Framework -> **Warning:** -> -> This feature is an experimental feature. It is not recommended to use it in production environments. - > **Note:** From cdbde567992e9f50b187c042284602dfc4269cc3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 17 Nov 2023 09:15:48 +0100 Subject: [PATCH 25/29] TOC: Simplify statement names (#15365) --- TOC-tidb-cloud.md | 28 ++++++++++++++-------------- TOC.md | 28 ++++++++++++++-------------- 2 files changed, 28 insertions(+), 28 deletions(-) diff --git a/TOC-tidb-cloud.md b/TOC-tidb-cloud.md index f44ecfee672c1..af6484f2f5d72 100644 --- a/TOC-tidb-cloud.md +++ b/TOC-tidb-cloud.md @@ -349,7 +349,7 @@ - [`BEGIN`](/sql-statements/sql-statement-begin.md) - [`CHANGE COLUMN`](/sql-statements/sql-statement-change-column.md) - [`COMMIT`](/sql-statements/sql-statement-commit.md) - - [`CREATE [GLOBAL|SESSION] BINDING`](/sql-statements/sql-statement-create-binding.md) + - [`CREATE BINDING`](/sql-statements/sql-statement-create-binding.md) - [`CREATE DATABASE`](/sql-statements/sql-statement-create-database.md) - [`CREATE INDEX`](/sql-statements/sql-statement-create-index.md) - [`CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-create-placement-policy.md) @@ -365,7 +365,7 @@ - [`DESC`](/sql-statements/sql-statement-desc.md) - [`DESCRIBE`](/sql-statements/sql-statement-describe.md) - [`DO`](/sql-statements/sql-statement-do.md) - - [`DROP [GLOBAL|SESSION] BINDING`](/sql-statements/sql-statement-drop-binding.md) + - [`DROP BINDING`](/sql-statements/sql-statement-drop-binding.md) - [`DROP COLUMN`](/sql-statements/sql-statement-drop-column.md) - [`DROP DATABASE`](/sql-statements/sql-statement-drop-database.md) - [`DROP INDEX`](/sql-statements/sql-statement-drop-index.md) @@ -389,11 +389,11 @@ - [`GRANT `](/sql-statements/sql-statement-grant-privileges.md) - [`GRANT `](/sql-statements/sql-statement-grant-role.md) - [`INSERT`](/sql-statements/sql-statement-insert.md) - - [`KILL [TIDB]`](/sql-statements/sql-statement-kill.md) + - [`KILL`](/sql-statements/sql-statement-kill.md) - [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) - [`LOAD STATS`](/sql-statements/sql-statement-load-stats.md) - [`LOCK STATS`](/sql-statements/sql-statement-lock-stats.md) - - [`LOCK TABLES` and `UNLOCK TABLES`](/sql-statements/sql-statement-lock-tables-and-unlock-tables.md) + - [`[LOCK|UNLOCK] TABLES`](/sql-statements/sql-statement-lock-tables-and-unlock-tables.md) - [`MODIFY COLUMN`](/sql-statements/sql-statement-modify-column.md) - [`PREPARE`](/sql-statements/sql-statement-prepare.md) - [`QUERY WATCH`](/sql-statements/sql-statement-query-watch.md) @@ -414,14 +414,14 @@ - [`SET RESOURCE GROUP`](/sql-statements/sql-statement-set-resource-group.md) - [`SET ROLE`](/sql-statements/sql-statement-set-role.md) - [`SET TRANSACTION`](/sql-statements/sql-statement-set-transaction.md) - - [`SET [GLOBAL|SESSION] `](/sql-statements/sql-statement-set-variable.md) + - [`SET `](/sql-statements/sql-statement-set-variable.md) - [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) - [`SHOW [BACKUPS|RESTORES]`](/sql-statements/sql-statement-show-backups.md) - - [`SHOW [GLOBAL|SESSION] BINDINGS`](/sql-statements/sql-statement-show-bindings.md) + - [`SHOW BINDINGS`](/sql-statements/sql-statement-show-bindings.md) - [`SHOW BUILTINS`](/sql-statements/sql-statement-show-builtins.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) - - [`SHOW [FULL] COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) + - [`SHOW COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) - [`SHOW CREATE DATABASE`](/sql-statements/sql-statement-show-create-database.md) - [`SHOW CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-show-create-placement-policy.md) - [`SHOW CREATE RESOURCE GROUP`](/sql-statements/sql-statement-show-create-resource-group.md) @@ -431,18 +431,18 @@ - [`SHOW DATABASES`](/sql-statements/sql-statement-show-databases.md) - [`SHOW ENGINES`](/sql-statements/sql-statement-show-engines.md) - [`SHOW ERRORS`](/sql-statements/sql-statement-show-errors.md) - - [`SHOW [FULL] FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) + - [`SHOW FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) - [`SHOW GRANTS`](/sql-statements/sql-statement-show-grants.md) - - [`SHOW INDEX [FROM|IN]`](/sql-statements/sql-statement-show-index.md) - - [`SHOW INDEXES [FROM|IN]`](/sql-statements/sql-statement-show-indexes.md) - - [`SHOW KEYS [FROM|IN]`](/sql-statements/sql-statement-show-keys.md) + - [`SHOW INDEX`](/sql-statements/sql-statement-show-index.md) + - [`SHOW INDEXES`](/sql-statements/sql-statement-show-indexes.md) + - [`SHOW KEYS`](/sql-statements/sql-statement-show-keys.md) - [`SHOW MASTER STATUS`](/sql-statements/sql-statement-show-master-status.md) - [`SHOW PLACEMENT`](/sql-statements/sql-statement-show-placement.md) - [`SHOW PLACEMENT FOR`](/sql-statements/sql-statement-show-placement-for.md) - [`SHOW PLACEMENT LABELS`](/sql-statements/sql-statement-show-placement-labels.md) - [`SHOW PLUGINS`](/sql-statements/sql-statement-show-plugins.md) - [`SHOW PRIVILEGES`](/sql-statements/sql-statement-show-privileges.md) - - [`SHOW [FULL] PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) + - [`SHOW PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) - [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) @@ -453,8 +453,8 @@ - [`SHOW TABLE NEXT_ROW_ID`](/sql-statements/sql-statement-show-table-next-rowid.md) - [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) - [`SHOW TABLE STATUS`](/sql-statements/sql-statement-show-table-status.md) - - [`SHOW [FULL] TABLES`](/sql-statements/sql-statement-show-tables.md) - - [`SHOW [GLOBAL|SESSION] VARIABLES`](/sql-statements/sql-statement-show-variables.md) + - [`SHOW TABLES`](/sql-statements/sql-statement-show-tables.md) + - [`SHOW VARIABLES`](/sql-statements/sql-statement-show-variables.md) - [`SHOW WARNINGS`](/sql-statements/sql-statement-show-warnings.md) - [`SPLIT REGION`](/sql-statements/sql-statement-split-region.md) - [`START TRANSACTION`](/sql-statements/sql-statement-start-transaction.md) diff --git a/TOC.md b/TOC.md index f83b17bb98b91..8d0e921e47493 100644 --- a/TOC.md +++ b/TOC.md @@ -725,7 +725,7 @@ - [`COMMIT`](/sql-statements/sql-statement-commit.md) - [`CHANGE DRAINER`](/sql-statements/sql-statement-change-drainer.md) - [`CHANGE PUMP`](/sql-statements/sql-statement-change-pump.md) - - [`CREATE [GLOBAL|SESSION] BINDING`](/sql-statements/sql-statement-create-binding.md) + - [`CREATE BINDING`](/sql-statements/sql-statement-create-binding.md) - [`CREATE DATABASE`](/sql-statements/sql-statement-create-database.md) - [`CREATE INDEX`](/sql-statements/sql-statement-create-index.md) - [`CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-create-placement-policy.md) @@ -741,7 +741,7 @@ - [`DESC`](/sql-statements/sql-statement-desc.md) - [`DESCRIBE`](/sql-statements/sql-statement-describe.md) - [`DO`](/sql-statements/sql-statement-do.md) - - [`DROP [GLOBAL|SESSION] BINDING`](/sql-statements/sql-statement-drop-binding.md) + - [`DROP BINDING`](/sql-statements/sql-statement-drop-binding.md) - [`DROP COLUMN`](/sql-statements/sql-statement-drop-column.md) - [`DROP DATABASE`](/sql-statements/sql-statement-drop-database.md) - [`DROP INDEX`](/sql-statements/sql-statement-drop-index.md) @@ -766,11 +766,11 @@ - [`GRANT `](/sql-statements/sql-statement-grant-role.md) - [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) - [`INSERT`](/sql-statements/sql-statement-insert.md) - - [`KILL [TIDB]`](/sql-statements/sql-statement-kill.md) + - [`KILL`](/sql-statements/sql-statement-kill.md) - [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) - [`LOAD STATS`](/sql-statements/sql-statement-load-stats.md) - [`LOCK STATS`](/sql-statements/sql-statement-lock-stats.md) - - [`LOCK TABLES` and `UNLOCK TABLES`](/sql-statements/sql-statement-lock-tables-and-unlock-tables.md) + - [`[LOCK|UNLOCK] TABLES`](/sql-statements/sql-statement-lock-tables-and-unlock-tables.md) - [`MODIFY COLUMN`](/sql-statements/sql-statement-modify-column.md) - [`PREPARE`](/sql-statements/sql-statement-prepare.md) - [`QUERY WATCH`](/sql-statements/sql-statement-query-watch.md) @@ -791,14 +791,14 @@ - [`SET RESOURCE GROUP`](/sql-statements/sql-statement-set-resource-group.md) - [`SET ROLE`](/sql-statements/sql-statement-set-role.md) - [`SET TRANSACTION`](/sql-statements/sql-statement-set-transaction.md) - - [`SET [GLOBAL|SESSION] `](/sql-statements/sql-statement-set-variable.md) + - [`SET `](/sql-statements/sql-statement-set-variable.md) - [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) - [`SHOW [BACKUPS|RESTORES]`](/sql-statements/sql-statement-show-backups.md) - - [`SHOW [GLOBAL|SESSION] BINDINGS`](/sql-statements/sql-statement-show-bindings.md) + - [`SHOW BINDINGS`](/sql-statements/sql-statement-show-bindings.md) - [`SHOW BUILTINS`](/sql-statements/sql-statement-show-builtins.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) - - [`SHOW [FULL] COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) + - [`SHOW COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) - [`SHOW CONFIG`](/sql-statements/sql-statement-show-config.md) - [`SHOW CREATE DATABASE`](/sql-statements/sql-statement-show-create-database.md) - [`SHOW CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-show-create-placement-policy.md) @@ -810,19 +810,19 @@ - [`SHOW DRAINER STATUS`](/sql-statements/sql-statement-show-drainer-status.md) - [`SHOW ENGINES`](/sql-statements/sql-statement-show-engines.md) - [`SHOW ERRORS`](/sql-statements/sql-statement-show-errors.md) - - [`SHOW [FULL] FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) + - [`SHOW FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) - [`SHOW GRANTS`](/sql-statements/sql-statement-show-grants.md) - [`SHOW IMPORT JOB`](/sql-statements/sql-statement-show-import-job.md) - - [`SHOW INDEX [FROM|IN]`](/sql-statements/sql-statement-show-index.md) - - [`SHOW INDEXES [FROM|IN]`](/sql-statements/sql-statement-show-indexes.md) - - [`SHOW KEYS [FROM|IN]`](/sql-statements/sql-statement-show-keys.md) + - [`SHOW INDEX`](/sql-statements/sql-statement-show-index.md) + - [`SHOW INDEXES`](/sql-statements/sql-statement-show-indexes.md) + - [`SHOW KEYS`](/sql-statements/sql-statement-show-keys.md) - [`SHOW MASTER STATUS`](/sql-statements/sql-statement-show-master-status.md) - [`SHOW PLACEMENT`](/sql-statements/sql-statement-show-placement.md) - [`SHOW PLACEMENT FOR`](/sql-statements/sql-statement-show-placement-for.md) - [`SHOW PLACEMENT LABELS`](/sql-statements/sql-statement-show-placement-labels.md) - [`SHOW PLUGINS`](/sql-statements/sql-statement-show-plugins.md) - [`SHOW PRIVILEGES`](/sql-statements/sql-statement-show-privileges.md) - - [`SHOW [FULL] PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) + - [`SHOW PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - [`SHOW PUMP STATUS`](/sql-statements/sql-statement-show-pump-status.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) @@ -834,8 +834,8 @@ - [`SHOW TABLE NEXT_ROW_ID`](/sql-statements/sql-statement-show-table-next-rowid.md) - [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) - [`SHOW TABLE STATUS`](/sql-statements/sql-statement-show-table-status.md) - - [`SHOW [FULL] TABLES`](/sql-statements/sql-statement-show-tables.md) - - [`SHOW [GLOBAL|SESSION] VARIABLES`](/sql-statements/sql-statement-show-variables.md) + - [`SHOW TABLES`](/sql-statements/sql-statement-show-tables.md) + - [`SHOW VARIABLES`](/sql-statements/sql-statement-show-variables.md) - [`SHOW WARNINGS`](/sql-statements/sql-statement-show-warnings.md) - [`SHUTDOWN`](/sql-statements/sql-statement-shutdown.md) - [`SPLIT REGION`](/sql-statements/sql-statement-split-region.md) From ab6d71034faab816731e6bc7b69e9bfef44a336e Mon Sep 17 00:00:00 2001 From: Mini256 Date: Mon, 20 Nov 2023 14:28:11 +0800 Subject: [PATCH 26/29] develop: update the third party support table (#15187) --- develop/dev-guide-third-party-support.md | 29 ++++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/develop/dev-guide-third-party-support.md b/develop/dev-guide-third-party-support.md index 263b6212a788e..7e0a9637767a0 100644 --- a/develop/dev-guide-third-party-support.md +++ b/develop/dev-guide-third-party-support.md @@ -140,30 +140,37 @@ If you encounter problems when connecting to TiDB using the tools listed in this v7.0 Full N/A - N/A + Connect to TiDB with Rails Framework and ActiveRecord ORM - JavaScript / TypeScript - sequelize + JavaScript / TypeScript + Sequelize v6.20.1 Full N/A - N/A + Connect to TiDB with Sequelize - Prisma Client + Prisma 4.16.2 Full N/A + Connect to TiDB with Prisma + + + TypeORM + v0.3.17 + Full N/A + Connect to TiDB with TypeORM Python Django - v4.1 + v4.2 Full django-tidb - Connect to TiDB with Django + Connect to TiDB with Django SQLAlchemy @@ -177,6 +184,8 @@ If you encounter problems when connecting to TiDB using the tools listed in this ## GUI -| GUI | Latest tested version | Support level | Tutorial | -| - | - | - | - | -| [DBeaver](https://dbeaver.io/) | 23.0.3 | Full | [Connect to TiDB with DBeaver](/develop/dev-guide-gui-dbeaver.md) | +| GUI | Latest tested version | Support level | Tutorial | +|-----------------------------------------------------------|-----------------------|---------------|-------------------------------------------------------------------------------| +| [JetBrains DataGrip](https://www.jetbrains.com/datagrip/) | 2023.2.1 | Full | [Connect to TiDB with JetBrains DataGrip](/develop/dev-guide-gui-datagrip.md) | +| [DBeaver](https://dbeaver.io/) | 23.0.3 | Full | [Connect to TiDB with DBeaver](/develop/dev-guide-gui-dbeaver.md) | +| [Visual Studio Code](https://code.visualstudio.com/) | 1.72.0 | Full | [Connect to TiDB with Visual Studio Code](/develop/dev-guide-gui-vscode-sqltools.md) | From 2641eb2af413fa0539d1fd606f7314467710005e Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 20 Nov 2023 23:02:41 +0800 Subject: [PATCH 27/29] import into 7.5 (#15288) --- releases/release-7.4.0.md | 4 +-- sql-statements/sql-statement-import-into.md | 36 +++++++++++++-------- tidb-global-sort.md | 4 +++ 3 files changed, 28 insertions(+), 16 deletions(-) diff --git a/releases/release-7.4.0.md b/releases/release-7.4.0.md index 19faf3150fb49..c27e18e365f86 100644 --- a/releases/release-7.4.0.md +++ b/releases/release-7.4.0.md @@ -25,7 +25,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v7.4/quick-start-with- Reliability and Availability Improve the performance and stability of IMPORT INTO and ADD INDEX operations via global sort (experimental) - Before v7.4.0, tasks such as ADD INDEX or IMPORT INTO using the distributed execution framework meant localized and partial sorting, which ultimately led to TiKV doing a lot of extra work to make up for the partial sorting. These jobs also required TiDB nodes to allocate local disk space for sorting, before loading to TiKV.
With the introduction of the Global Sorting feature in v7.4.0, data is temporarily stored in external shared storage (S3 in this version) for global sorting before being loaded into TiKV. This eliminates the need for TiKV to consume extra resources and significantly improves the performance and stability of operations like ADD INDEX and IMPORT INTO. + Before v7.4.0, tasks such as ADD INDEX or IMPORT INTO using the distributed execution framework meant localized and partial sorting, which ultimately led to TiKV doing a lot of extra work to make up for the partial sorting. These jobs also required TiDB nodes to allocate local disk space for sorting, before loading to TiKV.
With the introduction of the Global Sort feature in v7.4.0, data is temporarily stored in external shared storage (S3 in this version) for global sorting before being loaded into TiKV. This eliminates the need for TiKV to consume extra resources and significantly improves the performance and stability of operations like ADD INDEX and IMPORT INTO. Resource control for background tasks (experimental) @@ -241,7 +241,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v7.4/quick-start-with- * Enhance the `IMPORT INTO` feature [#46704](https://github.com/pingcap/tidb/issues/46704) @[D3Hunter](https://github.com/D3Hunter) - Starting from v7.4.0, you can add the `CLOUD_STORAGE_URI` option in the `IMPORT INTO` statement to enable the [global sorting](/tidb-global-sort.md) feature (experimental), which helps boost import performance and stability. In the `CLOUD_STORAGE_URI` option, you can specify a cloud storage address for the encoded data. + Starting from v7.4.0, you can add the `CLOUD_STORAGE_URI` option in the `IMPORT INTO` statement to enable the [Global Sort](/tidb-global-sort.md) feature (experimental), which helps boost import performance and stability. In the `CLOUD_STORAGE_URI` option, you can specify a cloud storage address for the encoded data. In addition, in v7.4.0, the `IMPORT INTO` feature introduces the following functionalities: diff --git a/sql-statements/sql-statement-import-into.md b/sql-statements/sql-statement-import-into.md index 225a8a9fc3f10..db0885f04606b 100644 --- a/sql-statements/sql-statement-import-into.md +++ b/sql-statements/sql-statement-import-into.md @@ -7,15 +7,9 @@ summary: An overview of the usage of IMPORT INTO in TiDB. The `IMPORT INTO` statement is used to import data in formats such as `CSV`, `SQL`, and `PARQUET` into an empty table in TiDB via the [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) of TiDB Lightning. - - -> **Warning:** +> **Note:** > -> Currently, this statement is experimental. It is not recommended to use it in production environments. +> This statement is only applicable to TiDB Self-Hosted and not available on [TiDB Cloud](https://docs.pingcap.com/tidbcloud/). `IMPORT INTO` supports importing data from files stored in Amazon S3, GCS, and the TiDB local storage. @@ -28,7 +22,7 @@ This TiDB statement is not applicable to TiDB Cloud. ## Restrictions -- Currently, `IMPORT INTO` supports importing data within 1 TiB. +- Currently, `IMPORT INTO` supports importing data within 10 TiB. - `IMPORT INTO` only supports importing data into existing empty tables in the database. - `IMPORT INTO` does not support transactions or rollback. Executing `IMPORT INTO` within an explicit transaction (`BEGIN`/`END`) will return an error. - The execution of `IMPORT INTO` blocks the current connection until the import is completed. To execute the statement asynchronously, you can add the `DETACHED` option. @@ -39,6 +33,9 @@ This TiDB statement is not applicable to TiDB Cloud. - The TiDB [temporary directory](/tidb-configuration-file.md#temp-dir-new-in-v630) is expected to have at least 90 GiB of available space. It is recommended to allocate storage space that is equal to or greater than the volume of data to be imported. - One import job supports importing data into one target table only. To import data into multiple target tables, after the import for a target table is completed, you need to create a new job for the next target table. - `IMPORT INTO` is not supported during TiDB cluster upgrades. +- When the [Global Sort](/tidb-global-sort.md) feature is used for data import, the data size of a single row after encoding must not exceed 32 MiB. +- When the Global Sort feature is used for data import, if the target TiDB cluster is deleted before the import task is completed, temporary data used for global sorting might remain on Amazon S3. In this case, you need to delete the residual data manually to avoid increasing S3 storage costs. +- Ensure that the data to be imported does not contain any records with primary key or non-null unique index conflicts. Otherwise, the conflicts can result in import task failures. ## Prerequisites for import @@ -137,7 +134,7 @@ The supported options are described as follows: | `MAX_WRITE_SPEED=''` | All formats | Controls the write speed to a TiKV node. By default, there is no speed limit. For example, you can specify this option as `1MiB` to limit the write speed to 1 MiB/s. | | `CHECKSUM_TABLE=''` | All formats | Configures whether to perform a checksum check on the target table after the import to validate the import integrity. The supported values include `"required"` (default), `"optional"`, and `"off"`. `"required"` means performing a checksum check after the import. If the checksum check fails, TiDB will return an error and the import will exit. `"optional"` means performing a checksum check after the import. If an error occurs, TiDB will return a warning and ignore the error. `"off"` means not performing a checksum check after the import. | | `DETACHED` | All Formats | Controls whether to execute `IMPORT INTO` asynchronously. When this option is enabled, executing `IMPORT INTO` immediately returns the information of the import job (such as the `Job_ID`), and the job is executed asynchronously in the backend. | -| `CLOUD_STORAGE_URI` | All formats | Specifies the target address where encoded KV data for [global sorting](#global-sorting) is stored. When `CLOUD_STORAGE_URI` is not specified, `IMPORT INTO` determines whether to use global sorting based on the value of the system variable [`tidb_cloud_storage_uri`](/system-variables.md#tidb_cloud_storage_uri-new-in-v740). If this system variable specifies a target storage address, `IMPORT INTO` uses this address for global sorting. When `CLOUD_STORAGE_URI` is specified with a non-empty value, `IMPORT INTO` uses that value as the target storage address. When `CLOUD_STORAGE_URI` is specified with an empty value, local sorting is enforced. Currently, the target storage address only supports S3. For details about the URI configuration, see [Amazon S3 URI format](/external-storage-uri.md#amazon-s3-uri-format). When this feature is used, all TiDB nodes must have read and write access for the target S3 bucket. | +| `CLOUD_STORAGE_URI` | All formats | Specifies the target address where encoded KV data for [Global Sort](/tidb-global-sort.md) is stored. When `CLOUD_STORAGE_URI` is not specified, `IMPORT INTO` determines whether to use Global Sort based on the value of the system variable [`tidb_cloud_storage_uri`](/system-variables.md#tidb_cloud_storage_uri-new-in-v740). If this system variable specifies a target storage address, `IMPORT INTO` uses this address for Global Sort. When `CLOUD_STORAGE_URI` is specified with a non-empty value, `IMPORT INTO` uses that value as the target storage address. When `CLOUD_STORAGE_URI` is specified with an empty value, local sorting is enforced. Currently, the target storage address only supports S3. For details about the URI configuration, see [Amazon S3 URI format](/external-storage-uri.md#amazon-s3-uri-format). When this feature is used, all TiDB nodes must have read and write access for the target S3 bucket. | ## Compressed files @@ -153,7 +150,11 @@ The supported options are described as follows: > > The Snappy compressed file must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported. -## Global sorting +## Global Sort + +> **Warning:** +> +> The Global Sort feature is experimental. It is not recommended to use it in production environments. `IMPORT INTO` splits the data import job of a source data file into multiple sub-jobs, each sub-job independently encoding and sorting data before importing. If the encoded KV ranges of these sub-jobs have significant overlap (to learn how TiDB encodes data to KV, see [TiDB computing](/tidb-computing.md)), TiKV needs to keep compaction during import, leading to a decrease in import performance and stability. @@ -163,12 +164,19 @@ In the following scenarios, there can be significant overlap in KV ranges: - `IMPORT INTO` splits sub-jobs based on the traversal order of data files, usually sorted by file name in lexicographic order. - If the target table has many indexes, or the index column values are scattered in the data file, the index KV generated by the encoding of each sub-job will also overlap. -When [Backend task distributed execution framework](/tidb-distributed-execution-framework.md) is enabled, you can enable global sorting by specifying the `CLOUD_STORAGE_URI` option in the `IMPORT INTO` statement or by specifying the target storage address for encoded KV data using the system variable [`tidb_cloud_storage_uri`](/system-variables.md#tidb_cloud_storage_uri-new-in-v740). Note that currently, only S3 is supported as the global sorting storage address. When global sorting is enabled, `IMPORT INTO` writes encoded KV data to the cloud storage, performs global sorting in the cloud storage, and then parallelly imports the globally sorted index and table data into TiKV. This prevents problems caused by KV overlap and enhances import stability. +When [Backend task distributed execution framework](/tidb-distributed-execution-framework.md) is enabled, you can enable [Global Sort](/tidb-global-sort.md) by specifying the `CLOUD_STORAGE_URI` option in the `IMPORT INTO` statement or by specifying the target storage address for encoded KV data using the system variable [`tidb_cloud_storage_uri`](/system-variables.md#tidb_cloud_storage_uri-new-in-v740). Note that currently, only S3 is supported as the Global Sort storage address. When Global Sort is enabled, `IMPORT INTO` writes encoded KV data to the cloud storage, performs Global Sort in the cloud storage, and then parallelly imports the globally sorted index and table data into TiKV. This prevents problems caused by KV overlap and enhances import stability. + +Global Sort consumes a significant amount of memory resources. Before the data import, it is recommended to configure the [`tidb_server_memory_limit_gc_trigger`](/system-variables.md#tidb_server_memory_limit_gc_trigger-new-in-v640) and [`tidb_server_memory_limit`](/system-variables.md#tidb_server_memory_limit-new-in-v640) variables, which avoids golang GC being frequently triggered and thus affecting the import efficiency. + +```sql +SET GLOBAL tidb_server_memory_limit_gc_trigger=0.99; +SET GLOBAL tidb_server_memory_limit='88%'; +``` > **Note:** > -> - If the KV range overlap in a source data file is low, enabling global sorting might decrease import performance. This is because when global sorting is enabled, TiDB needs to wait for the completion of local sorting in all sub-jobs before proceeding with the global sorting operations and subsequent import. -> - After an import job using global sorting completes, the files stored in the cloud storage for global sorting are cleaned up asynchronously in a background thread. +> - If the KV range overlap in a source data file is low, enabling Global Sort might decrease import performance. This is because when Global Sort is enabled, TiDB needs to wait for the completion of local sorting in all sub-jobs before proceeding with the Global Sort operations and subsequent import. +> - After an import job using Global Sort completes, the files stored in the cloud storage for Global Sort are cleaned up asynchronously in a background thread. ## Output diff --git a/tidb-global-sort.md b/tidb-global-sort.md index b5f3e66d7e851..19a00403c7c32 100644 --- a/tidb-global-sort.md +++ b/tidb-global-sort.md @@ -50,6 +50,10 @@ To enable Global Sort, follow these steps: 2. Set [`tidb_cloud_storage_uri`](/system-variables.md#tidb_cloud_storage_uri-new-in-v740) to a correct cloud storage path. See [an example](/br/backup-and-restore-storages.md). +> **Note:** +> +> For [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md), you can also specify the cloud storage path using the [`CLOUD_STORAGE_URI`](/sql-statements/sql-statement-import-into.md#withoptions) option. If both [`tidb_cloud_storage_uri`](/system-variables.md#tidb_cloud_storage_uri-new-in-v740) and `CLOUD_STORAGE_URI` are configured with a valid cloud storage path, the configuration of `CLOUD_STORAGE_URI` takes effect for [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). +
From f5c4fbdd32f2629ba52dc7cfb55e4ec0761d2fcf Mon Sep 17 00:00:00 2001 From: Aolin Date: Tue, 21 Nov 2023 14:26:40 +0800 Subject: [PATCH 28/29] Update disk space requirements for import into (#15384) --- hardware-and-software-requirements.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/hardware-and-software-requirements.md b/hardware-and-software-requirements.md index 03934b25a4428..396e9fe7bc33d 100644 --- a/hardware-and-software-requirements.md +++ b/hardware-and-software-requirements.md @@ -144,7 +144,7 @@ You can deploy and run TiDB on the 64-bit generic hardware server platform in th | Component | CPU | Memory | Local Storage | Network | Number of Instances (Minimum Requirement) | | :------: | :-----: | :-----: | :----------: | :------: | :----------------: | -| TiDB | 8 core+ | 16 GB+ | No special requirements | Gigabit network card | 1 (can be deployed on the same machine with PD) | +| TiDB | 8 core+ | 16 GB+ | [Disk space requirements](#disk-space-requirements) | Gigabit network card | 1 (can be deployed on the same machine with PD) | | PD | 4 core+ | 8 GB+ | SAS, 200 GB+ | Gigabit network card | 1 (can be deployed on the same machine with TiDB) | | TiKV | 8 core+ | 32 GB+ | SAS, 200 GB+ | Gigabit network card | 3 | | TiFlash | 32 core+ | 64 GB+ | SSD, 200 GB+ | Gigabit network card | 1 | @@ -156,7 +156,6 @@ You can deploy and run TiDB on the 64-bit generic hardware server platform in th > - For performance-related test, do not use low-performance storage and network hardware configuration, in order to guarantee the correctness of the test result. > - For the TiKV server, it is recommended to use NVMe SSDs to ensure faster reads and writes. > - If you only want to test and verify the features, follow [Quick Start Guide for TiDB](/quick-start-with-tidb.md) to deploy TiDB on a single machine. -> - The TiDB server uses the disk to store server logs, so there are no special requirements for the disk type and capacity in the test environment. > - Starting from v6.3.0, to deploy TiFlash under the Linux AMD64 architecture, the CPU must support the AVX2 instruction set. Ensure that `cat /proc/cpuinfo | grep avx2` has output. To deploy TiFlash under the Linux ARM64 architecture, the CPU must support the ARMv8 instruction set architecture. Ensure that `cat /proc/cpuinfo | grep 'crc32' | grep 'asimd'` has output. By using the instruction set extensions, TiFlash's vectorization engine can deliver better performance. ### Production environment @@ -234,7 +233,7 @@ As an open-source distributed SQL database, TiDB requires the following network TiDB -
  • At least 30 GB for the log disk
  • Starting from v6.5.0, Fast Online DDL (controlled by the tidb_ddl_enable_fast_reorg variable) is enabled by default to accelerate DDL operations, such as adding indexes. If DDL operations involving large objects exist in your application, it is highly recommended to prepare additional SSD disk space for TiDB (100 GB or more). For detailed configuration instructions, see Set a temporary space for a TiDB instance
+
  • At least 30 GB for the log disk
  • Starting from v6.5.0, Fast Online DDL (controlled by the tidb_ddl_enable_fast_reorg variable) is enabled by default to accelerate DDL operations, such as adding indexes. If DDL operations involving large objects exist in your application, or you want to use IMPORT INTO to import data, it is highly recommended to prepare additional SSD disk space for TiDB (100 GB or more). For detailed configuration instructions, see Set a temporary space for a TiDB instance
Lower than 90% From 4ca04457035cfb0cce732d316b76cb241e18d34f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Tue, 21 Nov 2023 08:34:10 +0100 Subject: [PATCH 29/29] statements: Add SHOW BINARY LOG STATUS info (#15380) --- sql-statements/sql-statement-show-master-status.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sql-statements/sql-statement-show-master-status.md b/sql-statements/sql-statement-show-master-status.md index deb31e34da5fc..0ba461b6a5669 100644 --- a/sql-statements/sql-statement-show-master-status.md +++ b/sql-statements/sql-statement-show-master-status.md @@ -29,6 +29,8 @@ SHOW MASTER STATUS; The output of `SHOW MASTER STATUS` is designed to match MySQL. However, the execution results are different in that the MySQL result is the binlog location information and the TiDB result is the latest TSO information. +The `SHOW BINARY LOG STATUS` statement was added in TiDB as an alias for `SHOW MASTER STATUS`, which has been deprecated in MySQL 8.2.0 and newer versions. + ## See also @@ -44,4 +46,4 @@ The output of `SHOW MASTER STATUS` is designed to match MySQL. However, the exec * [`SHOW TABLE STATUS`](/sql-statements/sql-statement-show-table-status.md) - \ No newline at end of file +