diff --git a/_posts/blog/2024-08-30-accumulo4-preview.md b/_posts/blog/2024-08-30-accumulo4-preview.md index 28ef7f619..a4b88fa6b 100644 --- a/_posts/blog/2024-08-30-accumulo4-preview.md +++ b/_posts/blog/2024-08-30-accumulo4-preview.md @@ -5,7 +5,7 @@ author: Dave Marion ## Background -In version 2.1, we introduced two new optional and experimental features, [External Compactions](https://accumulo.apache.org/blog/2021/07/08/external-compactions.html) and [ScanServers](https://github.com/apache/accumulo/pull/2665). The ExternalCompactions feature included two new server processes, the CompactionCoordinator and the Compactor. Using these new processes and their related configurations allows the user to perform major compactions on Tablets external to the TabletServer process. The configuration in 2.1 allows the user to define different “queues” for the external compactions and to assign a “queue” to the Compactor process when it’s started. This provides the user with some capability to define the resources for different classes of compactions (see the referenced blog post for examples). ExternalCompactions may provide lower latency for major compactions because major compactions that run in the TabletServer may queue up when all of the major compaction threads are busy. +In version 2.1, we introduced two new optional and experimental features, [External Compactions](https://accumulo.apache.org/blog/2021/07/08/external-compactions.html) and [ScanServers](https://github.com/apache/accumulo/pull/2665). The External Compactions feature included two new server processes, the CompactionCoordinator and the Compactor. Using these new processes and their related configurations allows the user to perform major compactions on Tablets external to the TabletServer process. The configuration in 2.1 allows the user to define different “queues” for the External Compactions and to assign a “queue” to the Compactor process when it’s started. This provides the user with some capability to define the resources for different classes of compactions (see the referenced blog post for examples). External Compactions may provide lower latency for major compactions because major compactions that run in the TabletServer may queue up when all of the major compaction threads are busy. The ScanServers feature included one new server process, the ScanServer, which allows users to execute scans against a Tablet external to the TabletServer. Because the ScanServer does not have access to the in-memory mutations within the TabletServer, we introduced a consistency level setting on the Scanner and BatchScanner where scans with the “immediate” consistency setting (default) would be sent to the TabletServer only and scans with the “eventual” consistency setting would be sent to a ScanServer. ScanServers can provide better allocation of resources against the current workload because many ScanServers can be used to scan the same Tablet, and a single ScanServer can be used to scan different versions of the Tablet. Immediate consistency scans are sent to the hosting TabletServer where they could possibly queue up, where eventual consistency scans can be serviced by many ScanServers at the cost of not seeing the most recent data (this time delta is configurable). ScanServer processes can be started with a group name which can be used in the client configuration such that eventual scans of a particular type can be sent to a specific group of ScanServer processes. @@ -15,7 +15,7 @@ The features in version 4.0 are intended to make running Accumulo in a cloud env ### On-Demand Tablets -On an upgrade to Accumulo 4.0, the upgrade code will assign all Tablets (except for the root and metadata tables) with an availability setting of ONDEMAND. What this means is that the Tablet is not assigned and hosted by a TabletServer by default. If an operation is performed that requires a Tablet to be hosted by a TabletServer, then the operation will wait for the Tablet to be assigned and hosted. This setting can be changed and checked using the Shell commands setavailability and getavailability, respectively. When a configurable amount of time has passed where the Tablet has been unused, then it will be unloaded. Other valid availability values are HOSTED, which means that the Tablet will always be hosted (the default in earlier versions of Accumulo), and UNHOSTED, which means that the Tablet will never be hosted. +On an upgrade to Accumulo 4.0, the upgrade code will assign all Tablets (except for the root and metadata tables) with an availability setting of ONDEMAND. What this means is that the Tablet is not assigned and hosted by a TabletServer by default. If an operation is performed that requires a Tablet to be hosted by a TabletServer, then the operation will wait for the Tablet to be assigned and hosted. This setting can be changed and checked using the Shell commands `setavailability` and `getavailability`, respectively. When a configurable amount of time has passed where the Tablet has been unused, then it will be unloaded. Other valid availability values are HOSTED, which means that the Tablet will always be hosted (the default in earlier versions of Accumulo), and UNHOSTED, which means that the Tablet will never be hosted. User operations that would require a Tablet to be hosted are live ingest and immediate consistency scans. Users can still interact with data in unhosted tablets via bulk import and eventual consistency scans, and users can still perform tablet maintenance operations on unhosted tablets. The root and metadata tables have an availability value of HOSTED, which cannot be changed by the user. If your application only performs eventual scans and bulk imports, then only one TabletServer is required with the sole purpose of hosting the root and metadata tables. @@ -23,19 +23,19 @@ Because Tablets are now optionally hosted in a TabletServer, the implementation ### External Compactions Only -If a Tablet is not hosted, and the user is bulk importing to it, this could trigger the need for a major compaction. Hosting the Tablet just for the purpose of compacting it will cause churn on the cluster as the balancer may move Tablets around. This led to the decision to move all major compactions to the ExternalCompactions feature. In 4.0, the CompactionCoordinator component was merged into the Manager process, so manually running the CompactionCoordinator process is no longer required. Running at least one Compactor is required to perform major compactions on the root and metadata tables. +If a Tablet is not hosted, and the user is bulk importing to it, this could trigger the need for a major compaction. Hosting the Tablet just for the purpose of compacting it will cause churn on the cluster as the balancer may move Tablets around. This led to the decision to move all major compactions to the External Compactions feature. In 4.0, the CompactionCoordinator component was merged into the Manager process, so manually running the CompactionCoordinator process is no longer required. Running at least one Compactor is required to perform major compactions on the root and metadata tables. ### Resource Groups -In version 4.0 a new group property can be supplied to the Compactor, ScanServer, and TabletServer processes (this replaces the “queue” property mentioned previously for Compactors). If not specified, the default group is used. These properties allow the user to create groups of processes with the same name that can be used host Tablets, execute major compactions, and perform eventual scans. For example, application A may have requirements that dictate the need for immediate access to Tablet data and application B may have requirements that do not require immediate access to data. You would not want to host Tablets for these applications’ tables in the same set of TabletServers as the loading and unloading of application B’s Tablets would cause churn when balancing. Instead, you would likely want to create two sets of TabletServers, groups appA and appB, where their respective tables can be hosted. The number of TabletServers in the appA group would likely be static, and the number of TabletServers in the appB group can scale as demand requires. +In version 4.0 a new group property can be supplied to the Compactor, ScanServer, and TabletServer processes (this replaces the “queue” property mentioned previously for Compactors). If not specified, the default group is used. These properties allow the user to create groups of processes with the same name that can be used to host Tablets, execute major compactions, and perform eventual scans. For example, application A may have requirements that dictate the need for immediate access to Tablet data and application B may have requirements that do not require immediate access to data. You would not want to host Tablets for these applications’ tables in the same set of TabletServers as the loading and unloading of application B’s Tablets would cause churn when balancing. Instead, you would likely want to create two sets of TabletServers, groups appA and appB, where their respective tables can be hosted. The number of TabletServers in the appA group would likely be static, and the number of TabletServers in the appB group can scale as demand requires. ### Increased Visibility -Speaking of scaling, in version 4.0 we are emitting more metrics that can be used to determine when and how a resource needs to be scaled. The resource group and application name tags can be used to identify the group and type of resource that needs to be scaled. The metric name and value can be used to determine how the resource needs to be scaled. For example, if the metric accumulo.compactor.queue.jobs.queued is increasing, you likely need more Compactor resources. Likewise, if the metric accumulo.tserver.compactions.minc.queued or accumulo.tserver.hold is increasing, then you might need to start more TabletServers. +Speaking of scaling, in version 4.0 we are emitting more metrics that can be used to determine when and how a resource needs to be scaled. The resource group and application name tags can be used to identify the group and type of resource that needs to be scaled. The metric name and value can be used to determine how the resource needs to be scaled. For example, if the value for metric `accumulo.compactor.queue.jobs.queued` is increasing, you likely need more Compactor resources. Likewise, if the value for metric `accumulo.tserver.compactions.minc.queued` or `accumulo.tserver.hold` is increasing, then you might need to start more TabletServers. ## Possible Deployment Scenarios -With the new features described above many possible deployment scenarios are possible. We highlight a few them below. +With the new features described above, many possible deployment scenarios are possible. We highlight a few of them below. ### Scenario 1