Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add action to create index from a source index #118890

Merged
merged 5 commits into from
Dec 18, 2024

Conversation

parkertimmins
Copy link
Contributor

Add CreateIndexFromSourceAction which creates a new index using the settings and mappings from a source index.
This already existed within ReindexDataStreamIndexAction, but was factored into it's own action, so can be used for migrating non-data stream indices.

The request takes source and destination index names, along with settings and mappings overrides. The destination index is then created with the settings and mappings from the source. Private and non-copyable settings are not added to the source index. If the request contains setting or mapping overrides, these are merged with the source settings and mappings before index creation.

Add new action that create and index from a source
index, copying settings and mappings from the source index.
This was refactored out of ReindexDataStreamIndexAction
@parkertimmins parkertimmins marked this pull request as ready for review December 17, 2024 21:03
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Dec 17, 2024
.andThenApply(ignored -> new ReindexDataStreamIndexAction.Response(destIndexName))
.addListener(listener);
}

private void setBlockWrites(String sourceIndexName, ActionListener<AcknowledgedResponse> listener) {
logger.debug("Setting write block on source index [{}]", sourceIndexName);
final Settings readOnlySettings = Settings.builder().put(IndexMetadata.INDEX_BLOCKS_WRITE_SETTING.getKey(), true).build();
var updateSettingsRequest = new UpdateSettingsRequest(readOnlySettings, sourceIndexName);
client.admin().indices().updateSettings(updateSettingsRequest, new ActionListener<>() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add block-writes and read-only to the source and dest indices was originally done with an UpdateSettingsRequest. While this works, I noticed that ILM used the AddIndexBlock api which seems a bit safer, so I used this as well.

Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I assume there's a rest action coming that will pretty directly use this new action, and that kibana will use that. Just for my own sake, the parts of reindexing an index for upgrade are:

  • add write block to the source index
  • delete destination index (if it exists)
  • create destination index <== This new action
  • reindex from source to dest
  • undo blocks on dest
  • delete source (not currently being done)

So kibana gets to take advantage of creating the destination index, including overriding mappings and copying over source settings. But they'll continue to be responsible for the rest themselves (for now anyway).

@parkertimmins parkertimmins added auto-backport Automatically create backport pull requests when merged and removed needs:triage Requires assignment of a team area label labels Dec 17, 2024
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Dec 17, 2024
@parkertimmins parkertimmins added >enhancement :Data Management/Data streams Data streams and their lifecycles labels Dec 17, 2024
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels Dec 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @parkertimmins, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@jloleysens
Copy link
Contributor

From Kibana's side this action would replace the following code with one API call:

https://github.com/elastic/kibana/blob/38aa404e4f00ce98b7d004255583c5ba35d36e54/x-pack/plugins/upgrade_assistant/server/lib/reindexing/reindex_service.ts#L188-L221

Brief description of this code: we request source index mappings and settings (getFlatSettings), transform/prepare them (transformFlatSettings, see settings steps 1 and 2, mappings is currently noop transform) and then create the destination index.

fwiw If destination index exists already (for whatever reason) we just continue to call reindex from source -> dest (no deletion attempt).


Aside: we planned on introducing this improvement. The idea is that we'll create the destination index with refresh_interval: -1 and number_of_replicas: 0 initially to improve (re)index performance.

With ES's new create from source action available via rest it would be a nice-to-have if we could provide these "overrides" to the create action directly, or perhaps if the API just did this by default? WDYT?

@parkertimmins
Copy link
Contributor Author

@jloleysens Looking through what Kibana does, it appears I missed how it filters out some of the deprecated settings. Thanks for linking this.

Currently, this action will fail if the index already exists. Since we delete the destination index if it exists in the reindex-data-stream code, this seemed like the correct thing to do. (There has been some debate on this point, and I'm uncertain which way is better).
So that it doesn't fail for Kibana, I could add a boolean parameter like ignoreIfDestinationExists which would just return an Ack if the dest index exists.

I am inclined to not put any defaults in the override as I think this api could be useful more generally and don't want unexpected defaults for other users. Though I could see adding a boolean parameter (something like addCommonReindexSettings) which merges these settings into the source settings before adding the override settings.

What do you think?

@parkertimmins parkertimmins merged commit 1004da2 into elastic:main Dec 18, 2024
17 checks passed
@parkertimmins parkertimmins deleted the create-index-from-source branch December 18, 2024 16:35
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

parkertimmins added a commit to parkertimmins/elasticsearch that referenced this pull request Dec 18, 2024
Add new action that creates an index from a source
index, copying settings and mappings from the source index.
This was refactored out of ReindexDataStreamIndexAction.
@parkertimmins
Copy link
Contributor Author

@jloleysens After some discussion, I think we should actually not provide the addCommonReindexSettings option as I suggested. Mainly, because we want to keep this a very general API. But also, Kibana will have to know about the specific overrides used as it will need to unset them later. For example, number_of_replicas will need to be reset to the value from the source index. I think settings defaults within the API has more potential for errors due to forgetting to cleanup values after reindex.

elasticsearchmachine pushed a commit that referenced this pull request Dec 18, 2024
* Add action to create index from a source index (#118890)

Add new action that creates an index from a source
index, copying settings and mappings from the source index.
This was refactored out of ReindexDataStreamIndexAction.

* block-writes cannot be added after read-only

Fix bug in ReindexDataStreamIndexAction. If the source index has both a block-writes and is read-only, these must be updated on the destination index. If read-only is set first, the block-writes cannot be added because settings cannot be modified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Data Management/Data streams Data streams and their lifecycles >enhancement Team:Data Management Meta label for data/management team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants