-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch index already exists exception #326
base: master
Are you sure you want to change the base?
Conversation
Hey @gsoyka, I'm a little hesitant to merge this because I believe failure to drop the schema or failure to create a schema are conditions that should be errors. For example, if changing the version of the schema that you are using, failure to create the schema means any new changes wouldn't be applied. Catching that exception makes it easier to miss that. I'm on the fence, what do you think? |
I am using the pelias/kubernetes helm charts. Please forgive my ignorance around elasticsearch, but is it safe to drop and recreate an index, or will that result in data loss? My main concern here is breaking the endless loop of kubernetes retrying the schema creation job if the index already exists, but I'm completely open to other ways of accomplishing that. |
Dropping the index will delete all the data, so that's a valid concern. (among other things, However, since the schema create job is used only to start a Pelias build, and it's usually best to start each build with a freshly-dropped index, usually losing that data has not been a concern. Let me know if that matches with your workflows. I don't want to assume everyone does things the same way. |
That makes sense, I figured dropping the index would have a similar effect as dropping a database but wasn't able to confirm that. In terms of workflow, I came upon this issue after restarting from a failed build, as referenced by item #1 in the docs you linked as a situation when you don't need to delete the index. I'm working with the entire planet, so restarting from scratch after one data source fails isn't ideal from a time perspective. Is there a recommendation of what to do when one data source fails and the others succeed? |
Okay. One importer failing and you want to continue just that one is definitely in the list of cases where you want to avoid dropping the index. How about this? Lets make a configuration option in By the way, which importer was failing? We have been working on making that less likely with PRs such as pelias/openaddresses#405, so if we can help fix something, we'd like to know. |
I just ran into the same problem on Kubernetes (I'm not using HELM, but deployed the pelias-build jobs/pods manually). Could the code for |
Hey @arne-cl, I'm not sure if the schema JSON that we generate and send to Elasticsearch in this module is identical to what would come back if we ask Elasticsearch for the existing schema, but if you want to play around with it and see how much work it would be, that's definitely a feature we'd merge, and would solve this issue nicely. |
In our deployment, the schema container fails on the create_index job if the schema already exists in elasticsearch. This patch resolves that issue and breaks the cycle of k8s constantly attempting to restart the job.