Catch index already exists exception #326

gsoyka · 2018-10-08T18:38:22Z

In our deployment, the schema container fails on the create_index job if the schema already exists in elasticsearch. This patch resolves that issue and breaks the cycle of k8s constantly attempting to restart the job.

orangejulius · 2018-10-09T13:36:17Z

Hey @gsoyka,
Are you using the pelias/kubernetes helm charts? Our schema build job has an option to drop the existing schema.

I'm a little hesitant to merge this because I believe failure to drop the schema or failure to create a schema are conditions that should be errors. For example, if changing the version of the schema that you are using, failure to create the schema means any new changes wouldn't be applied. Catching that exception makes it easier to miss that.

I'm on the fence, what do you think?

gsoyka · 2018-10-10T13:36:45Z

I am using the pelias/kubernetes helm charts. Please forgive my ignorance around elasticsearch, but is it safe to drop and recreate an index, or will that result in data loss? My main concern here is breaking the endless loop of kubernetes retrying the schema creation job if the index already exists, but I'm completely open to other ways of accomplishing that.

orangejulius · 2018-10-10T13:43:40Z

Dropping the index will delete all the data, so that's a valid concern. (among other things, index is the word Elasticsearch means in the same way Postgres or MySQL might use the word database, so it's analagous to dropping a database in an RDBMS) I admit myself to having spent a bit of time breaking that endless loop.

However, since the schema create job is used only to start a Pelias build, and it's usually best to start each build with a freshly-dropped index, usually losing that data has not been a concern.

Let me know if that matches with your workflows. I don't want to assume everyone does things the same way.

gsoyka · 2018-10-10T14:32:13Z

That makes sense, I figured dropping the index would have a similar effect as dropping a database but wasn't able to confirm that.

In terms of workflow, I came upon this issue after restarting from a failed build, as referenced by item #1 in the docs you linked as a situation when you don't need to delete the index. I'm working with the entire planet, so restarting from scratch after one data source fails isn't ideal from a time perspective. Is there a recommendation of what to do when one data source fails and the others succeed?

orangejulius · 2018-10-13T01:26:52Z

Okay. One importer failing and you want to continue just that one is definitely in the list of cases where you want to avoid dropping the index.

How about this? Lets make a configuration option in pelias.json that controls whether or not this is a fatal error. That increases complexity here slightly, and would require a similar PR to pelias/kubernetes.

By the way, which importer was failing? We have been working on making that less likely with PRs such as pelias/openaddresses#405, so if we can help fix something, we'd like to know.

arne-cl · 2019-05-27T10:44:33Z

I just ran into the same problem on Kubernetes (I'm not using HELM, but deployed the pelias-build jobs/pods manually).

Could the code for ./bin/create_index be changed to only throw an error if the index exists AND it does
not match the schema that pelias needs? That way, the code can always run without "causing harm".

orangejulius · 2019-05-28T13:01:44Z

Hey @arne-cl,
That's a really good idea.

I'm not sure if the schema JSON that we generate and send to Elasticsearch in this module is identical to what would come back if we ask Elasticsearch for the existing schema, but if you want to play around with it and see how much work it would be, that's definitely a feature we'd merge, and would solve this issue nicely.

Catch index already exists exception

db292cd

missinglink requested a review from orangejulius October 9, 2018 09:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catch index already exists exception #326

Catch index already exists exception #326

gsoyka commented Oct 8, 2018

orangejulius commented Oct 9, 2018 •

edited

Loading

gsoyka commented Oct 10, 2018

orangejulius commented Oct 10, 2018

gsoyka commented Oct 10, 2018

orangejulius commented Oct 13, 2018

arne-cl commented May 27, 2019

orangejulius commented May 28, 2019

Catch index already exists exception #326

Are you sure you want to change the base?

Catch index already exists exception #326

Conversation

gsoyka commented Oct 8, 2018

orangejulius commented Oct 9, 2018 • edited Loading

gsoyka commented Oct 10, 2018

orangejulius commented Oct 10, 2018

gsoyka commented Oct 10, 2018

orangejulius commented Oct 13, 2018

arne-cl commented May 27, 2019

orangejulius commented May 28, 2019

orangejulius commented Oct 9, 2018 •

edited

Loading