Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON decode error on Nginx 504 Gateway Time-out from Meilisearch Cloud #990

Open
sauloperez opened this issue Jul 4, 2024 · 10 comments
Open

Comments

@sauloperez
Copy link

sauloperez commented Jul 4, 2024

Description

When receiving a 504 Gateway timeout on a call to index.update_documents, the SDK raises a JSONDecodeError instead of the expected MeilisearchApiError. You can see the stacktrace we found in our Sentry account in the Screenshots or Logs section below.

Note we're using Meilisearch Cloud so the underlying issue is related to Meilisearch's Cloud unresponsiveness.

Expected behavior

One would expect to get a meaningful MeilisearchApiError so we can retry the operation at a later time, or any other fault-tolerance mechanism.

Current behavior

A JSONDecodeError was raised, which initially makes you think there's a problem on the way the documents are serialized.

Screenshots or Logs
Screenshot from 2024-07-04 10-52-01

Environment (please complete the following information):

  • OS: Debian 11.10
  • Meilisearch version: v1.8.1
  • meilisearch-python version: 0.31.3
@mde-pach
Copy link

mde-pach commented Jul 4, 2024

Indeed, there is a premise in the way the package is managing such errors.

Here

class MeilisearchApiError(MeilisearchError):

And here
except requests.exceptions.HTTPError as err:

We can see that every errors are intended to be MeilisearchApiError and then follow the same data validation/formatting process as any encountered errors.


To me, it looks like the error management need to handle differently errors that are returned by meilisearch (which are intended to be, following this implementation, only in a json format) and errors that can be returned by some in the middle third part tool like a reverse proxy (nginx here).

There is multiple way to handle such use case, the best one to me is introduced by the RFC 9457 but it's not commonly adopted by systems we are working with today so it doesn't look like a valid option here (maybe in 10 years 😅).

Another solution would be to being able to differentiate meilisearch responses from others. We can use a custom header set by meilisearch server to only try to handle errors as MeilisearchApiError in this use case and manage an ExternalWhateverError for others.

We can also just try/except on json decoding and manage it another way but you can't distinguish not json formatted errors returned by meilisearch (which are not supposed to happen) from not json formatted returned by other services

Regardless the solution you chose to implement, I would be pleased to do it myself in this repository :)

@sanders41
Copy link
Collaborator

I have not had time to really look at this, but a simple solution could potentially be changing

        if request.text:
            json_data = json.loads(request.text)
            self.message = json_data.get("message")
            self.code = json_data.get("code")
            self.link = json_data.get("link")
            self.type = json_data.get("type")

to

        if request.content:
            json_data = request.json()
            self.message = json_data.get("message")
            self.code = json_data.get("code")
            self.link = json_data.get("link")
            self.type = json_data.get("type")

This is untested so I'm not sure if it would solve the issue, but maybe.

@mde-pach
Copy link

mde-pach commented Jul 4, 2024

If the data is not JSON decodable response.json() will still raise a requests.exceptions.JSONDecodeError

The main point for this issue is to:

  • Identify when the data is not in an expected format
  • Find what to do for such data

@sauloperez
Copy link
Author

and it happened again 🙈

image

@curquiza
Copy link
Member

curquiza commented Jul 9, 2024

Hello here, I'm discussing with Meilisearch team. You problem might not be related to the Python SDK (even if changes can always be applied, but let's try to mitigate the problem first).

We had issues with v1.8.0, v1.8.1 and v1.8.2: soft lock and memory leaks we solved in v1.8.3. As far as we know, you are using v1.8.1 on the Cloud, let me know if I'm wrong.
Sorry for the inconvenience but we recommend you migrate to the latest version of Meilisearch. This can be easily done, cf documentation.

@sanders41
Copy link
Collaborator

sanders41 commented Jul 9, 2024

@mde-pach I put in a fix for this in another library here. Do you want to try to do something similar? The idea is if the response isn't JSON we know the error didn't come from Meilisearch so raise the error as is rather than trying to process it.

But as @curquiza said, this won't solve the underlying issue, only hopefully make it easier to understand what is happening.

@brunoocasali
Copy link
Member

brunoocasali commented Jul 9, 2024

Hi folks!

I was discussing with @curquiza and the team that we should change the content types of these errors in our Meilisearch Cloud infra. You got the parsing error because the message comes from our proxy layer, not the meilisearch instance, which is HTML. That's why they are so different.

So, we could add special handling like @sanders41 in his SDK, but ideally, everything should be JSON in the first place, so I don't think it is worth it until we change on our side first.
Changing to JSON would unlock more possibilities for handling them, like creating special classes to give the consumer the power to decide what they should do when they face a network error like this.

@mde-pach
Copy link

mde-pach commented Jul 9, 2024

Hi folks!

I was discussing with @curquiza and the team that we should change the content types of these errors in our Meilisearch Cloud infra. You got the parsing error because the message comes from our proxy layer, not the meilisearch instance, which is HTML. That's why they are so different.

So, we could add special handling like @sanders41 in his SDK, but ideally, everything should be JSON in the first place, so I don't think it is worth it until we change on our side first. Changing to JSON would unlock more possibilities for handling them, like creating special classes to give the consumer the power to decide what they should do when they face a network error like this.

Make sense to me :)

@sanders41
Copy link
Collaborator

@sauloperez if upgrading Meilisearch doesn't solve the issue, you could add tenacity to your functions that are getting error responses to have them retry. It would be a work around and not a real fix, but might help.

@sauloperez
Copy link
Author

thanks everyone, we'll definitely go with tenacity or backoff, while we also try to upgrade to v1.9, which has failed so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants