-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Querystring Search Performance #1252
Comments
jMeter Performance Test Resultsquerystring-search performance tests-> you can see that the results are actually pretty close to each other, at least when we hammer the instance a bit. spread of the results for one kind of request-> you can see that even for the exact same request (in this case, content object with fullobjects=1) the response times are pretty widespread, this looks like a load problem |
See linked PR at ZCatalog above here. |
@jensens on our way back home from Sorrento, @jackahl and I came up with an idea of how to solve the querystring-search caching issue. We could, as discussed, use a GET request instead of POST and use subtraversal for a unique URL. Though, instead of either encoding the complex query in the URL as GET params (which has lots of possible side effects and the character limit), we could just send the query in the body and generate a unique hash in the URL, to make it unique. I outlined the idea above: And I also started to play around with plone.rest, to see if that works: Though, the main question would be if we get away with bending the HTTP standard: https://stackoverflow.com/questions/978061/http-get-with-request-body?answertab=votes#tab-top and if Varnish/Cloudflare or any other system (Python requests lib, curl, etc. seem to work) just works or if we run into issues. The other option would be to send the encoded querystring in the URL as subtraversal, with the problem that it is pretty opaque for devs and that it can run into the 2000 characters limit: https://stackoverflow.com/questions/417142/what-is-the-maximum-length-of-a-url-in-different-browsers @jackahl gave me the idea for the solution we came up with when he proposed to send the "clean" query within the HTTP body, just for reference and dev convenience. I'd love to hear what you think! :) @lukasgraf @buchi did you ever had the need to cache POST requests in plone.restapi? |
My overall reaction: Great idea, using a hash and sub-traversal .... but is this within the spec? Specification wise (HTTP 1.1) in RFC 7231 https://www.rfc-editor.org/rfc/rfc7231#page-25
When we implement it this way we have no guarantee if webserver, caches, web application firewalls, and such operating between the client and the api-server are working correctly. My conclusion: No, better not. We open pandoras box here. I would propose to check the length of the encoded payload, and if the length exceeds the limit: go with a POST. Inspired by your idea to use the body, we could use the Headers to send the payload (combined with a hash). But this feels somehow wrong and hackish as well, even if we would kind of stay within the specs. Overall this is not that satisfying |
@jensens I share your doubts and I agree that this could cause problems. Though, Elastic Search is doing exactly this (GET with a query in the payload): Therefore it might be worth a shot. If we would run into issues, this would be a deal-breaker of course. Making the frontend decide to use either GET or POST could be a valid way to go. Though we still have to fix the nested querystring issue and this would mean that long queries can not be cached at all. I thought about using headers as well. Though, as you said, this feels hacky and there is also a size limit on HTTP headers. The body is the only size-safe way. |
Just a follow up: Solr also uses GET for search requests and it seems that GET is even the default here: https://solr.apache.org/guide/8_10/json-request-api.html |
@sneridagh @tiberiuichim @plone/volto-team my main challenge right now it how to properly convert a nested dict with an array to a proper querystring. In Volto we are using the "query-string" lib which intentionally does not support nesting: https://github.com/sindresorhus/query-string#nesting After digging into this rabbit whole I can fully understand this decision. Though, this would mean that on Volto (or any other restapi request) needs to "stringify" an object within the request. In JS this would look like this:
In Python like this:
(link to my playground in plone.rest: https://github.com/plone/plone.rest/pull/125/files) Though, there are other libraries (like "qs") which seem to support nested querystrings:
|
At first sight, I guess it's better to use something that supports it, so no conversion whatsoever needed (in either side). Fact: If we choose to change to |
Back to work after a day off and thinking further.
|
@jensens +1 to all. |
Side info, but maybe good to know: we use 'paqo' in the serialized p.a.querystring query for the URL, in search block: https://github.com/plone/volto/blob/master/src/components/manage/Blocks/Search/hocs/withSearch.jsx#L177 |
The HTTP spec doesn't say you can't send data with a GET request but it all depends on the frameworks used if it is supported or not. Another issue with this approach is that the requests can't be cached. If you use query string parameters you are able to cache the requests. There are however things to take in to account when using the query string for this; Complex data formatsJSON is not supported by default as a query string. There is also no "global" standard on how to do this. There are multiple methods of encoding a JSON object to a string, let's take the following query:
Which results in the following transforms:
The encode uri method is in my opinion the safest and most straight forward implementation and shouldn't be hard to implement both on the frontend and the backend. LengthAltough the HTTP spec doesn't limit the length of a querystring all browsers do. Currenty the following limits are in place:
Looking at the above examples even the largest (url encoding) is "only" 453 characters so even if you want to support Internet Explorer almost all requests should work. |
LGTM, let's discuss this. Also, FYI, I just remembered that this PR is in place: plone/volto#4159 No clue how they are encoding the state in the URL. I guess we should then sync before merging. |
@robgietema thanks for the write-up. We have to take into account the F5 Big IP component: https://my.f5.com/manage/s/article/K60450445 (default is 2048 bytes) |
We could think about shorting strings like |
@tisto Should we consider this done? |
Possible Solutions
Use GET instead of POST
Problems to solve
The text was updated successfully, but these errors were encountered: