Server not responding during large spike in requests #2530

grumpyflask · 2024-03-01T06:01:13Z

grumpyflask
Mar 1, 2024

Hi,

I believe that my situation is similar to this post fastapi/fastapi#9875 but there did not seem to be a resolution to the question, also I believe the core logic is within Starlette, not FastAPI.

I am using FastAPI with an endpoint designed to retrieve thumbnails that are relatively small in size ~(.25mb). Each of the thumbnails are stored in S3 and so for each request that hits the endpoint a specific thumbnail is retrieved. On average for a single thumbnail a retrieval takes about 1.5 seconds which is within expectations.

The confusion arises when multiple requests come in to the thumbnail endpoint ~250 for different images, some of the thumbnails are retreived within ~3 seconds, others take up to 25 seconds.

Quite a bit of debugging later it seems as if the starlette router is handling incoming requests as priority over handling current tasks which ends up not creating a response back to the client until there is a lull (or stop) on the incoming requests.

I created a small test app (you'll need to find a local file to run the app) to demonstrate this behavior and a stripped down version of my S3 function that performs similarly,

app.py.txt

along with a test client application.
client_sim.js.txt

This is the behavior that I expected to see from FastAPI/Starlette

But I believe I'm seeing this as the behavior

I've tried mixing async and sync calls but the behavior seems to always be rooted in how the router logic functions.

I ended up noticing this as an issue as when there is a spike in requests to the server, it fails to handle any requests until it essentially gets a break and can then start responding.

Anyone experience this situation before that could help me change the application to achieve the desired behavior image?

edit

To further demonstrate the how I observed this behavior I added printout statemtents to the constructor and streaming method for starlette.responses.StreamingResponse.stream_response right after the await send() call

async def stream_response(self, send: Send) -> None:
        await send(
            {
                "type": "http.response.start",
                "status": self.status_code,
                "headers": self.raw_headers,
            }
        )
        #print(f"streaming back data for {self.debug_name}")
        async for chunk in self.body_iterator:
            if not isinstance(chunk, bytes):
                chunk = chunk.encode(self.charset)
            
            await send({"type": "http.response.body", "body": chunk, "more_body": True})
            print(f"streaming back data for {self.debug_name}")
            
        await send({"type": "http.response.body", "body": b"", "more_body": False})
        print(f"Done with {self.debug_name}, {time.time() - self.start_time}")

I can see that only towards the end of the requests does anything start streaming out. At a particular number of active response streams I would like to leave the incoming requests on the backlog.

I've tried limiting the number of threads fastapi/fastapi#4221 which just ended up making this issue more obvious what was happening.

I've also tried using the https://www.uvicorn.org/settings/ --limit-concurrency which is close to what I want to achieve, I just don't want to drop the requests that go over the configured limit, I just want to leave them on the incoming work queue and handle them eventually.

edit2

Sounds like it could be related to this: #802 (comment)

Thanks!

David

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server not responding during large spike in requests #2530

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Server not responding during large spike in requests #2530

grumpyflask Mar 1, 2024

Replies: 0 comments

grumpyflask
Mar 1, 2024