Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to gevent workers to better handle IO-bound workloads #2701

Closed
wants to merge 1 commit into from

Conversation

ttys0dev
Copy link
Contributor

Hopefully fixes remaining #2640 issues.

@ttys0dev
Copy link
Contributor Author

ttys0dev commented Apr 30, 2023

I also tested the production image locally like this:

diff --git a/docker/courtlistener/docker-compose.yml b/docker/courtlistener/docker-compose.yml
index 90dcc9921..4e1343aa6 100644
--- a/docker/courtlistener/docker-compose.yml
+++ b/docker/courtlistener/docker-compose.yml
@@ -62,7 +62,7 @@ services:
 
   cl-django:
     container_name: cl-django
-    image: ${DJANGO_DOCKER_IMAGE:-freelawproject/courtlistener:latest-web-dev}
+    image: ${DJANGO_DOCKER_IMAGE:-freelawproject/courtlistener:latest-web-prod}
     depends_on:
       - cl-postgresql
       - cl-redis
diff --git a/docker/django/Dockerfile b/docker/django/Dockerfile
index 86e893aef..b4ac53cba 100644
--- a/docker/django/Dockerfile
+++ b/docker/django/Dockerfile
@@ -57,8 +57,9 @@ FROM build-base as build-prod
 WORKDIR $PYSETUP_PATH
 
 # copy project requirement files here to ensure they will be cached.
-ADD https://raw.githubusercontent.com/freelawproject/courtlistener/main/poetry.lock /opt/poetry.lock
-ADD https://raw.githubusercontent.com/freelawproject/courtlistener/main/pyproject.toml /opt/pyproject.toml
+#ADD https://raw.githubusercontent.com/freelawproject/courtlistener/main/poetry.lock /opt/poetry.lock
+#ADD https://raw.githubusercontent.com/freelawproject/courtlistener/main/pyproject.toml /opt/pyproject.toml
+COPY poetry.lock pyproject.toml ./
 
 # install runtime deps - uses $POETRY_VIRTUALENVS_IN_PROJECT internally
 RUN poetry install --no-root --without dev
@@ -114,7 +115,9 @@ CMD python /opt/courtlistener/manage.py migrate && \
 #freelawproject/courtlistener:latest-web-prod
 FROM python-base as web-prod
 
-CMD gunicorn cl_wsgi:application \
+CMD python /opt/courtlistener/manage.py migrate && \
+    python /opt/courtlistener/manage.py createcachetable && \
+    gunicorn cl_wsgi:application \
     --chdir /opt/courtlistener/docker/django/wsgi-configs/ \
     --config /opt/courtlistener/docker/django/wsgi-configs/gunicorn.conf.py \
     --user www-data \

@ttys0dev ttys0dev force-pushed the gevent branch 8 times, most recently from 682fdb2 to 54bbbe1 Compare May 1, 2023 02:03
@mlissner
Copy link
Member

mlissner commented May 1, 2023

Thanks for the diff and the PR (which I assume is ready?).

Monkey patching is not something to be taken lightly, though I did read a few places that it seems to be needed to do gevent. I'll have to do some more research though before I feel entirely comfortable with this. One part of me is thinking, "Is this really necessary? Wasn't the last switch supposed to fix this?" and another is thinking, "Fine, I mean, I guess we can do this, but—monkey patching???"

@mlissner
Copy link
Member

mlissner commented May 1, 2023

Anyway, I guess confirm for me this is ready, please, by requesting my review, and I'll dig in a bit more to assuage my fears.

@ttys0dev
Copy link
Contributor Author

ttys0dev commented May 1, 2023

I assume is ready?

Yeah, should be.

Monkey patching is not something to be taken lightly, though I did read a few places that it seems to be needed to do gevent.

Yeah, seems to be the least invasive way of getting asynchronous networking operations working with django at the moment.

Wasn't the last switch supposed to fix this?" and another is thinking, "Fine, I mean, I guess we can do this, but—monkey patching???"

Well last switch was even less invasive but also less efficient. I suppose if this doesn't work ASGI might be needed once django is updated to version 4.2, although that's more invasive in regards to code changes needed(but also even better for these sort of workloads as well).

Anyway, I guess confirm for me this is ready, please, by requesting my review, and I'll dig in a bit more to assuage my fears.

Yeah, it's ready for review.

@mlissner
Copy link
Member

mlissner commented May 2, 2023

I have to admit I'm not enthusiastic about this. It feels like monkey patching is messy and like this won't work anyway. According to the dev.to page you shared before gevent should be fine for most loads unless we're serving hundreds/s. The page says:

it can be tricky to have it configured 100% correctly, and if you’re not serving hundreds or more requests/sec, it’s probably easier to just use the gthread worker class

So...given that, my instinct is that we should figure out what's wrong with gevent, rather than try to switch to a new worker model that seems to mes things up sometimes.

@ttys0dev
Copy link
Contributor Author

ttys0dev commented May 2, 2023

I have to admit I'm not enthusiastic about this. It feels like monkey patching is messy and like this won't work anyway.

Yeah, I'm personally also not a big fan of gevent in general due to the required monkey patching but it seemed to be the next simplest option to try.

So...given that, my instinct is that we should figure out what's wrong with gevent, rather than try to switch to a new worker model that seems to mes things up sometimes.

Well we're using gthread right now, which is known to not have the best performance and may use excess RAM.

However part of the reason I worked on updating django in #2703 was to make it easier to then experiment and see if using native asyncio which requires a newer django version would be feasible as this would likely be the best performing option(also updated django to potentially fix any bugs that got fixed upstream as well which might have been causing issues). Once the django update is merged I'll experiment a bit more there with native asyncio support.

@mlissner
Copy link
Member

mlissner commented May 2, 2023

OK. That upgrade is going to be a while though, until we can prioritize it, due to the postgres dependency in prod.

@ttys0dev
Copy link
Contributor Author

ttys0dev commented May 2, 2023

due to the postgres dependency in prod

Are you using a managed database service in prod or is it similar to the docker container?

@mlissner
Copy link
Member

mlissner commented May 2, 2023

AWS RDS.

@ttys0dev
Copy link
Contributor Author

ttys0dev commented May 2, 2023

AWS RDS.

What version does it show is currently running?

@ttys0dev
Copy link
Contributor Author

ttys0dev commented May 2, 2023

AWS RDS.

By the way it looks like upgrading with AWS RDS should be fairly straight forwards. Probably a good idea to do that ahead of time anyways before the django 4 update.

@ttys0dev
Copy link
Contributor Author

ttys0dev commented May 2, 2023

see if using native asyncio which requires a newer django version

Hmm, actually I might be wrong here, looks like native asyncio is supported on version 3.2, I'll give that a shot.

Copy link
Member

@mlissner mlissner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question about the code here.

I guess this looks fine and I'm OK giving it a try, but it needs a bit of cleanup.

@@ -121,7 +121,7 @@ jobs:
run: >
docker exec -e SELENIUM_DEBUG=1 -e SELENIUM_TIMEOUT=30 cl-django
python /opt/courtlistener/manage.py test
cl --verbosity=2 ${{ matrix.tag_flags }} --parallel
cl --verbosity=2 ${{ matrix.tag_flags }} --parallel=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this make tests slower? Why do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gevent doesn't really interoperate with python multiprocessing correctly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests aren't using gunicorn/gevent though, are they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well tests are using gevent without gunicorn so they do have this issue.

@ttys0dev
Copy link
Contributor Author

Going to close this for now in favor of migrating to ASGI.

@ttys0dev ttys0dev closed this May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants