Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add records for Api app status #4249

Merged
merged 1 commit into from
Sep 6, 2023
Merged

Conversation

mdellweg
Copy link
Member

@mdellweg mdellweg commented Aug 3, 2023

fixes #3175

@mdellweg mdellweg force-pushed the 3175_api_worker_records branch 3 times, most recently from b468fd9 to 30e5d64 Compare August 3, 2023 17:12
@mdellweg
Copy link
Member Author

mdellweg commented Aug 3, 2023

The current version of the change required, that gunicorn is started with "--worker-class pulpcore.app.gunicorn_worker.PulpApiWorker`.

@bmbouter
Copy link
Member

bmbouter commented Aug 8, 2023

I think this is a good way to solve this problem. +1 to continuing this approach.

@mdellweg
Copy link
Member Author

mdellweg commented Aug 9, 2023

I redesigned and pulpcore will now provide a pulpcore-api entrypoint that is basically a preloaded gunicorn.
How much of the options should we move there? E.g. how about the logging-format?
Also any way we do it requires a change to the startup files in the oci images.

Comment on lines 90 to 91
def gunicorn():
PulpcoreApiApplication("%(prog)s [OPTIONS]").run()
Copy link
Member Author

@mdellweg mdellweg Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need confirmation that we want to start running our components this way.
So if two @pulp/core devs could ack this specific part please?

edit: Even if not technically needed today, should we have the same for pulpcore-content?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question is especially important for the decision on this PR: pulp/pulp-oci-images#532

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"confirmation that we want to start running our components this way."

I guess my only real question is, is there some downside to this? I can't think of one, but I'm prob just being dense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that it's another layer of abstraction. But that goes on the plus side as well. When i think about our plan to redesign the content app, it will be much easier to just change the gunicorn worker type used in the entrypoint than telling people to modify their service files.
And exactly that thought brings me to the next question: How much should our entrypoint mimick the gunicorn commandline? Should we just override some parameters internally, or should we explicitely proxy only some parameters?

@dkliban
Copy link
Member

dkliban commented Aug 15, 2023

We need to make sure that users can still specify gunicorn timeout, number of workers, and logging format.

@mdellweg mdellweg force-pushed the 3175_api_worker_records branch 2 times, most recently from c0c1f6f to b94b3d2 Compare August 21, 2023 14:54
@mdellweg mdellweg marked this pull request as ready for review August 21, 2023 14:54
@mdellweg mdellweg force-pushed the 3175_api_worker_records branch 13 times, most recently from 1987899 to ccf4530 Compare August 25, 2023 17:07
@mdellweg mdellweg requested a review from dkliban August 29, 2023 10:03
@mdellweg mdellweg requested review from a team and decko and removed request for a team August 29, 2023 10:03
Comment on lines +36 to +38
if self.api_app_status.versions != self.versions:
self.api_app_status.versions = self.versions
self.api_app_status.save(update_fields=["versions"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really possible for the versions to update from underneath itself? The name should change upon restart and the previous record should be deleted, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory yes. The name is composed of <pid>@<hostname> when you restart that computer, and the boot process is very consistent, you will end up with the same name again.

Comment on lines +79 to +81
# cleanup
if self.api_app_status:
self.api_app_status.delete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible we could close/kill the worker and not delete the record?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we? Also there was a request from @bmbouter when implementing the hearbeating for pulpcore-workers to clean up on exiting. And this PR tries to apply these rules to all types of workers.

And btw we never cleaned up the content api status before. I guess there are some db's around where that table became humongous by now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I worded my question wrong. I was asking is it possible the worker could be kill and not have its record cleaned up. Looking later down in the code, it seems that the task worker code would clean up any missing worker records.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup happens after a week (or so). But yes, it's always possible that the application is killed in a way that no cleanup procedures will be executed anymore. OOM, kill -9, segfault, you name it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link the appropriate gunicorn documentation for the classes we import and options we are allowed to pass in?

pulpcore/app/models/status.py Outdated Show resolved Hide resolved
pulpcore/app/models/status.py Outdated Show resolved Hide resolved
Comment on lines +74 to +77
async def _heartbeat_ctx(app):
heartbeat_task = asyncio.create_task(_heartbeat())
yield
heartbeat_task.cancel()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nifty, this should remove the asyncio Error 'Task was destroy but it is pending' from the logs when restarting the content app?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope so...

@@ -56,9 +76,7 @@ def online(self):
Returns:
bool: True if the content app is considered online, otherwise False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/content app/app. Also, you need to change the docstring for the missing property as well.

@mdellweg mdellweg force-pushed the 3175_api_worker_records branch 2 times, most recently from 7114284 to 3598075 Compare September 1, 2023 20:57
pulpcore/app/wsgi.py Outdated Show resolved Hide resolved
@mdellweg mdellweg merged commit 4def68e into pulp:main Sep 6, 2023
14 checks passed
@mdellweg mdellweg deleted the 3175_api_worker_records branch September 6, 2023 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pulpcore-api worker records are not in the database
5 participants