Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around docker hub rate limits #103

Closed
Silex opened this issue Jul 23, 2024 · 14 comments
Closed

Work around docker hub rate limits #103

Silex opened this issue Jul 23, 2024 · 14 comments

Comments

@Silex
Copy link
Owner

Silex commented Jul 23, 2024

I'm getting hit by https://www.docker.com/increase-rate-limits/ (https://github.com/Silex/docker-emacs/actions/runs/9769943725)

Basically I "pull too much" when building the images.

1st option: move to another registry (https://stackoverflow.com/questions/65806330/toomanyrequests-you-have-reached-your-pull-rate-limit-you-may-increase-the-lim), which I'm not a fan of because well, official images tends to be on docker hub.

2nd option: I wonder wether I could use some caching like https://github.com/marketplace/actions/docker-cache

3rd option: rate limit the build process so the ~200 pulls are spread over 6h sounds... but this sounds silly.

If anyone has insights about how to tackle this I'm all hear 😉

@pataquets
Copy link
Contributor

pataquets commented Jul 23, 2024

If it's "just for your local", auth'ing your local Docker daemon raises DH's limits for auth'ed users. Not sure if you're or not.
On Gitlab, you do this: https://docs.gitlab.com/ee/user/packages/dependency_proxy/#docker-hub-rate-limits-and-the-dependency-proxy. Github might have something similar (or may not).
Optimizing Dockerfile steps' order might improve build layer caching, which might reduce pulling needs.
Also, a search for "registry proxy" or "registry cache" on DH, turns up several results, but I've used none of them. Maybe there is sthg oficial-ish around, but not sure, thou.
That's all for now, off the top of my head. I'll get back if anything else comes to mind.

@Silex
Copy link
Owner Author

Silex commented Jul 23, 2024

Thanks. Yes I'm authed otherwise the github actions would not be able to push the images to the registry. Here's how each image are built/pushed: https://github.com/Silex/docker-emacs/blob/master/.github/actions/build/action.yml

I just pushed something that sets the max jobs to 1 at a time. Maybe it'll be enough for now... but I doubt it. To pass under then 200 pulls in a 6h period I'll also need to add some sleep() 😞

But yes, a registry proxy that is updated once in a while would work, not sure how you tell docker to use that proxy tho.

@Silex
Copy link
Owner Author

Silex commented Jul 23, 2024

Ah, just found this https://engineering.deptagency.com/how-to-speed-up-docker-builds-in-github-actions:

          cache-from: type=gha
          cache-to: type=gha,mode=max

Sounds like the way to go, will give it a try.

@Silex
Copy link
Owner Author

Silex commented Jul 23, 2024

Meh.

GitHub Action cache has a current limit of 10 GB. Large Docker images can quickly outgrow this size limitation.

But the page mention using a registry cache and there's a Github Container Registry.

I guess I could build & cache my images to this, and then only push to the docker hub.

That requires some refactor and more secret token tho, not something I have time for at the moment.

Will look into it beginning of august.

@Silex
Copy link
Owner Author

Silex commented Jul 23, 2024

Actually this won't fix the problem of FROM alpine of my Dockerfiles.

I really need a registry proxy cache. Will need to google more.

@pataquets
Copy link
Contributor

I've seen on the results excerpts of a cursory google search (not clicking any link) that there are some "dummy" proxies made just from standard HTTP services (squid, nginx, etc.), no special logic involved, apparently. This might simplify your solution.

(Original reply, as intended for a previous comment, before you posted th Alpine update)
Glad to hear that @Silex!
If Gh space limits are too restrictive (and also apply to both image registry and caching storage), maybe combinig them w Gitlab's Docker registry as proxy, you can create some sort of 2-tiered cache. Sounds cumbersome (and it might be), but perhaps sometime along the path it's the only solution.
Also, give it a thought on if building some "common base image", maybe refreshed via cronjob might help optimizing usage quota, but that's more dependent on the flow (which I'm not familiar).
finally, don't forget to check if you qualify for some freebie/grant: https://docs.github.com/en/billing/managing-the-plan-for-your-github-account/discounted-plans-for-github-accounts, just for completeness.
Feel free to share details as you progress, and I'll be happy to help on whatever I can.

@Silex
Copy link
Owner Author

Silex commented Jul 23, 2024

@pataquets: thanks, you can help figure out how I should use renovatebot/renovate#9958 which apparently allows to use gitlab's dependency proxy.

The goal is not to modify the dockerfiles, but as a plan B I see we could also do FROM gitlab.example.com/groupname/dependency_proxy/containers/alpine:latest.

@pataquets
Copy link
Contributor

Hi, @Silex.
Not 100% sure of from where are you planning to use Gitlab's Dependency Proxy. Luckily, I use it from Gitlab's CI. I'm guessing you'll be using it from Github Actions.
From inside Gitlab, I just add the group or project-specific env vars pointing to the DP as the image prefix. Thus, when built from local, you just pull from standard Docker Hub. Example:
${CI_DEPENDENCY_PROXY_GROUP_IMAGE_PREFIX}alpine:latest. You'll have to find the full registry URL for your account/group/project associated DP registry.
Another story is authentication, since Gitlab's DP requires Docker daemon to be logged/auth'ed to it. From inside Gitlab'd CI, this is mostly transparent. From another CI, you'll need to create an access token with DP read permission, which will work as user/pass for Docker daemon to DP from outside Gitlab.
Check out this for more info: https://docs.gitlab.com/ee/user/packages/dependency_proxy

@Silex
Copy link
Owner Author

Silex commented Aug 14, 2024

Continuing in #106

@Silex Silex closed this as completed Aug 14, 2024
@Silex
Copy link
Owner Author

Silex commented Aug 26, 2024

What I did with ghcr.io is better, but I still hit docker hub limits sometimes, because of FROM alpine and FROM debian.

Using a proxy cache might help for those... but at this point I'm considering ditching docker hub.

Or switching just these images to ghcr.io, but that means I'll need to maintain ghcr.io/silex/ubuntu:latest etc. This could be a preparation step in the ci tho.

@Silex Silex reopened this Aug 26, 2024
@Silex
Copy link
Owner Author

Silex commented Aug 26, 2024

Actually when looking at the CI errors, it becomes obvious I'll need to change every FROM to the ghcr.io equivalent.

image

Because even tho it has the cache on ghcr.io, it still has to pull the base image from docker hub.

@Silex
Copy link
Owner Author

Silex commented Aug 26, 2024

Made most of the images have "FROM ghcr.io", will see how this affects pull limits.

If not sufficient, will also have nix, alpine and debian be on ghcr.io.

It's a shame there's no public mirror of docker hub in ghcr.io

@pataquets
Copy link
Contributor

Wouldn't help to also use your DH credentials for increased pull quota when pulling the steps' images?
https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#jobsjob_idcontainercredentials

@Silex
Copy link
Owner Author

Silex commented Aug 27, 2024

@pataquets I already use them.

Recent fix seems to be enough, closing for now.

@Silex Silex closed this as completed Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants