Comments (15)
In the short term I think the simplest fix would indeed be to just cache the retrieved Docker Hub image and re-use it until the project is clean
'ed
from rockcraft.
@cjdcordeiro would it be possible to add a flag to specify the base image or allow the base: to use fqdn like public.ecr.aws/ubuntu/ubuntu:22.04_stable
in theory it could, but it would actually require some work, as we want to make sure Rockcraft is absolutely sure of what's being used as a base (it being an official ubuntu image). So some extensive validation would need to take place.
In the short term I think the simplest fix would indeed be to just cache the retrieved Docker Hub image and re-use it until the project is clean'ed
That would probably be the easiest for the time being. Although I'm a bit afraid of ending up with outdated ROCKs, cause if the base
is not refreshed, one can go for weeks without getting the latest security updates...
from rockcraft.
I feel like that 100pulls/6h is only part of the story, because I remember that I hit the limit way before I did 100 pulls.
oh yes 😛 I've had similar problems like this in the past, and ended up going down the rabbit hole of how docker pull
works. Nowadays, however, Docker is a bit more explicit about how it works. In short, when you ask for an image, the client will pull its manifest. If we're talking about a multi-architecture image (which is our case), then it pulls 2 manifests: the image index (with the manifest list) and the actual image manifest. So in reality, Rockcraft might only have 50pulls/6h/IP.
edit: how about 8 hours so it fits the whole workday ;)
I don't think there's a right number 🤷 why? well, this limit could be hit, mainly, for 2 reasons: 1) you're building the same ROCK a lot 😁, or 2) you're building many ROCKs behind the same IP (which is probably the case for most, when building multiple ROCKs as part of a CI/CD pipeline). If we were only addressing "1)", then it would be easy -> 6h / 50 = 1 ROCK build every 7.2 minutes
. And that would be your cache. But if you have multiple ROCKs being built...they aren't aware of each other, so 🤷
The rule of thumb here though should be to keep a low interval. Low enough to keep rebuilds fresh and with the most recent Ubuntu updates...but NOT so low that it will still make an IP hit the limit when multiple ROCKs are being built. My educated guess -> somewhere between 1h and 3h (=> allowing for an avg of 8 to 25 builds per cached interval).
Doing it off time is a little flawed in my opinion as it could cache the image right before an update is pushed which wouldnt be refreshed until hours later,
Precisely my point above in #184 (comment). However, IMO this is still the best immediate fix, for a few reasons:
- quick to implement
- the official Ubuntu image in DH is only updated 1/month...yes 1/month. Even our regular images (in ECR, ACR, etc.) are only updated, at most, twice a day. So this (temporary) 6h-based cache isn't off-putting
- it is future-proof. I.e., while the 6h timer isn't the best solution, the underlying caching mechanism is desired, and once we agree on a CLI option for something like
rockcraft pack --no-cache
, we can get rid of the 6h boilerplate.
It would be better if it could compare a checksum or something of the cached image versus what is in docker hub to decide to pull or not.
The devil is in the details :) To compare image digests, you need the image manifest, so you need to pull it.
from rockcraft.
Having a similar issue, from the logs
2023-02-07 12:01:20.925 :: 2023-02-07 11:01:20.570 Failed to copy image: Command '['skopeo', '--insecure-policy', '--override-arch', 'amd64', 'copy', 'docker://ubuntu:20.04', 'oci:/root/images/ubuntu:20.0 4']' returned non-zero exit status 1. (time="2023-02-07T11:01:20Z" level=fatal msg="initializing source docker://ubuntu:20.04: reading manifest 20.04 in docker.io/library/ubuntu: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit")
from rockcraft.
I've had that issue happen to me too. I think the current behavior of always pulling from dockerhub was a stopgap/MVP and we'll move to something else. @sergiusens @cjdcordeiro do you know the plans here?
from rockcraft.
No immediate plan I'm afraid. Ideally, we should have our own store to avoid these 3rd party dependencies. An alternative would be to use the ubuntu rootfs tarball instead, but this raises concerns in terms of tracing the underlying image build/digest.
from rockcraft.
@cjdcordeiro would it be possible to add a flag to specify the base image or allow the base:
to use fqdn like public.ecr.aws/ubuntu/ubuntu:22.04_stable
from rockcraft.
I suppose we could add a simple "timestamp" file on the fetched bundle, so that if it's older than say 2 weeks we re-fetch?
from rockcraft.
I suppose we could add a simple "timestamp" file on the fetched bundle, so that if it's older than say 2 weeks we re-fetch?
that's a bit too long.
The problem here is related with DH's pull rate limit, which is 100pull/6h, for anonymous users. So that's the timestamp we can work with. I.e, refresh if older than 6h
:)
from rockcraft.
Is that a good way though to solve this problem, seems like if Docker Hub decides to change from 6H to 3H we will have to push an update, right?
from rockcraft.
This is not a solution but rather a short-term easy fix. Also because we shouldn't be pulling from docker://ubuntu:...
as those images are not built by us nor updated as frequently as the ones on ECR, ACR, etc.
While we come up with a better plan for handling ROCKs' bases, this short-term fix is actually a good thing because:
- it overcomes the DH pull rate limit. @twovican if DH changes that to be 100/pulls every 3 hours, that's even better :) the more pull we have a day the less likely it is to hit the limit. The opposite is worse. If DH makes it 100pulls/day, then it means Rockcraft will pull at least 4 times (once every 6 hours). So we're good.
- it puts in place some mechanisms that we can leverage in the future, like adding
pack
options to say when or not to make use of cached artifacts, like thebase
from rockcraft.
I feel like that 100pulls/6h is only part of the story, because I remember that I hit the limit way before I did 100 pulls.
It was at the Engineering Sprint back in November, so maybe the IP was getting shared?
Regardless, I think 6h is probably a fine interval to pick. It'll get around the rate-limiting issue but also speed-up the iterative development of rocks on the day-to-day.
edit: how about 8 hours so it fits the whole workday ;)
from rockcraft.
Doing it off time is a little flawed in my opinion as it could cache the image right before an update is pushed which wouldnt be refreshed until hours later, but for development of a rock I likely dont care.
It would be better if it could compare a checksum or something of the cached image versus what is in docker hub to decide to pull or not.
But at least doing it based on time would unblock local development even if the method is not ideal. I won't complain as long as I no longer hit my pull rate limit.
from rockcraft.
We don't need to pull from a registry or hub; we could pull from the source too. I would rather remove this dependency.
from rockcraft.
To improve the experience, we would need to put focus in our reusability story in craft-providers; this all happens inside the LXD or multipass environment, so we want some form of global cache.
from rockcraft.
Related Issues (20)
- Integrate with Starbase
- Show link to full execution logs when `rockcraft pack --debug` enters debug session HOT 3
- clarify the impact of the global rockcraft.yaml fields in the final ROCK
- add the typical environment variables to any interactive session, like with `--debug`
- Error message for invalid versions is misleading
- Document how primed files become an OCI layer
- improve the speed rockcraft's iterative design experience
- isolate parts of a rockcraft build process from each other
- Integrate plugin docs from craft-parts
- Dedicated plugin for `deb-security-manifest` HOT 5
- Consider creating and making the _daemon_ user available before running any steps
- Add support for pinning the build and build-base images
- `run-user` does not work properly without `services` HOT 1
- rock with `entrypoint-service` does not work if executed without any `args` HOT 7
- Summarise what has/has not been completed when failing or entering a `--debug` session HOT 1
- overlay-packages is not documented in rockcraft.yaml reference HOT 1
- Upgrade skopeo version for docker engine API version HOT 4
- Add support for `org.opencontainers.image.revision` annotation HOT 1
- Refactor platform validation HOT 1
- ARM pack fails `choosing an image from manifest list docker://public.ecr.aws/ubuntu/ubuntu:22.04: no image found in image index for architecture arm64, variant \"v8\", OS linux"` HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rockcraft.