Comments (5)
Closing this ticket which is not actionable anymore. ATM we're looking at using https://github.com/offspot/docker-export to retrieve registry-stored images. Compressing can be discussed separately
from image-creator.
I have benchmark a few compression/decompressions. The most important results are:
- Best compression with
xz
: 559M, but pretty slow to uncompress: 39s - Good compression with
zstd (max)
: 601M, but super fast uncompression: 5s
To my opinion, it looks like using .tar.zstd
would be the best compromise in term of compression. See here the raw results:
Benchmark made on a Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
no compression: 2.4GB
=====================
$ time docker save 'ghcr.io/offspot/wikifundi-en' > wikifundi-en.tar
real 1m31.837s
user 0m0.324s
sys 0m2.566s
$ time docker load < wikifundi-en.tar
Loaded image: ghcr.io/offspot/wikifundi-en:latest
real 0m3.334s
user 0m0.340s
sys 0m1.232s
gzip compression: 805MB
=======================
$ time docker save 'ghcr.io/offspot/wikifundi-en' | gzip > wikifundi-en.tar.gz
real 1m57.308s
user 1m32.532s
sys 0m4.921s
$ time cat wikifundi-en.tar.gz | gzip -d | docker load
Loaded image: ghcr.io/offspot/wikifundi-en:latest
real 0m15.721s
user 0m15.102s
sys 0m2.648s
xz compression: 559M
====================
$ time docker save 'ghcr.io/offspot/wikifundi-en' | xz > wikifundi-en.tar.xz
real 14m55.352s
user 14m34.306s
sys 0m14.334s
$ time cat wikifundi-en.tar.xz | xz -d | docker load
Loaded image: ghcr.io/offspot/wikifundi-en:latest
real 0m39.525s
user 0m40.183s
sys 0m6.583s
zstd compression: 755M
======================
$ time docker save 'ghcr.io/offspot/wikifundi-en' | zstd > wikifundi-en.tar.zstd
real 0m38.203s
user 0m16.827s
sys 0m3.285s
$ time cat wikifundi-en.tar.zstd | zstd -d | docker load
Loaded image: ghcr.io/offspot/wikifundi-en:latest
real 0m5.540s
user 0m3.894s
sys 0m2.740s
zstd compression (max): 601M
============================
$ time docker save 'ghcr.io/offspot/wikifundi-en' | zstd -z -19 > wikifundi-en.tar.zstd
real 15m29.294s
user 15m6.930s
sys 0m3.190s
$ time cat wikifundi-en.tar.zstd | zstd -d | docker load
Loaded image: ghcr.io/offspot/wikifundi-en:latest
real 0m5.592s
user 0m4.231s
sys 0m2.413s
from image-creator.
Do we/Should we have a script which can deal with the the online image repository?
That's the plan. There are a few versions of this available online, even in Python. There's work to be done to test it, adapt it, integrate it of course.
How should we secure that the cache works (see #1) and is in sync with upstream online image repositories?
Images are collections of stages identified with hashes. We can imagine we we reuse the image's hash (latest stage) as filename or something: we won't keep every layers ; we're not recreating docker. Anyway I think it's a bit early to decide those details.
Considering that compressing this container image tarballs would be beneficial (less data to download, less data to write to the SD image/card), it seems we might need to precompute these tarballs and store them somewhere... but how & where precisely?
I'm not convinced at the moment. Why would we want to pre-compute it? It's almost exclusively downloads from the registry and as discussed above, it will be cached. As for compressing, I don't think it's a good idea neither:
- takes time and resources in the image-creator
- takes time and resources on the Pi at first boot. Can be very long
- Not much useful: images should be the smallest possible in any case and will be piped to docker load. I can't see cases where we'd have images with lots of compressable data inside, so I don't expect interesting compression ratios.
I see your comment about WikiFundi which is not a good example.
from image-creator.
@rgaudin Yes, Wikifundi is a "special case". I have chosen it to make the differences between the different compression tools more visible.
from image-creator.
We'll have to build some of those images first anyway so we'll have data. At this point I think it would be counter productive but we'll see in time
from image-creator.
Related Issues (20)
- Prevent download bomb HOT 3
- Prevent expansion bomb
- Add changelog to image-creator HOT 1
- Better Cache Eviction Process HOT 1
- Add support for shrink HOT 2
- Proper naming? HOT 7
- Investigate very slow downloads from worker to Kiwix HOT 1
- Add keep-latest-versions for OCI Images
- Add keep-latest-zim-versions for files HOT 1
- Add check-after
- Unable to use two versions of same OCI Image HOT 1
- Cache evicting latest version of a ZIM HOT 1
- image-creator base image fullsize
- Add checksum support from image.yaml
- create build-dir if not exists
- Error in downloader's get_feedback
- Introduce the concept of "Edition"
- Timeout reading aria2 RPC
- Failed to setup loop device
- Required disk space seems too high
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from image-creator.