Giter Club home page Giter Club logo

dxf's Introduction

dxf   Build Status Coverage Status PyPI version

Python module and command-line tool for storing and retrieving data in a Docker registry.

  • Store arbitrary data (blob-store)
  • Content addressable
  • Set up named aliases to blobs
  • Supports Docker registry schema v2
  • Works on Python 3.8+

Please note that dxf does not generate Docker container configuration, so you won't be able to docker pull data you store using dxf. See this issue for more details.

Command-line example:

dxf push-blob fred/datalogger logger.dat @may15-readings
dxf pull-blob fred/datalogger @may15-readings

which is the same as:

dxf set-alias fred/datalogger may15-readings $(dxf push-blob fred/datalogger logger.dat)
dxf pull-blob fred/datalogger $(dxf get-alias fred/datalogger may15-readings)

Module example:

from dxf import DXF

def auth(dxf, response):
    dxf.authenticate('fred', 'somepassword', response=response)

dxf = DXF('registry-1.docker.io', 'fred/datalogger', auth)

dgst = dxf.push_blob('logger.dat')
dxf.set_alias('may15-readings', dgst)

assert dxf.get_alias('may15-readings') == [dgst]

for chunk in dxf.pull_blob(dgst):
    sys.stdout.write(chunk)

Usage

The module API is described here.

The dxf command-line tool uses the following environment variables:

  • DXF_HOST - Host where Docker registry is running.
  • DXF_INSECURE - Set this to 1 if you want to connect to the registry using http rather than https (which is the default).
  • DXF_USERNAME - Name of user to authenticate as.
  • DXF_PASSWORD - User's password.
  • DXF_AUTHORIZATION - HTTP Authorization header value.
  • DXF_AUTH_HOST - If set, always perform token authentication to this host, overriding the value returned by the registry.
  • DXF_PROGRESS - If this is set to 1, a progress bar is displayed (on standard error) during push-blob and pull-blob. If this is set to 0, a progress bar is not displayed. If this is set to any other value, a progress bar is only displayed if standard error is a terminal.
  • DXF_BLOB_INFO - Set this to 1 if you want pull-blob to prepend each blob with its digest and size (printed in plain text, separated by a space and followed by a newline).
  • DXF_CHUNK_SIZE - Number of bytes pull-blob should download at a time. Defaults to 8192.
  • DXF_SKIPTLSVERIFY - Set this to 1 to skip TLS certificate verification.
  • DXF_TLSVERIFY - Optional path to custom CA bundle to use for TLS verification.
  • DXF_PLATFORM - Optional platform (e.g. linux/amd64) to use for multi-arch manifests. If a multi-arch manifest is encountered and this is not set then a dict containing entries for each platform will be displayed.

You can use the following options with dxf. Supply the name of the repository you wish to work with in each case as the second argument.

  • dxf push-blob <repo> <file> [@alias]

    Upload a file to the registry and optionally give it a name (alias). The blob's hash is printed to standard output.

    The hash or the alias can be used to fetch the blob later using pull-blob.

  • dxf pull-blob <repo> <hash>|<@alias>...

    Download blobs from the registry to standard output. For each blob you can specify its hash, prefixed by sha256: (remember the registry is content-addressable) or an alias you've given it (using push-blob or set-alias).

  • dxf blob-size <repo> <hash>|<@alias>...

    Print the size of blobs in the registry. If you specify an alias, the sum of all the blobs it points to will be printed.

  • dxf mount-blob <repo> <from-repo> <hash> [@alias]

    Cross mount a blob from another repository and optionally give it an alias. Specify the blob by its hash, prefixed by sha256:.

    This is useful to avoid having to upload a blob to your repository if you know it already exists in the registry.

  • dxf del-blob <repo> <hash>|<@alias>...

    Delete blobs from the registry. If you specify an alias the blobs it points to will be deleted, not the alias itself. Use del-alias for that.

  • dxf set-alias <repo> <alias> <hash>|<file>...

    Give a name (alias) to a set of blobs. For each blob you can either specify its hash (as printed by push-blob or get-alias) or, if you have the blob's contents on disk, its filename (including a path separator to distinguish it from a hash).

  • dxf get-alias <repo> <alias>...

    For each alias you specify, print the hashes of all the blobs it points to.

  • dxf del-alias <repo> <alias>...

    Delete each specified alias. The blobs they point to won't be deleted (use del-blob for that), but their hashes will be printed.

  • dxf get-digest <repo> <alias>...

    For each alias you specify, print the hash of its configuration blob. For an alias created using dxf, this is the hash of the first blob it points to. For a Docker image tag, this is the same as docker inspect alias --format='{{.Id}}'.

  • dxf get-manifest <repo> <alias>...

    For each alias you specify, print its manifest obtained from the registry.

  • dxf list-aliases <repo>

    Print all the aliases defined in the repository.

  • dxf list-repos

    Print the names of all the repositories in the registry. Not all versions of the registry support this.

Certificates

If your registry uses SSL with a self-issued certificate, you'll need to supply dxf with a set of trusted certificate authorities.

Set the REQUESTS_CA_BUNDLE environment variable to the path of a PEM file containing the trusted certificate authority certificates.

Both the module and command-line tool support REQUESTS_CA_BUNDLE.

Alternatively, you can set the DXF_TLSVERIFY environment variable for the command-line tool or pass the tlsverify option to the module.

Authentication tokens

dxf automatically obtains Docker registry authentication tokens using your DXF_USERNAME and DXF_PASSWORD, or DXF_AUTHORIZATION, environment variables as necessary.

However, if you wish to override this then you can use the following command:

  • dxf auth <repo> <action>...

    Authenticate to the registry using DXF_USERNAME and DXF_PASSWORD, or DXF_AUTHORIZATION, and print the resulting token.

    action can be pull, push or *.

If you assign the token to the DXF_TOKEN environment variable, for example:

DXF_TOKEN=$(dxf auth fred/datalogger pull)

then subsequent dxf commands will use the token without needing DXF_USERNAME and DXF_PASSWORD, or DXF_AUTHORIZATION, to be set.

Note however that the token expires after a few minutes, after which dxf will exit with EACCES.

Docker Cloud authentication

You can use the dockercloud library to read authentication information from your Docker configuration file and pass it to dxf:

auth = 'Basic ' + dockercloud.api.auth.load_from_file()
dxf_obj = dxf.DXF('index.docker.io', repo='myorganization/myimage')
dxf_obj.authenticate(authorization=auth, actions=['pull'])
dxf_obj.list_aliases()

Thanks to cyrilleverrier for this tip.

Installation

pip install python-dxf

Licence

MIT

Other projects that use DXF

Docker-charon

https://github.com/gabrieldemarmiesse/docker-charon

This package allows you to transfer Docker images from one registry to another. The second one being disconnected from the internet.

Unlike docker save and docker load, it creates the payload directly from the registry (it's faster) and is able to compute diffs to only take the layers needed, hence reducing the size.

Tests

make test

Lint

make lint

Code Coverage

make coverage

coverage.py results are available here.

Coveralls page is here.

dxf's People

Contributors

davedoesdev avatar domsekotill avatar farcaller avatar gabrieldemarmiesse avatar kastb avatar mackenzieata avatar micw avatar mrueg avatar msadiq058 avatar norpol avatar pastelmind avatar petermarko avatar rhelmot avatar romain-dartigues avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dxf's Issues

Registry authentication problem while empty 'auth_token' value is returned by catalog

Greetings.
I've faced the problem of registry authentication when response for token request contains the key 'auth_token', but its value is empty. Some harbor registries doing so. Example:
{ "token": "_long_but_correct_blahblahblah_token_so_I_cut_it_off_", "access_token": "", "expires_in": 1800, "issued_at": "2019-11-27T08:50:25Z" }

The suggested fix is in init.py file, please see here:
https://github.com/vifrrg/dxf/commit/2d82a0adf6cd0ad73db0dd0700b1f203dddc7a36
Thank You.

Resolve a tag to its digest

I want to resolve a tag of an image to its digest (to then download the cosign signature of the image, that uses the digest of the signed image as the prefix of the tag of the signature-image).

I've tried get_digest, get_alias and get_manifest, but none of these get me the digest that I am looking for.

The digest I am looking for can be retrieved by using docker manifest inspect -v (the -v is imporant). For example:

$ docker manifest inspect my-registry.local/project/repo:1.0.0 -v
{
	"Ref": "my-registry.local/project/repo:1.0.0",
	"Descriptor": {
		"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
		"digest": "sha256:2613d8f5b647d16835739e8432e5272733aae1325d12d8986e82696e84f91b31",
		"size": 527,
		"platform": {
			"architecture": "amd64",
			"os": "linux"
		}
	},
	"SchemaV2Manifest": {
		"schemaVersion": 2,
		"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
		"config": {
			"mediaType": "application/vnd.docker.container.image.v1+json",
			"size": 708,
			"digest": "sha256:76ba57e0638f58bcbccaa0f6639226486b00651fd9eef03f21443c7b3ee72208"
		},
		"layers": [
			{
				"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
				"size": 2289926,
				"digest": "sha256:6a1aaa55c0246737ca422553617a0dcbe2fef0e626db2a00b455fd925f95dd92"
			}
		]
	}
}

The digest I need is the sha256:2613d8f5b647d16835739e8432e5272733aae1325d12d8986e82696e84f91b31.

The same digest can also be retrieved by using skopeo inspect:

$ skopeo inspect docker://my-registry.local/project/repo:1.0.0
{
    "Name": "my-registry.local/project/repo:1.0.0",
    "Digest": "sha256:2613d8f5b647d16835739e8432e5272733aae1325d12d8986e82696e84f91b31",
    "RepoTags": [
        "1.0.0",
        "sha256-2613d8f5b647d16835739e8432e5272733aae1325d12d8986e82696e84f91b31.sbom",
        "sha256-2613d8f5b647d16835739e8432e5272733aae1325d12d8986e82696e84f91b31.sig",
    ],

Is there a way to retrieve this digest using DXF? I would like to do this in pure python if possible.

Org-less repos not supported

@davedoesdev

I recently came across an issue with Docker repos that did not include an organization value in the path. I came across https://github.com/davedoesdev/dxf/issues/31 and think I see what might be happening. At least in my case.

Two component repos work fine, eg. docker.io/curlimages/curl, docker.io/bitnami/python, and docker.io/circleci/openjdk but 1 component repos fail to authenticate every time, eg. docker.io/python, docker.io/openjdk and docker.io/eclipse-temurin.

I was running 7.7.2, because pip refused to install 7.7.3 because of a dependency conflict, but ultimately got both installed and both fail to work for single component repos.

I poked around some and it looks like it may be how you're setting _repo_path. For single component repos, like python, java, and openjdk, you need to prepend library to the API call, eg. registry-1.docker.io/python needs to be registry-1.docker.io/library/python where host=registry-1.docker.io and repo= library/python). I modified your code to this and it seemed to work, but could use come cleaning up.

repo_parts = repo.split("/")
if len(repo_parts) == 1:
    self._repo = "library/" + repo
    self._repo_path = ("library/" + repo + "/") if repo else ""
else:
    self._repo =  repo
    self._repo_path = (repo + "/") if repo else ""

It seems as though the "organization" for simple repositories is library, and using this allows this package to work with these types of repos.

"dxf get-digest" doesn't work with multi-arch images

get-digest on single-arch images work fine:

$ DXF_HOST=registry-1.docker.io dxf get-digest ubuntu 12.04
sha256:5b117edd0b767986092e9f721ba2364951b0a271f53f1f41aff9dd1861c2d4fe

But multi-arch images fail with KeyError: 'config':

$ DXF_HOST=registry-1.docker.io dxf get-digest ubuntu 22.04
Traceback (most recent call last):
  File "/home/behlers/.local/pyvenv/dxf/bin/dxf", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/behlers/.local/pyvenv/dxf/lib/python3.11/site-packages/dxf/main.py", line 219, in main
    sys.exit(doit(sys.argv[1:], os.environ))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/behlers/.local/pyvenv/dxf/lib/python3.11/site-packages/dxf/main.py", line 206, in doit
    _doit()
  File "/home/behlers/.local/pyvenv/dxf/lib/python3.11/site-packages/dxf/main.py", line 189, in _doit
    dgsts = [dxf_obj.get_digest(name) for name in args.args]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/behlers/.local/pyvenv/dxf/lib/python3.11/site-packages/dxf/main.py", line 189, in <listcomp>
    dgsts = [dxf_obj.get_digest(name) for name in args.args]
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/behlers/.local/pyvenv/dxf/lib/python3.11/site-packages/dxf/__init__.py", line 723, in get_digest
    return self._get_alias(alias, manifest, verify, False, True, False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/behlers/.local/pyvenv/dxf/lib/python3.11/site-packages/dxf/__init__.py", line 661, in _get_alias
    dgst = parsed_manifest['config']['digest']
           ~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'config'

The reason is, that multi-arch images return a manifest list, that contains just a list of architectures with a specific manifest digest for each architecture. Getting this manifest reveals the needed information:

$ docker buildx imagetools inspect --raw ubuntu:22.04 | jq
{
  "manifests": [
    {
      "digest": "sha256:7a57c69fe1e9d5b97c5fe649849e79f2cfc3bf11d10bbd5218b4eb61716aebe6",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      },
      "size": 424
    },
    {
      "digest": "sha256:ad18cfdb19dac67bf0072dacea661a817330e5c955d081f4d09914e743ae5d4a",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "platform": {
        "architecture": "arm",
        "os": "linux",
        "variant": "v7"
      },
      "size": 424
    },
    {
      "digest": "sha256:537da24818633b45fcb65e5285a68c3ec1f3db25f5ae5476a7757bc8dfae92a3",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "platform": {
        "architecture": "arm64",
        "os": "linux",
        "variant": "v8"
      },
      "size": 424
    },
    {
      "digest": "sha256:f23b7ade9f88f91c8d5932a48b721712ed509a607d9a05cdeae4cd06de09e5f7",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "platform": {
        "architecture": "ppc64le",
        "os": "linux"
      },
      "size": 424
    },
    {
      "digest": "sha256:b351315d950a4da70f19d62f4da5dd7f9a445eb8c8d6851a5b6cdddbdafb13cf",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "platform": {
        "architecture": "s390x",
        "os": "linux"
      },
      "size": 424
    }
  ],
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "schemaVersion": 2
}
$
$
$ docker buildx imagetools inspect --raw ubuntu@sha256:7a57c69fe1e9d5b97c5fe649849e79f2cfc3bf11d10bbd5218b4eb61716aebe6 | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "size": 2298,
    "digest": "sha256:08d22c0ceb150ddeb2237c5fa3129c0183f3cc6f5eeb2e7aa4016da3ad02140a"
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "size": 29533950,
      "digest": "sha256:2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6"
    }
  ]
}

Authentication fails with key error

Hi,
I just wrote a simple script for testing:

#!/usr/bin/env python3

from dxf import DXF
dxf = DXF('registry-1.docker.io', 'library/alpine')
dxf.authenticate()

This fails with:
File "/usr/lib/python3.7/site-packages/dxf/init.py", line 290, in authenticate
scope = info['scope']
KeyError: 'scope'

I printed out the info object which does not contain scope:

{'realm': 'https://auth.docker.io/token', 'service': 'registry.docker.io'}

The problem might be that I did no pass any actions. The doc says here:

If you know which types of operation you need to make on the registry, specify them here

get_manifest is unreliable in its return type

I'm not exactly sure for the triggering conditions, but get_manifest will return a str when I talk to docker hub (registry-1.docker.io/freshrss/freshrss) and a dict when I talk to ghcr.io (ghcr.io/tailscale/golink) in the same loop but different DXF instances.

Running "dxf auth" against a registry without any authentication service should not fail

Due to

        if response.status_code != requests.codes.unauthorized:
            raise exceptions.DXFUnexpectedStatusCodeError(response.status_code,
                                                          requests.codes.unauthorized)

dxf auth returns a dxf.exceptions.DXFUnexpectedStatusCodeError: expected status code 401, got 200.
It might be worth to return an empty bearer token instead or handle that case differently (as the request was HTTP200 successfull).

Authentication fails for Azure Container Registry

Tested this on versions 5.0.0 and 7.4.0. It seems that Azure has recently changed their response in the Oauth2 workflow. They no longer include a token value in the response, only access_token is present. Based on my testing the access_token value is correct to use; I assume that when token was provided it was always the same value as access_token (like what Dockerhub does).

Traceback (most recent call last):
  File "/Users/gavin/docker_puller/docker_puller.py", line 149, in handler
    dxf.authenticate(username=username, password=password, actions=['pull'])
  File "/Users/gavin/.pyenv/versions/2.7.13/envs/py27/lib/python2.7/site-packages/dxf/__init__.py", line 305, in authenticate
    self.token = r.json()['token']
KeyError: 'token'

Seems like token is hard-coded as the expected response value:

dxf/dxf/__init__.py

Lines 303 to 306 in c877b77

r = self._sessions[0].get(auth_url, headers=headers, verify=self._tlsverify)
_raise_for_status(r)
self.token = r.json()['token']
return self._token

Is it possible to support both values if one is missing?

problem while downloading blobs named using set-alias

LAYERSDIG=""

for LAYERDIGEST in dxf get-alias ubuntu latest; do
LAYERSDIG+="${LAYERDIGEST} "
done

dxf set-alias ubuntu releaseN ${LAYERSDIG}

$ docker pull localhost:5000/ubuntu:releaseN
releaseN: Pulling from ubuntu
87192bdbe00f: Already exists
28e09fddaacb: Already exists
7e15ce58ccb2: Already exists
a3ed95caeb02: Already exists
invalid character '\x1f' looking for beginning of value

registry-1.docker.io

No matter if I try with DXF_PASSWORD or DXF_TOKEN I can do neither dxf list-repos nor dxf list-aliases. I am getting an 'unauthorized' error each time. The credentials seem fine. Overall I have tried these two sets of environment variables:

DXF_HOST=registry-1.docker.io
DXF_USERNAME=myusername
DXF_PASSWORD=mypassword

or

DXF_HOST=registry-1.docker.io
DXF_USERNAME=myusername
DXF_TOKEN=mytoken

to no avail. Am I missing something obvious or is it a problem with registry-1.docker.io or dxf? I am trying to integrate dxf with https://github.com/sadaszewski/focker/ . It would be a great and impactful project with some help to get it off the ground. Thank you very much in advance.

Add a function to do "Cross Repository Blob Mount"

Hello there!
I'm working on docker-charon, which uses heavily DFX. While the size of the zip that contains the docker images is heavily reduced because we only take the diff,
we can only compute this diff within the same repository. We could do even better if we had the option to mount blobs from one repository to another.
Such an option is already available in the registry v2 api. But I don't see a function in DFX to mount blobs from one repository to another.

https://docs.docker.com/registry/spec/api/#pushing-an-image

Would you be interested in such a function? If yes I can make a pull request :)

I wonder how the CLI does to automatically detect if blobs are in another repository. It would seem unlikely that the CLI is querying all the repositories one by one.

If we know how the CLI does it, maybe we can implement the same and do an auto-skip when calling "push_blob"?

Misleading Docs

Documentation/README says Python 3.6+ but, setup.py says Python 3.7+.

Support for gcr.io

This is probably a problem on gcr.io side but it would be nice if we can put a workaround. The following sample code fails

#!/usr/bin/env python3

import sys
import json

from dxf import DXFBase


def auth(dxf, response):
   creds = json.loads(sys.stdin.read())
   dxf.authenticate(creds['Username'], creds['Secret'], response=response)


dxf = DXFBase('gcr.io', auth)
repos = dxf.list_repos()
print(repos)

when run with:

echo gcr.io | gcloud auth docker-helper get | ./dxf-gcr.py

The problem seems to be lack of "scope" on www-authenticate header from gcr.io.

$ http -h https://gcr.io/v2/_catalog | grep -i WWW-Authenticate
WWW-Authenticate: Bearer realm="https://gcr.io/v2/token",service="gcr.io"

while dockerhub reply is as follows:

$ http -h https://registry-1.docker.io/v2/_catalog | grep -i WWW-Authenticate
Www-Authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="registry:catalog:*"

The possible workaround could be something like this:

diff --git dxf/__init__.py dxf/__init__.py
index 8b60245..678a7d4 100644
--- dxf/__init__.py
+++ dxf/__init__.py
@@ -288,6 +288,8 @@ class DXFBase(object):
                 scope = 'repository:' + self._repo + ':' + ','.join(actions)
             elif 'scope' in info:
                 scope = info['scope']
+            elif not self._repo:
+                scope = 'registry:catalog:*'
             else:
                 scope = ''
             url_parts = list(urlparse.urlparse(info['realm']))

Thanks

Support creating a new `DXF` class from an existing `DXFBase`

I'm trying to iterate over the contents of a registry, so I initially create a DXFBase class(as I have no repo); however to then drill-down into the repo I need to create a brand new object using the exact same properties as the first. It'd be great if there was a method that created a DXF object that shared some information like URLs and sessions.

dxf list-aliases returns TypeError if no tags are available

Minimal example

./ubuntu
./ubuntu/_uploads
./ubuntu/_layers
./ubuntu/_layers/sha256
./ubuntu/_layers/sha256/REDACTED
./ubuntu/_manifests
./ubuntu/_manifests/revisions
./ubuntu/_manifests/revisions/sha256
./ubuntu/_manifests/revisions/sha256/REDACTED
./ubuntu/_manifests/tags
$ dxf list-aliases foo/ubuntu
Traceback (most recent call last):
  File "/usr/lib/python-exec/python3.5/dxf", line 11, in <module>
    load_entry_point('python-dxf==7.1.0', 'console_scripts', 'dxf')()
  File "/usr/lib64/python3.5/site-packages/dxf/main.py", line 206, in main
    exit(doit(sys.argv[1:], os.environ))
  File "/usr/lib64/python3.5/site-packages/dxf/main.py", line 193, in doit
    _doit()
  File "/usr/lib64/python3.5/site-packages/dxf/main.py", line 185, in _doit
    for name in dxf_obj.list_aliases(iterate=True):
  File "/usr/lib64/python3.5/site-packages/dxf/__init__.py", line 134, in __iter__
    for v in response.json()[self._header]:
TypeError: 'NoneType' object is not iterable

Adds extra '/v2/...' in url for certain repositories

My use case needs to list all versions in an ACR. I usually works for all images but for google-containers/kube-apiserver-amd64 inside the registry it is throwing an exception.

Traceback (most recent call last):
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/src/entities.py", line 149, in get_dest_versions
    return Image._get_tags(self.dest, self.dest_user, self.dest_pass)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/src/entities.py", line 163, in _get_tags
    return dxf.DXF(dest[:bound], dest[bound + 1 :], auth=auth).list_aliases()
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/dxf/__init__.py", line 773, in list_aliases
    return it if iterate else list(it)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/dxf/__init__.py", line 153, in __iter__
    response = self._meth("get", self._path, **self._kwargs)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/dxf/__init__.py", line 407, in _request
    return super(DXF, self)._base_request(method, self._repo_path + path, **kwargs)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/dxf/__init__.py", line 248, in _base_request
    _raise_for_status(r)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/dxf/__init__.py", line 83, in _raise_for_status
    r.raise_for_status()
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://acrlabsred.azurecr.io/v2/google-containers/kube-apiserver-amd64/v2/google-containers/kube-apiserver-amd64/tags/list?last=v1.14.10&n=100&orderby=

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "src/main.py", line 165, in <module>
    command_main()  # pylint: disable=no-value-for-parameter
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/.v/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "src/main.py", line 161, in command_main
    main(config_file, dest, dry_run)
  File "src/main.py", line 127, in main
    entity.get_dest_versions(), key=StrFavouredLooseVersion
  File "/mnt/c/Users/daman.bawa/Desktop/crumb/src/entities.py", line 152, in get_dest_versions
    raise EntityDoesNotExist
exceptions.EntityDoesNotExist

I traced it to inside PaginatingResponse.iter in going through while loop self._path and self._kwargs change to
tags/list

self._path = /v2/google-containers/kube-apiserver-amd64/tags/list?last=v1.14.10&n=100&orderby=
self._kwargs = {}

while in cases it works they stop as in first iteration of while

self._path = tags/list
self._kwargs = {'params': {'n': None}}

Request for head manifest api

I need to validate whether the given tag exists or not.
get_manifest_and_response API is available which returns detail manifest but for docker-hub, there is a rate-limit on it.

There is another HEAD API that can be used to validate the existence. https://github.com/distribution/distribution/blob/5cb406d511b7b9163bff9b6439072e4892e5ae3b/docs/spec/api.md#existing-manifests

Golang clients support this feature https://github.com/regclient/regclient/blob/main/docs/regctl.md#manifest-commands

If required I can contribute to this, please let me know the contribution guidelines.

Cannot auth on registry that return 200 and empty _catalog for anonymous users

DXF bypass authorization if the response is OK, but some registry (zot) returns an empty _catalog and a HTTP 200 OK response for anonymous users with all required www-authenticate headers.

While I understand it would be complex to fix that for all client methods (list_repos, …), DXF should at least provides an escape hatch, and let the client force DXF.authenticate() to parse the response header and authenticate even if the result is 200 OK, so it would be possible to call something like dxf.authenticate("user", "password", force=True) to login, and then be able to fully use the registry API.

docker pull fails when using set-alias to create or move tag

We're trying to use dxf to move a tag from a position to another (saying production is now the same as integration) :

DXF_HOST=*** DXF_USERNAME=*** DXF_PASSWORD=*** dxf set-alias myproject/myproject-builder production $(DXF_HOST=*** DXF_USERNAME=*** DXF_PASSWORD=*** dxf get-alias myproject/myproject-builder integration)

It seems to do the job but

docker pull registry/project/image:newtag 
[snip]
invalid character '\x1f' looking for beginning of value

The get-alias shows the same signatures, but not the get-digest... are we missing something

It would be awesome if there was a dxf set-alias myproject/image:tag myproject/image:newtag that could do this sort of thing.

del-alias: 405 Client Error: Method Not Allowed

I'm trying to delete a tag from the docker registry.

$ export DXF_HOST=index.docker.io
$ export DXF_USERNAME=...
$ export DXF_PASSWORD=...

$ dxf list-aliases sisu4u/unity3d-test
2018.3.7f1-android
2018.3.7f1-facebook
2018.3.7f1-ios
2018.3.7f1-mac
2018.3.7f1-unity
2018.3.7f1-webgl
2018.3.7f1-windows
2018.3.7f1
2019.1.3f1-android
2019.1.3f1-facebook
2019.1.3f1-ios
2019.1.3f1-mac
2019.1.3f1-unity
2019.1.3f1-webgl
2019.1.3f1-windows
2019.1.3f1
latest

$ dxf del-alias sisu4u/unity3d-test latest
Traceback (most recent call last):
  File "/usr/local/bin/dxf", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/dxf/main.py", line 206, in main
    exit(doit(sys.argv[1:], os.environ))
  File "/usr/local/lib/python3.7/site-packages/dxf/main.py", line 193, in doit
    _doit()
  File "/usr/local/lib/python3.7/site-packages/dxf/main.py", line 171, in _doit
    for dgst in dxf_obj.del_alias(name):
  File "/usr/local/lib/python3.7/site-packages/dxf/__init__.py", line 728, in del_alias
    self._request('delete', 'manifests/{}'.format(dcd))
  File "/usr/local/lib/python3.7/site-packages/dxf/__init__.py", line 376, in _request
    **kwargs)
  File "/usr/local/lib/python3.7/site-packages/dxf/__init__.py", line 225, in _base_request
    _raise_for_status(r)
  File "/usr/local/lib/python3.7/site-packages/dxf/__init__.py", line 77, in _raise_for_status
    r.raise_for_status()
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 405 Client Error: Method Not Allowed for url: https://index.docker.io/v2/sisu4u/unity3d-test/manifests/sha256:c346e1c10d901c660d824b48fb327ac62f936b8dfebdb6bb47d73a1dcd832acf

With my own script I can successfully delete tags:

def _del_tag(namespace, repo, tag, token):
    print(('Commence to delete %s/%s:%s ...' % (namespace, repo, tag)))
    url = 'https://hub.docker.com/v2/repositories/%s/%s/tags/%s/' % (namespace, repo, tag)
    headers = {
        'Authorization': 'JWT %s' % token
    }
    request = Request(url=url, headers=headers)
    request.get_method = lambda: 'DELETE'
    try:
        opener = build_opener(HTTPHandler)
        opener.open(request)
        print(('%s/%s:%s deleted successfully.' % (namespace, repo, tag)))
        # body = response.read().decode('utf-8')
    # If we have an HTTPError, try to follow the response
    except HTTPError as err:
        print(("Failed to delete tag. Error: %s" % err))
        raise

support old style manifests

I'm running using Amazon's ECR and that requires support for v2 API auth and V1 schemas.

It would be great to drop down to V1 schemas where necessary

DXF_BLOB_INFO rework?

Currently, this prefixes each blob with digest and size.
Perhaps we should add another option which generates a tar file?
But what to name each entry in the file? Perhaps that should be passed in (and any unnamed are named index-digest)?

Implement a command to get the digest of an alias

the internal get_manifest() function already provides information from the registry:

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 32434,
      "digest": "sha256:$MYHASH"
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 1253243212,
         "digest": "sha256:anotherhash"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 12211986,
         "digest": "sha256:anotherhash"
      }
   ]
}

while docker inspect provides:

$ docker inspect host/foo:tag --format='{{.Id}}'
$ sha256:$MYHASH

It would be nice, if an additional command get-digest could fetch the id from an alias, so it's easy to compare if the local and remote image match without pulling the image.

Feature request: chunked blob upload

To copy a blob between registries, it needs to be downloaded to a file, then uploaded again. It would be great if the push_blob would also be usable with a digest + chunk iterator so that no file must be created.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.