cloud-gov / s3-simple-resource Goto Github PK

Concourse CI Resource for uploading files to S3

Home Page: https://hub.docker.com/r/18fgsa/s3-resource-simple/

License: Other

Shell 91.04% Dockerfile 8.96%

s3-simple-resource's Introduction

Simple S3 Resource for Concourse CI

Resource to upload files to S3. Unlike the the official S3 Resource, this Resource can upload or download multiple files.

Usage

Include the following in your Pipeline YAML file, replacing the values in the angle brackets (< >):

resource_types:
- name: <resource type name>
  type: docker-image
  source:
    repository: 18fgsa/s3-resource-simple
resources:
- name: <resource name>
  type: <resource type name>
  source:
    access_key_id: {{aws-access-key}}
    secret_access_key: {{aws-secret-key}}
    bucket: {{aws-bucket}}
    path: [<optional>, use to sync to a specific path of the bucket instead of root of bucket]
    change_dir_to: [<optional, see note below>]
    options: [<optional, see note below>]
    region: <optional, see below>
jobs:
- name: <job name>
  plan:
  - <some Resource or Task that outputs files>
  - put: <resource name>

AWS Credentials

The access_key_id and secret_access_key are optional and if not provided the EC2 Metadata service will be queried for role based credentials.

change_dir_to

The change_dir_to flag lets you upload the contents of a sub-directory without including the directory name as a prefix in your bucket. Given the following directory test:

test
├── 1.json
└── 2.json

and the config:

- name: test
  type: s3-resource-simple
  source:
    change_dir_to: test
    bucket: my-bucket
    [...other settings...]

put will upload 1.json and 2.json to the root of the bucket. By contrast, with change_dir_to set to false (the default), 1.json and 2.json will be uploaded as test/1.json and test/2.json, respectively. This flag has no effect on get or check.

Options

The options parameter is synonymous with the options that aws cli accepts for sync. Please see S3 Sync Options and pay special attention to the Use of Exclude and Include Filters.

Given the following directory test:

test
├── results
│   ├── 1.json
│   └── 2.json
└── scripts
    └── bad.sh

we can upload only the results subdirectory by using the following options in our task configuration:

options:
- "--exclude '*'"
- "--include 'results/*'"

Region

Interacting with some AWS regions (like London) requires AWS Signature Version 4. This options allows you to explicitly specify region where your bucket is located (if this is set, AWS_DEFAULT_REGION env variable will be set accordingly).

region: eu-west-2

s3-simple-resource's People

Contributors

Stargazers

Watchers

s3-simple-resource's Issues

Vulnerabilities in the Docker image

The Docker image produced by this repo appears at the top of the list of containers on Vulnerable Containers.

Support IAM Roles

AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY take precedence over interrogating metadata for role based credentials, regardless of whether they are empty or not.

If you are to rely on a role being applied, then these two environment variables must NOT be defined.

Therefore the following breaks the ability to use IAM roles.

export AWS_ACCESS_KEY_ID=$(echo "$payload" | jq -r '.source.access_key_id')
export AWS_SECRET_ACCESS_KEY=$(echo "$payload" | jq -r '.source.secret_access_key')

We should only export these environment variables when source.access_key_id and source.secret_access_key` are not empty.

I am submitting a Pull Request that addresses this.

Testing

With regards to testing, open to ideas to test the absence of credentials on a docker image that would mimic a role and access to AWS metadata which you don't have access to in local development.

I have tested this running in Concourse and as stated, if the credential environment variables are not set, then it will gain access through keys retrieved from the metadata service.

`check` must return some form of latest version.

This resource needs to support Concourse/versioning via the check method.
Otherwise an initial get of this resource hangs indefinitely in Concourse.

if there are no versions yielded or produced by a put nothing can ever run

Best idea currently is to use the most recently modified file in our S3 bucket as the version.

@DavidEBest came up with this idea:

aws s3api list-objects --bucket BUCKET --query 'Contents[].{LastModified: LastModified}' | jq -r 'max_by(.LastModified)'

Any plans in updating the resource to AWS CLI v2

Is there any plans to update the resource to use AWS CLI v2.

Notes

When running the following warning is printed:

Note: AWS CLI version 2, the latest major version of the AWS CLI, is now stable and recommended for general use. For more information, see the AWS CLI version 2 installation instructions at: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html

latest docker image on docker hub not updated by 18fgsa

from https://hub.docker.com/r/18fgsa/s3-resource-simple/tags

latest
Last updated23 days ago by cloudgovoperations

but previous built images were made by 18fgsa. Does the latest docker image legitimate ? Can we use it?

I also noticed that https://hub.docker.com/r/18fgsa/s3-resource-simple/dockerfile doesn't match the current (or previous Dockerfile on the master branch).

Thanks for sharing this concourse resource type (I can stop the one I started).

Uploading files not including root directory

First of all, thanks for your amazing tool. It is being very helpful for my purposes. I have an issue when I am uploading files to S3, that you maybe could gently help me. The thing is when I am setting "options" parameters like this:

options:

"--exclude '*'"
"--include 'results/*'"

I don't want to upload also "results" directory to s3 bucket, just the content inside of it to the bucket's root . I didn't find an easy way since it seems that concourse resource requires an output directory to make these operations.

Thanks a lot in advance!

Support for setting custom S3 endpoint for using S3 compatible provider.

Hello,

First of all -- thanks for your awesome work!

The official S3 resource has support for setting custom endpoint for using S3 compatible provider, but does not support multiple files. While this resource supports multiple files, it does not support setting custom S3 endpoint.

Do you have any idea for an easy way to tweak it so I'd be able to configure a custom S3 endpoint? I'm using minio, if that matters.
Thanks!

Check fails if bucket is empty

When a bucket has no initial contents, this resource fails with the error:

jq: error (at <stdin>:1): Cannot iterate over null (null)

This is because in this scenario, aws s3api list-objects --query 'Contents[].{LastModified: LastModified}' returns null, and max_by(.LastModified) is not a valid jq query.

Acceptance Criteria

The check for an empty bucket should return an empty list of versions, rather than returning an error.

register on concourse/resource-types

Maybe you can register this resource-type at https://github.com/concourse/resource-types .

options: Have options on put

options are set in the resources/source configuration.

It would be nice to have options during the put.

For example with a additional_options param ? Or override_options.

What do you think ?

s3-simple can't start new thread when uploading to S3

When running a put jobs, the resource is failing with can't start new thread error

Log is provided below:

fetching 18fgsa/s3-resource-simple@sha256:971cf6b8f1628dc9b51883a5de3b0c411ec8bd0f030c4d7d4ba785002af0ac16
677076032cca [========================================] 28.2MiB/28.2MiB
8dc06e55e485 [========================================] 27.6MiB/27.6MiB
dd432980e08f [========================================] 10.0MiB/10.0MiB
87d620264231 [======================================] 110.0MiB/110.0MiB
bc06e923974c [======================================] 829.1KiB/829.1KiB
0cf2e98594a4 [==========================================] 1.7MiB/1.7MiB
de02592f91ad [==========================================] 5.4MiB/5.4MiB
1cf5234db64b [========================================] 31.9MiB/31.9MiB
0e9d17ab554f [==========================================] 1.2KiB/1.2KiB
+ exec
+ exec
+ source=/tmp/build/put
+ [ -z /tmp/build/put ]
+ set +x
+ [ -n  ]
+ cd /tmp/build/put/loopedge-logs
+ echo Uploading to S3...
Uploading to S3...
+ eval aws s3 sync . s3://le-yocto-build-logs/
+ aws s3 sync . s3://le-yocto-build-logs/

can't start new thread

Concourse Pipeline resource configuration:

resource_types:
- name: s3-simple
  type: registry-image
  source:
    repository: 18fgsa/s3-resource-simple
    username: ((docker-hub-username))
    password: ((docker-hub-password))

- name: loopedge-logs
  type: s3-simple
  source:
    bucket: le-yocto-build-logs
    access_key_id: ((aws_access_key))
    secret_access_key: ((aws_secret))
    change_dir_to: loopedge-logs

I would appreciate any assistance with debugging this issue.

Acceptance Criteria

Resource uploads to S3 successfully

Add ability to filter what should be uploaded or downloaded.

As it functions now, the entire current directory is synced. This behavior is problematic for the get use case, especially if the scripts to be run for the task are in the working directory.

It'd be good to specify a subdirectory or filter pattern as an additional parameter.

Specifying working directory for aws sync command?

Since concourse puts are passed in the directory containing all of a plan's outputs, is it currently possible to sync only one output directory to the root of a s3 bucket?

It looks like we'd have to pass an additional parameter to either cd into or use in the aws sync line. Would you be open to that as a pull request?

Looks like someone tried something similar here: pivotal-sydney@3888817

Use of `eval` introduces possibility of shell injection unnecessarily.

Unfortunately, fixing this will require changing the configuration format -- it may need to wait for a new major release.

Current configuration is of the form:

options:
- "--exclude '*'"
- "--include 'results/*'"

...that is to say, correct shell escaping is required to be part of the data and the boundary between individual options is not meaningful (for example, --exclude is one argument and * is another, but they're entered as part of the same list element; one could add - "--exclude '*' --include 'results/*'" as a single element with the exact same semantic meaning).

The use of eval makes this possible, but that also introduces potential for human error (any command substitution, redirection, globbing, or other syntax present in the options array will be evaluated by the shell executing the scripts).

A safe alternative would be to instead use configuration of the form:

options:
- "--exclude"
- "*"
- "--include"
- "results/*"

...and have jq be responsible for transforming it into a shell array, as follows:

eval "set -- $(printf '%s\n' "$payload" | jq -r '.source.options // [] | @sh')"

...which runs the command:

set -- '--exclude' '*' '--include' 'results/*'

...allowing "$@" to be expanded to the desired arguments later in the script.

Notes

For the reasons for using printf over echo, see https://unix.stackexchange.com/a/65819/3113, or the APPLICATION USAGE and RATIONALE sections of the POSIX standard for echo)
Passing eval a single string is strongly preferred over passing it a list of strings. Notably, passing eval multiple strings does not meaningfully pass it distinct arguments: the strings it is passed are concatenated together (with spaces between them) into a single larger string before that single, larger string is passed to the parser. I'm glad to provide some examples of cases where this results in unintended behavior should those be desired.

Unable to sync directory using put

I'm new to concourse and found this resource which meets my requirements to sync directory with s3 bucket.
I'm trying to use this to upload entire directory to s3 bucket. I have a task which has a directory as an output and followed by put task. if I use inputs[output from previous task] in the put task I get an error more than one pattern found.
Could you please let me know how do I pass my output(directory) from build task to put task.

Thanks!!

Using a git resource for files

I might be missing something basic here as I am quite new to concourse, but I am trying to upload the contents of a git repo to S3...with some exclusions. It seems that when providing the git repo as in input to the s3 resource it always uploads to the repo name i.e. site/ I imagine because it is cloning into that as a folder, which then gets synced up:

---
resource_types:
  - name: s3-put
    type: docker-image
    source:
      repository: 18fgsa/s3-resource-simple
resources:
  - name: site
    type: git 
    source:
      uri: ssh://***/site.git
      branch: master
      private_key: ((git-private-key))
  - name: my-s3-bucket
    type: s3-put
    source:
      access_key_id: ((aws-access-key-id))
      secret_access_key: ((aws-secret-access-key)) 
      bucket: my-s3-bucket
      options:
        - "--exclude '*'"
        - "--include 'site/*'" #only this seems to work
      region: eu-west-1
jobs:
  - name: deploy
    serial: true
    plan:
      - get: holding-site
      - put: my-s3-bucket

Ideally I would want it to sync into the root of the bucket.

Thanks in advance for any help

Using v1.2.10 causes already working `put` step to fail

Since the new changes to this resource, the put step fails with the following error in our pipeline job.

+ dirname /opt/resource/out
+ source /opt/resource/emit.sh
/opt/resource/out: 49: /opt/resource/out: source: not found

Notes

We use this resource to upload our static assets to an S3 bucket and it was working fine with v1.2.6.

cloud-gov / s3-simple-resource Goto Github PK

s3-simple-resource's Introduction

Simple S3 Resource for Concourse CI

Usage

AWS Credentials

change_dir_to

Options

Region

s3-simple-resource's People

Contributors

Stargazers

Watchers

Forkers

s3-simple-resource's Issues

Testing

Notes

Acceptance Criteria

Acceptance Criteria

Notes

Notes

Recommend Projects

Recommend Topics

Recommend Org