Giter Club home page Giter Club logo

nationalarchives.tdr-prototype-file-export's Introduction

TDR prototype: file export

This project is part of the Transfer Digital Records project. It is a prototype for an application that a Digital Archivist might use to export files from S3 once a transfer has been finalized.

Running the project

The full export has several steps:

  • Download files from S3, and create a directory of files to export
  • Zip the files
  • Upload the encrypted file to a different S3 bucket

You can run the steps separately, or run them together with Docker.

Step 1: download and package files

Configure your AWS credentials in the ~/.aws/credentials file. The download step will use this configuration to authenticate requests to S3.

Set the mandatory environment variables in the command line or in IntelliJ:

  • GRAPHQL_SERVER: The hostname of the API, e.g. http://localhost:8080 in development
  • GRAPHQL_PATH: The path of the GraphQL API endpoint, e.g. graphql in development
  • CONSIGNMENT_ID: the database ID of the consignment to export

Then run sbt download/run.

By default, this will download the contents of a specific S3 bucket to a temporary directory, and create a BagIt bag in another temporary directory.

You can also set some optional environment variables to configure the download:

  • INPUT_BUCKET_NAME: name of the S3 bucket to download files from
  • INPUT_FOLDER_NAME: name of the parent S3 folder (defaults to the consignment ID)
  • FILE_DOWNLOAD_DIR: the local folder to download files to
  • BAG_DIR: the local folder to save the BagIt bag to

Step 2: zip the files

Use tar to create a .tar.gz file:

tar -zcvf name-of-output-file.tar.gz /path/of/directory/to/zip

Step 3: upload the encrypted file

Run:

ARCHIVE_FILEPATH=/path/of/file/to/upload \
  sbt exportZip/run

setting the ARCHIVE_FILEPATH variable to the file to be uploaded.

You can also set the S3 bucket to upload the file to in an optional parameter: EXPORT_BUCKET.

Run all steps in Docker

  • Build the jar files with sbt clean assembly

  • Build the image with docker build . --tag exportfiles

  • Run the Docker image, setting environment variables:

    docker run \
      --env ACCESS_KEY_ID=your_aws_key_id \
      --env SECRET_ACCESS_KEY=your_aws_secret_key \
      --env GRAPHQL_SERVER=https://graphql-api-hostname.amazonaws.com \
      --env GRAPHQL_PATH=some/api/path \
      --env CONSIGNMENT_ID=1234 \
      --env EXPORT_BUCKET=name-of-s3-bucket \
      exportfiles:latest
    

    You can also set INPUT_BUCKET_NAME and INPUT_FOLDER_NAME to specify the S3 bucket and folder to download.

nationalarchives.tdr-prototype-file-export's People

Contributors

suzannehamilton avatar tomjking avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.