Giter Club home page Giter Club logo

dwp.uc-historic-data-importer's Introduction

uc-historic-data-importer

Import UC mongo backup into hbase.

A Makefile wraps some of the gradle and docker-compose commands to give a more unified basic set of operations. These can be checked by running:

$ make help

Data Transformations

When we import the entire set of historic data, there are a number of old and differing record structures to what we see when new data is sent across Kafka. To deal with this we tranform data in certain circumstances.

Details of all the transforms we perform and the reasons why can be found on the data transforms page.

Build

Ensure a JVM is installed and run the gradle wrapper.

make build

Run full local stack

A full local stack can be run using the provided Dockerfile and Docker Compose configuration. The Dockerfile uses a multi-stage build so no pre-compilation is required.

make up

The environment can be stopped without losing any data:

make down

Or completely removed including all data volumes:

make destroy

Run in an IDE

First bring up the containerized versions of hbase, aws and dks:

make up-ancillary

Then arrange for their docker level network names and IPs to be in your hosts files:

make hosts

Create a run configuration with the environment variable SPRING_CONFIG_LOCATION pointing to resources/application-ide.properties and a main class of app.UcHistoricDataImporterApplication, and run this.

Getting logs

The services are listed in the docker-compose.yaml file and logs can be retrieved for all services, or for a subset.

docker-compose logs aws-s3

The logs can be followed so new lines are automatically shown.

docker-compose logs -f aws-s3

The dks logs live in the containers so to get the logs after a run these targets are provided:

make dks-logs-http
make dks-logs-https

UC laptops

This is a one time activity.

Java

Install a JDK underneath your home directory, one way to do this is with sdkman.

Once sdkman is installed and initialised you can install a jdk with e.g.:

sdk install java 8.0.222-zulu

Make sure that JAVA_HOME is set after this completes, start a new shell first but if it is still not set you may need to add a line to your .bashrc thus:

export JAVA_HOME=/Users/<your-username>/.sdkman/candidates/java/current

then have it set in your current session by executing

exec bash

Gradle wrapper

Update the project's gradle wrapper properties file to include a gradle repository that can be accessed from a UC laptop. From the project root directory:

cd setup
./wrapper.sh ../gradle/wrapper/gradle-wrapper.properties

A backup of the original file will created at ./gradle/wrapper/gradle-wrapper.properties.backup.1

Gradle.org certificates

The gradle.org certificate chain must be inserted into your local java truststore:

cd setup # if not already there.
./certificates.sh path-to-truststore
# e.g.
./certificates.sh $JAVA_HOME/jre/lib/security/cacerts

..again a backup will be created at (in the example above)

$JAVA_HOME/jre/lib/security/cacerts.backup.1.

Run a gradle build

From the project root first ensure no gradle daemons are running.

./gradlew --stop

Then run a gradle build

./gradlew build

To ensure the dockerized setup is functional, first generate the self-signed certificates needed for local development (from the project root):

./truststores.sh

or

make jks

Then bring up the containers:

make up

Note that you are at the mercy of the quarry house wifi here as there are a number of large docker image downloads.

dwp.uc-historic-data-importer's People

Contributors

danielchicot avatar dataworks-ci avatar draagc-iw avatar mark-r-m avatar mattburgess avatar snyk-bot avatar steveburton4 avatar tea-short avatar udaychokkam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.