Giter Club home page Giter Club logo

constellationonyarn's Introduction

ConstellationOnYarn

This repo contains some experimental code for putting the Consellation runtime system on YARN/HDFS

The nl.esciencecenter package contains a simple example application that reads an input file and computes a hash of each block.

Prerequisites

To run this example you need a machine which has the following software installed:

  • Java 1.7 or higher
  • ant 1.7 or higher
  • Hadoop 2.5 or higher

Note that Hadoop must be set up properly, i.e., the $HADOOP_HOME environment variable must be set up. In addition, make sure you use the Java version that is used to run Hadoop to compile this example. Using a different version may result in unsupported version exceptions.

Compilation

After cloning this example, change to the directory and compile using:

ant

This compiles the example, and creates a ./dist directory containing a ConstellationOnYARN.jar plus all dependencies.

Running the example

To run the example you will first need to put an input file in HDFS. As usual with Hadoop, bigger is better. You can use an ubuntu source image for example:

wget http://ftp.acc.umu.se/mirror/cdimage.ubuntu.com/releases/xenial/release/source/ubuntu-16.04-src-1.iso

Next put this image into HDFS for example (after replacing /user/jason with your home directory in HDFS):

hdfs dfs -copyFromLocal ubuntu-16.04-src-1.iso /user/jason

Next run the example using (after replacing /user/jason with your home directory in HDFS twice):

java -cp ./dist/ConstellationOnYarn.jar:`yarn classpath` nl.esciencecenter.ConstellationSubmitter /user/jason/ ./dist /user/jason/ubuntu-16.04-src-1.iso true 1

This starts the nl.esciencecenter.ConstellationSubmitter example locally, which connects to the YARN scheduler and submits the example. This class expects the following command-line parameters:

  • The HDFS root directory to which the dependencies should be staged-in, /user/jason in this example.
  • The local directory containing the dependencies that need to be staged in. Use ./dist to run this example.
  • The input file (on HDFS) to process, /user/jason/ubuntu-16.04-src-1.iso in this example.
  • true/false, indicating whether an attempt should be made to create precise activity contexts.
  • The number of YARN Workers to submit, 1 in this example.

The example first copies the ./dist directory to HDFS, as these files are needed on the YARN nodes to run the example. It then stubmits the job to YARN, and waits for it to finish.

The example code

The example code can be found in 'src' and consists of the following packages:

  • nl.esciencecenter the main example code
  • nl.esciencecenter.constellation the constellation application
  • nl.esciencecenter.yarn a somewhat generic library to access YARN

Like most YARN applications, running this application requires three parts:

  • nl.esciencecenter.ConstellationSubmitter: the main application that connects to the YARN scheduler and submits the nl.esciencecenter.ApplicationMaster
  • nl.esciencecenter.ApplicationMaster: the master application that runs on a YARN node, aquires the nodes to run one or more nl.esciencecenter.constellation.ConstellationWorker applications and starts them, starts a nl.esciencecenter.constellation.ConstellationMaster itself, and waits for everything to finish.
  • nl.esciencecenter.constellation.ConstellationWorker: the worker application that is started on the YARN nodes to processes one or more nl.esciencecenter.constellation.SHA1Jobs.

In addition, the following classes are used:

  • nl.esciencecenter.constellation.ConstellationMaster: the Constellation master application which starts a Constellation, submits the jobs, and gathers the results.
  • nl.esciencecenter.constellation.SHA1Job: a job that is created by the ConstellationMaster and executed on one of the ConstellationWorkder. Each of these jobs reads a block of data from a file in HDFS, and calculates the SHA1 hash of this block.
  • nl.esciencecenter.constellation.SHA1Result: a result object that is returned by each SHA1Job to the ConstellationMaster.
  • nl.esciencecenter.yarn.YarnSubmitter: a generic class used by 'ConstellationOnYarn' to submit the application to YARN.
  • nl.esciencecenter.yarn.YarnMaster: a generic class used by 'ApplicationMaster' to submit the workers to YARN.
  • nl.esciencecenter.yarn.Utils: various utility methods used by 'ApplicationSubmitter' and 'ApplicationMaster'.

constellationonyarn's People

Contributors

cerieljacobs avatar jmaassen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.