Giter Club home page Giter Club logo

spark-on-hpc's Introduction

Introduction

Spark-on-HPC dynamically provisions Apache Spark clusters and run spark jobs on an HPC under its traditional resource manager. Currently, PBS, OAR or Torque on a linux cluster are supported.

Features

  • Run under PBS or OAR resource limit, i.e. number of nodes, number of cores, memory and walltime
  • Multiple spark jobs (master port is selected randomly for each job) for any user
  • Only master and workers of the same job are allowed to connect together by a shared secret.

Requirements

  • Linux (should work with most distributions)
  • Apache Spark 1.3.0+

Installation

  • Download and unpack Spark package into SPARK_HOME directory
  • Download and unzip the Spark-on-HPC package. Change directory to SPARK_ON_HPC root directory
#cd $SPARK_ON_HPC
  • Copy scripts into $SPARK_HOME/sbin
#cp pbs/spark-sbin/* $SPARK_HOME/sbin

or

#cp oar/spark-sbin/* $SPARK_HOME/sbin

Usage

Root permission is NOT required.

  • Once installed, create a job directory.
#cd $HOME
#mkdir test
#cd test

PBS

  • Copy an example job script inside the package. There are two examples in the package. One is for single node script. The other is for multiple node script.
#cp $SPARK_ON_HPC/examples/test_spark_multi/spark_multi.sh test_spark_job.sh
  • Make change to the script. Usually, the directives, shell variables and spark-submit arguments are changed. Set directives. For a PBS example, request 5 nodes (1 master + 4 workers), each with 2 cores and 1gb memory allocated, and queue "test".
#PBS -l nodes=5:ppn=2
#PBS -l vmem=1gb
#PBS -q test
  • In job script, set SPARK_HOME to where the spark package is installed, and SPARK_JOB_DIR to the directory where the configuration and log files will be created. Note that the PBS_O_WORKDIR is the location where qsub command is executed.
export SPARK_HOME=$HOME/spark-1.4.1-bin-1.2.1
export SPARK_JOB_DIR=$PBS_O_WORKDIR
  • In job script, change the spark-submit arguments. For example, running JavaSparkPi 10 tasks
SPARK_HOME/bin/spark-submit --master $SPARK_URL --class org.apache.spark.examples.JavaSparkPi $SPARK_HOME/lib/spark-examples-1.4.1-hadoop1.2.1.jar 10 > $PBS_O_WORKDIR/pi.txt
  • Submit a job
#qsub test_spark_job.sh

The directory conf, logs, and work will be created in SPARK_JOB_DIR during the execution of spark. Examine them if necessary in addition to the normal job stdout and stderr files.

OAR

# create a job folder
mkdir jobs
# copy an example submission script
cp spark-on-oar/oar/spark-multi.sh jobs/
# Edit the $SPARK_HOME env variable in spark-multi.sh to point to your spark folder
# then submit with oarsub, e.g.:
oarsub -l nodes=2/cpu=1/core=1,walltime=0:20 -n sparkPi spark-multi.sh

By default spark-multi.sh submits a SparkPi job with parameter 10. You should replace this line with your own submissions. Spark will write all the logs in your jobs folder.

How to run spark-on-hpc manually for testing purpose

Set environment variables SPARK_JOB_DIR, SPARK_HOME, and PBS_NODEFILE.

spark-on-hpc's People

Contributors

ekasitk avatar krledmno1 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.