Giter Club home page Giter Club logo

ansible-role-spark's Introduction

ansible-spark

Ansible role to deploy a spark cluster

Installation

  1. Install dependencies, this role requires java be installed on the nodes. My advice would be to use galaxy:
sudo ansible-galaxy install geerlingguy.java
  1. Simply clone this repo to /etc/ansible/roles
sudo git clone [email protected]:slaclab/ansible-role-spark.git /etc/ansible/roles/ansible-spark
  1. modify your hosts inventory file (either /etc/ansible/hosts or your own file somewhere to be referenced by the playbook with -i) to something like this:
[all:vars]
ansible_user=centos
ansible_ssh_private_key_file=~private-key.pem

[masters]
dhcp-os-129-163.slac.stanford.edu

[zookeepers:children]
masters

[spark-masters:children]
masters

[slaves]
dhcp-os-129-155.slac.stanford.edu
dhcp-os-129-160.slac.stanford.edu
dhcp-os-129-161.slac.stanford.edu
dhcp-os-129-162.slac.stanford.edu

[spark-workers:children]
slaves
  1. create a playbook file somewhere (eg. ~/spark.yml):
- name: spark master setup
  hosts: all
  roles:
    - role: geerlingguy.java
    - role: ansible-spark
  1. run the playbook
ansible-playbook [-i <path>/<to>/<hosts/inventory>] ~/spark.yaml
  1. validate cluster

navigate a web browser to http://:7077/ and you should see the spark master panel and all of the workers defined registered.

  1. test cluster
/opt/spark/spark-2.0.0-bin-hadoop2.7/bin/spark-submit  \
  --master spark://<master>:7077  \
  --supervise   
  --class org.apache.spark.examples.SparkPi  \
  /opt/spark/spark-2.0.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.0.0.jar   1000

to-do

  • integrate zookeeper
  • add ubuntu support

ansible-role-spark's People

Contributors

yee379 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.