Giter Club home page Giter Club logo

spark-k8s's Introduction

spark-k8s

Run a Spark cluter in a Kubernetes cluster. Execute custom module imported from worker nodes.

This content is based on a tutorial. See my references.

Tested on GCP.

steps:

  1. Copy spark-k8s-config.yaml to your home directory and configure.
  2. Deploy infrastructure with Terraform. Build your Spark base image. Deploy a Spark cluster.
python3 cli.py 
  1. run PySpark on headnode
kubectl exec -it spark-master-0 -- pyspark 
  1. Test it
from example_module import func 
x = sc.parallelize(range(100),20)
y = x.map(str) 
z = y.map(func)
z.collect()
  1. Run more-complex software by recursively copying-in directory trees to each Spark pods' /work directory.
python3 cli.py --update-work-dir [dir to copy]
  1. spark-submit
spark-submit \
  --master spark://spark-master:7077 \
  --supervise \
  --py-files work/regmem.py,work/az_blob_util.py,work/regmem_cnn.py,work/lanczos.py,work/nlp.py,work/miner.py \
  --files work/shakespeare_tokens.pkl \
  --conf "spark.python.worker.memory=3g" \
  work/spark-k8s-experiment-14-online-learning.py 1> stdout 2> stderr

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.