Giter Club home page Giter Club logo

qjam's Introduction

qjam is a framework for distributed computing.

It leverages ssh public key setups to automatically bootstrap and start worker
nodes without manual intervention.


DEPENDENCIES
--------------------------------

The following packages are required to use qjam:

  * Python Nose (sudo aptitude install python-nose)
  * Python numpy (sudo aptitude install python-numpy)


EXAMPLES
--------------------------------

You need to have passwordless public-key ssh access to localhost as your
current username. If `ssh localhost' at a terminal gets you to a new prompt,
you're golden. If not, set up ssh keys and/or an ssh agent as appropriate.

The username used to access remote machines via ssh is the username used on the
master machine. You can specify an alternate username to be used on the remote
machines in two ways:

  * Set the QJAM_USER environment variable.
  * Use a 'User' directive in your ~/.ssh/config.

With the proper ssh setup, these example programs can be run directly from the
commandline:

  bin/sum-matrix-example.py
    This is a simple example that explains how to use the qjam API to execute
    code across many machines.

    Usage: ./sum-matrix-example.py localhost machine2 machine3

  examples/newton_sqrt.py
    Approximates the square root of a number using Newton's method.

    Usage: python examples/newton_sqrt.py 123456789


ARCHITECTURE
--------------------------------

There are three main pieces to the qjam framework: the Master, the
RemoteWorker, and the Worker.

The Worker is a program that is copied to all of the remote machines during the
bootstrapping process. It is responsible for waiting for instructions from the
Master, and upon receiving work, processing that work and returning the result.

The RemoteWorker is a special Python class that communicates with the remote
machines. One RemoteWorker has a single target machine that can be reached via
ssh. There can be many RemoteWorkers with the same target (say, in the case
where there are many cores on a machine), but only one target per
RemoteWorker. At creation, the RemoteWorker bootstraps the remote machine by
copying the requisite files to run the Worker program, via ssh. After the
bootstrapping process completes, the RemoteWorker starts a Worker process on
the remote machine and attaches to it. The RemoteWorker is the proxy between
the Master and the Worker.

The Master is a Python class that divides up work and assigns the work units
among its pool of RemoteWorker instances. These RemoteWorker instances relay
the work to the Worker programs running on the remote machines and wait for the
results.

Usage of the qjam framework is simple. There is one primary point of entry on a
Master instance: master.run(module, params, dataset). The 'module' argument is
a Python module object that contains a function 'mapfunc(params, dataset)' that
will be called by the worker on the params and slices of the whole dataset. The
'params' argument specifies an arbitrary Python object that is passed directly
to all workers. The 'dataset' argument is an instance of a DataSet type
(defined in qjam.dataset) that helps qjam determine how to slice the input
data.

There is a difference between 'params' and 'dataset': multiple calls to
master.run() can use the same 'params' and 'dataset', but the pieces of
'dataset' will not be retransferred to all of the remote machines on the second
call to master.run(); the slices of data are cached locally at the
nodes. However, 'params' will be transferred in full on every call to
master.run(). Therefore, 'params' is best used for data that changes between
calls to master.run(), and 'dataset' should be used for data that does not
change.


TESTS
--------------------------------

The tests for qjam are located in the 'tests' directory. To run the test suite,
type:

  nosetests

in the root directory. The tests are a good way to learn more about the
architecture of qjam.


RUNNING ON STANFORD AI LAB MACHINES
----------------------------------------

The yggdrasil machines are missing Python 2.6 and Numpy.

So, you must first install Python 2.6 and Numpy locally. As of
2010/11/29, yggdrasil[1-4] all have these installed at the prefix
/tmp/py26. Adapt the /tmp/update_yggdrasil_py26.sh script to install these on
other yggdrasils. Set the env var QJAM_REMOTE_PYTHON=/tmp/py26/bin/python in
your calls to qjam to use this Python.

Make sure /tmp/py26/bin/python is in your $PATH, and set $PYTHONPATH to the
top-level 'qjam' directory.

Examples:

yggdrasil1% QJAM_REMOTE_PYTHON=/tmp/py26/bin/python2.6 \
            bin/sum-matrix-example.py \
            yggdrasil1 yggdrasil2 yggdrasil3 yggdrasil4

yggdrasil1% QJAM_REMOTE_PYTHON=/tmp/py26/bin/python2.6 \
            python examples/newton_sqrt.py 123456789 yggdrasil1

yggdrasil1% QJAM_REMOTE_PYTHON=/tmp/py26/bin/python2.6 \
            nosetests tests/

qjam's People

Contributors

ali01 avatar jbenet avatar msparks avatar sqs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

sqs jbenet msparks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.