Giter Club home page Giter Club logo

hpc-collab's People

Contributors

ssenator avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hpc-collab's Issues

create a small library of cluster templates, preferably with a small constructor to create a cluster from a pre-existing template

Templates of interest:

  1. minimal cluster, similar to a stripped down "vc"
  2. use-case: (re)create use problem found on a production cluster, so that the packaged hpc-collab cluster may be used to a) demonstrate the problem, b) (in)validate the fix for the problem, c) provide archival artifacts, potentially serving as points of integration into larger integration & validation test suites
  3. minimal cluster,
  1. clusters with login front-ends in different security zones

Nodes must be able to be halted, destroyed and then re-integrated, possibly with a "light provision" or "no provision" alternative

This issue arose from a discussion regarding the USRC Resilience team's requirements for fault detection and signature generation.

It may be a justification for a lighterweight underlying provider, such as libvirt/ovirt or docker/container.

This would allow node configuration changes to be tested in a CI/CD pipeline, self-validating an hpc-collab node recipe & configuration change.

Another alternative: , returning node back to an earlier full state
Perhaps an underlying vagrant snapshot/suspend mechanism could be utilized, provided:
there are sufficient hooks to trigger a mini-provision, consisting mostly of verification, and
there is a mechanism to do multi-machine snapshot/resume in the preserving dependencies.

documentation and tool to CreateClusterFromRecipeTemplate

  • Issue #1/AddNodeCfg
  • Issue #2/Refactor ... common
  • Issue #22/External source of truth cfg mgmt
  • Makefile/Autoconf/CMake/Ansible-glue to generate Vagrantfile for a meta-make cluster recipe (using templates)
    • constructs other shared views into common data such as:
      /etc/ethers, /etc/networks, /etc/hosts and Vagrantfile IP and MAC addresses dynamically
    • Use this instantiation of a file system import into the cluster as a consistent set checkout to interface with an external configuration management system
      populate user data, slurmdb, full node config? from external source of truth, ex. HESIOD, or vice versao, populate external directory entry form local recipe (alternate option) file system configuration

leaner shell provisioner: provision.sh => Makefile/cfg-mgmt: provisioning is within node recipe (data/config/cfg-mgr/ansible) rather than procedural control (provision.sh)

Issue #2, Issue #22, Issue #52

  • Makefile target rules (build, install, configure, verify) rather than in provision.sh driver
  • precise ordering
    • precisely specify order of operations of service hierarchy in sub-directories of {build,config,install,verify}
    • including the {inc,lib}/*.sh headers which should not be loaded in alphabetic order (ex. dynamic before cfgfs)
      each directory contains a "cursor", "next" and a "requires"

Document and/or create a small 'AddNodeCfg' process & tool

Current process is as follows:

  1. Add an entry into clusters/<cluster-id>/common/etc/{hosts,ethers} for the new host picking a unique
    IP and MAC address
  2. Put in a corresponding entry into clusters/<cluster-id>/Vagrantfile.<cluster-id>.nodes
  3. Add an entry into clusters/<cluster-id>/common/etc/slurm.conf, for the node itself and whichever partition(s) are appropriate.
  4. Update clusters/<cluster-id>/cfg/<new-node>/requires/<pre-requisite-nodes>/<pre-requisite-svcs> up

Modulefile to set path & aliases, as a superset of setpath.*

nodename aliases vary depending upon whether the node has been provisioned correctly. If the node is not fully provisioned, the alias maps to 'make '. If the node is fully provisioned, the alias maps to 'ssh -o UserKnownHostsFile=/dev/null '

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.