Giter Club home page Giter Club logo

ais-k8s's Introduction

AIStore on Kubernetes

AIStore is a lightweight, scalable object storage solution designed for AI applications. This repository serves as a complete toolkit for setting up AIStore in a Kubernetes environment, accommodating both managed Kubernetes services and bare-metal Kubernetes setups.

Overview

This repository includes a variety of resources to facilitate your deployment:

  • Documentation/Guide: This section provides detailed, step-by-step instructions for deploying AIStore on Kubernetes (K8s), covering essential deployment scenarios and considerations.
  • Ansible Playbooks: These playbooks are designed to streamline the setup of Kubernetes worker nodes for hosting AIStore deployments.
  • Kubernetes Operator: AIS K8s Operator simplifies critical tasks such as bootstrapping, deployment, scaling, graceful shutdowns, and upgrades. It extends Kubernetes' native API, automating the lifecycle management of AIStore clusters.
  • Helm Charts: [In development]. Helm charts for deploying AIS resources to be controlled by the operator (alternative to ansible).
  • Monitoring: This guide provides detailed instructions on how to monitor AIStore using both command-line tools and a Kubernetes-based monitoring stack.

A Simple System Overview

The diagram illustrates a AIStore deployment on Kubernetes spread across multiple nodes, each containing a proxy and a target pod. The proxy routes client requests to the target pods, which handle data storage and retrieval. These pods utilize Persistent Volume Claims (PVCs) linked to Persistent Volumes (PVs) corresponding to actual storage disks. The AIS Operator oversees the entire setup, managing all operations related to the cluster.

system-overview

Small Scale Experimental Deployments

This repository mainly focuses on production deployments of AIStore with multiple nodes each with multiple drives. If you don't require such scale then consider checking out the different deployment options available.

Deployment Guide

To successfully implement a multi-node deployment of AIStore in a production environment, thorough planning and strategic configuration decisions are essential. We recommend reviewing our Key Deployment Scenarios to determine the specific needs and objectives for your cluster. For a clear and detailed roadmap, our Step-by-Step Deployment Guide provides extensive instructions and best practices for setting up AIStore clusters on Kubernetes.

AIStore Operator

The AIS Operator is a key component in the ais-k8s system. It helps manage everything in an AIStore cluster, making tasks like starting, deploying, adjusting size, shutting down smoothly, and updating easier. It effectively handles AIStore resources within Kubernetes, adding to the Kubernetes API to fully automate the AIStore's lifecycle.

Important: Our deployment guide focuses on using the AIStore Operator for an easy and integrated setup process.

ais-k8s's People

Contributors

saiprashanth173 avatar virrages avatar aaronnw avatar gaikwadabhishek avatar knopt avatar alex-aizman avatar grmaltby avatar ryan-beisner avatar vladimirmarkelov avatar yingca1 avatar straill-nvidia avatar superleo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.