Giter Club home page Giter Club logo

fluid's Introduction

License CircleCI Build Status codecov Go Report Card Artifact HUB OpenSSF Scorecard CII Best Practices Leaderboard

📅 Community Meeting
The Fluid project holds bi-weekly community online meeting. To join or watch previous meeting notes and recordings, please see meeting schedule and meeting minutes.

What is Fluid?

Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. It is hosted by the Cloud Native Computing Foundation (CNCF) as a sandbox project.

For more information, please refer to our papers:

  1. Rong Gu, Kai Zhang, Zhihao Xu, et al. Fluid: Dataset Abstraction and Elastic Acceleration for Cloud-native Deep Learning Training Jobs. IEEE ICDE, pp. 2183-2196, May, 2022. (Conference Version)

  2. Rong Gu, Zhihao Xu, Yang Che, et al. High-level Data Abstraction and Elastic Data Caching for Data-intensive AI Applications on Cloud-native Platforms. IEEE TPDS, pp. 2946-2964, Vol 34(11), 2023. (Journal Version)

Fluid

English | 简体中文

notification What is NEW!
Latest Release: Apr. 17th, 2024. Fluid v1.0.0. Please check the CHANGELOG for details.
v0.9.0 Release: May. 26th, 2023. Fluid v0.9.0. Please check the CHANGELOG for details.
v0.8.0 Release: Sep. 03th, 2022. Fluid v0.8.0. Please check the CHANGELOG for details.
v0.7.0 Release: Mar. 02th, 2022. Fluid v0.7.0. Please check the CHANGELOG for details.
v0.6.0 Release: Aug. 11th, 2021. Fluid v0.6.0. Please check the CHANGELOG for details.
Apr. 27th, 2021. Fluid accepted by CNCF! Fluid project was accepted as an official CNCF Sandbox Project by CNCF Technical Oversight Committee (TOC) with a majority vote after the review process. New beginning for Fluid! .

Features

  • Dataset Abstraction

    Implements the unified abstraction for datasets from multiple storage sources, with observability features to help users evaluate the need for scaling the cache system.

  • Scalable Cache Runtime

    Offers a unified access interface for data operations with different runtimes, enabling access to third-party storage systems.

  • Automated Data Operations

    Provides various automated data operation modes to facilitate integration with automated operations systems.

  • Elasticity and Scheduling

    Enhances data access performance by combining data caching technology with elastic scaling, portability, observability, and data affinity-scheduling capabilities.

  • Runtime Platform Agnostic

    Supports a variety of environments and can run different storage clients based on the environment, including native, edge, Serverless Kubernetes clusters, and Kubernetes multi-cluster environments.

Key Concepts

Dataset: A Dataset is a set of data logically related that can be used by computing engines, such as Spark for big data analytics and TensorFlow for AI applications. Intelligently leveraging data often creates core industry values. Managing Datasets may require features in different dimensions, such as security, version management and data acceleration. We hope to start with data acceleration to support the management of datasets.

Runtime: The Runtime enforces dataset isolation/share, provides version management, and enables data acceleration by defining a set of interfaces to handle DataSets throughout their lifecycle, allowing for the implementation of management and acceleration functionalities behind these interfaces.

Prerequisites

  • Kubernetes version > 1.16, and support CSI
  • Golang 1.18+
  • Helm 3

Quick Start

You can follow our Get Started guide to quickly start a testing Kubernetes cluster.

Documentation

You can see our documentation at docs for more in-depth installation and instructions for production:

You can also visit Fluid Homepage to get relevant documents.

Quick Demo

Demo 1: Accelerate Remote File Accessing with Fluid

Demo 2: Machine Learning with Fluid

Demo 3: Accelerate PVC with Fluid

Demo 4: Preload dataset with Fluid

Demo 5: On-the-fly dataset cache scaling

Roadmap

See ROADMAP.md for the roadmap details. It may be updated from time to time.

Community

Feel free to reach out if you have any questions. The maintainers of this project are reachable via:

DingTalk:

WeChat Official Account:

Slack:

  • Join in the CNCF Slack and navigate to the #fluid channel for discussion.

Contributing

Contributions are highly welcomed and greatly appreciated. See CONTRIBUTING.md for details on submitting patches and the contribution workflow.

Adopters

If you are interested in Fluid and would like to share your experiences with others, you are warmly welcome to add your information on ADOPTERS.md page. We will continuously discuss new requirements and feature design with you in advance.

Open Source License

Fluid is under the Apache 2.0 license. See the LICENSE file for details. It is vendor-neutral.

Report Vulnerability

Security is a first priority thing for us at Fluid. If you come across a related issue, please send email to [email protected]. Also see our SECURITY.md file for details.

Code of Conduct

Fluid adopts CNCF Code of Conduct.

fluid's People

Contributors

abowloflrf avatar allenhaozi avatar baowj-678 avatar billychen1 avatar chenxiaofei-cxf avatar cheyang avatar daomin885 avatar dashanji avatar dependabot[bot] avatar fengshunli avatar frankleaf avatar hahchenchen avatar iluoeli avatar ldawns avatar littletiger123 avatar myccccccc avatar ronggu avatar ssz1997 avatar trafalgarzzz avatar uniqueni avatar wang-mask avatar wangshli avatar xiao-hou avatar xieydd avatar xliuqq avatar yangjun289519474 avatar yangyuliufeng avatar zhang-x-z avatar zhongweichang001 avatar zwwhdls avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.