Giter Club home page Giter Club logo

sealfs's Introduction

SEALFS

English | 简体 中文

The storage system of sealos, aiming to be a high-performance, highly reliable and auto-scalable distributed file system which fits the cloud native environment.

System Architecture

The architecture of sealfs is decentralized, and there is no single metadata node. sealfs hopes to improve the read and write performance as much as possible and solve the problems of storing large amounts of small files.

Main Components

Sealfs consists of the following three components:

Server

Server component is responsible for storing files and metadata. sealfs separates data and metadata into different disks, since metadata is undoubtedly the hot file on distributed file-system. This way, users can choose better hardware to store metadata.

Client

Client component implements the file-system in user mode. It intercepts file requests, stores, and addresses them through hash algorithms.

Manager

Manager component is responsible for coordinating the cluster.

The System Architecture can be shown as follow:

User Mode All The Way

With specific hardware, sealos hopes to support user-mode completely, from file request hijacking on the client side, to the network, and to the storage, for maximum performance improvement.

More designs can be referred to:

Design Document

design document

RoadMap

Currently, we are committed to improving the performance thoroughly. For other design aspects, such as high reliability and high availability, the priority would be lower.

  • first version Function:
    • Client:

      • fuse file system interface
      • System call hijacking(file system of user mode)
      • location algorithm
      • batch process
    • Sever:

      • bypass file system
      • file Storage
      • disk manager
      • catalogue manager
      • Metadata persistent memory storage
      • file index
      • file lock
      • Persistent data structure
    • Manger:

      • heart manager
    • Network:

      • RDMA
      • socket network
    • Test

      • IO500
      • function test

Compile

rust version 1.68

make build

Quick Start

Start Manager

# edit manager.yaml
vi examples/manager.yaml

# start manager with manager.yaml
SEALFS_CONFIG_PATH=./examples ./target/debug/manager &

Start Servers on a Node

./target/debug/server --manager-address <manager_ip>:<manager_port> --server-address <server_ip>:<server_port> --database-path <local_database_dir> --storage-path <local_storage_dir> --log-level warn &

Start Client on a Node

./target/debug/client --log-level warn daemon

Create & Mount Disk

./target/debug/client --log-level warn create test1 100000
./target/debug/client --log-level warn mount ~/fs test1

LICENSE

Apache License 2.0

sealfs's People

Contributors

dinoallo avatar hhoflittlefish777 avatar lavenderqaq avatar luanshaotong avatar mond77 avatar tricky511 avatar uran0sh avatar zzjin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sealfs's Issues

refactor the fuser code

refactor the fuser main loop use tokio to avoid swap between thread and coroutine.
maybe introducing another async-fuse is better.

Encapsulate Linux native aio

We choose asynchronous io model to store file. First, the native aio of linux, which is supported by a relatively low kernel version, is encapsulated.And io_uring encapsulate later.

搭建独立的集群节点数据服务器

搭建一个独立的服务器用于提供集群中的节点信息,集群中包含哪些服务节点(即哪些deamon)。

  1. 每个client需要能从服务节点获取当前的所有在线server。实现src/client/cluster_info.cpp get_servers()。目前不需要考虑动态变化的问题。

  2. deamon能够向服务器注册信息。这部分的接口还没有抽象。

ClientAsync have some problems

Error: Error waiting for callback occurs when 10w concurrency rpc with async client. while Client(that has been replaced with ClientAync) is normal.

there must be some implicit problems.

Replace unwrap error deal

When an error occurs, using unwrap will cause the program panic, and unwrap in the code needs to be replaced.

Bug: duplicate RPC request id

The id of the RPC request to the element index of the resource pool, but duplicate ids may appear. The timed-out first round of requests may be parsed and passed back to the second round of requests. There needs to be a mechanism to ensure that previous requests are discarded.
One possible way is to add UID in request header.

实现 open_file

目前不支持open函数,无法打开并读取文件。

  1. 实现src/client/connection.cpp Connection::open_remote_file
  2. 实现src/deamon/server.cpp Server::open_file
  3. 在src/deamon/server.cpp Server::operation_filter中注册open_file
  4. 实现src/deamon/engine.cpp Engine::open_file

测试:

echo "test" >> test
cat test

Implement thread safe treemap

In the consistency hash module, we use rwlock+btreemap to cache data, but there are the following problems: that is, using a third-party rwlock library can only ensure fair read and write, but in this update heartbeat cache scenario, we need to write first, and the performance of rwlock+btreemap is not very good,This is an important work. If you are interested in this, you can write a tool class and put it in the common.Of course, we need to have a benchmark to reflect the performance.
在一致性hash模块中,我们用rwlock + btreemap对数据进行缓存,但是存在以下问题:即使用了第三方rwlock库也只能保证读写公平,但是在这个更新心跳缓存场景之下我们需要写优先,同时rwlock + btreemap的性能也不太好,这是一个重要的工作,如果对这个感兴趣的话,可以写一个工具类放到common中,当然,我们需要有一个基准测试来体现性能

Optimize heartbeat module

The current heartbeat design is very simple, and needs to be optimized from several aspects. 1. There will be no frequent online and offline nodes during network partitioning. 2. Cache consistency, data consistency between clients and server. 3. Avalanche caused by unavailability of large-scale nodes
目前的心跳设计十分简单,需要从几个方面进行优化1.在网络分区时不会有频繁的节点上下线2.缓存一致性,server和client的数据的一致性问题3.大规模节点出现不可用造成的雪崩问题

Global unique id

As an important distributed component, the globally unique ID needs to be implemented using the snowflake algorithm and some practical optimization
全局唯一id作为分布式重要的组件需要去实现,采用雪花算法实现,并做一些实践的优化

cargo build erorr

root@ubuntu-1:~/github/sealfs# cargo build
   Compiling sealfs v0.1.0 (/root/github/sealfs)
error[E0412]: cannot find type `Database` in this scope
  --> src/server/storage_engine/default_engine.rs:33:18
   |
33 |     pub file_db: Database,
   |                  ^^^^^^^^ not found in this scope

error[E0412]: cannot find type `Database` in this scope
  --> src/server/storage_engine/default_engine.rs:34:17
   |
34 |     pub dir_db: Database,
   |                 ^^^^^^^^ not found in this scope

error[E0412]: cannot find type `Database` in this scope
  --> src/server/storage_engine/default_engine.rs:35:23
   |
35 |     pub file_attr_db: Database,
   |                       ^^^^^^^^ not found in this scope

error[E0425]: cannot find value `file_db` in this scope
   --> src/server/storage_engine/default_engine.rs:127:13
    |
127 |             file_db,
    |             ^^^^^^^ a field by this name exists in `Self`

error[E0425]: cannot find value `dir_db` in this scope
   --> src/server/storage_engine/default_engine.rs:128:13
    |
128 |             dir_db,
    |             ^^^^^^ a field by this name exists in `Self`

error[E0425]: cannot find value `file_attr_db` in this scope
   --> src/server/storage_engine/default_engine.rs:129:13
    |
129 |             file_attr_db,
    |             ^^^^^^^^^^^^ a field by this name exists in `Self`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:172:21
    |
172 |         if let Some(value) = self.file_attr_db.db.get(path.as_bytes())? {
    |                     ^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
    = note: all local variables must have a statically known size
    = help: unsized locals are gated as an unstable feature

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:172:16
    |
172 |         if let Some(value) = self.file_attr_db.db.get(path.as_bytes())? {
    |                ^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `client::_::_serde::__private::Some`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:518:17
    |
518 | pub enum Option<T> {
    |                 ^ required by this bound in `client::_::_serde::__private::Some`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:172:30
    |
172 |         if let Some(value) = self.file_attr_db.db.get(path.as_bytes())? {
    |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `std::option::Option`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:518:17
    |
518 | pub enum Option<T> {
    |                 ^ required by this bound in `std::option::Option`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:173:74
    |
173 |             debug!("read_dir getting attr, path: {}, value: {:?}", path, value);
    |                                                                          ^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `ArgumentV1::<'a>::new_debug`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:340:5
    |
340 |     arg_new!(new_debug, Debug);
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `ArgumentV1::<'a>::new_debug`
    = note: this error originates in the macro `format_args` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:197:21
    |
197 |             Ok(Some(value)) => {
    |                     ^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
    = note: all local variables must have a statically known size
    = help: unsized locals are gated as an unstable feature

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:197:16
    |
197 |             Ok(Some(value)) => {
    |                ^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `client::_::_serde::__private::Some`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:518:17
    |
518 | pub enum Option<T> {
    |                 ^ required by this bound in `client::_::_serde::__private::Some`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:258:18
    |
258 |             Some(value) => {
    |                  ^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
    = note: all local variables must have a statically known size
    = help: unsized locals are gated as an unstable feature

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:258:13
    |
258 |             Some(value) => {
    |             ^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `client::_::_serde::__private::Some`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:518:17
    |
518 | pub enum Option<T> {
    |                 ^ required by this bound in `client::_::_serde::__private::Some`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:257:35
    |
257 |         let mut file_attr = match self.file_attr_db.db.get(path.as_bytes())? {
    |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `std::option::Option`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:518:17
    |
518 | pub enum Option<T> {
    |                 ^ required by this bound in `std::option::Option`

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/server/storage_engine/default_engine.rs:273:13
    |
273 |             None => {
    |             ^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `client::_::_serde::__private::None`
   --> /root/.rustup/toolchains/1.62.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/option.rs:518:17
    |
518 | pub enum Option<T> {
    |                 ^ required by this bound in `client::_::_serde::__private::None`

Some errors have detailed explanations: E0277, E0412, E0425.
For more information about an error, try `rustc --explain E0277`.
error: could not compile `sealfs` due to 16 previous errors

[Feature]Read from command line

We need to read the configuration file or parameters from the control line, but now we read directly from the default configuration file

Get Metadata from manager

client need to obtain the heartbeat information of the server from the manager to build hash ring add support use consistent hash

可配置的远程server

目前代码还是本地单节点(服务器写死127.0.0.1)。
需要作为参数可配置的host和port。

Some questions about running io500 tests

Code Bug Issues

  1. getdents_remote --- need to consider data length (SubDirectory needs to avoid serialization)

Performance Issues

  1. rpc --- should be the current bottleneck (40 times difference)
  2. path --- get_realpath needs to be optimized (a double difference)
  3. fd --- The implementation of file_desc in intercept needs to be optimized (10% difference)

RoadMap

At present, it still a very simple system, and our first plan is committed to improve performance of disk IO and network . The construction of general file systems, the priority will be lower.See readme.md for the work of the first version. Welcome to pay attention.

  • Client:
    • fuse file system interface
    • System call hijacking(file system of user mode)(doing)
    • location algorithm(need optimize)
    • batch process
  • Sever:
    • bypass file system
    • file Storage
    • disk manager
    • catalogue manager
    • Metadata persistent memory storage
    • file index
    • file lock
    • Persistent data structure
  • Manger:
    • heart manager(need optimize)
  • Network:
    • RDMA
    • socket network
  • Test
    • IO500
    • function test

[BUG] Compilation error occurs because libverbs are missing

What happened:
lib is missing at compile time.

  = note: /usr/bin/ld: cannot find -libverbs
          collect2: error: ld returned 1 exit status
          

error: could not compile `sealfs` due to previous error

Anything else we need to know?:
I manually installed libverbs and compiled successfully

sudo apt-get install libibverbs-dev

Environment:

  • version: main
  • OS (e.g: cat /etc/os-release):
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
  • Kernel (e.g. uname -a):
Linux hanqi-code 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

others
Do we need to add libibverbs-dev to the makefile?

/kind bug

Key in dir_db may conflicts

.put(format!("{}-{}-{}", parent_dir, file_name, ft), file_name)?;

Fisrt create a dir with path "/a".
Then create a file with path "/a/a".
Now there is a record of "/a-a-f" in db.

Next if we create a dir with path "/a-a-f", may lead to record conflicts.
And even create a file of "/a-a-a", may also cause the read dir error at

for item in self.dir_db.db.iterator(IteratorMode::From(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.