Giter Club home page Giter Club logo

leofs's Introduction

LeoFS - A Storage System for a Data Lake and the Web

Join the chat at https://gitter.im/leo-project/leofs Release Build Status

LeoFS Logo

Overview

LeoFS is an Enterprise Open Source Storage, and it is a highly available, distributed, eventually consistent object/blob store. If you are seeking a storage system that can store huge amount and various kind of files such as photo, movie, log data and so on, LeoFS is suitable for that.

LeoFS is supporting the following features:

  • Multi Protocol
    • S3-API Support
      • LeoFS is an Amazon S3 compatible storage system.
      • Switch to LeoFS to decrease your cost from more expensive public-cloud solutions.
    • REST-API Support
    • NFS Support
      • NFS support was provided from LeoFS v1.1, the current status of which is beta.
  • Large Object Support
    • LeoFS covers handling large size objects.
  • Multi Data Center Replication
    • LeoFS is a highly scalable, fault-tolerant distributed file system without SPOF.
    • LeoFS's cluster can be viewed as a huge capacity storage. It consists of a set of loosely connected nodes.
    • We can build a global scale storage system with easy operations

Architecture

leofs-architecture-1

LeoFS consists of three core components - LeoStorage, LeoGateway and LeoManager which depend on Erlang.

LeoGateway handles http-request and http-response from any clients when using REST-API OR S3-API. Also, it is already built in the object-cache mechanism (memory and disk cache).

LeoStorage handles GET, PUT and DELETE objects as well as metadata. Also, it has replicator, recoverer and queueing mechanism in order to keep running a storage node and realise eventual consistency.

LeoManager always monitors LeoGateway and LeoStorage nodes. The main monitoring status are Node status and RING’s checksum in order to realise to keep high availability and keep data consistency.

You can access a LeoFS system using Amazon S3 clients and the SDK.

Slide

The presentation - Scaling and High Performance Storage System: LeoFS was given at Erlang User Conference 2014 in Stockholm on June 2014

GOALs

  • LeoFS has been aiming to provide high reliability, high scalability, and high cost performance ratio:
    • HIGH Reliability
      • Nine nines - Operating ratios is 99.9999999%
    • High Scalability
      • Build huge-cluster at low cost
    • HIGH Cost Performance
      • Fast - Over 10Gbps
      • A lower cost than other storage
      • Provide easy management and easy operation

Further Reference

Build LeoFS with LeoFS Packages

LeoFS packages have been already provided on the Web. You're able to easily install LeoFS on your environments.

Here is the installation manual.

Build LeoFS From Source (For Developers)

Here, we explain how to build LeoFS from source code. First, you have to install the following packages to build Erlang and LeoFS.

Build Dependencies

## [CentOS]
$ sudo yum install cmake check-devel gcc gcc-c++ make
## [Ubuntu]
$ sudo apt-get install gcc g++ cmake make check libtool
### For Docker
$ apt-get install lsb-release

Install Erlang

You can install Erlang with kerl.

$ curl -O https://raw.githubusercontent.com/kerl/kerl/master/kerl
$ chmod a+x kerl
$ mkdir -p ~/bin
$ mv kerl ~/bin/
$ echo "export PATH=$PATH:~/bin" >> ~/.bashrc
$ source ~/.bashrc
  • Install Erlang (Erlang/OTP 19.3)
$ kerl build 19.3 19.3
$ kerl list builds
19.3,19.3

$ kerl install 19.3 /path/to/19.3
$  kerl list installations
19.3 /path/to/19.3

$ source /path/to/19.3/activate
$ kerl active
The current active installation is:
/path/to/19.3

Install LeoFS

Then, clone source of LeoFS and libraries from GitHub.

$ git clone https://github.com/leo-project/leofs.git
$ cd leofs
$ git checkout -b develop remotes/origin/develop
$ ./rebar get-deps
$ ./git_checkout.sh develop

Then, build LeoFS with the following commands.

$ make && make release_for_test

Now, you can find the LeoFS package as follow.

$ ls package/
leo_gateway/  leo_manager_0/  leo_manager_1/  leo_storage/  README.md

Then, we can start and access LeoFS with the following commands. Also, you're able to easily operate LeoFS with leofs-adm script.

$ package/leo_manager_0/bin/leo_manager start
$ package/leo_manager_1/bin/leo_manager start
$ package/leo_storage/bin/leo_storage start
$ package/leo_gateway/bin/leo_gateway start
$ ./leofs-adm status
 [System Confiuration]
-----------------------------------+----------
 Item                              | Value
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.3.4
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 1
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
 [mdcr] max number of joinable DCs | 2
 [mdcr] total replicas per a DC    | 1
 [mdcr] number of successes of R   | 1
 [mdcr] number of successes of W   | 1
 [mdcr] number of successes of D   | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash |
                previous ring-hash |
-----------------------------------+----------

 [State of Node(s)]
-------+--------------------------+--------------+----------------+----------------+----------------------------
 type  |           node           |    state     |  current ring  |   prev ring    |          updated at
-------+--------------------------+--------------+----------------+----------------+----------------------------
  S    | [email protected]      | attached     |                |                | 2017-06-02 14:59:20 +0900
-------+--------------------------+--------------+----------------+----------------+----------------------------

$ ./leofs-adm start
OK

$ ./leofs-adm status
 [System Confiuration]
-----------------------------------+----------
 Item                              | Value
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.3.4
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 1
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
 [mdcr] max number of joinable DCs | 2
 [mdcr] total replicas per a DC    | 1
 [mdcr] number of successes of R   | 1
 [mdcr] number of successes of W   | 1
 [mdcr] number of successes of D   | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash | 433fe365
                previous ring-hash | 433fe365
-----------------------------------+----------

 [State of Node(s)]
-------+--------------------------+--------------+----------------+----------------+----------------------------
 type  |           node           |    state     |  current ring  |   prev ring    |          updated at
-------+--------------------------+--------------+----------------+----------------+----------------------------
  S    | [email protected]      | running      | 433fe365       | 433fe365       | 2017-06-02 15:00:10 +0900
  G    | [email protected]      | running      | 433fe365       | 433fe365       | 2017-06-02 15:00:12 +0900
-------+--------------------------+--------------+----------------+----------------+----------------------------

Build a LeoFS Cluster

You can easily build a LeoFS cluster. See here.

Configure LeoFS

About the configuration of LeoFS, See here.

Benchmarking

You can benchmark LeoFS with Basho Bench, and here is a documentation to benchmark LeoFS.

Integration Test

You can test LeoFS with leofs_test whether LeoFS has issues or not before getting installed LeoFS in your dev/staging/production environment(s).

Milestones

Version 1

  • DONE - v1.0
    • Multi Data Center Replication
    • Increase compatibility S3-APIs#5
      • Other bucket operations
  • DONE - v1.1
    • NFS v3 Support (alpha)
    • Improve Web GUI Console (Option)
  • DONE - v1.2
    • NFS v3 Support (beta)
    • Watchdog
    • Automated data-compaction
  • DONE - v1.3
    • NFS v3 Support (stable)
    • Improve compatibility S3-APIs#6
  • DONE - v1.4
    • Improvement of the core features
    • Integration with distributed computing frameworks#1
      • Hadoop integration
      • Spark integration

Version 2

  • WIP - v2.0
    • Erasure Code
    • Improve Data Security for GDPR and Enterprise Storages
    • Improve compatibility S3-APIs#7
    • NFS v3 Support (stable)
      • Improve performance of the list objects, the ls command
    • Improvement of the Multi Data Center Replication
    • Searching objects by a custom-metadata
  • v2.1
    • Hinted Hand-off
    • Improve compatibility S3-APIs#8
      • Objects Expiration into a Bucket
      • Object Versioning
    • Kubernetes Persistent Volumes Support
    • Integration with distributed computing frameworks#2
      • Hadoop integration
      • Spark integration
    • Improve Web GUI console, LeoFS Center (option)
  • v2.2
    • Data Deduplication
    • Improve compatibility S3-APIs#9

Versioning Policy

LeoFS adheres to the versioning policy from v1.3.3.

Licensing

LeoFS is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Sponsors

leofs's People

Contributors

egisatoshi avatar epoll avatar essen avatar getong avatar gitter-badger avatar hiroaki-iwase avatar kevinmeziere avatar kunaltyagi avatar licenser avatar maxkochubey avatar mmasaki avatar mocchira avatar shuichiro-makigaki avatar trociny avatar vstax avatar windkit avatar yosukehara avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

leofs's Issues

Improve re-launch storage process

  • When regularly stop the storage-process, It writes current status in a file. Then it restarts the storage-process when reading the file.

Make compaction's control(suspend/resume) finer granularities

  • ASIS: per AVS(a raw file on OS filesystem)
  • TOBE: per some time unit ( ex. 5sec/1min ... )
  • TODO
    • To prevent gen_server from blocking, must do compaction on another process.
    • To realize time unit controlable, a compaction process must invoke receive regulaly.
  • Files to be modified
    • leo_object_storage_server
    • leo_compaction_manager_fsm
    • leo_object_storage_api

Improve compaction mechanism

  • Able to gradually compact data
    • Assignable data-compaction target raw-file(s)
  • Able to suspend/resume compaction
    • Need to implement commands of compact status, compact suspend, compact resume and compact start in LeoFS-Manager

build error following quick start

I am writing an installer for my dev env using information found on the quickstart guide.

#!/bin/bash
#http://www.leofs.org/docs/getting_started.html
#http://www.erlang.org/download_release/16
#https://github.com/leo-project/leofs

sudo apt-get install libncurses5-dev
http://www.erlang.org/download/otp_src_R14B04.tar.gz
tar xzf otp_src_R14B04.tar.gz
cd otp_src_R14B04
./configure --prefix=/usr/local/erlang/R14B04 --enable-smp-support --enable-m64-build --enable-halfword-emulator --enable-kernel-poll --without-javac --disable-native-libs --disable-hipe --disable-sctp --enable-threads
make
sudo make install

git clone https://github.com/leo-project/leofs.git
cd leofs
make
make release

the second last line of code produces this output

==> leo_commons (get-deps)
==> proper (get-deps)
==> jiffy (get-deps)
==> leo_logger (get-deps)
WARNING: deprecated port_envs option used
Option 'port_envs' has been deprecated
in favor of 'port_env'.
'port_envs' will be removed soon.

==> meck (get-deps)
WARNING: deprecated port_envs option used
Option 'port_envs' has been deprecated
in favor of 'port_env'.
'port_envs' will be removed soon.

==> bitcask (get-deps)
WARNING: deprecated port_envs option used
Option 'port_envs' has been deprecated
in favor of 'port_env'.
'port_envs' will be removed soon.

==> eleveldb (get-deps)
==> leo_backend_db (get-deps)
==> leo_object_storage (get-deps)
==> leo_mq (get-deps)
==> leo_redundant_manager (get-deps)
==> bear (get-deps)
==> folsom (get-deps)
==> leo_statistics (get-deps)
==> leo_s3_libs (get-deps)
==> leo_manager (get-deps)
==> lz4 (get-deps)
==> leo_ordning_reda (get-deps)
==> leo_storage (get-deps)
WARNING: deprecated port_envs option used
Option 'port_envs' has been deprecated
in favor of 'port_env'.
'port_envs' will be removed soon.

==> cherly (get-deps)
==> ecache (get-deps)
==> cowboy (get-deps)
==> leo_gateway (get-deps)
==> rel (get-deps)
==> leofs (get-deps)
==> leo_commons (compile)
ERROR: OTP release R15B01 does not match required regex R14B04|R15B02|R15B03
ERROR: compile failed while processing /home/paul/leofs/deps/leo_commons: rebar_abort
make: *** [compile] Error 1

basho_bench

Could you detail a bit how you got the bash_bench driver working?

  • ibrowse is up and running
  • basho_bench http raw driver works

but the leofs driver gives me:

10:21:59.096 [debug] Supervisor folsom_sup started folsom_sample_slide_sup:start_link() at pid <0.70.0>
10:21:59.097 [debug] Supervisor folsom_sup started folsom_meter_timer_server:start_link() at pid <0.71.0>
10:21:59.098 [debug] Supervisor folsom_sup started folsom_metrics_histogram_ets:start_link() at pid <0.72.0>
10:21:59.098 [info] Application folsom started on node nonode@nohost
10:21:59.111 [debug] Supervisor basho_bench_sup started basho_bench_stats:start_link() at pid <0.65.0>
10:21:59.114 [debug] ID 1 generating range 0 to 20833
10:21:59.209 [error] Failed to initialize driver basho_bench_driver_leofs: {'EXIT',{undef,[{basho_bench_driver_leofs,new,[1],[]},{basho_bench_worker,worker_idle_loop,1,[{file,"src/basho_bench_worker.erl"},{line,201}]}]}}

Gateway always responds with 403

Can't get a simple developer model deployment to run. Everything comes up, but spits out a 403 Forbidden when I access using s3cmd or DragonFS. Any logs I can look at to figure why it's failing ? I'm new to Erlang, but have loads of dev experience otherwise.

Support for "s3cmd (s3-client/tool)"

  • Results:
    New settings:
      Access Key: 05236
      Secret Key: 802562235
      Encryption password: 
      Path to GPG program: /usr/bin/gpg
      Use HTTPS protocol: False
      HTTP Proxy server name: 
      HTTP Proxy server port: 0

    Test access with supplied credentials? [Y/n] Y
    Please wait, attempting to list all buckets...
    ERROR: Test failed: no element found: line 1, column 0

Benchmark docs unclear

A bit unsure on how to explain these things so I'll put the few issues we discussed here:

  • The bucket must be created before tests are ran, using the special test user (command and key should be given from the documentation directly)
  • The bucket name cannot contain _ (underscore), so the "test_bucket" provided as example plain doesn't work, and it's not obvious from this page alone that it isn't
  • The benchmark file given as an example at the end doesn't work without extra configuration, you have to edit the file to set the right port and bucket name first

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.