Giter Club home page Giter Club logo

tifs-hashfs's Introduction

HashFs based on TiKV (derived from TiFS)

This repo contains my experimental implementation of a POSIX(FUSE) based distributed filesystem that automatically de-duplicates by managing data blocks by hashes.

It contains executables for

  1. direct connection to the TiKV cluster
  2. GRPC client+server to avoid many long round-trip times when operating on slow internet connection.

Features:

  • deduplication by using hashed blocks
  • reference counting on the bashed blocks
  • provides access to the internally computed hashes by special automatically listed hash-files.
  • vectored upload and download of blocks to speedup transfer
  • very basic snapshot mechanism (no write-protection yet)

Experiences:

  • despite some optimizations, the block-reference counting brings some overhead that significantly slows down the write operations. For me it wasn't possible to get much more than 20 MBit/s on Gigabit Ethernet (LAN).

Credits to TiFS - project

TiFS is a distributed POSIX filesystem based on TiKV, with partition tolerance and strict consistency. The remaining part of this README is still original from TiFS.

pjdfstest

Environment

Build

  • Linux libfuse and build-essential are required, in ubuntu/debian:
sudo apt install -y libfuse-dev libfuse3-dev build-essential
  • macOS
brew install --cask osxfuse

Runtime

  • Linux fuse3 and openssl are required, in ubuntu/debian:
sudo apt-get install -y libfuse3-dev fuse3 libssl-dev
  • macOS
brew install --cask osxfuse

In Catalina or former version, you need to load osxfuse into the kernel:

/Library/Filesystems/osxfuse.fs/Contents/Resources/load_osxfuse

Installation

Container

You can use the image on docker hub or build from the Dockerfile.

Binary(linux-amd64 or darwin-amd64)

mkdir tmp
cd tmp
wget https://github.com/Hexilee/tifs/releases/download/v0.2.1/tifs-linux-amd64.tar.gz
tar -xvf tifs-linux-amd64.tar.gz
sudo ./install.sh

The install.sh may fail in macOS Catalina or Big Sur because of the SIP.

You can just use the target/release/tifs to mount tifs.

Example

target/release/tifs tifs:127.0.0.1:2379 ~/mnt

Source code

git clone https://github.com/Hexilee/tifs.git
cd tifs
sudo make install

Usage

You need a tikv cluster to run tifs. tiup is convenient to deploy one, just install it and run tiup playground.

Container

docker run -d --device /dev/fuse \
    --cap-add SYS_ADMIN \
    -v <mount point>:/mnt:shared \
    hexilee/tifs:0.2.2 --mount-point /mnt --pd-endpoints <endpoints>

TLS

You need ca.crt, client.crt and client.key to access TiKV cluster on TLS.

It will be convenient to get self-signed certificates by sign-cert.sh(based on the easy-rsa).

You should place them into a directory and execute following docker command.

docker run -d --device /dev/fuse \
    --cap-add SYS_ADMIN \
    -v <cert dir>:/root/.tifs/tls \
    -v <mount point>:/mnt:shared \
    hexilee/tifs:0.3.1 --mount-point /mnt --pd-endpoints <endpoints>

Binary

mkdir <mount point>
mount -t tifs tifs:<pd endpoints> <mount point>

TLS

mount -t tifs -o tls=<tls config file> tifs:<pd endpoints> <mount point>

By default, the tls-config should be located in ~/.tifs/tls.toml, refer to the tls.toml for detailed configuration.

Other Custom Mount Options

direct_io

Enable global direct io, to avoid page cache.

mount -t tifs -o direct_io tifs:<pd endpoints> <mount point>

blksize

The block size, 64KiB by default, could be human-readable.

mount -t tifs -o blksize=512 tifs:<pd endpoints> <mount point>

maxsize

The quota of fs capacity, could be human-readable.

mount -t tifs -o maxsize=1GiB tifs:<pd endpoints> <mount point>

Development

cargo build
mkdir ~/mnt
RUST_LOG=debug target/debug/tifs --mount-point ~/mnt

Then you can open another shell and play with tifs in ~/mnt.

Maybe you should enable user_allow_other in /etc/fuse.conf.

for developing under FreeBSD, make sure the following dependencies are met.

pkg install llvm protobuf pkgconf fusefs-libs3 cmake

for now, user_allow_other and auto unmount does not work for FreeBSD, using as root and manually umount is needed.

Contribution

Design

Please refer to the design.md

FUSE

There are little docs about FUSE, refer to the example for the meaning of FUSE API.

Deploy TiKV

Please refer to the tikv-deploy.md.

TODO

  • FUSE API

    • init
    • lookup
    • getattr
    • setattr
    • readlink
    • readdir
    • open
    • release
    • read
    • write
    • mkdir
    • rmdir
    • mknod
    • lseek
    • unlink
    • symlink
    • rename
    • link
    • statfs
    • create
    • fallocate
    • getlk
    • setlk
  • Testing and Benchmarking

    • pjdfstest
    • fio
  • Real-world usage

    • vim
    • emacs
    • git
    • gcc
    • rustc
    • cargo build
    • npm install
    • sqlite
    • tikv on tifs
    • client runs on FreeBSD: simple case works

tifs-hashfs's People

Contributors

abracax avatar cre4ture avatar dinoallo avatar hexilee avatar tespent avatar yzydavid avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.