Giter Club home page Giter Club logo

cannyfs's Introduction

CannyFS

CannyFS is a shim file system. That is, it mirrors an existing file system. This is implemented using FUSE, and the source code is a heavily modified fork of the https://github.com/libfuse/libfuse example https://github.com/libfuse/libfuse/blob/master/example/passthrough_fh.c (which used to be called fusexmp_fh.c).

What makes CannyFS different is that almost any write operation is completed asynchronously, while the code itself reports back to the caller that the task completed. This means that the calling process can proceed doing useful work. This is especially important if your I/O subsystem has high latency (due to lots of I/O or being hosted somewhere else within a network), and/or if your process makes a very high number of I/O requests, with a lot of flushing or writing to different files. Examples of this can be tasks that walk over complete directory trees, touching or writing to every file within them.

Preprint paper (not currently submitted anywhere) with some benchmarks: https://arxiv.org/abs/1612.06830

Intended usage mode

  1. Mount a directory with CannyFS, in non-demon mode to easily see error messages written to stderr.
  2. Do your work.
  3. Kill the CannyFS process.
  4. Check that the CannyFS process gracefully reports no errors.

Compiling CannyFS

The packages tbb, boost, and fuse are needed, beyond what's typically available in any Linux distro. In e.g. Ubuntu 16.04, this can be enough to get you going:

sudo apt-get install libtbb-dev libfuse-dev libboost-dev libboost-filesystem-dev libboost-system-dev

Then compile using at the very least a g++ compiler from the 5.x tree, 6.x highly recommended (C++ 14 support is needed).

g++ cannyfs.cpp -std=c++14 -O3 -lfuse -ltbb -lpthread -lboost_filesystem -lboost_system -D_FILE_OFFSET_BITS=64 -o cannyfs

Example script

The following script will create a mount that mirrors your local dir, with settings that are suitable for a Linux system (where default pipe buffers are typically 65536 bytes in length). The zip file archive.zip contains loads of small files and thus takes quite long to extract, especially if you are doing this over an NFS or CIFS mount. By mounting it in CannyFS, unzip can enqueue I/O operations to several target files, rather than performing a blocking wait for completion for each file.

CannyFS will need to be in your path, if it's found locally, adjust the command and kill jobspec to ./cannyfs:

#!/usr/bin/bash
mkdir mountpoint
cannyfs -f -o big_writes -o max_write=65536 -omodules=subdir,subdir=$HOME mountpoint &
# Linux-specific way of checking for mountpoint, arbitrary sleep generally works fine as well
until mountpoint -q $CANNY_PATH; do sleep 1; done
cd mountpoint
unzip $HOME/archive.zip
kill %cannyfs
rmdir mountpoint

You can also use the handy cannywrapper.sh script, assuming cannyfs is in your path. That way, you can wrap a single shell command in a CannyFS jail, where any file system I/O is tunneled through CannyFS. This is less than ideal since some operations to special file systems are not compatible, but it is fine for common I/O workhorse tools such as `rsync'.

cannyfs's People

Contributors

cnettel avatar samuell avatar peralonso avatar

Stargazers

 avatar Henrik Bengtsson avatar Jessica Nettelblad avatar Jesse Talavera avatar  avatar Johan avatar

Watchers

James Cloos avatar Jessica Nettelblad avatar  avatar paper2code - bot avatar

cannyfs's Issues

Allow fifos, character files, ...

If you use cannywrapper.sh, the wrapping of e.g. dev and proc messes up a lot of tools. To allow simple pass-through of character files that do not change should be simple enough.

It's a bit more messy when filesystem metadata changes.

It is even more messy to allow creation of fifos in cannied directories. ... ...

cachemissing required for proper operations (and related cases)

Some eagerness and other flags might in fact be required for other flags. For example, eager rmdir and eager unlink have a prerequisite on inaccuratestat and cachemissing being turned on. Otherwise, getattr will return the true stat information for the backing filesystem. Since libfuse itself will do a getattr immediately following delete, this is disastrous (the existence of the file is then cached by libfuse, causing e.g. future new create operations on the same name to fail).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.