hexxellor / fusecompress Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 4.0 285 KB

Automatically exported from code.google.com/p/fusecompress Transparent compression FUSE filesystem (0.9.x tree)

License: GNU General Public License v2.0

Shell 22.15% C 68.80% C++ 0.07% Python 8.98%

fusecompress's Introduction

Hi there 👋

I'm Hexxellor. I love coding, SHMUPs, cryptography, and civil rights. Not neccessarily in that order.

fusecompress's People

Contributors

Stargazers

Watchers

Forkers

rigid caioycosta 05px tinyloop

fusecompress's Issues

fsck test case

fsck needs a test case.

Original issue reported on code.google.com by [email protected] on 11 Aug 2008 at 1:35

possible data corruption on read after direct write

Found while investigating issue #24: direct_decompress() should fall back 
to decompression when reading from a file in write mode, but will not do 
so if cache_skipped is enabled. This may cause severe data corruption 
because it resets the offset at which direct_compress() will write next. 
The good news is that this is easy to fix.

Original issue reported on code.google.com by [email protected] on 5 Sep 2008 at 1:23

compile problem with lzma-4.999.5alpha

Hi

I download your fusecompress 0.9.x source from google code and have problem
compile it. "make release" complains:

compress_lzma.o: In function `lzmaCompress':
/usr/src/fusecompress-read-only/compress_lzma.c:53: undefined reference to
`lzma_easy_encoder_single'
compress_lzma.o: In function `lzmaOpen':
/usr/src/fusecompress-read-only/compress_lzma.c:220: undefined reference to
`lzma_easy_encoder_single'
collect2: ld returned 1 exit status
make: *** [fusecompress] Error 1

Could you take a look?

Thanks

Original issue reported on code.google.com by [email protected] on 9 Nov 2008 at 11:14

seek to end triggers decompression

Some programs open a file and then seek to the end to determine the file
size. If you do that on a file who's size is not a multiple of the page
size, FUSE will do a read(!) on the last few bytes. There is no way to
determine whether this is a valid read or just FUSE being stupid, so we
have to decompress the entire file, which takes quite some time for, say,
DVD images...

(If you want to work around this in your program, use stat() instead, which
is probably a good idea anyway.)

Original issue reported on code.google.com by [email protected] on 31 Aug 2008 at 8:59

fusecompress_offline test

fusecompress_offline does not have a test case

Original issue reported on code.google.com by [email protected] on 12 Aug 2008 at 1:04

ability to set all fusecompress options via "-o"

It's tricky to give command-line options to fusecompress when mounting
filesystems in /etc/fstab. Using the standard mount options mechanism would
make it a lot easier.

Original issue reported on code.google.com by [email protected] on 21 Aug 2008 at 9:00

change options at runtime

There should be a way to change options, such as the default compressor, 
the cache size, or excluded file types at runtime.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 2:49

Filesystem crashes when writing large files via samba share

What steps will reproduce the problem?
1. Create a Samba share with a fusecompress filesystem on it
2. Write a large (Many Gigs) file to it via samba over the network
3.

What is the expected outcome? What happens instead?
It should write and compress the file. Instead the filesystem crashes

What FuseCompress version are you using? What distribution? What kernel?
What FUSE release?
after 0.9.1.1 - svn pull
Red Hat Enterprise Linux Server release 5.3 (Tikanga)
2.6.18-128.1.10.el5
2.7.3


Please provide any additional information below.
[root@valvcsnas001vm ~]# ./fusecompress_mount.sh
fusecompress: ERR direct_decompress 565: short read in compressed file
valvcshd001vm/valvcshd001vm/20090611_12358/1-unmount.dat (probably corrupt)
fusecompress: ERR direct_decompress 565: short read in compressed file
valvcshd001vm/valvcshd001vm/20090611_12358/1-unmount.dat (probably corrupt)
./fusecompress_mount.sh: line 10: 13023 Segmentation fault     
/opt/fusecompress/bin/fusecompress -o gz,nodetach
/storage/midev_vm_backend/ /storage/midev/vm/

Original issue reported on code.google.com by [email protected] on 11 Jun 2009 at 5:08

direct_rename() assertion failure

direct_rename() tries to reroute all pointers to the original file object to
the new one. Subsequently it asserts that there are no more accesses to the
original file. When running dbench I once saw that assertion fail. I have
not managed to reproduce this since then. May be a locking issue.

Original issue reported on code.google.com by [email protected] on 5 Aug 2008 at 4:14

fsck does not check whether FS is mounted

fsck should check if it is running on a mounted FuseCompress filesystem and
refuse to run if so.

Original issue reported on code.google.com by [email protected] on 13 Aug 2008 at 12:52

fusecompress_offline -v option undocumented

The verbose option of fusecompress_offline does not show up in the --help 
output.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 9:44

fusecompress_offline ignores exclusion list

fusecompress_offline should, either by default or optionally, heed the 
compression exclusion list to prevent needless compression when batch-
compressing files.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 1:30

rsync defies exclusion list

rsync creates temporary files with mangled names while transferring data, which 
it then renames after completion. Example:

.eglibc-2_10.tar.bz2.Yg4BWG -> eglibc-2_10.tar.bz2

These temp files fail the exclusion test in fusecompress and are directly 
compressed, wasting a large amount of CPU cycles. Since rsync is such a common 
tool, it might be worthwhile to add a special workaround. Or perhaps matching 
for the extension within the filename is OK.

Original issue reported on code.google.com by [email protected] on 2 Nov 2011 at 3:29

disable decompression for read-only backing FS

Decompression needs to be disabled entirely when running on top of a
read-only filesystem. (The need to decompress when writing is a non-issue
in this case.)

Original issue reported on code.google.com by [email protected] on 21 Aug 2008 at 2:29

segfault when reading beyond EOF with caching enabled

With the skip cache enabled, a read beyond the end of file on a file with a
cache can lead to a segfault or incorrect data being returned. We need to
check the length of the cache.

Original issue reported on code.google.com by [email protected] on 6 Oct 2008 at 1:16

intelligent corruption handling in fsck

fsck currently only offers two options: delete or do nothing. It should be
able to create a valid (compressed or uncompressed) file from the
recoverable remains of the old one, if there are any.

Original issue reported on code.google.com by [email protected] on 11 Aug 2008 at 1:37

stupid apps can DOS filesystem in case of corrupt files

Some fuckwad apps (lets call them Shwoogle Pisspot, for lack of a more 
spiteful name) ignore even the most hardcore errors and simply keep 
rambling on even if you clearly told them that there is a fucking I/O 
error! While this by itself would be OK if it only created a lot of CPU 
load, it seems that this behavior blocks all other processes from 
accessing files. Even killing the offending process won't help, it stays 
in D state, and there is still no way to access any files anymore. Only 
killing fusecompress and remounting the filesystem helps.

Original issue reported on code.google.com by [email protected] on 3 Sep 2008 at 10:51

rm -fr breaks (due to temp files?)

Once in a while rm -fr on a directory throws an error saying "Directory not
empty". The problem goes away after doing operations on a number of other
files in the filesystem. It might be due to temp files that have not been
removed yet.

Original issue reported on code.google.com by [email protected] on 27 Nov 2008 at 10:39

Don't do setpriority() or make nice level configurable.

I don't like the way fusecompress sets itself to 10, I've been patching this 
down to +1 for a while now and think that it should either be configurable 
(--enable-autonice --with-nice= or something.) or simply stripped out completly.

Original issue reported on code.google.com by sverd.johnsen on 24 Jun 2011 at 11:24

fsck "quick check" option

fsck reads all files in whole to determine their integrity, which takes an
awfully long time. There ought to be a "quick check" option that only
removes stale temp files.

Original issue reported on code.google.com by [email protected] on 12 Aug 2008 at 5:07

compression modules: redundant code

All compression modules implement their own compress/decompress methods. It
would be possible to replace them with a single generic implementation
using the compression module's open/read/write/close methods.

Original issue reported on code.google.com by [email protected] on 11 Aug 2008 at 11:45

fusecompress_offline

FuseCompress needs an offline compression/decompression tool, like the one
in the 1.99.x tree.

Original issue reported on code.google.com by [email protected] on 5 Aug 2008 at 4:17

read returns stale cached data

Cached data is not invalidated on writes, leading fusecompress to return
stale data when files with cached data are modified and then read back.

Original issue reported on code.google.com by [email protected] on 11 Nov 2008 at 2:25

hardcoded cache size

Uncompressed cache size is hardcoded to 100 MB ATM. Should be made
configurable.

Original issue reported on code.google.com by [email protected] on 25 Aug 2008 at 4:08

small file descriptor leak

There seems to be a tiny file descriptor leak somewhere that only becomes
apparent in filesystems running for a long time without being remounted. In
the case observed about a dozen descriptors had been leaked in a
fusecompress filesystem that has been running for several weeks.

Original issue reported on code.google.com by [email protected] on 27 Nov 2008 at 5:09

rare but severe slowdowns when reading large files linearly

When reading large amounts of data (several dozen GB) of files with varying
sizes around 1 GB, fusecompress occasionally grinds to a near-standstill
while it seems to reread the entire file over and over again, once for each
page read. This is very strange because it happens when copying using "cp",
which invariably reads data linearly in blocks of 128 KB. One possible
explanation would be that FUSE, which always seems to request data
pagewise, does so out-of-order within such a 128 KB block in rare cases.

Investigating...

Original issue reported on code.google.com by [email protected] on 5 Sep 2008 at 10:42

direct seek test

direct seeking does not have a test case

Original issue reported on code.google.com by [email protected] on 12 Aug 2008 at 1:05

fusecompress_offline leaves temp files behind when aborted

When fusecompress_offline is aborted (^C), it does not clean up its 
temporary file.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 9:39

fusecompress should try to raise RLIMIT_NOFILE as far as possible

Bad things happen when fusecompress is not allowed to open more files,
especially when using it as root filesystem. On startup it should try to
raise its file descriptor limit as far as possible.

Original issue reported on code.google.com by [email protected] on 15 Aug 2008 at 11:29

user-configurable incompressible file list

Implement the possibility to add additional file extensions to the list of
incompressible file types.

Original issue reported on code.google.com by [email protected] on 12 Aug 2008 at 8:40

fusecompress_offline does not work when used across mountpoints

fusecompress_offline uses the same code as fusecompress to (un)compress 
files. It creates a temporary file in the current directory and renames it 
to the target name when done. This only works if the current directory is 
on the same filesystem as the target file, which is always the case with 
fusecompress, but not with fusecompress_offline.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 9:41

compression exclusion list does not work when using rsync

rsync creates temporary files prefixed with a dot and with a random string 
attached at the end, which prevents the exclusion mechanism in 
fusecompress from catching uncompressible files, leading to severely 
reduced performance.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 9:36

Merged into: #49

memleak: cache array not freed

While the cached pages themselves are freed correctly when purging a file,
the cache page array is not freed in do_decompress() and direct_open_delete().

Original issue reported on code.google.com by [email protected] on 6 Oct 2008 at 1:19

handling errors in compressed data

Error handling in the compression modules is not good. For cut-off data, for
instance, LZO returns an error on the first try, but zero on subsequent
tries. LZMA always returns zero. This can lead to (virtually) endless loops
in some applications. We are trying to return a serious error (EIO) if we
detect corrupt compressed data, but that does not keep some programs from
trying over and over.

Original issue reported on code.google.com by [email protected] on 5 Aug 2008 at 4:16

hardlink read/write inconsistency

The second read() on file "b" in test/link.c doesn't get through to us. FUSE
seems to assume instead that the data is still the same as when reading it
the first time. This is a bug in FUSE, NTFS-3G exhibits the same problem. It
is triggered by Firefox and probably other software doing funny stuff with
hardlinks.

Try compiling test/link.c with -DSTRICT. Will work on kernel filesystems,
but not in FUSE.

Original issue reported on code.google.com by [email protected] on 5 Aug 2008 at 4:13

hangs with FUSE 2.7.x when doing lots of renames

Using libfuse 2.7.2 or 2.7.4 (probably all 2.7.x releases), fusecompress
gets stuck badly when under rename() stress. Running

while true ; do mv -v x/* y ; mv -v y/* x ; done

hangs after only a few operations. The problem only occurs when mounting
the fusecompress FS on top of its backing directory (i.e. old syntax,
"fusecompress /foo"). Trying to kill fusecompress or the process doing the
rename() is not possible, attaching to those processes does not work either
(strace and gdb hang, too).

There is no such problem when using a differently named backing directory
(new syntax, e.g. "fusecompress /foo /bar") or when using libfuse 2.8.0pre1.

This is most likely a bug in libfuse.

Original issue reported on code.google.com by [email protected] on 12 Aug 2008 at 8:59

test suite broken on OSX

The test suite currently has two major issues that prevent it from working
on OSX:

1. It uses the old-style "mount-over" syntax, which does not work on OSX.
2. It uses "fusermount -u" to unmount filesystems, which is not available
on OSX. (It should use umount instead)

Original issue reported on code.google.com by [email protected] on 29 Jul 2009 at 6:14

I/O error reading valid file with garbage data at the end

FUSE always reads whole pages, and expects short reads for files that have
a tail. This usually works, but it seems that at least LZO is not
comfortable with it if there is trailing garbage at the end of the valid
compressed data. This is actually two bugs:

1. fsck should detect such files and truncate them.
2. fusecompress should not request reads beyond the end of the file from
compression modules.

Original issue reported on code.google.com by [email protected] on 4 Sep 2008 at 9:52

long unmount time due to large unpurged files

When handling large files, unmounting sometimes takes a very long time 
because uncompressed files have not been purged yet and have to be 
compressed at the last moment. It may be beneficial to purge large 
uncompressed files more aggressively.

Original issue reported on code.google.com by [email protected] on 5 Sep 2008 at 11:43

fusecompress_offline hangs for 50 seconds if it cannot create a temp file

What steps will reproduce the problem?
1. Run fusecompress_offline from a directory you are not allowed to write to.

What is the expected output? What do you see instead?

It should come back with an error; instead, it hangs. According to the comment 
in the code, that is because mkstemp() "may fail" and needs to be retried. I 
find that hard to believe... At any rate, it will make file_create_temp() hang 
for almost a minute!

Original issue reported on code.google.com by [email protected] on 10 Jun 2011 at 12:30

error logging limit

Stupid applications keep hammering on broken files despite getting the most 
ghastly error codes back, filling up the log files. Set a limit for the 
maximum number of errors reported per file.

Original issue reported on code.google.com by [email protected] on 13 Sep 2008 at 9:01

Let lzma to be optional when compiling fusecompress


liblzma is alpha software and should not be enforced in a stable production
quality environment (which is the goal of fusecompress 0.9.x code). Many
stable distro doesn't have this library.

Please allow option to compile fusecompress WITHOUT liblzma.

Original issue reported on code.google.com by [email protected] on 12 Nov 2008 at 6:29

fusecompress eats up memory when reading a file

I have one compressed foler, which I mount like this:

##      MOUNT THE ASP DATA STORE (COMPRESSED)
if [ `mount | grep fusecompress | grep /mnt/data/ascii | wc -l` -eq 0 ]
    then
        fusecompress -o fc_c:lzo /mnt/data/.ascii /mnt/data/ascii
fi

I then filled this folder with a bunch (~2500) of ASCII files, about
29MB each. The copying *to* this folder went smoothly, and actually
surprisingly quick.

However, when I read a file (just by copying it *from* the compressed
folder to my home directory), fusecompress quickly eats up all the RAM i
have (4G) and even more, until swap is also completely taken by
fusecompress and my machine basically crashes because there's no memory
left. The copy process never finishes.


What FuseCompress version are you using? What distribution? What kernel?
What FUSE release?

Ubuntu Lucid Lynx, AMD64.
fusecompress 2.6-2
kernel 2.6.32-22
libfuse 2.8.1

The compressed folder lies on a xfs filesystem on a software raid-0.

Original issue reported on code.google.com by [email protected] on 19 May 2010 at 5:14

fsck tool

FuseCompress does not currently have a tool verifying integrity of the
filesystem metadata.

Original issue reported on code.google.com by [email protected] on 5 Aug 2008 at 4:16

lock fails in do_decompress()

fusecompress.debug: compress.c:178: do_decompress: Assertion
`pthread_mutex_lock(&file->lock) == 0' failed.

Seems to be hard to reproduce.

Original issue reported on code.google.com by [email protected] on 5 Aug 2008 at 8:26

file format not endianness-agnostic

fusecompress saves the uncompressed size as a 64-bit integer of type off_t
in the file header, in native endianness, making it impossible to exchange
data between big and little-endian hosts. IMO it would be best to
standardize on little-endian.

Original issue reported on code.google.com by [email protected] on 15 Aug 2008 at 11:24

ctime of mounted files is incorrect

What steps will reproduce the problem?
1. Create a fusecompress filesystem with some files in it.
2. Run stat on a file in the mounted fusecompress filesystem. Note the
"change time".
3. Unmount the filesystem and run stat against the same file.

What is the expected outcome? What happens instead?
The ctimes should be the same between the mounted and unmounted versions of
the file.  Instead the ctime and mtime are always identical. 

What FuseCompress version are you using? What distribution? What kernel?
What FUSE release?
This is revision 90 from SVN.

Please provide any additional information below.
This causes problems with rsync.  rsync uses the ctime, so if it is not
correct, rsync will sometimes update files that do not need to be updated.
The situation where you get a problem is a file for which the mtime and
ctime are different, for example, if you modify the file data then later 
modify the file permissions.  Such a file, when it's on a fusecompress
filesystem will always be updated by rsync.

Commenting out a line in fusecompress.c fixed this for me:

Index: fusecompress.c
===================================================================
--- fusecompress.c  (revision 90)
+++ fusecompress.c  (working copy)
@@ -112,7 +112,7 @@
    // (tar checks this item and it is loudly when the result
    // is different than what it exepects)
    //
-   stbuf->st_ctime = stbuf->st_mtime;
+   //stbuf->st_ctime = stbuf->st_mtime;

    UNLOCK(&file->lock);
    return 0;

Original issue reported on code.google.com by [email protected] on 28 Apr 2009 at 4:26

fusecompress_cat

A tool that allows to dump compressed files without uncompressing them or 
using FUSE is needed. Could be built into fusecompress_offline as well, 
similar to the "-dc" options in gzip.

Original issue reported on code.google.com by [email protected] on 3 May 2009 at 9:38

I/O error when writing to empty file at arbitrary offset

direct_compress() calls do_decompress() when told to write at a different
offset than the current one. If the file is empty at that point,
do_decompress() will fail because it cannot read the header, yielding an
I/O error.

Original issue reported on code.google.com by [email protected] on 6 Oct 2008 at 1:14

directory timestamps modified behind user's back

Due to the background operations (deduping and BG compression) being performed 
asynchronously, the directory timestamps are sometimes updated even though we 
don't want that.

Since the operations causing these timestamp updates are fast (link() for 
dedup, rename() for compression), we could lock the directory while doing them 
and write the old timestamps back again after completion. Problem: There is no 
mechanism in place for locking entire directories.

Without locking, we would risk losing legitimate updates caused by 
modifications to entries in the directory that happen while we do our thing.

Original issue reported on code.google.com by [email protected] on 8 Nov 2011 at 12:04