Giter Club home page Giter Club logo

rust-snappy's Introduction

snap

A pure Rust implementation of the Snappy compression algorithm. Includes streaming compression and decompression using the Snappy frame format. This implementation is ported from both the reference C++ implementation and the Go implementation.

Build status

Licensed under the BSD 3-Clause.

Documentation

https://docs.rs/snap

Usage

Add this to your Cargo.toml:

[dependencies]
snap = "1"

Example: compress data on stdin

This program reads data from stdin, compresses it and emits it to stdout. This example can be found in examples/compress.rs:

use std::io;

fn main() {
    let stdin = io::stdin();
    let stdout = io::stdout();

    let mut rdr = stdin.lock();
    // Wrap the stdout writer in a Snappy writer.
    let mut wtr = snap::write::FrameEncoder::new(stdout.lock());
    io::copy(&mut rdr, &mut wtr).expect("I/O operation failed");
}

Example: decompress data on stdin

This program reads data from stdin, decompresses it and emits it to stdout. This example can be found in examples/decompress.rs:

use std::io;

fn main() {
    let stdin = io::stdin();
    let stdout = io::stdout();

    // Wrap the stdin reader in a Snappy reader.
    let mut rdr = snap::read::FrameDecoder::new(stdin.lock());
    let mut wtr = stdout.lock();
    io::copy(&mut rdr, &mut wtr).expect("I/O operation failed");
}

Example: the szip tool

szip is a tool with similar behavior as gzip, except it uses Snappy compression. It can be installed with Cargo:

$ cargo install szip

To compress a file, run szip file. To decompress a file, run szip -d file.sz. See szip --help for more details.

Testing

This crate is tested against the reference C++ implementation of Snappy. Currently, compression is byte-for-byte equivalent with the C++ implementation. This seems like a reasonable starting point, although it is not necessarily a goal to always maintain byte-for-byte equivalence.

Tests against the reference C++ implementation can be run with cargo test --features cpp. Note that you will need to have the C++ Snappy library in your LD_LIBRARY_PATH (or equivalent).

To run tests, you'll need to explicitly run the test crate:

$ cargo test --manifest-path test/Cargo.toml

To test that this library matches the output of the reference C++ library, use:

$ cargo test --manifest-path test/Cargo.toml --features cpp

Tests are in a separate crate because of the dependency on the C++ reference library. Namely, Cargo does not yet permit optional dev dependencies.

Minimum Rust version policy

This crate's minimum supported rustc version is 1.39.0.

The current policy is that the minimum Rust version required to use this crate can be increased in minor version updates. For example, if crate 1.0 requires Rust 1.20.0, then crate 1.0.z for all values of z will also require Rust 1.20.0 or newer. However, crate 1.y for y > 0 may require a newer minimum version of Rust.

In general, this crate will be conservative with respect to the minimum supported version of Rust.

Performance

The performance of this implementation should roughly match the performance of the C++ implementation on x86_64. Below are the results of the microbenchmarks (as defined in the C++ library):

group                         snappy/cpp/                            snappy/snap/
-----                         -----------                            ------------
compress/zflat00_html         1.00     94.5±0.62µs  1033.1 MB/sec    1.02     96.1±0.74µs  1016.2 MB/sec
compress/zflat01_urls         1.00   1182.3±8.89µs   566.3 MB/sec    1.04  1235.3±11.99µs   542.0 MB/sec
compress/zflat02_jpg          1.00      7.2±0.11µs    15.9 GB/sec    1.01      7.3±0.06µs    15.8 GB/sec
compress/zflat03_jpg_200      1.10    262.4±1.84ns   727.0 MB/sec    1.00    237.5±2.95ns   803.2 MB/sec
compress/zflat04_pdf          1.02     10.3±0.18µs     9.2 GB/sec    1.00     10.1±0.16µs     9.4 GB/sec
compress/zflat05_html4        1.00    399.2±5.36µs   978.4 MB/sec    1.01    404.0±2.46µs   966.8 MB/sec
compress/zflat06_txt1         1.00    397.3±2.61µs   365.1 MB/sec    1.00    398.5±3.06µs   364.0 MB/sec
compress/zflat07_txt2         1.00    352.8±3.20µs   338.4 MB/sec    1.01    355.2±5.01µs   336.1 MB/sec
compress/zflat08_txt3         1.01   1058.8±6.85µs   384.4 MB/sec    1.00   1051.8±6.74µs   386.9 MB/sec
compress/zflat09_txt4         1.00   1444.1±8.10µs   318.2 MB/sec    1.00  1450.0±13.36µs   316.9 MB/sec
compress/zflat10_pb           1.00     85.1±0.58µs  1328.6 MB/sec    1.02     87.0±0.90µs  1300.2 MB/sec
compress/zflat11_gaviota      1.07    311.9±4.27µs   563.5 MB/sec    1.00    291.9±1.86µs   602.3 MB/sec
decompress/uflat00_html       1.03     36.9±0.28µs     2.6 GB/sec    1.00     36.0±0.25µs     2.7 GB/sec
decompress/uflat01_urls       1.04    437.4±2.89µs  1530.7 MB/sec    1.00    419.9±3.10µs  1594.6 MB/sec
decompress/uflat02_jpg        1.00      4.6±0.05µs    24.9 GB/sec    1.00      4.6±0.03µs    25.0 GB/sec
decompress/uflat03_jpg_200    1.08    122.4±1.06ns  1558.6 MB/sec    1.00    112.8±1.35ns  1690.8 MB/sec
decompress/uflat04_pdf        1.00      5.7±0.05µs    16.8 GB/sec    1.10      6.2±0.07µs    15.3 GB/sec
decompress/uflat05_html4      1.01    164.1±1.71µs     2.3 GB/sec    1.00    162.6±2.16µs     2.3 GB/sec
decompress/uflat06_txt1       1.08    146.6±1.01µs   989.5 MB/sec    1.00    135.3±1.11µs  1072.0 MB/sec
decompress/uflat07_txt2       1.09    130.2±0.93µs   916.6 MB/sec    1.00    119.2±0.96µs  1001.8 MB/sec
decompress/uflat08_txt3       1.07    387.2±2.30µs  1051.0 MB/sec    1.00    361.9±6.29µs  1124.7 MB/sec
decompress/uflat09_txt4       1.09    536.1±3.47µs   857.2 MB/sec    1.00    494.0±5.05µs   930.2 MB/sec
decompress/uflat10_pb         1.00     32.5±0.19µs     3.4 GB/sec    1.05     34.0±0.48µs     3.2 GB/sec
decompress/uflat11_gaviota    1.00    142.1±2.05µs  1236.7 MB/sec    1.00    141.5±0.92µs  1242.3 MB/sec

Notes: These benchmarks were run with Snappy/C++ 1.1.8. Both the C++ and Rust benchmarks were run with the same benchmark harness. Benchmarks were run on an Intel i7-6900K.

Additionally, here are the benchmarks run on the same machine from the Go implementation of Snappy (which has a hand rolled implementation in Assembly). Note that these were run using Go's microbenchmark tool, so the numbers may not be directly comparable, but they should serve as a useful signpost:

Benchmark_UFlat0           25040             45180 ns/op        2266.49 MB/s
Benchmark_UFlat1            2648            451475 ns/op        1555.10 MB/s
Benchmark_UFlat2          229965              4788 ns/op        25709.01 MB/s
Benchmark_UFlat3        11355555               101 ns/op        1973.65 MB/s
Benchmark_UFlat4          196551              6055 ns/op        16912.64 MB/s
Benchmark_UFlat5            6016            189219 ns/op        2164.68 MB/s
Benchmark_UFlat6            6914            166371 ns/op         914.16 MB/s
Benchmark_UFlat7            8173            142506 ns/op         878.41 MB/s
Benchmark_UFlat8            2744            436424 ns/op         977.84 MB/s
Benchmark_UFlat9            1999            591141 ns/op         815.14 MB/s
Benchmark_UFlat10          28885             37291 ns/op        3180.04 MB/s
Benchmark_UFlat11           7308            163366 ns/op        1128.26 MB/s
Benchmark_ZFlat0           12902             91231 ns/op        1122.43 MB/s
Benchmark_ZFlat1             997           1200579 ns/op         584.79 MB/s
Benchmark_ZFlat2          136762              7832 ns/op        15716.53 MB/s
Benchmark_ZFlat3         4896124               245 ns/op         817.27 MB/s
Benchmark_ZFlat4          117643             10129 ns/op        10109.44 MB/s
Benchmark_ZFlat5            2934            394742 ns/op        1037.64 MB/s
Benchmark_ZFlat6            3008            382877 ns/op         397.23 MB/s
Benchmark_ZFlat7            3411            344916 ns/op         362.93 MB/s
Benchmark_ZFlat8             966           1057985 ns/op         403.36 MB/s
Benchmark_ZFlat9             854           1429024 ns/op         337.20 MB/s
Benchmark_ZFlat10          13861             83040 ns/op        1428.08 MB/s
Benchmark_ZFlat11           4070            293952 ns/op         627.04 MB/s

To run benchmarks, including the reference C++ implementation, do the following:

$ cd bench
$ cargo bench --features cpp -- --save-baseline snappy

To compare them, as shown above, install critcmp and run (assuming you saved the baseline above under the name snappy):

$ critcmp snappy -g '.*?/(.*$)'

Finally, the Go benchmarks were run with the following command on commit ff6b7dc8:

$ go test -cpu 1 -bench Flat -download

Comparison with other Snappy crates

  • snappy - These are bindings to the C++ library. No support for the Snappy frame format.
  • snappy_framed - Implements the Snappy frame format on top of the snappy crate.
  • rsnappy - Written in pure Rust, but lacks documentation and the Snappy frame format. Performance is unclear and tests appear incomplete.
  • snzip - Was created and immediately yanked from crates.io.

rust-snappy's People

Contributors

aaron1011 avatar arthurprs avatar atouchet avatar bruceg avatar burntsushi avatar emk avatar icmccorm avatar llogiq avatar matklad avatar object905 avatar pawanjay176 avatar rikardfalkeborn avatar shadlock0133 avatar shepmaster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

rust-snappy's Issues

szip compression on a directory

When running szip mydir (both cargo install version and github HEAD) where mydir is a directory, it will report mydir: Is a directory (os error 21) with exit code 0, while gzip exits with 2; the more annoying issue is that szip will create a dummy mydir.sz.

Reader/Writer inner get_mut()

flate2 provides get_mut(); obtaining an inner mutable reference can be helpful when using a reader/writer over a streaming protocol where the buffer size would be infinite.

The current access to the low level Encoder/Decoder is great, and can be used to work around the fact that I can't manage the internal writer/reader "mid-flight" - however it shifts a lot of responsibility onto the developer to reatain the Reader/Writer block semantics (checksums, etc) - providing a "I know what I'm doing" get_mut() function would go a long way to support this as you can imagine.

Does this make sense? Or maybe there's something I missed? I can try to provide a more descriptive use case / sample code if necessary.

(Thank you for this awesome crate!)

add 32 bit testing to travis

A bug reported in #3 exposed a problem that only exists when using 32 bit architectures. It's easy enough to test locally with rustup target add i686-unknown-linux-gnu (and an appropriate installation of gcc-multilib), but this should be automated to prevent regressions.

decompress_len output size, unexpected value

Happy Friday!

My question is regarding decompress_len
Perhaps I'm using it wrong, but it gives an output significantly less than what the actual decompressed length is.

Given this:

#[cfg(test)]
mod tests {

    use std::io::Write;
    use snap::raw::decompress_len;

    #[test]
    fn test_decompress_len() {
        let data = (0..5000000)
            .map(|_| b"oh what a beautiful morning, oh what a beautiful day!!".to_vec())
            .flat_map(|v|v)
            .collect::<Vec<u8>>();

        let compressed = {
            let buffer = Vec::new();
            let mut encoder = snap::write::FrameEncoder::new(buffer);
            encoder.write_all(&data).unwrap();
            encoder.get_ref().to_vec()
        };
        let estimated_decompression_len = decompress_len(&compressed).unwrap();
        println!("Estimated size: {}, actual: {}", estimated_decompression_len, data.len());
        assert!(estimated_decompression_len >= data.len());
    }
}

estimated_decompress_len ends up as 895
but actual is 270000000

I must be doing something wrong, appreciate any pointers. :-)

`write::FrameDecoder`

Hi!
What is the status of write::FrameDecoder?
I would be very grateful if you implemented it, as it could reduce a large allocation in my code.
Thanks for the great pure-rust compression algorithm!

Is there a reason the the MAX_INPUT_SIZE limit?

Is there a reason the the MAX_INPUT_SIZE limit?

/// We don't permit compressing a block bigger than what can fit in a u32.
const MAX_INPUT_SIZE: u64 = std::u32::MAX as u64;

I was testing out the cramjam Python package (which uses this Rust implementation) for compressing some large arrays and ran into this limitation. The program reading the data relies on the C++ snappy implementation, but there I don't see the same limitation.

Concatenating multiple Encoder streams

Hi @BurntSushi ,

Thanks for designing and maintaining this great tool and I wanted to add that I am big fan of all your work in rust-lang !

Recently, @rob-p and I were thinking of the possibility of a use case where data from multiple rust-snappy compressed streams can be concatenated together into one. Basically, we are working with a multithreaded scenario where multiple producer threads push that data into a single consumer thread, to be written into a file. Currently we can use rust-snappy to compress the data on the single threaded consumer but even though rust-snappy compression is fast, we are bounded by the single threaded compression slowing down the whole pipeline. I am guessing you might have already thought or discussed about this and if possible we'd like to get your thoughts on how can we move the data compression part back into the producer threads, so that the single threaded consumer can just concatenate the compressed stream avoiding the overhead of compression and instead just dump the bytes into a file.

I hope the use case makes sense and looking forward to hearing back from you.

Async support

Any chance this crate will support AsyncRead and AsyncWrite eventually?

Question: PyO3 wrapped snappy, slower than python-snappy

Hi there,

First, thanks for a lovely lib, both here and all your other contributions in Rust! 💯

As a learning exercise I've made a Python lib wrapping various de/compression libs in Rust: cramjam and in running the benchmarks for the Snappy comparison, it appears slower than python-snappy by a fair margin (granted both are quite quick), however, given your benchmarks I was expecting they'd be a bit closer.

Here I expose the interface and was curious if you would be able to give me any pointers on things I may have missed.

Appreciate the help. 😃

Heap buffer overflow

Found using cargo-fuzz:

To reproduce, add the following test case in src/tests.rs

// Crashes found with fuzzing
testerrored!(err_heap_buffer_overflow, &b"\x23\x00\x00\x04\x00\xff\x01\x03\x01\x03\x03\x03\x00\x00\x00\xfc\xc4\xff\x01\x03"[..],
             Error::Literal {
                 len: 50462661,
                 src_len: 0,
                 dst_len: 23,
             });

To run under Linux (this requires a nightly build): RUSTFLAGS="-Zsanitizer=address" ASAN_OPTIONS=detect_odr_violation=0 cargo test --target x86_64-unknown-linux-gnu
Note that the test passes if not running with Address Sanitizer.

=================================================================

==6875==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000006033 at pc 0x559a2e2cf009 bp 0x7f94747fc790 sp 0x7f94747fbf40                                                                    
WRITE of size 16 at 0x604000006033 thread T40 (tests::err_heap)                                       
    #0 0x559a2e2cf008 in __asan_memmove /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_interceptors.cc:461
    #1 0x559a2db04bc7 in snap::decompress::{{impl}}::read_copy /home/rikard/code/rust-snappy/src/decompress.rs:317
    #2 0x559a2db04bc7 in snap::decompress::Decompress::decompress::h74fcb923230eef32 /home/rikard/code/rust-snappy/src/decompress.rs:143
    #3 0x559a2db02eb1 in snap::decompress::Decoder::decompress::h8d4cf6fc6f8a2c36 /home/rikard/code/rust-snappy/src/decompress.rs:98
    #4 0x559a2db32750 in snap::tests::err_heap_buffer_overflow::hf80b5d87bb01e9a2 /home/rikard/code/rust-snappy/src/tests.rs:114
    #5 0x559a2e1e4041 in test::run_test::{{closure}} /checkout/src/libtest/lib.rs:1478
    #6 0x559a2e1e4041 in core::ops::function::FnOnce::call_once<closure,(())> /checkout/src/libcore/ops/function.rs:223
    #7 0x559a2e1e4041 in _$LT$F$u20$as$u20$test..FnBox$LT$T$GT$$GT$::call_box::h4cd0226556414a83 /checkout/src/libtest/lib.rs:139
    #8 0x559a2e2228cc in __rust_maybe_catch_panic /checkout/src/libpanic_unwind/lib.rs:98
    #9 0x559a2e1d524c in std::panicking::try<(),std::panic::AssertUnwindSafe<closure>> /checkout/src/libstd/panicking.rs:459
    #10 0x559a2e1d524c in std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()> /checkout/src/libstd/panic.rs:361
    #11 0x559a2e1d524c in test::run_test::run_test_inner::{{closure}} /checkout/src/libtest/lib.rs:1417
    #12 0x559a2e1d524c in std::sys_common::backtrace::__rust_begin_short_backtrace::h13ca8dac9b7f5353 /checkout/src/libstd/sys_common/backtrace.rs:136
    #13 0x559a2e1d6012 in std::thread::{{impl}}::spawn::{{closure}}::{{closure}}<closure,()> /checkout/src/libstd/thread/mod.rs:394
    #14 0x559a2e1d6012 in std::panic::{{impl}}::call_once<(),closure> /checkout/src/libstd/panic.rs:296
    #15 0x559a2e1d6012 in std::panicking::try::do_call::hd57949c7b5b244b9 /checkout/src/libstd/panicking.rs:480
    #16 0x559a2e2228cc in __rust_maybe_catch_panic /checkout/src/libpanic_unwind/lib.rs:98
    #17 0x559a2e1ddd12 in std::panicking::try<(),std::panic::AssertUnwindSafe<closure>> /checkout/src/libstd/panicking.rs:459
    #18 0x559a2e1ddd12 in std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()> /checkout/src/libstd/panic.rs:361
    #19 0x559a2e1ddd12 in std::thread::{{impl}}::spawn::{{closure}}<closure,()> /checkout/src/libstd/thread/mod.rs:393
    #20 0x559a2e1ddd12 in _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::hcecece177889931b /checkout/src/liballoc/boxed.rs:682
    #21 0x559a2e21a83b in alloc::boxed::{{impl}}::call_once<(),()> /checkout/src/liballoc/boxed.rs:692
    #22 0x559a2e21a83b in std::sys_common::thread::start_thread /checkout/src/libstd/sys_common/thread.rs:21
    #23 0x559a2e21a83b in std::sys::imp::thread::Thread::new::thread_start::h8596ddab359bf413 /checkout/src/libstd/sys/unix/thread.rs:84
    #24 0x7f9477d87048 in start_thread (/usr/lib/libpthread.so.0+0x7048)
    #25 0x7f94778b0f0e in __GI___clone (/usr/lib/libc.so.6+0xedf0e)

0x604000006033 is located 0 bytes to the right of 35-byte region [0x604000006010,0x604000006033)
allocated by thread T40 (tests::err_heap) here:                                                       
    #0 0x559a2e2e539d in calloc /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_malloc_linux.cc:74
    #1 0x559a2e222c6a in alloc_system::platform::{{impl}}::alloc_zeroed /checkout/src/liballoc_system/lib.rs:149
    #2 0x559a2e222c6a in __rg_alloc_zeroed /checkout/src/librustc_asan/lib.rs:27
    #3 0x559a2daeb8c3 in _$LT$alloc..heap..Heap$u20$as$u20$alloc..allocator..Alloc$GT$::alloc_zeroed::h0bce0d66bdc59af2 /checkout/src/liballoc/heap.rs:134
    #4 0x559a2dad7f09 in _$LT$alloc..raw_vec..RawVec$LT$T$C$$u20$A$GT$$GT$::allocate_in::hf46d911cf2794885 /checkout/src/liballoc/raw_vec.rs:95
    #5 0x559a2dad4dc8 in alloc::raw_vec::{{impl}}::with_capacity_zeroed<u8> /checkout/src/liballoc/raw_vec.rs:147
    #6 0x559a2dad4dc8 in _$LT$u8$u20$as$u20$alloc..vec..SpecFromElem$GT$::from_elem::hddc89846c51d523b /checkout/src/liballoc/vec.rs:1467
    #7 0x559a2db326d1 in alloc::vec::from_elem<u8> /checkout/src/liballoc/vec.rs:1446
    #8 0x559a2db326d1 in snap::tests::err_heap_buffer_overflow::hf80b5d87bb01e9a2 /home/rikard/code/rust-snappy/src/tests.rs:114
    #9 0x559a2e1e4041 in test::run_test::{{closure}} /checkout/src/libtest/lib.rs:1478
    #10 0x559a2e1e4041 in core::ops::function::FnOnce::call_once<closure,(())> /checkout/src/libcore/ops/function.rs:223
    #11 0x559a2e1e4041 in _$LT$F$u20$as$u20$test..FnBox$LT$T$GT$$GT$::call_box::h4cd0226556414a83 /checkout/src/libtest/lib.rs:139
    #12 0x559a2e2228cc in __rust_maybe_catch_panic /checkout/src/libpanic_unwind/lib.rs:98
    #13 0x559a2e1d524c in std::panicking::try<(),std::panic::AssertUnwindSafe<closure>> /checkout/src/libstd/panicking.rs:459
    #14 0x559a2e1d524c in std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()> /checkout/src/libstd/panic.rs:361
    #15 0x559a2e1d524c in test::run_test::run_test_inner::{{closure}} /checkout/src/libtest/lib.rs:1417
    #16 0x559a2e1d524c in std::sys_common::backtrace::__rust_begin_short_backtrace::h13ca8dac9b7f5353 /checkout/src/libstd/sys_common/backtrace.rs:136
    #17 0x559a2e1d6012 in std::thread::{{impl}}::spawn::{{closure}}::{{closure}}<closure,()> /checkout/src/libstd/thread/mod.rs:394
    #18 0x559a2e1d6012 in std::panic::{{impl}}::call_once<(),closure> /checkout/src/libstd/panic.rs:296
    #19 0x559a2e1d6012 in std::panicking::try::do_call::hd57949c7b5b244b9 /checkout/src/libstd/panicking.rs:480
    #20 0x559a2e2228cc in __rust_maybe_catch_panic /checkout/src/libpanic_unwind/lib.rs:98
    #21 0x559a2e1ddd12 in std::panicking::try<(),std::panic::AssertUnwindSafe<closure>> /checkout/src/libstd/panicking.rs:459
    #22 0x559a2e1ddd12 in std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()> /checkout/src/libstd/panic.rs:361
    #23 0x559a2e1ddd12 in std::thread::{{impl}}::spawn::{{closure}}<closure,()> /checkout/src/libstd/thread/mod.rs:393
    #24 0x559a2e1ddd12 in _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::hcecece177889931b /checkout/src/liballoc/boxed.rs:682
    #25 0x559a2e21a83b in alloc::boxed::{{impl}}::call_once<(),()> /checkout/src/liballoc/boxed.rs:692
    #26 0x559a2e21a83b in std::sys_common::thread::start_thread /checkout/src/libstd/sys_common/thread.rs:21
    #27 0x559a2e21a83b in std::sys::imp::thread::Thread::new::thread_start::h8596ddab359bf413 /checkout/src/libstd/sys/unix/thread.rs:84
    #28 0x7f9477d87048 in start_thread (/usr/lib/libpthread.so.0+0x7048)

Thread T40 (tests::err_heap) created by T0 here:
    #0 0x559a2e23ccf1 in __interceptor_pthread_create /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_interceptors.cc:305
    #1 0x559a2e21a505 in std::sys::imp::thread::Thread::new::h839f712eddd9ed31 /checkout/src/libstd/sys/unix/thread.rs:72
    #2 0x559a2e1f48d4 in std::thread::{{impl}}::spawn<closure,()> /checkout/src/libstd/thread/mod.rs:404
    #3 0x559a2e1f48d4 in test::run_test::run_test_inner::ha06748790ed0239c /checkout/src/libtest/lib.rs:1441
    #4 0x559a2e1f349a in test::run_test::h04d610473c4005ff /checkout/src/libtest/lib.rs:1477
    #5 0x559a2e1ec3b3 in test::run_tests<closure> /checkout/src/libtest/lib.rs:1133
    #6 0x559a2e1ec3b3 in test::run_tests_console::h176cb26f5f786689 /checkout/src/libtest/lib.rs:961
    #7 0x559a2e1e43bc in test::test_main::h6f6a791440c5d636 /checkout/src/libtest/lib.rs:288
    #8 0x559a2e1e4a95 in test::test_main_static::he789761c24fac0b3 /checkout/src/libtest/lib.rs:324
    #9 0x559a2e2228cc in __rust_maybe_catch_panic /checkout/src/libpanic_unwind/lib.rs:98
    #10 0x559a2e21c05b in std::panicking::try<(),closure> /checkout/src/libstd/panicking.rs:459
    #11 0x559a2e21c05b in std::panic::catch_unwind<closure,()> /checkout/src/libstd/panic.rs:361
    #12 0x559a2e21c05b in std::rt::lang_start::hc026c6b655c62503 /checkout/src/libstd/rt.rs:61
    #13 0x7f94777e34c9 in __libc_start_main (/usr/lib/libc.so.6+0x204c9)

SUMMARY: AddressSanitizer: heap-buffer-overflow /checkout/src/libcompiler_builtins/compiler-rt/lib/asan/asan_interceptors.cc:461 in __asan_memmove
Shadow bytes around the buggy address:
  0x0c087fff8bb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8bc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8bd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8be0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8bf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c087fff8c00: fa fa 00 00 00 00[03]fa fa fa fa fa fa fa fa fa
  0x0c087fff8c10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8c20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8c30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8c40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff8c50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==6875==ABORTING

Seems to be incompatible with python and java versions of snappy

A simple working example demonstrating szip and python's python-snappy wheel are incompatible:

  • Assume a file "data.json" exists
  • In python3, write
data = open('data.json', 'r').read()
c = snappy.compress(data.encode('UTF-8'))
open('data.json.sz', 'wb').write(c)```
* This results in a binary file on disk. Running `szip -d data.json.sz` yields: 
```$ szip -d out.json.sz
out.json.sz: snappy: corrupt input (expected stream header but got unexpected chunk type byte 148)```
* However python's snappy thinks the file is fine; the following gives the original contents right back: 
```import snappy
snappy.decompress(open('data.json.sz', 'rb').read())```

So szip cannot interpret python's compressed output. The reverse is also true:

* Assume a file "data.json" exists (and data.json.sz does not)
* Run `szip data.json`
* In python3, write 
```import snappy
data = open('data.json.sz', 'rb').read()
c = snappy.decompress(data)``` to get this error:

Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.7/site-packages/snappy/snappy.py", line 92, in uncompress
return _uncompress(data)
snappy.UncompressError: Error while decompressing: invalid input```

I should add this isn't just a quirk of python; the behavior of the Java library I'm using ( https://github.com/xerial/snappy-java ) works fine and is interoperable with python's snappy. So this seems to be the odd one out.

I don't know much about the internals of this algorithm but maybe it's possible they're implementing different specifications of the snappy format?

Use std::arch for crc checking if available

There's a comment in the source already but I thought I'd open an issue for this too, now that x86 intrinsics will be available to us in stable Rust 1.27. I've got a crc function I've been using that looks something like:

fn crc32c(buf: &[u8]) -> u32 {
    #[cfg(target_arch = "x86_64")]
    {
        if is_x86_feature_detected!("sse4.2") {
            return unsafe {
                crc32c_sse42(buf)
            }
        }
    }

    crc32c_slice8(buf)
}

#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "sse4.2")]
unsafe fn crc32c_sse42(mut buf: &[u8]) -> u32 {
    use std::arch::x86_64::*;

    let mut crc: u32 = !0;

    while buf.len() >= 4 {
        let b = LE::read_u32(&buf[0..4]);
        crc = _mm_crc32_u32(crc, b);

        buf = &buf[4..];
    }

    for &b in buf {
        crc = _mm_crc32_u8(crc, b);
    }

    !crc
}

With SSE4.2 support that's approximately, roughly somewhere in the ballpark of much faster than our fallback implementation.

I'm not so sure yet how we'd want to approach the target_feature stuff for a library, I'm guessing the static check for SSE4.2 support would be better than a runtime check? I also excluded 32bit because I didn't need it.

Integer overflow in bounds check on 32-bit targets

On 64-bit there is no problem.

Cause of the bug:

if self.s + len > self.src.len() || self.d + len > self.dst.len() {

Code to produce panic in debug mode and segfault in release mode:

extern crate snap;

fn main()
{
    let mut decoder = snap::Decoder::new();
    let corrupt_data = generate_overflowing_compressed();
    let _decompressed = decoder.decompress_vec(corrupt_data.as_slice()).unwrap();
}

fn generate_overflowing_compressed() -> Vec<u8>
{
    let mut overflowing: Vec<u8> = Vec::new();

    // Header; output size doesn't matter
    write_varu64_to_vec(&mut overflowing, 1234);

    // We need one valid literal to trigger segfault
    overflowing.push(0x00);
    overflowing.push(0x00);

    // Invalid literal with length 2^32-1
    overflowing.push(0b11_11_11_00);
    overflowing.push(0xFE); // 1 will be added to length
    overflowing.push(0xFF);
    overflowing.push(0xFF);
    overflowing.push(0xFF);

    overflowing
}

fn write_varu64_to_vec(data: &mut Vec<u8>, mut n: u64)
{
    while n >= 0b1000_0000 {
        data.push((n as u8) | 0b1000_0000);
        n >>= 7;
    }
    data.push(n as u8);
}

Add analogue to `flate2::read::GzEncoder` that compresses in `Read`

Thank you for a writing a pure Rust port of snappy! We use it heavily at Faraday for fast compression and decompression of huge files in many places.

We recently discovered a need for a "snappy framed" equivalent of the flate2::read::GzEncoder type. That is, a snappy encoder which implements Read, consuming uncompressed data and outputting compressed data.

I'm volunteering to write this and send you a PR with tests and docs, assuming you're interested. :-) Thank you as always for all your great Rust tools.

FrameDecoder panics on malformed input

Encountered a panic thread 'main' panicked at 'attempt to subtract with overflow', while decoding malformed input bytes.

use snap::read::FrameDecoder;
use std::io::Read;

fn main() {
    let data = [255, 6, 0, 0, 115, 78, 97, 80, 112, 89, 0, 0, 0, 0, 38, 1, 255, 0].as_ref();
    let mut reader = FrameDecoder::new(data);
    let mut decoded = Vec::new();
    reader.read_to_end(&mut decoded).unwrap();
}

The panic occurred here

let sn = len - 4;

The malformed input after providing the correct stream identifier chunk has 4 zero bytes which leads to len becoming 0 and hence panics when we try to do unsafe subtraction in line 191.

The stack trace:

thread 'main' panicked at 'attempt to subtract with overflow', /Users/pawan/.cargo/registry/src/github.com-1ecc6299db9ec823/snap-1.0.0/src/read.rs:191:30
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1063
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:204
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:224
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:470
  11: rust_begin_unwind
             at src/libstd/panicking.rs:378
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
  13: core::panicking::panic
             at src/libcore/panicking.rs:52
  14: <snap::read::FrameDecoder<R> as std::io::Read>::read
             at /Users/pawan/.cargo/registry/src/github.com-1ecc6299db9ec823/snap-1.0.0/src/read.rs:191
  15: std::io::read_to_end_with_reservation
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/io/mod.rs:394
  16: std::io::read_to_end
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/io/mod.rs:361
  17: std::io::Read::read_to_end
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/io/mod.rs:659
  18: upgrade_test::main
             at src/main.rs:8
  19: std::rt::lang_start::{{closure}}
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/rt.rs:67
  20: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  21: std::panicking::try::do_call
             at src/libstd/panicking.rs:303
  22: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:86
  23: std::panicking::try
             at src/libstd/panicking.rs:281
  24: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  25: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  26: std::rt::lang_start
             at /rustc/4fb7144ed159f94491249e86d5bbd033b5d60550/src/libstd/rt.rs:67
  27: upgrade_test::main

I think we can fix this by doing checked subtraction and returning an error if there's an underflow.
Will be happy to raise a PR if you can confirm it's an issue :)

Fuzzing

I was trying to use cargo fuzz on rust-snappy using the following script:

#![no_main]
extern crate libfuzzer_sys;
extern crate snap;

use std::io::{Read, Write};

#[export_name="rust_fuzzer_test_input"]
pub extern fn go(data: &[u8]) {
    let mut compressed = Vec::with_capacity(data.len());
    {
        let mut wtr = snap::Writer::new(&mut compressed);
        wtr.write_all(data).unwrap();
    }
    let mut uncompressed = Vec::with_capacity(data.len());
    {
        let mut rdr = snap::Reader::new(&compressed[..]);
        rdr.read_to_end(&mut uncompressed).unwrap();
    }
    assert!(data == &uncompressed[..]);
}

This resulted in the following output (probably a false positive):

INFO: Seed: 2122059377
INFO: Loaded 0 modules (0 guards): 
Loading corpus dir: corpus
INFO: -max_len is not provided, using 64
INFO: A corpus is not provided, starting from an empty corpus
#0	READ units: 1
=================================================================
==5253==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd15b2ee48 at pc 0x559029269b64 bp 0x7ffd15b2ee10 sp 0x7ffd15b2ee08
ACCESS of size 0 at 0x7ffd15b2ee48 thread T0
    #0 0x559029269b63  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0xffb63)
    #1 0x559029189113  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0x1f113)
    #2 0x5590291b1379  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0x47379)
    #3 0x5590291b51fa  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0x4b1fa)
    #4 0x5590291b329f  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0x4929f)
    #5 0x5590293646bb  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0x1fa6bb)

Address 0x7ffd15b2ee48 is located in stack of thread T0 at offset 40 in frame
    #0 0x5590292077bf  (~/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0x9d7bf)

  This frame has 3 object(s):
    [32, 40) 'arg' <== Memory access at offset 40 is inside this variable
    [64, 72) '_8'
    [96, 104) 'r'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/steinberg/src/rust-snappy/fuzz/target/x86_64-unknown-linux-gnu/debug/fuzzer_script_1+0xffb63) 
Shadow bytes around the buggy address:
  0x100022b5dd70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022b5dd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022b5dd90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022b5dda0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022b5ddb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x100022b5ddc0: 00 00 00 00 f1 f1 f1 f1 00[f2]f2 f2 00 f2 f2 f2
  0x100022b5ddd0: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022b5dde0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022b5ddf0: 00 00 00 00 f1 f1 f1 f1 00 00 00 f2 f2 f2 f2 f2
  0x100022b5de00: 00 f2 f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00 00 00
  0x100022b5de10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==5253==ABORTING
MS: 0 ; base unit: 0000000000000000000000000000000000000000


artifact_prefix='artifacts/'; Test unit written to artifacts/crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
Base64: 

Example decompressor fails against Mark Twin sample

Maybe I'm doing something wrong but I get a StreamHeader panic when I try decompressing the test file.

adam@pc: ~/devel/rust-snappy master
$ RUST_BACKTRACE=1 cargo run --release --example decompress <data/Mark.Twain-Tom.Sawyer.txt.rawsnappy
   Compiling libc v0.2.48
   Compiling byteorder v1.3.1
   Compiling rand_core v0.4.0
   Compiling lazy_static v1.2.0
   Compiling rand_core v0.3.1
   Compiling rand_core v0.2.2
   Compiling snap v0.2.5 (/home/adam/devel/rust-snappy)
   Compiling rand v0.5.6
   Compiling quickcheck v0.7.2
    Finished release [optimized + debuginfo] target(s) in 5.40s
     Running `target/release/examples/decompress`
thread 'main' panicked at 'I/O operation failed: Custom { kind: Other, error: StreamHeader { byte: 216 } }', src/libcore/result.rs:1009:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:59
             at src/libstd/panicking.rs:211
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:227
   4: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:491
   5: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:398
   6: rust_begin_unwind
             at src/libstd/panicking.rs:325
   7: core::panicking::panic_fmt
             at src/libcore/panicking.rs:95
   8: core::result::unwrap_failed
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b/src/libcore/macros.rs:26
   9: decompress::main
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b/src/libcore/result.rs:835
             at examples/decompress.rs:12
  10: std::rt::lang_start::{{closure}}
             at /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b/src/libstd/rt.rs:74
  11: std::panicking::try::do_call
             at src/libstd/rt.rs:59
             at src/libstd/panicking.rs:310
  12: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:102
  13: std::rt::lang_start_internal
             at src/libstd/panicking.rs:289
             at src/libstd/panic.rs:398
             at src/libstd/rt.rs:58
  14: main
  15: __libc_start_main
  16: _start
FAIL: 101

adam@pc: ~/devel/rust-snappy master
$ cargo version
cargo 1.32.0 (8610973aa 2019-01-02)

Read trait for raw::Encoder/Decoder

Hello,

Picking up from the rabbit hole conversation I got us into about Read for raw::Encoder/Decoder

I'm trying to expose various de/compression algorithms written in rust to Python. Most, including snappy's framed format, implement either Read/Write enabled En/Decoders. This allows me to give Python users the ability to use file-like, bytes-like or numpy arrays, all of which on the Rust side implement Read/Write, to pass onto all de/compression variations; this is done by having common entry signature like the following:

fn compress<W: Write + ?Sized, R: Read>(input: R, output: &mut W) -> Result<usize, Error>;

As mentioned in the previous thread, there are users with good use cases to have an equal API for snappy raw format.
However, as we know, raw de/encoders ::de/compress only accept byte slices.

I've attempted to implement my own RawEncoder/RawDecoder which implements the Read trait, here but seems to require doing a full read into a buffer before being able to pass it to snap::raw::Encoder::compress and thus ends up being slower than python-snappy, where before using byte slices it was "just as fast" at the cost of less flexibility from the Python user.

Should you find yourself unconvinced this would add value to rust-snappy, I would be very happy if you could point out any of my (perhaps many) performance mistakes when implementing it myself in RawEncoder/RawDecoder

szip all files in a directory recursively?

I'm really loving szip for data-munging tasks!

I keep hitting a use-case where I need to *.szip all the files in a directory. For example, I might have a directory of hundreds of giant *.csv files output by xsv split, and I need them compressed before uploading them to S3.

I'd love to be able to run:

szip -r /path/to/dir/

...and to a high-performance directory walk and parallel szip.

But I admit this might be too specialized a use-case for szip, so I'm happy to go ahead and build a separate szipdir tool to handle this use case if that makes more sense. What do you think?

Undefined behavior in `Decompress::read_copy`

I've found an instance of undefined behavior under the Tree Borrows model. This occurs in the implementation of Decompress::read_copy. This occurs within a loop on lines 299-312 in ./src/decompress.rs.

let mut dstp = self.dst.as_mut_ptr().add(self.d);           
let mut srcp = dstp.sub(offset);
loop {
    debug_assert!(dstp >= srcp);
    let diff = (dstp as usize) - (srcp as usize);
    if diff >= 16 {
        break;
    }
    // srcp and dstp can overlap, so use ptr::copy.
    debug_assert!(self.d + 16 <= self.dst.len());
    ptr::copy(srcp, dstp, 16);
    self.d += diff as usize;
    dstp = dstp.add(diff);
}

Both dstp and srcp are derived from the field self.dst and share the same access tags. They're considered Reserved under tree borrows until the first call to ptr::copy, after which they transition to Active.

The field self.dst is also reborrowed immutably on each iteration. This occurs within the second debug assertion as a result of the expression self.dst.len(). From the perspective of the dstp and srcp pointers, the expression self.dst.len() reborrows against self.dst, counting as a foreign read access. This is totally fine during the first iteration of the loop, as Reserved tolerates foreign reads. However, Active does not allow foreign reads, so both srcp and destp will become invalid to use during the second iteration, making ptr::copy undefined behavior.

This can be fixed by saving the value of self.dst.len() in a variable outside the loop and then referencing this value during iteration.

3 MB of test files are uploaded to crates.io

Hi, awesome crate you have here!

I noticed while using cargo-vendor that the "data" directory of this repository is uploaded to crates.io, turning a 126kB (44.1kB compressed) crate into a 3MB (1MB when compressed) crate. It doesn't appear like these test files are referenced in the uploaded crate, so it would make sense to somehow prevent them being uploaded.

Based on this documentation, it seems like adding the following to Cargo.toml would prevent these data files from being uploaded with the crate. significantly slimming down the download:

exclude = [
    "data/*"
]

If you'd like, I can open a PR with that change, but unfortunately I don't think that there's a way for me to properly test it.

Thanks!

Resetting Encoder structs

The encoder structs create a new raw::Encoder each time you instantiate them. As stated by the docs, it's recommended to save the raw::Encoder. To prevent creating new raw::Encoders for each new encoding, could this crate provide a way to reset the higher level Read and Write encoders?
Thanks!

look into failing roundtrip tests on macos

CI build failure: https://github.com/BurntSushi/rust-snappy/runs/2751122501?check_suite_focus=true

The failing tests are tests that check that this library generates the same compressed output as the reference Snappy C++ implementation. So a failing test here doesn't necessarily indicate a correctness issue, but that perhaps the reference implementation changed or got some improvements that we should investigate.

My next steps would probably be:

  • Compare the versions on macOS and Linux CI. Are they the same? If so, try to understand why the failures might be happening on macOS and not Linux. If they aren't the same, try reproducing the problem on Linux with the same Snappy version used on macOS.
  • Review the commit log for the reference Snappy C++ implementation and see if there are any relevant recentish changes.

For now, I've disabled running the cpp roundtrip tests on macOS in CI.

Unexpected Chunk Type Byte 224

Rust Edition: 2021
Snap Version 1
OS Win 10

When reading a valid snappy compressed byte getting the error

snappy: corrupt input (expected stream header but got unexpected chunk type byte 224)

The byte in question is 0xe0 which should not raise the error in

Err(b) if 0x02 <= b && b <= 0x7F => {

full bytes

b"\xe0\x04\xf0c\rg~_E\x10\x8d\xc6\x06\x18\x81\xa3\x03"\xd0\x04"\xcd\x04\x08\xba\x9e\xe4\xae\x16\x10\x16\x18\x02";\n\x1bnpc_dota_hero_dragon_knight\x12\x0eArchon Forever\x18\x00 \xb7\xf9\x8b\x96\x90\x80\x80\x88\x01(\x02"8\n\x1fnpc_do\x11=\x80abyssal_underlord\x12\x07VERSACE\x18\x00 \xb6\xea\xa5\xbe\x19:\x00\x126w\x00\x80lich\x12\x14Professor Impossible\x18\x00 \xcb\xa3\x86\xb2\x11:\x08>\n 6:\x00\x98ancient_apparition\x12\x0c#2parasite#2\x18\x00 \xc3\xec\x92\x8c\x11@\x08/\n\x136@\x00<slark\x12\nLogicalAp\x01q\x14\xd3\x90\xcb\xd8\x93\x80\t\xe5\x083\n\x1c:1\x00hpirit_breaker\x12\x05Lunch\x18\x00 \x93\xcf\xc0\xa0\tf\x10\x03"2\n\x1865\x00hjuggernaut\x12\x08Alnasty-\x18\x00 \xec\xb0\x93\xc3\x114\x008>4\x00\x80tidehunter\x12\x0eThelastmailman\x18\x00 \xd6\xc1\xf8\x8f\x11:\x006>\xa3\x00Xcrystal_maiden\x12\x08kshamme!\xc0\x0c\xbd\xc1\xef\xa7\x118\x00<>r\x00\xccwindrunner\x12\x12[M] StrawberryKiwi\x18\x00 \xc4\xf3\x84\xa1\x90\x80\x80\x88\x01(\x03(\x00X\xe6\xff\x9c\x85\x06"

code

    let mut buf:Vec<u8, Global> = vec![];
    let x = FrameDecoder::new(peek.message.reader()).read(&mut buf);
    println!("{}", x.unwrap());
    println!("{:#x?}", buf); 

struct Peek {
    tick: u32,
    message_type: u32,
    tell: u64,
    size: u32,
    message: Bytes,
    compression: bool,
}

I am able to decompress this set of bytes in python and I am struggling to figure out why it isn't working with this library. All help is appreciated and if you have any other questions please let me know.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.