Giter Club home page Giter Club logo

snappy's Introduction

Snappy, a fast compressor/decompressor.

Build Status

Introduction

Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. (For more information, see "Performance", below.)

Snappy has the following properties:

  • Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code. See "Performance" below.
  • Stable: Over the last few years, Snappy has compressed and decompressed petabytes of data in Google's production environment. The Snappy bitstream format is stable and will not change between versions.
  • Robust: The Snappy decompressor is designed not to crash in the face of corrupted or malicious input.
  • Free and open source software: Snappy is licensed under a BSD-type license. For more information, see the included COPYING file.

Snappy has previously been called "Zippy" in some Google presentations and the like.

Performance

Snappy is intended to be fast. On a single core of a Core i7 processor in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more. (These numbers are for the slowest inputs in our benchmark suite; others are much faster.) In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, QuickLZ, etc.) while achieving comparable compression ratios.

Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are capable of achieving yet higher compression rates, although usually at the expense of speed. Of course, compression ratio will vary significantly with the input.

Although Snappy should be fairly portable, it is primarily optimized for 64-bit x86-compatible processors, and may run slower in other environments. In particular:

  • Snappy uses 64-bit operations in several places to process more data at once than would otherwise be possible.
  • Snappy assumes unaligned 32 and 64-bit loads and stores are cheap. On some platforms, these must be emulated with single-byte loads and stores, which is much slower.
  • Snappy assumes little-endian throughout, and needs to byte-swap data in several places if running on a big-endian platform.

Experience has shown that even heavily tuned code can be improved. Performance optimizations, whether for 64-bit x86 or other platforms, are of course most welcome; see "Contact", below.

Building

You need the CMake version specified in CMakeLists.txt or later to build:

git submodule update --init
mkdir build
cd build && cmake ../ && make

Usage

Note that Snappy, both the implementation and the main interface, is written in C++. However, several third-party bindings to other languages are available; see the home page for more information. Also, if you want to use Snappy from C code, you can use the included C bindings in snappy-c.h.

To use Snappy from your own C++ program, include the file "snappy.h" from your calling file, and link against the compiled library.

There are many ways to call Snappy, but the simplest possible is

snappy::Compress(input.data(), input.size(), &output);

and similarly

snappy::Uncompress(input.data(), input.size(), &output);

where "input" and "output" are both instances of std::string.

There are other interfaces that are more flexible in various ways, including support for custom (non-array) input sources. See the header file for more information.

Tests and benchmarks

When you compile Snappy, the following binaries are compiled in addition to the library itself. You do not need them to use the compressor from your own library, but they are useful for Snappy development.

  • snappy_benchmark contains microbenchmarks used to tune compression and decompression performance.
  • snappy_unittests contains unit tests, verifying correctness on your machine in various scenarios.
  • snappy_test_tool can benchmark Snappy against a few other compression libraries (zlib, LZO, LZF, and QuickLZ), if they were detected at configure time. To benchmark using a given file, give the compression algorithm you want to test Snappy against (e.g. --zlib) and then a list of one or more file names on the command line.

If you want to change or optimize Snappy, please run the tests and benchmarks to verify you have not broken anything.

The testdata/ directory contains the files used by the microbenchmarks, which should provide a reasonably balanced starting point for benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they are used to verify correctness in the presence of corrupted data in the unit test.)

Contributing to the Snappy Project

In addition to the aims listed at the top of the README Snappy explicitly supports the following:

  1. C++11
  2. Clang (gcc and MSVC are best-effort).
  3. Low level optimizations (e.g. assembly or equivalent intrinsics) for:
  4. x86
  5. x86-64
  6. ARMv7 (32-bit)
  7. ARMv8 (AArch64)
  8. Supports only the Snappy compression scheme as described in format_description.txt.
  9. CMake for building

Changes adding features or dependencies outside of the core area of focus listed above might not be accepted. If in doubt post a message to the Snappy discussion mailing list.

We are unlikely to accept contributions to the build configuration files, such as CMakeLists.txt. We are focused on maintaining a build configuration that allows us to test that the project works in a few supported configurations inside Google. We are not currently interested in supporting other requirements, such as different operating systems, compilers, or build systems.

Contact

Snappy is distributed through GitHub. For the latest version and other information, see https://github.com/google/snappy.

snappy's People

Contributors

abyss7 avatar atdt avatar bshastry avatar chandlerc avatar ckennelly avatar cmumford avatar danilak-g avatar howardhinnant avatar huachaohuang avatar jefflim-google avatar jjerphan avatar jsteemann avatar jueminyang avatar junhe77 avatar jyrkialakuijala avatar mpcallanan avatar nafi3000 avatar pkasting avatar pwnall avatar qrczakmk avatar rjogrady avatar s-kanev avatar sesse avatar slackito avatar tmm1 avatar tmsri avatar tocarip avatar veluca93 avatar vrabaud avatar wmi-11 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

snappy's Issues

snappy 1.2.0 removed __ZN6snappy11RawCompressEPKcmPcPm from libsnappy

Thank you for opening the issue tracker!

MacPorts recently updated to snappy 1.2.0 and @justinbb reported to us today that snappy 1.2.0 removes a symbol which existed in 1.1.10. The user discovered this by trying to run qgis3 after updating snappy to 1.2.0:

Symbol not found: __ZN6snappy11RawCompressEPKcmPcPm
Referenced from: <6B4C63E9-F153-35F2-B6E3-9B70D5205EB7> /opt/local/lib/libleveldb.1.23.0.dylib
Expected in:     <21F5B8CD-DE83-3E89-BC9F-C6F6CD66E239> /opt/local/lib/libsnappy.1.1.10.dylib

The versioning of libsnappy from snappy 1.2.0 on macOS as installed by MacPorts was:

% otool -L /opt/local/lib/libsnappy.1.dylib
/opt/local/lib/libsnappy.1.dylib:
	/opt/local/lib/libsnappy.1.dylib (compatibility version 1.0.0, current version 1.1.10)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1200.3.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)

The library current version of 1.1.10 was a mistake corrected in #178. After applying that fix, the versioning is now:

% otool -L /opt/local/lib/libsnappy.1.dylib
/opt/local/lib/libsnappy.1.dylib:
	/opt/local/lib/libsnappy.1.dylib (compatibility version 1.0.0, current version 1.2.0)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1200.3.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)

Translating, this means that you are providing minor version 1.2.0 of major version 1 of libsnappy and it claims it is backward compatible with all minor versions of major version 1 of the library back to version 1.0.0. In other words, if I compiled a program against libsnappy major version 1 minor version 1.0.0, that program should still work if the runtime version of libsnappy is major version 1 minor version 1.2.0. But the failure to launch qgis3 due to the removed symbol shows that claim is not accurate.

Isn't it customary that removing a public symbol from a library would be accompanied by increasing its major version? In other words, snappy 1.2.0 should have included libsnappy.2.dylib not libsnappy.1.dylib. This would have clearly communicated (via a message that libsnappy.1.dylib could not be found when opening a program linking with that library) that everything linking with the library needed to be rebuilt to remove any references to the removed symbol(s).

Caveat: if __ZN6snappy11RawCompressEPKcmPcPm was a private symbol that nobody should have been using, then the above does not apply and the fault would lie with whoever used your private symbol and should be solved by them not using that private symbol.

The problem is also mentioned at apache/arrow#41058 which refers to conda-forge/snappy-feedstock#35 which points the blame for the ABI break on 766d24c.

As an alternative to increasing the major library version to indicate the compatibility break, you could reinstate the removed symbol to restore compatibility and avoid the need to increase the major library version; if possible, this would be the least disruptive solution.

I have identified inapropriate URLs within url.10K

Within "main/snappy/testdata/urls.10K". I have identified ~23 instances of the offending URLs (adult content) on lines 323, 493, 1484, 1485, 1637, 1741, 2874, 3128, 4090, 4858, 5145, 6418, 6470, 7131, 7319, 7337, 7683, 7838, 8741, 8834, 9134, 9779, 9837. There could be a possibility of more instances of the inappropriate content but I haven't stumbled upon them.

std::less_equal need header <functional>

std::less_equal is defined in functional which is missing from snappy.cc

I found this PR: #172. Why hasn't it been merged?

That's my error message:

[build] E:\work_e\others\snappy\snappy.cc(1233): error C2039: “less_equal”: 不是“std”的成员 [E:\work_e\others\snappy\build\snappy.vcxproj]
[build]   C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\vector(17): note: 参见“std”的声明
[build] E:\work_e\others\snappy\snappy.cc(1233): error C2065: “less_equal”: 未声明的标识符 [E:\work_e\others\snappy\build\snappy.vcxproj]
[build] E:\work_e\others\snappy\snappy.cc(1233): error C2059: 语法错误:“const” [E:\work_e\others\snappy\build\snappy.vcxproj]

undefined symbol: typeinfo for snappy::Source

At least 2 depending projects fail with snappy-1.2.1:

The snzip-1.0.5 project fails:

c++  -O2 -pipe -fstack-protector-strong -fno-strict-aliasing   -Wall -I/usr/local/include   -fstack-protector-strong  -L/usr/local/lib -o snzip snzip.o snzip-format.o  framing-format.o framing2-format.o  hadoop-snappy-format.o iwa-format.o  snappy-java-format.o snappy-in-java-format.o  comment-43-format.o crc32.o raw_format.o  crc32_sse4_2.o  -lsnappy
ld: error: undefined symbol: typeinfo for snappy::Source
>>> referenced by raw_format.cpp
>>>               raw_format.o:(typeinfo for snzip::FileSource)
>>> did you mean: vtable for snappy::Source
>>> defined in: /usr/local/lib/libsnappy.so

And the proxygen-2024.05.20.00 project fails:

ld: error: /usr/local/lib/libfolly.so.0.58.0-dev: undefined reference to typeinfo for snappy::Source [--no-allow-shlib-undefined]

`-fno-rtti` incompatible with UBSAN fuzzing setup

We have Apache Arrow set up on OSS-Fuzz. Arrow can optionally use Snappy to read Parquet files, but unfortunately Snappy cannot be enabled on some OSS-Fuzz builders because of incompatible compiler options.

See attempted CI run here:
https://github.com/google/oss-fuzz/actions/runs/10306444556/job/28529568675

and in particular these errors:

[1/5] Building CXX object CMakeFiles/snappy.dir/snappy-sinksource.cc.o
FAILED: CMakeFiles/snappy.dir/snappy-sinksource.cc.o 
/usr/local/bin/clang++ -DHAVE_CONFIG_H -I/work/snappy_ep-prefix/src/snappy_ep -O1 -fno-omit-frame-pointer -gline-tables-only -Wno-error=enum-constexpr-conversion -Wno-error=incompatible-function-pointer-types -Wno-error=int-conversion -Wno-error=deprecated-declarations -Wno-error=implicit-function-declaration -Wno-error=implicit-int -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=array-bounds,bool,builtin,enum,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unsigned-integer-overflow,unreachable,vla-bound,vptr -fno-sanitize-recover=array-bounds,bool,builtin,enum,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr -fsanitize=fuzzer-no-link -stdlib=libc++ -Qunused-arguments -fcolor-diagnostics -fPIC -Wall -Wextra -Werror -fno-exceptions -fno-rtti -O2 -g -DNDEBUG -ggdb   -Wno-error -std=gnu++17 -MD -MT CMakeFiles/snappy.dir/snappy-sinksource.cc.o -MF CMakeFiles/snappy.dir/snappy-sinksource.cc.o.d -o CMakeFiles/snappy.dir/snappy-sinksource.cc.o -c /work/snappy_ep-prefix/src/snappy_ep/snappy-sinksource.cc
clang++: error: invalid argument '-fsanitize=vptr' not allowed with '-fno-rtti'

[etc.]

It would be nice if Snappy didn't force -fno-rtti unconditionally.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.