Giter Club home page Giter Club logo

kanzi-cpp's Introduction

Kanzi

Kanzi is a modern, modular, portable and efficient lossless data compressor implemented in C++.

  • modern: state-of-the-art algorithms are implemented and multi-core CPUs can take advantage of the built-in multi-threading.
  • modular: entropy codec and a combination of transforms can be provided at runtime to best match the kind of data to compress.
  • portable: many OSes, compilers and C++ versions are supported (see below).
  • expandable: clean design with heavy use of interfaces as contracts makes integrating and expanding the code easy. No dependencies.
  • efficient: the code is optimized for efficiency (trade-off between compression ratio and speed).

Unlike the most common lossless data compressors, Kanzi uses a variety of different compression algorithms and supports a wider range of compression ratios as a result. Most usual compressors do not take advantage of the many cores and threads available on modern CPUs (what a waste!). Kanzi is concurrent by design and uses threads to compress several blocks in parallel. It is not compatible with standard compression formats.

Kanzi is a lossless data compressor, not an archiver. It uses checksums (optional but recommended) to validate data integrity but does not have a mechanism for data recovery. It also lacks data deduplication across files. However, Kanzi generates a bitstream that is seekable (one or several consecutive blocks can be decompressed without the need for the whole bitstream to be decompressed).

For more details, see Wiki and Q&A

See how to reuse the C and C++ APIs: here

There is a Java implementation available here: https://github.com/flanglet/kanzi

There is Go implementation available here: https://github.com/flanglet/kanzi-go

Build Status Quality Gate Status Lines of Code Coverity Scan Build Status License

Why Kanzi

There are many excellent, open-source lossless data compressors available already.

If gzip is starting to show its age, zstd and brotli are open-source, standardized and used daily by millions of people. Zstd is incredibly fast and probably the best choice in many cases. There are a few scenarios where Kanzi can be a better choice:

  • gzip, lzma, brotli, zstd are all LZ based. It means that they can reach certain compression ratios only. Kanzi also makes use of BWT and CM which can compress beyond what LZ can do.

  • These LZ based compressors are well suited for software distribution (one compression / many decompressions) due to their fast decompression (but low compression speed at high compression ratios). There are other scenarios where compression speed is critical: when data is generated before being compressed and consumed (one compression / one decompression) or during backups (many compressions / one decompression).

  • Kanzi has built-in customized data transforms (multimedia, utf, text, dna, ...) that can be chosen and combined at compression time to better compress specific kinds of data.

  • Kanzi can take advantage of the multiple cores of a modern CPU to improve performance

  • Implementing a new transform or entropy codec (to either test an idea or improve compression ratio on specific kinds of data) is simple.

Benchmarks

Test machine:

AWS c5a8xlarge: AMD EPYC 7R32 (32 vCPUs), 64 GB RAM

Ubuntu clang++ version 15.0.7 + tcmalloc

Ubuntu 24.04 LTS

Kanzi version 2.3.0 C++ implementation

On this machine, Kanzi uses up to 16 threads (half of CPUs by default).

bzip3 and zpaq use 16 threads. zstd uses 16 threads for compression and 1 for decompression, other compressors are single threaded.

The default block size at level 9 is 32MB, severely limiting the number of threads in use, especially with enwik8, but all tests are performed with default values.

silesia.tar

Download at http://sun.aei.polsl.pl/~sdeor/corpus/silesia.zip

Compressor Encoding (sec) Decoding (sec) Size
Original 211,957,760
Kanzi -l 1 0.263 0.231 80,277,212
Lz4 1.9.5 -4 0.321 0.330 79,912,419
Zstd 1.5.6 -2 -T16 0.151 0.271 69,556,157
Kanzi -l 2 0.267 0.253 68,195,845
Brotli 1.1.0 -2 1.749 0.761 68,041,629
Gzip 1.12 -9 20.09 1.403 67,652,449
Kanzi -l 3 0.446 0.287 65,613,695
Zstd 1.5.6 -5 -T16 0.356 0.289 63,131,656
Kanzi -l 4 0.543 0.373 61,249,959
Zstd 1.5.5 -9 -T16 0.690 0.278 59,429,335
Brotli 1.1.0 -6 8.388 0.677 58,571,909
Zstd 1.5.6 -13 -T16 3.244 0.272 58,041,112
Brotli 1.1.0 -9 70.07 0.761 56,376,419
Bzip2 1.0.8 -9 16.94 6.734 54,572,500
Kanzi -l 5 1.627 0.883 54,039,773
Zstd 1.5.6 -19 -T16 20.87 0.303 52,889,925
Kanzi -l 6 2.312 1.227 49,567,817
Lzma 5.4.5 -9 95.97 3.172 48,745,354
Kanzi -l 7 2.686 2.553 47,520,629
bzip3 1.3.2.r4-gb2d61e8 -j 16 2.682 3.221 47,237,088
Kanzi -l 8 7.260 8.021 43,167,429
Kanzi -l 9 18.99 21.07 41,497,835
zpaq 7.15 -m5 -t16 213.8 213.8 40,050,429

enwik8

Download at https://mattmahoney.net/dc/enwik8.zip

Tested on Ubuntu 22.04.4 LTS, i7-7700K CPU @ 4.20GHz, 32 GB RAM, clang-15, 4 threads (default)

Compressor Encoding (ms) Decoding (ms) Size
Original 100,000,000
Kanzi -l 1 251 87 43,746,017
Kanzi -l 2 268 114 37,816,913
Kanzi -l 3 512 175 33,865,383
Kanzi -l 4 546 249 29,597,577
Kanzi -l 5 1030 500 26,528,023
Kanzi -l 6 1537 799 24,076,674
Kanzi -l 7 2695 2045 22,817,373
Kanzi -l 8 7217 7314 21,181,983
Kanzi -l 9 11336 11574 20,035,138

Round-trip scores for LZ

Below is a table showing silesia.tar compressed using different LZ compressors (no entropy) in single-threaded mode.

The efficiency score is computed as such: score(lambda) = compTime + 2 x decompTime + 10^-lambda x compSize

A lower score is better. Best scores are in bold.

Tested on Ubuntu 22.04.4 LTS, i7-7700K CPU @ 4.20GHz, 32 GB RAM, clang-15

Compressor Encoding (sec) Decoding (sec) Size Score(5) Score(6) Score(7)
FastLZ -2 1.85 0.84 101114153 1014.66 104.63 13.63
Lizard 1.1.0 -11 0.76 0.24 93967850 940.91 95.20 10.63
lzav 0.52 0.19 89232384 893.23 90.14 9.83
Lz4 1.9.5 -2 -T1 0.81 0.21 89208908 893.32 90.44 10.15
Lzturbo 1.2 -11 -p0 1.09 0.34 88657053 888.35 90.43 10.64
s2 -cpu 1 0.81 0.40 86646819 868.08 88.25 10.27
LZ4x 1.60 -2 1.13 0.22 87883674 880.40 89.44 10.35
Lizard 1.1.0 -12 1.48 0.23 86340434 865.35 88.29 10.58
LZ4x 1.60 -3 1.36 0.24 85483806 856.67 87.32 10.38
Kanzi 2.3 -t lz -j 1 0.83 0.24 83355862 834.87 84.67 9.65
Lzturbo 1.2 -12 -p0 2.40 0.22 83179291 834.63 86.02 11.16
Kanzi 2.3 -t lzx -j 1 1.09 0.22 81485228 816.39 83.02 9.68
Lz4 1.9.5 -3 -T1 2.33 0.21 81441623 817.17 84.19 10.90

References:

FastLZ Lizard LZ4 S2 LZAV LZ4x LZTurbo

lz4@97291fc50

kanzi@af12d07f2

More benchmarks

Comprehensive lzbench benchmarks

Mode round trip scores

Build Kanzi

The C++ code can be built on Windows with Visual Studio, Linux, macOS and Android with g++ and/or clang++. There are no dependencies. Porting to other operating systems should be straightforward.

Visual Studio 2008

Unzip the file "Kanzi_VS2008.zip" in place. The solution generates a Windows 32 binary. Multithreading is not supported with this version.

Visual Studio 2022

Unzip the file "Kanzi_VS2022.zip" in place. The solution generates a Windows 64 binary and library. Multithreading is supported with this version.

mingw-w64

Go to the source directory and run 'make clean && mingw32-make.exe kanzi'. The Makefile contains all the necessary targets. Tested successfully on Win64 with mingw-w64 g++ 8.1.0. Multithreading is supportedwith g++ version 5.0.0 or newer. Builds successfully with C++11, C++14, C++17.

Linux

Go to the source directory and run 'make clean && make kanzi'. The Makefile contains all the necessary targets. Build successfully on Ubuntu with many versions of g++ and clang++. Multithreading is supported with g++ version 5.0.0 or newer. Builds successfully with C++98, C++11, C++14, C++17, C++20.

MacOS

Go to the source directory and run 'make clean && make kanzi'. The Makefile contains all the necessary targets. Build successfully on MacOs with several versions of clang++. Multithreading is supported.

BSD

The makefile uses the gnu-make syntax. First, make sure gmake is present (or install it: 'pkg_add gmake'). Go to the source directory and run 'gmake clean && gmake kanzi'. The Makefile contains all the necessary targets. Multithreading is supported.

Makefile targets

clean:     removes objects, libraries and binaries
kanzi:     builds the kanzi executable
lib:       builds static and dynamic libraries
test:      builds test binaries
all:       kanzi + lib + test
install:   installs libraries, headers and executable
uninstall: removes installed libraries, headers and executable

Credits

Matt Mahoney, Yann Collet, Jan Ondrus, Yuta Mori, Ilya Muravyov, Neal Burns, Fabian Giesen, Jarek Duda, Ilya Grebnov

Disclaimer

Use at your own risk. Always keep a copy of your original files.

kanzi-cpp's People

Contributors

flanglet avatar pschichtel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kanzi-cpp's Issues

Add brunsli JPEG compressor as a transform

Hi Frederic! Do you think it might be possible for you to add a new transform for jpg files using brunsli? Especially if it can detect them inside compounded data, like a tar archive.

I'm available for testing! BTW, reporting that kanzi is working flawlessly on ARM aarch64, compiled with clang 👍👌

i can't build binary in MSVC2022

Hi

62 errors in project
`libapi.cpp

C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\utility(655,42): error C2440: '=': cannot convert from '_Other' to '_Ty'
with
[
_Other=int
]
and
[
_Ty=std::_Iterator_base12 *
]
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\utility(655,42): error C2440: '=': cannot convert from '_Other' to '_Ty'
with
[
_Other=int
]
and
[
_Ty=std::_Container_proxy *
]
`

please help, Thanks

An error occurred using TPAQ and TPAQX

#include <fstream>
#include <iostream>
#include "types.hpp"
#include "InputStream.hpp"
#include "OutputStream.hpp"
#include "io/CompressedInputStream.hpp"
#include "io/CompressedOutputStream.hpp"

using namespace kanzi;
using namespace std;

uint64 testCompress(byte block[], uint length, string name = "FPAQ") {
    // Create an OutputStream
    OutputStream* os = new ofstream("compressed.knz", ofstream::out | ofstream::binary);

    // Create a CompressedOutputStream
    CompressedOutputStream cos(*os, name, "NONE", 4194304, false, 1);

    // Compress block
    cos.write((const char*)block, length);

    // Close CompressedOutputStream
    cos.close();

    // Get number of bytes written
    uint64 written = cos.getWritten();
    delete os;
    return written;
}

uint64 testDecompress(byte block[], uint length) {
    // Create an InputStream
    InputStream* is = new ifstream("compressed.knz", ifstream::in | ifstream::binary);

    // Create a CompressedInputStream
    CompressedInputStream cis(*is, 1);

    // Decompress block
    cis.read((char*)block, length);

    // Close CompressedInputStream
    cis.close();

    // Get number of bytes read
    uint64 read = cis.getRead();
    delete is;
    return cis.gcount();
    // return read;
}

int myhash(char* buf, int n) {
    unsigned long long res = 0, mod = 1000000007;
    for (int i = 0; i < n; ++i) {
        res = buf[i] + res * mod;
    }
    return res;
}

void test_(string name) {
    int n = 15 * 1024 * 1024;
    byte* in = new byte[n];
    for (int i = 0; i < n; ++i) in[i] = i & 255;
    testCompress(in, n, name);
    byte* out = new byte[n];
    testDecompress(out, n);
    int h_in = myhash((char *)in, n);
    int h_out = myhash((char *)out, n);
    fprintf(stderr, "%10s: %X, %X, %s\n", name.c_str(), h_in, h_out, h_in == h_out ? "true" : "false");
}

void test() {
    test_("None");
    test_("Huffman");
    test_("ANS0");
    test_("ANS1");
    test_("Range");
    test_("FPAQ");
    test_("TPAQ");
    test_("TPAQX");
    test_("CM");
}

Hi Frederic!
I used similar code to use this library, but an error occurred using TPAQ and TPAQX, then I wrote this test program and got this result

      None: D1780000, D1780000, true
   Huffman: D1780000, D1780000, true
      ANS0: D1780000, D1780000, true
      ANS1: D1780000, D1780000, true
     Range: D1780000, D1780000, true
      FPAQ: D1780000, D1780000, true
      TPAQ: D1780000, 53447A22, false
     TPAQX: D1780000, 82DF5A3, false
        CM: D1780000, D1780000, true

I don't know if there is something wrong with my parameter settings that caused this to happen.

But when I run the compiled program (./kanzi) with the following parameters to compress my data, I get the correct result

./kanzi -v 5 -t NONE -e TPAQX -j 1 -c -i in -o out
./kanzi -v 5 -t NONE -e TPAQX -j 1 -d -i out -o out.bak

Block size too big?

Same 252K file with random bytes. Since block size is way bigger than the file size, it shouldn't matter, at least I would thing so.

blocksize 15m compression time < 1s
blocksize 16m compression time > 160s

This is on a vm which "only" 700MB of memory and I can not reproduce on a 8GB linux machine.

$ ./kanzi -c -l 9 -b 15m -i in -f

Kanzi 2.0 (c) Frederic Langlet

1 file to compress

Compressing in: 257514 => 258387 (100.34%) in 998 ms
$ ./kanzi -c -l 9 -b 16m -i in -f

Kanzi 2.0 (c) Frederic Langlet

1 file to compress

Compressing in: 257514 => 258387 (100.34%) in 160.8 s

Any way to build unpack-only C-library?

Hello!
Is there any simple way (i.e. without removing bunch of code) to build a minimal static C-library which is only unpack-capable, i.e. without pack features?
Thank you!

Benchmark Feedback

Hi flanglet, wanted to share with you my results - in order to see where Kanzi is positioned among some top performers.

Oh, and before I forgot, is the name of that famous monkey used for your project?

Okay, crunching DNA...

D:\TEXTORAMIC_benchmarking_2019-Apr-29>dir *.kanzi

06/30/2018  02:43 PM       382,715,759 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE1.BLOCK128MB.Kanzi
06/30/2018  12:41 PM       383,311,150 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE1.Kanzi
06/30/2018  02:44 PM       255,457,674 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE2.BLOCK128MB.Kanzi
06/30/2018  12:41 PM       261,736,754 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE2.Kanzi
06/30/2018  02:45 PM       251,208,169 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE3.BLOCK128MB.Kanzi
06/30/2018  12:47 PM       257,686,441 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE3.Kanzi
06/30/2018  02:49 PM        58,966,467 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE4.BLOCK128MB.Kanzi
06/30/2018  12:50 PM       125,295,300 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE4.Kanzi
06/30/2018  02:52 PM        56,291,224 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE5.BLOCK128MB.Kanzi
06/30/2018  12:52 PM       119,932,618 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE5.Kanzi
06/30/2018  02:58 PM        55,168,123 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE6.BLOCK128MB.Kanzi
06/30/2018  12:55 PM       118,568,658 SILVA_123_SSURef_Nr99_tax_silva.fasta.MODE6.Kanzi

In above box, I tested the default 1MB block and 128GB, for modes 1..3 the gain is poor, why so? Maybe you decided to go after symmetricalness!?

D:\TEXTORAMIC_benchmarking_2019-Apr-29>DUMP_HEX_header.exe SILVA_123_SSURef_Nr99_tax_silva.fasta
First 512 bytes of: 'SILVA_123_SSURef_Nr99_tax_silva.fasta' 947,966,398 bytes:
0000  3e 48 50 34 35 31 37 34 39 2e 36 2e 31 37 39 34 20 45 75 6b 61 72 79 6f 74 61 3b 4f 70 69 73 74  >HP451749.6.1794 Eukaryota;Opist
0020  68 6f 6b 6f 6e 74 61 3b 4e 75 63 6c 65 74 6d 79 63 65 61 3b 46 75 6e 67 69 3b 44 69 6b 61 72 79  hokonta;Nucletmycea;Fungi;Dikary
0040  61 3b 42 61 73 69 64 69 6f 6d 79 63 6f 74 61 3b 50 75 63 63 69 6e 69 6f 6d 79 63 6f 74 69 6e 61  a;Basidiomycota;Pucciniomycotina
0060  3b 50 75 63 63 69 6e 69 6f 6d 79 63 65 74 65 73 3b 50 75 63 63 69 6e 69 61 6c 65 73 3b 50 75 63  ;Pucciniomycetes;Pucciniales;Puc
0080  63 69 6e 69 61 63 65 61 65 3b 50 75 63 63 69 6e 69 61 3b 50 75 63 63 69 6e 69 61 20 74 72 69 74  ciniaceae;Puccinia;Puccinia trit
00a0  69 63 69 6e 61 0a 43 43 55 47 47 55 55 47 41 55 43 43 55 47 43 43 41 47 55 41 47 55 43 41 55 41  icina.CCUGGUUGAUCCUGCCAGUAGUCAUA
00c0  55 47 43 55 55 47 55 43 55 43 41 41 41 47 41 55 55 41 41 47 43 43 41 55 47 43 41 55 47 55 43 55  UGCUUGUCUCAAAGAUUAAGCCAUGCAUGUCU
00e0  41 41 47 55 41 55 41 41 41 43 41 41 43 55 41 55 41 43 41 47 55 47 0a 41 41 41 43 55 47 43 47 41  AAGUAUAAACAACUAUACAGUG.AAACUGCGA
0100  41 55 47 47 43 55 43 41 55 55 41 41 41 55 43 41 47 55 55 41 55 41 47 55 55 55 41 55 55 55 47 41  AUGGCUCAUUAAAUCAGUUAUAGUUUAUUUGA
0120  55 47 41 55 41 43 43 55 55 41 43 55 41 43 41 55 47 47 41 55 41 41 43 55 47 55 47 47 55 41 41 55  UGAUACCUUACUACAUGGAUAACUGUGGUAAU
0140  55 43 55 41 47 41 47 0a 43 55 41 41 55 41 43 41 55 47 43 55 47 41 41 41 41 47 43 43 43 43 41 41  UCUAGAG.CUAAUACAUGCUGAAAAGCCCCAA
0160  43 43 55 55 55 47 47 41 41 47 47 47 47 55 47 55 41 55 55 55 41 55 55 41 47 41 55 41 41 41 41 41  CCUUUGGAAGGGGUGUAUUUAUUAGAUAAAAA
0180  41 43 43 41 41 55 47 47 43 55 55 55 43 47 47 47 55 43 55 43 55 55 55 47 0a 47 55 47 41 55 55 43  ACCAAUGGCUUUCGGGUCUCUUUG.GUGAUUC
01a0  41 55 41 41 55 41 41 43 55 55 43 55 43 47 41 41 55 43 47 43 41 55 47 47 43 43 55 55 47 55 47 43  AUAAUAACUUCUCGAAUCGCAUGGCCUUGUGC
01c0  43 47 47 55 47 41 55 47 43 55 55 43 41 55 55 43 41 41 41 55 41 55 43 55 47 43 43 43 55 41 55 43  CGGUGAUGCUUCAUUCAAAUAUCUGCCCUAUC
01e0  41 41 43 55 55 55 43 47 41 0a 55 47 47 55 41 47 47 41 55 41 47 41 47 47 43 43 55 41 43 43 41 55  AACUUUCGA.UGGUAGGAUAGAGGCCUACCAU

This file is taken from Kirill's SCB:
http://kirill-kryukov.com/study/naf/
https://github.com/KirillKryukov/naf
http://kirr.dyndns.org/sequence-compression-benchmark/

The used 'SILVA' corpus is here:
https://www.arb-silva.de/no_cache/download/archive/release_132/Exports/

The testmachine is my slowest laptop i5-2430M @2.40GHz 16GB DDR3 1333MHz, Windows 7:

D:\TEXTORAMIC_benchmarking_2019-Apr-29>lzbench173 -c4 -i1,15 -o3 -etornado,16/blosclz,9/brieflz/crush,2/csc,5/density,3/fastlz,2/gipfeli/lzo1b,999/libdeflate,1,12/lz4hc,1,12/lizard,19,29,39,49/lzf,1/lzfse/lzg,9/lzjb/lzlib,9/lzma,9/lzrw,5/lzsse2,17/lzsse4,17/lzsse8,17/lzvn/pithy,9/quicklz,3/snappy/slz_zlib,3/ucl_nrv2b,9/ucl_nrv2d,9/ucl_nrv2e,9/xpack,1,9/xz,9/yalz77,12/yappy,99/zlib,1,5,9/zling,4/shrinker/wflz/lzmat SILVA_123_SSURef_Nr99_tax_silva.fasta
lzbench 1.7.3 (64-bit Windows)   Assembled by P.Skibinski
The results sorted by column number 4:
Compressor name         Compress. Decompress.  Orig. size  Compr. size  Ratio Filename
lzlib 1.8 -9             0.57 MB/s   115 MB/s   947966398     49927392   5.27 SILVA_123_SSURef_Nr99_tax_silva.fasta
tornado 0.6a -16         0.67 MB/s   317 MB/s   947966398     51876911   5.47 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzma 16.04 -9            0.92 MB/s   219 MB/s   947966398     53482886   5.64 SILVA_123_SSURef_Nr99_tax_silva.fasta
xz 5.2.3 -9              1.01 MB/s   169 MB/s   947966398     53483925   5.64 SILVA_123_SSURef_Nr99_tax_silva.fasta
csc 2016-10-13 -5        1.62 MB/s   158 MB/s   947966398     63122428   6.66 SILVA_123_SSURef_Nr99_tax_silva.fasta
lizard 1.0 -29           0.51 MB/s  1666 MB/s   947966398     92138920   9.72 SILVA_123_SSURef_Nr99_tax_silva.fasta

Nakamichi 'Ryuugan-ditto-1TB'       1197 MB/s                 92184094        ! outside lzbench !

lizard 1.0 -49           0.49 MB/s  1672 MB/s   947966398     96151827  10.14 SILVA_123_SSURef_Nr99_tax_silva.fasta
crush 1.0 -2             0.16 MB/s   521 MB/s   947966398    120354575  12.70 SILVA_123_SSURef_Nr99_tax_silva.fasta
xpack 2016-06-02 -9      7.32 MB/s   689 MB/s   947966398    144557535  15.25 SILVA_123_SSURef_Nr99_tax_silva.fasta
zling 2016-01-10 -4        29 MB/s   256 MB/s   947966398    163033199  17.20 SILVA_123_SSURef_Nr99_tax_silva.fasta
libdeflate 0.7 -12       2.07 MB/s   737 MB/s   947966398    168227973  17.75 SILVA_123_SSURef_Nr99_tax_silva.fasta
ucl_nrv2e 1.03 -9        0.28 MB/s   341 MB/s   947966398    176282228  18.60 SILVA_123_SSURef_Nr99_tax_silva.fasta
zlib 1.2.11 -9           1.59 MB/s   335 MB/s   947966398    177147955  18.69 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzsse2 2016-05-14 -17    0.53 MB/s  3080 MB/s   947966398    177807250  18.76 SILVA_123_SSURef_Nr99_tax_silva.fasta
ucl_nrv2d 1.03 -9        0.28 MB/s   356 MB/s   947966398    178368952  18.82 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzsse4 2016-05-14 -17    0.15 MB/s  3095 MB/s   947966398    178506100  18.83 SILVA_123_SSURef_Nr99_tax_silva.fasta
ucl_nrv2b 1.03 -9        0.28 MB/s   357 MB/s   947966398    181389542  19.13 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzsse8 2016-05-14 -17    0.21 MB/s  2870 MB/s   947966398    181980090  19.20 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzg 1.0.8 -9             0.25 MB/s   752 MB/s   947966398    186464811  19.67 SILVA_123_SSURef_Nr99_tax_silva.fasta
lz4hc 1.8.0 -12          1.80 MB/s  2056 MB/s   947966398    194162877  20.48 SILVA_123_SSURef_Nr99_tax_silva.fasta
lizard 1.0 -19           1.51 MB/s  2267 MB/s   947966398    194639562  20.53 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzo1b 2.09 -999          1.79 MB/s   801 MB/s   947966398    194821390  20.55 SILVA_123_SSURef_Nr99_tax_silva.fasta
lizard 1.0 -39           1.49 MB/s  2179 MB/s   947966398    200121731  21.11 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzmat 1.01               1.83 MB/s   314 MB/s   947966398    218007177  23.00 SILVA_123_SSURef_Nr99_tax_silva.fasta
zlib 1.2.11 -5             11 MB/s   193 MB/s   947966398    244043474  25.74 SILVA_123_SSURef_Nr99_tax_silva.fasta
yalz77 2015-09-19 -12      26 MB/s   344 MB/s   947966398    245078611  25.85 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzfse 2017-03-08           39 MB/s   385 MB/s   947966398    262755056  27.72 SILVA_123_SSURef_Nr99_tax_silva.fasta
yappy 2014-03-22 -99       27 MB/s  2411 MB/s   947966398    284086695  29.97 SILVA_123_SSURef_Nr99_tax_silva.fasta
libdeflate 0.7 -1         102 MB/s   372 MB/s   947966398    295835903  31.21 SILVA_123_SSURef_Nr99_tax_silva.fasta
xpack 2016-06-02 -1        25 MB/s   296 MB/s   947966398    297319392  31.36 SILVA_123_SSURef_Nr99_tax_silva.fasta
zlib 1.2.11 -1             33 MB/s   177 MB/s   947966398    329027419  34.71 SILVA_123_SSURef_Nr99_tax_silva.fasta
brieflz 1.1.0              90 MB/s   152 MB/s   947966398    341108765  35.98 SILVA_123_SSURef_Nr99_tax_silva.fasta
quicklz 1.5.0 -3           39 MB/s   809 MB/s   947966398    378438165  39.92 SILVA_123_SSURef_Nr99_tax_silva.fasta
gipfeli 2016-07-13        143 MB/s   332 MB/s   947966398    393326740  41.49 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzrw 15-Jul-1991 -5        23 MB/s   420 MB/s   947966398    397501507  41.93 SILVA_123_SSURef_Nr99_tax_silva.fasta
pithy 2011-12-24 -9       201 MB/s   809 MB/s   947966398    408863225  43.13 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzvn 2017-03-08            27 MB/s   818 MB/s   947966398    410636374  43.32 SILVA_123_SSURef_Nr99_tax_silva.fasta
snappy 1.1.4              156 MB/s   688 MB/s   947966398    431965139  45.57 SILVA_123_SSURef_Nr99_tax_silva.fasta
density 0.12.5 beta -3    103 MB/s   326 MB/s   947966398    446112238  47.06 SILVA_123_SSURef_Nr99_tax_silva.fasta
slz_zlib 1.0.0 -3         149 MB/s   198 MB/s   947966398    491482457  51.85 SILVA_123_SSURef_Nr99_tax_silva.fasta
lz4hc 1.8.0 -1             65 MB/s  1256 MB/s   947966398    493265115  52.03 SILVA_123_SSURef_Nr99_tax_silva.fasta
fastlz 0.1 -2             215 MB/s   370 MB/s   947966398    543326925  57.31 SILVA_123_SSURef_Nr99_tax_silva.fasta
blosclz 2015-11-10 -9      92 MB/s   283 MB/s   947966398    548624289  57.87 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzf 3.6 -1                215 MB/s   426 MB/s   947966398    552336487  58.27 SILVA_123_SSURef_Nr99_tax_silva.fasta
lzjb 2010                 214 MB/s   387 MB/s   947966398    626861264  66.13 SILVA_123_SSURef_Nr99_tax_silva.fasta
wflz 2015-09-16           162 MB/s   536 MB/s   947966398    687187700  72.49 SILVA_123_SSURef_Nr99_tax_silva.fasta
shrinker 0.1               48 MB/s  5658 MB/s   947966398    943719706  99.55 SILVA_123_SSURef_Nr99_tax_silva.fasta
memcpy                   6034 MB/s  6041 MB/s   947966398    947966398 100.00 SILVA_123_SSURef_Nr99_tax_silva.fasta

Oodle 'Leviathan' impresses most, this time around, second best seems to be the supermonster lzturbo 39, decompression-speed-wise.

D:\TEXTORAMIC_benchmarking_2019-Apr-29>"turbobench_v18.05_-_build_04_May_2018" SILVA_123_SSURef_Nr99_tax_silva.fasta -ebzip2/lzlib,9d30fb273/lzham,4fb258:x4:d30/lzma,9d30:fb273:mf=bt4/oodle,89,91,95,99,111,115,119,129,139/lzturbo,19,12,10,29,22,20,39,32,30,59/brotli,11d30/trle -I3 -J31 -k1 -B2G
    41170037     4.3       0.20     423.47   brotli 11d30           SILVA_123_SSURef_Nr99_tax_silva.fasta
    41175596     4.3       0.50     270.33   lzma 9d30:fb273:mf=bt4 SILVA_123_SSURef_Nr99_tax_silva.fasta
    43226170     4.6       0.28    1289.30   lzturbo 39             SILVA_123_SSURef_Nr99_tax_silva.fasta
    44350957     4.7       4.72      24.33   lzturbo 59             SILVA_123_SSURef_Nr99_tax_silva.fasta
    46338866     4.9       0.07    1887.44   oodle 139 'Leviathan'  SILVA_123_SSURef_Nr99_tax_silva.fasta
    47401385     5.0       0.05    1124.12   oodle 129 'Hydra'      SILVA_123_SSURef_Nr99_tax_silva.fasta
    47655604     5.0       0.09    1134.04   oodle 89 'Kraken'      SILVA_123_SSURef_Nr99_tax_silva.fasta
    62538908     6.6       0.29    1567.57   lzturbo 29             SILVA_123_SSURef_Nr99_tax_silva.fasta
    64352852     6.8       0.15    1556.81   oodle 99 'Mermaid'     SILVA_123_SSURef_Nr99_tax_silva.fasta
    72130095     7.6       0.41    1501.21   oodle 95 'Mermaid'     SILVA_123_SSURef_Nr99_tax_silva.fasta
    85823872     9.1       0.16    1735.64   oodle 119 'Selkie'     SILVA_123_SSURef_Nr99_tax_silva.fasta

    92184094                       1197      Nakamichi 'Ryuugan-ditto-1TB' ! outside turbobench !

    98454437    10.4       0.46    1692.66   oodle 115 'Selkie'     SILVA_123_SSURef_Nr99_tax_silva.fasta
   129524831    13.7       6.79      21.35   bzip2                  SILVA_123_SSURef_Nr99_tax_silva.fasta
   134312319    14.2      44.34    1097.28   lzturbo 32             SILVA_123_SSURef_Nr99_tax_silva.fasta
   187031693    19.7      46.16    1814.22   lzturbo 22             SILVA_123_SSURef_Nr99_tax_silva.fasta
   194672481    20.5       0.24    2840.00   lzturbo 19             SILVA_123_SSURef_Nr99_tax_silva.fasta
   235566020    24.8      70.18    2681.77   lzturbo 12             SILVA_123_SSURef_Nr99_tax_silva.fasta
   285590842    30.1     111.84     476.37   lzturbo 30             SILVA_123_SSURef_Nr99_tax_silva.fasta
   292872859    30.9      94.61    1815.50   oodle 91 'Mermaid'     SILVA_123_SSURef_Nr99_tax_silva.fasta
   420135691    44.3     353.19     832.80   lzturbo 20             SILVA_123_SSURef_Nr99_tax_silva.fasta
   541288106    57.1     131.91    1749.63   oodle 111 'Selkie'     SILVA_123_SSURef_Nr99_tax_silva.fasta
   594113486    62.7     362.27    1454.73   lzturbo 10             SILVA_123_SSURef_Nr99_tax_silva.fasta
   891632906    94.1     115.44     985.34   trle                   SILVA_123_SSURef_Nr99_tax_silva.fasta

Notes:

  • Latest available ‘oo2core_6_win64.dll’ was used;
  • The tweak value spacespeedtradeoff=[64..1024] “tweak size vs decode time” has not been played with!
  • Oodle Levels: HyperFast = -4..-1; None = 0; SuperFast = 1; VeryFast = 2; Fast = 3; Normal = 4; Optimal1,2,3,4 = 5,6,7,8;
  • Oodle Compressors: Kraken = 8 (Default); Leviathan = 13 (Best); Mermaid = 9 (Crazy fast); Selkie = 11 (Fastest); Hydra = 12 (Tuneable composite of above).

And some strong crunchers outside RAM-2-RAM benches:

 48,764,228 SILVA_123_SSURef_Nr99_tax_silva.fasta.L9Dict1024.xz     ! "xz_v5.2.3_x64.exe" -z -k -f -9 -e -v -v --lzma2=dict=1024MiB --threads=1 SILVA_123_SSURef_Nr99_tax_silva.fasta !
 49,105,765 SILVA_123_SSURef_Nr99_tax_silva.fasta.2GB.L22.zst       ! zstd-v1.4.0-win64.exe --ultra -22 --zstd=wlog=31,clog=30,hlog=30,slog=26 SILVA_123_SSURef_Nr99_tax_silva.fasta !
 51,465,690 SILVA_123_SSURef_Nr99_tax_silva.fasta.method511.zpaq    ! "zpaq_v7.05_x64.exe" add SILVA_123_SSURef_Nr99_tax_silva.fasta.method511.zpaq SILVA_123_SSURef_Nr99_tax_silva.fasta -method 511 -threads 1 !
 63,601,401 SILVA_123_SSURef_Nr99_tax_silva.fasta.O16.PPMd_varI     ! PPMd_varI_rev2_Intel15_32bit.exe e -o16 -m256 -fSILVA_123_SSURef_Nr99_tax_silva.fasta.O16.PPMd_varI SILVA_123_SSURef_Nr99_tax_silva.fasta !
 69,514,622 SILVA_123_SSURef_Nr99_tax_silva.fasta.method211.zpaq    ! "zpaq_v7.05_x64.exe" add SILVA_123_SSURef_Nr99_tax_silva.fasta.method211.zpaq SILVA_123_SSURef_Nr99_tax_silva.fasta -method 211 -threads 1 !
 92,184,094 SILVA_123_SSURef_Nr99_tax_silva.fasta.Nakamichi
 96,851,916 SILVA_123_SSURef_Nr99_tax_silva.fasta.rar560_m5_m1g     ! rar-x64-560.exe a -m5 -ma5 -md1g SILVA_123_SSURef_Nr99_tax_silva.fasta.rar560_m5_m1g SILVA_123_SSURef_Nr99_tax_silva.fasta !
168,268,686 SILVA_123_SSURef_Nr99_tax_silva.fasta.ST6Block1024.bsc  ! "bsc_v3.1.0_x64.exe" e SILVA_123_SSURef_Nr99_tax_silva.fasta SILVA_123_SSURef_Nr99_tax_silva.fasta.ST6Block1024.bsc -b1024 -m6 -cp -Tt !
174,231,115 SILVA_123_SSURef_Nr99_tax_silva.fasta.O6.PPMd_varI      ! PPMd_varI_rev2_Intel15_32bit.exe e -o6 -m256 -fSILVA_123_SSURef_Nr99_tax_silva.fasta.O6.PPMd_varI SILVA_123_SSURef_Nr99_tax_silva.fasta !
194,978,369 SILVA_123_SSURef_Nr99_tax_silva.fasta.12.lz4            ! lz4_v1_9_0_win64.exe -12 SILVA_123_SSURef_Nr99_tax_silva.fasta SILVA_123_SSURef_Nr99_tax_silva.fasta.12.lz4 !
947,966,398 SILVA_123_SSURef_Nr99_tax_silva.fasta

And for good measure latest Zstd and LZ4:

D:\TEXTORAMIC_benchmarking_2019-Apr-29>zstd-v1.4.0-win64.exe -b1e22 -i9 --priority=rt "SILVA_123_SSURef_Nr99_tax_silva.fasta"
Note : switching to real-time priority .fasta...
 1#9_tax_silva.fasta : 947966398 -> 178911916 (5.299), 124.4 MB/s , 573.7 MB/s
Note : switching to real-time priority .fasta...
 2#9_tax_silva.fasta : 947966398 -> 197639514 (4.796), 150.4 MB/s , 461.8 MB/s
Note : switching to real-time priority .fasta...
 3#9_tax_silva.fasta : 947966398 -> 149661634 (6.334), 214.5 MB/s , 618.8 MB/s
Note : switching to real-time priority .fasta...
 4#9_tax_silva.fasta : 947966398 -> 147498216 (6.427), 196.0 MB/s , 644.2 MB/s
Note : switching to real-time priority .fasta...
 5#9_tax_silva.fasta : 947966398 -> 188131516 (5.039),  69.8 MB/s , 497.8 MB/s
Note : switching to real-time priority .fasta...
 6#9_tax_silva.fasta : 947966398 -> 170660167 (5.555),  73.9 MB/s , 549.6 MB/s
Note : switching to real-time priority .fasta...
 7#9_tax_silva.fasta : 947966398 -> 158332784 (5.987),  49.0 MB/s , 602.0 MB/s
Note : switching to real-time priority .fasta...
 8#9_tax_silva.fasta : 947966398 -> 152178033 (6.229),  36.9 MB/s , 621.6 MB/s
Note : switching to real-time priority .fasta...
 9#9_tax_silva.fasta : 947966398 -> 140101920 (6.766),  24.4 MB/s , 663.2 MB/s
Note : switching to real-time priority .fasta...
10#9_tax_silva.fasta : 947966398 -> 136449466 (6.947),  23.4 MB/s , 669.4 MB/s
Note : switching to real-time priority .fasta...
11#9_tax_silva.fasta : 947966398 -> 135240559 (7.009),  21.5 MB/s , 696.6 MB/s
Note : switching to real-time priority .fasta...
12#9_tax_silva.fasta : 947966398 -> 124191769 (7.633),  4.10 MB/s , 778.8 MB/s
Note : switching to real-time priority .fasta...
13#9_tax_silva.fasta : 947966398 -> 101866858 (9.306),  4.33 MB/s , 893.3 MB/s
Note : switching to real-time priority .fasta...
14#9_tax_silva.fasta : 947966398 ->  97726488 (9.700),  3.93 MB/s , 841.1 MB/s
Note : switching to real-time priority .fasta...
15#9_tax_silva.fasta : 947966398 ->  86286159 (10.99),  2.06 MB/s , 825.5 MB/s
Note : switching to real-time priority .fasta...
16#9_tax_silva.fasta : 947966398 ->  74265519 (12.76),  2.08 MB/s , 774.7 MB/s
Note : switching to real-time priority .fasta...
17#9_tax_silva.fasta : 947966398 ->  67723270 (14.00),  1.73 MB/s , 771.4 MB/s
Note : switching to real-time priority .fasta...
18#9_tax_silva.fasta : 947966398 ->  66641989 (14.22),  1.51 MB/s , 735.9 MB/s
Note : switching to real-time priority .fasta...
19#9_tax_silva.fasta : 947966398 ->  62603077 (15.14),  1.04 MB/s , 769.3 MB/s
Note : switching to real-time priority .fasta...
20#9_tax_silva.fasta : 947966398 ->  55306862 (17.14),  0.99 MB/s , 790.0 MB/s
Note : switching to real-time priority .fasta...
21#9_tax_silva.fasta : 947966398 ->  52169242 (18.17),  0.84 MB/s , 824.5 MB/s
Note : switching to real-time priority .fasta...
22#9_tax_silva.fasta : 947966398 ->  50823066 (18.65),  0.70 MB/s , 844.3 MB/s
D:\TEXTORAMIC_benchmarking_2019-Apr-29>lz4_v1_9_0_win64.exe -b1e12 -i9 --no-frame-crc SILVA_123_SSURef_Nr99_tax_silva.fasta
Benchmarking levels from 1 to 12
 1#9_tax_silva.fasta : 947966398 -> 429351653 (2.208), 318.9 MB/s ,1459.3 MB/s
 2#9_tax_silva.fasta : 947966398 -> 429351653 (2.208),  92.8 MB/s ,1595.4 MB/s
 3#9_tax_silva.fasta : 947966398 -> 412845449 (2.296),  13.8 MB/s ,1740.3 MB/s
 4#9_tax_silva.fasta : 947966398 -> 341175162 (2.779),  37.1 MB/s ,1963.4 MB/s
 5#9_tax_silva.fasta : 947966398 -> 292454847 (3.241),  25.2 MB/s ,2140.2 MB/s
 6#9_tax_silva.fasta : 947966398 -> 257848296 (3.676),  16.7 MB/s ,2184.8 MB/s
 7#9_tax_silva.fasta : 947966398 -> 230959312 (4.104),   3.3 MB/s ,2405.7 MB/s
 8#9_tax_silva.fasta : 947966398 -> 210632835 (4.501),   7.2 MB/s ,2488.3 MB/s
 9#9_tax_silva.fasta : 947966398 -> 200556101 (4.727),   4.4 MB/s ,2513.3 MB/s
10#9_tax_silva.fasta : 947966398 -> 213436118 (4.441),   5.7 MB/s ,2494.9 MB/s
11#9_tax_silva.fasta : 947966398 -> 194190162 (4.882),   2.6 MB/s ,2611.8 MB/s
12#9_tax_silva.fasta : 947966398 -> 194162877 (4.882),   2.3 MB/s ,2556.4 MB/s

And the console log of Nakamichi during compression:

D:\_TEXTUAL_MADNESS_bare-minimum_2019-Feb-17\TESTDATAFILES>timer64 "Nakamichi_Ryuugan-ditto-1TB_RAM_(5GB)_Intel150.exe" SILVA_123_SSURef_Nr99_tax_silva.fasta SILVA_123_SSURef_Nr99_tax_silva.fasta.Nakamichi 27 208000 E
...
Nakamichi 'Ryuugan-ditto-1TB', written by Kaze, inspired by Haruhiko Okumura sharing, based on Nobuo Ito's LZSS source, babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey enforced.
Note0: Nakamichi 'Dragoneye' is 100% FREE, licenseless that is.
Note1: Hamid Buzidi's LzTurbo ([a] FASTEST [Textual] Decompressor, Levels 19/29/39) retains kingship, his TurboBench (2017-Apr-07) proves the supremacy of LzTurbo, Turbo-Amazing!
Note2: Conor Stokes' LZSSE2 ([a] FASTEST Textual Decompressor, Level 17) is embedded, all credits along with many thanks go to him.
Note3: The matchfinder is either 'Railgun_Trolldom' (matches longer than 18, except 36 and 64) or Leprechaun's B-tree order 3.
Note4: Instead of '_mm_loadu_si128' '_mm_lddqu_si128' is used.
Note5: Maximum compression ratio is 44:1, for 704 bytes long matches within 1TB Sliding Window.
Note6: Please send me (at [email protected]) decompression results obtained on machines with fast CPU-RAM subsystems.
Note7: In this compile, clock() was replaced with time() - to counter bigtime stats misreporting.
Note8: Multi-way hashing allows each KeySize to occupy its own HASH pool, thus less RAM is in use - the LEAF is smaller.
Note9: In this revision, B-tree heuristics are in use, allowing skipping many unnecessary memmem() invocations.
NoteA: The file being compressed should be 64 bytes or longer due to Building-Blocks being in range 4..18, 36, 64.
NoteB: In this compile, the keysizes in the LEAF are not HEXed i.e. not doubled.
NoteC: In this latest (2019-Apr-24) compile, keysizes 36/64 are no longer hashed with SHA3-224, it is slow for this case.
Current priority class is REALTIME_PRIORITY_CLASS.
Allocating Source-Buffer 904 MB ...
Allocating Source-Buffer 904 MB (REVERSED) ...
Allocating Target-Buffer 936 MB ...
Allocating Verification-Buffer 904 MB ...
Leprechaun: Memory pool for B-tress is 208,000 MB.
Leprechaun: In this revision 1,024MB 10-way hash is used which results in 10 x 134,217,728 external B-Trees of order 3.
Leprechaun: In this revision, 128 passes are to be executed.
Leprechaun: Allocating HASH memory 10,737,418,305 bytes ... OK
Leprechaun: Allocating/ZEROing 218,103,808,014 bytes swap file ... OK
Leprechaun: Size of input file: 947,966,398

Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,207,988 MB; Pass #128 of 128 ... DONE; 00,000,246,906 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,207,891 MB; Pass #128 of 128 ... DONE; 00,002,136,886 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,207,647 MB; Pass #128 of 128 ... DONE; 00,006,517,040 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,206,980 MB; Pass #128 of 128 ... DONE; 00,017,724,196 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,205,190 MB; Pass #128 of 128 ... DONE; 00,042,176,707 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,202,079 MB; Pass #128 of 128 ... DONE; 00,081,573,960 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,197,932 MB; Pass #128 of 128 ... DONE; 00,125,366,437 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,192,687 MB; Pass #128 of 128 ... DONE; 00,179,346,212 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,174,474 MB; Pass #128 of 128 ... DONE; 00,273,214,316 B-trees have been rooted so far.
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,125,820 MB; Pass #128 of 128 ... DONE; 00,391,919,211 B-trees have been rooted so far.

Leprechaun: Total Searches-n-Inserts Per Second: 10,456,803 SNIPS
Leprechaun: RAM needed to house B-trees (relative to the file being ripped): 89N = 80,513MB
Leprechaun: Total IOPS for 10,698,482,483 'freads' and 10,081,382,386 'fwrites' (of packets 170 bytes long) during loading traversing all orders: 179,076 IOPS

Compressing 947,966,398 bytes ...
|; Each rotation means 64KB are encoded; Speed: 0,000,285 B/s; Done 100%; Compression Ratio: 10.28:1; Matches(16/24/48): 1,445,744/2,822,352/1,782,082; 128[+] long matches: 1,668,715; ETA: 0.00 days
NumberOfFullLiterals (lower-the-better): 10148
Tsuyo_HEURISTIC_APPLIED_thrice_back-to-back: 0
NumberOf(Tiny)Matches[Micro]Window (4)[16B]: 265094
NumberOfMatches[Bheema]Window [128GB window]: 2474051
RAM-to-RAM performance: 285 B/s.
Compressed to 92,184,094 bytes.
Source-file-Hash(FNV1A_YoshimitsuTRIAD) = 0xf401,08aa
Target-file-Hash(FNV1A_YoshimitsuTRIAD) = 0x18e9,9edd
Decompressing 92,184,094 (being the compressed stream) bytes ...
RAM-to-RAM performance: 1151 MB/s.
Verification (input and output sizes match) OK.
Verification (input and output blocks match) OK.

Kernel  Time =433452.226 =   12%
User    Time =2349298.385 =   68%
Process Time =2782750.611 =   80%    Virtual  Memory =  34755 MB
Global  Time =3437074.672 =  100%    Physical Memory =  12161 MB

Or, 947,966,398/3,437,074 = 275 B/s compression rate on SSD Crucial MX200 250GB (DRAM DDR3-1600 512MB), and i5-2430M @2.40GHz 16GB DDR3 1333MHz, Windows 7.

Directory as input

the input option for a directory to compress to a single file seems to compress each file individually instead of all merged into one. is whole directory not supported?

Blocksize integer overflow

While 1G is the maximum documented blocksize...

$ echo "Block size" | kanzi -c -i stdin -t rolzx -b 1G -o stdout > block
$ echo "Block size" | kanzi -c -i stdin -t rolzx -b 2G -o stdout > block
Could not create the compressor: Minimum block size is 1 KB (1024 bytes), got -2147483648 bytes
$ echo "Block size" | kanzi -c -i stdin -t rolzx -b 3G -o stdout > block
Could not create the compressor: Minimum block size is 1 KB (1024 bytes), got -1073741824 bytes
$ echo "Block size" | kanzi -c -i stdin -t rolzx -b 4G -o stdout > block
Could not create the compressor: Minimum block size is 1 KB (1024 bytes), got 0 bytes
$ echo "Block size" | kanzi -c -i stdin -t rolzx -b 5G -o stdout > block

sorry it is again not a compression related bug ;)

Incorrect -t -e values can lead to core dump

okay
$ ./kanzi -c -t empty -e tpaqx -i in -f -v 0
Could not create the compressor: Unknown transform type: 'EMPTY'

$ ./kanzi -c -t none -e tpaqx -i in -f -v 0

not okay
$ ./kanzi -c -t '' -e tpaqx -i in -f -v 0
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr: __pos (which is 1) > this->size() (which is 0)
Aborted (core dumped)

$ ./kanzi -c -t none -e '' -i in -f -v 0
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr: __pos (which is 1) > this->size() (which is 0)
Aborted (core dumped)

Empty file compression

Again not a big issue, but if there is "nothing to do" the output file "test.knz" should probably not be created.

$ touch test
$ ./kanzi -c -i test

Kanzi 2.0 (c) Frederic Langlet

1 file to compress

Input file test is empty ... nothing to do
$ ./kanzi -d -i test.knz -o stdout
Invalid stream type. Error code: 15

Windows binary

Dear @flanglet,
Could you be so kind to generate .exe for the rest of us who are mere Windows users w/o compiler?

Kanzi 2.1 command line tool crashes upon use

2.1 when entering any command (including -h) crashes with a "The application was unable to start correctly (0xc000007b) Click OK to close the application."
2.0 works as expected.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.