Giter Club home page Giter Club logo

lzsse's Introduction

LZSSE

LZSS designed for a branchless SSE decompression implementation.

Three variants:

  • LZSSE2, for high compression files with small literal runs.
  • LZSSE4, for a more balanced mix of literals and matches.
  • LZSSE8, for lower compression data with longer runs of matches.

All three variants have an optimal parser implementation, which uses a quite strong match finder (very similar to LzFind) combined with a Storer-Szymanski style parse. LZSSE4 and LZSSE8 have "fast" compressor implementations, which use a simple hash table based matching and a greedy parse.

Currently LZSSE8 is the recommended variant to use in the general case, as it generally performs well in most cases (and you have the option of both optimal parse and fast compression). LZSSE2 is recommended if you are only using text, especially heavily compressible text, but is slow/doesn't compress as well on less compressible data and binaries.

The code is approaching production readiness and LZSSE2 and LZSSE8 have received a reasonable amount of testing.

See these blog posts An LZ Codec Designed for SSE Decompression and Compressor Improvements and LZSSE2 vs LZSSE8 for a description of how the compression algorithm and implementation function. There are also benchmarks, but these may not be upto date (in particular the figures in the initial blog post no longer represent compression performance).

lzsse's People

Contributors

conorstokes avatar nemequ avatar turtlesimos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lzsse's Issues

Excellent. An attempt for a console tool.

Hi Conor,
many thanks for the undreamt performance of your LZSSE2, simply the FASTEST decompressor!

Not an issue but feedback, wish this site had [Feedback] section as well.
My wish was to have LZSSE2 in form of console tool as well, so I attempted to do so, but not as it had to be, your etude - your tool, that's the right combination, however I wanted level 17 in my textual comparisons so I just embedded LZSSE2 into my fastest (old, 1MB sliding window) Nakamichi, the result:

LZSSE2 excels at:

  • much faster decompression;
  • better compression ratio, especially when considering 64KB vs 1MB;
  • 100x faster compression.

Overall, significantly better everywhere, LZSSE2 is superior to Tengu, hands down.
In my benchmarks with Hamid's TurboBench from (Feb 21), LZSSE2 level 16 decompresses 2x faster than Nakamichi 'Goldenboy'! However with Haswell and above I expect 3x, even 4x.
For more tests (console dumps), you may see my compression logs/notes (far from finished) at:
www.sanmayce.com/Downloads/The-Last-Stand_booklet.pdf

Also, in the www.sanmayce.com/Downloads/TEXTUAL_MADNESS.zip package I made one .bat file running 12 compressors for a given file, thus giving quick look where one is ranked:

Performers:

  • LZ4 for Windows 32-bits v1.4, by Yann Collet (Sep 17 2013).
  • 7-Zip (A) 9.20, Copyright (c) 1999-2010 Igor Pavlov, 2010-11-18.
  • bsc, Block Sorting Compressor, Version 3.1.0. Copyright (c) 2009-2012 Ilya Grebnov, 8 July 2012.
  • lzturbo 1.2 Copyright (c) 2007-2014 Hamid Buzidi, Aug 11 2014.
  • zpaq v7.05 journaling archiver, compiled Apr 17 2015, http://mattmahoney.net/zpaq
  • CABARC, Microsoft (R) Cabinet Tool - Version 5.1.2600.0, Copyright (c) Microsoft Corporation.
  • Compress, version: (N)compress 4.2.4.4, compiled: Fri, Aug 23, 2013 11:56:09. Authors: Peter Jannesen, Dave Mack, Spencer W. Thomas, Jim McKie, Steve Davies, Ken Turkowski, James A. Woods, Joe Orost.
  • zstd command line interface 64-bits v0.5.1, by Yann Collet
  • xz (XZ Utils) 5.2.1, liblzma 5.2.1, XZ Utils home page: http://tukaani.org/xz/
  • brotli, Feb-10-2016 source
  • Nakamichi 'Tengu-Tsuyo', written by Kaze, based on Nobuo Ito's LZSS source, babealicious suggestion by m2 enforced, muffinesque suggestion by Jim Dempsey enforced.
  • GLZA v0.4.1, Copyright 2014-2016 Kennon Conrad
D:\TEXTUAL_MADNESS\Nakamichi_(Tengu-Tsuyo)_1MB-Sliding-Window_vs_LZSSE2\_The_Usual_Suspects>Bundle_of_11_compressors.bat Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt

03/31/2016  04:25 AM         6,225,580 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt
03/31/2016  04:25 AM         1,869,174 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.128MB.7z
                             2,388,411 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.Level17.LZSSE2
03/31/2016  04:25 AM         2,704,954 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.256MB.lzturbo12-19.lzt
03/31/2016  04:26 AM         2,438,567 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.256MB.lzturbo12-29.lzt
03/31/2016  04:26 AM         1,899,416 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.256MB.lzturbo12-39.lzt
03/31/2016  05:39 AM         1,590,341 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.glze
03/31/2016  04:27 AM         1,886,657 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.L11_W24.brotli
03/31/2016  04:26 AM         1,896,775 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.L21.zst
03/31/2016  04:25 AM         2,740,872 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.L9.lz4
03/31/2016  04:25 AM         2,258,711 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.L9.zip
03/31/2016  04:26 AM         1,881,697 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.LZX21.cab
03/31/2016  04:26 AM         6,229,861 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.method08.zpaq
03/31/2016  04:26 AM         2,272,354 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.method28.zpaq
03/31/2016  04:26 AM         1,435,525 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.method58.zpaq
03/31/2016  04:26 AM         2,380,947 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.MSZIP.cab
03/31/2016  04:25 AM         1,582,176 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.ST6Block256.bsc
03/31/2016  04:48 AM         2,577,921 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.Tengu-Tsuyo.Nakamichi
03/31/2016  04:27 AM         1,870,860 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.xz
03/31/2016  04:26 AM         2,464,941 Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.Z

Level17 gives excellent tightness and incredible decompression speed (i5-2430M @3ghz, DDR3 @666MHz):

--------------------------------------------------------------------
| decompressor            | compression size | decompression speed |
--------------------------------------------------------------------
| LZSSE2 level 17         |        2,388,411 |           2277 MB/s |
| LzTurbo v1.2 -19        |        2,704,954 |        1962.67 MB/s |
| Nakamichi 'Tengu-Tsuyo' |        2,577,921 |           1152 MB/s |
--------------------------------------------------------------------

The SSE4.1 and AVX .cod files are included (Assembly, that is), do you see register utilization/distribution as you intended?
In AVX code I see 4466 lines for LZSSE2_Decompress procedure, while the SSE4.1 amounts to 4819, how does this translate into speed, say, on Haswell?
On i5-2430M @3ghz, DDR3 @666MHz I see no speed difference, at all:

D:\Conor>Nakamichi_Tengu-Tsuyo_XMM_PREFETCH_4096_Intel_15.0_64bit_SSE41.exe alice29.txt
Nakamichi 'Tengu-Tsuyo', written by Kaze, based on Nobuo Ito's LZSS source, babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey enforced.
Note: Conor Stokes' LZSSE2(FASTEST decompressor) is embedded, all credits along with many thanks go to him.
Limitation: Uncompressed 8192 MB of filesize.
Current priority class is HIGH_PRIORITY_CLASS.
Allocating Source-Buffer 0 MB ...
Allocating Target-Buffer 32 MB ...
Compressing 152,089 bytes ...
-; Each rotation means 64KB are encoded; Done 100%
NumberOfFullLiterals (lower-the-better): 31
NumberOf(Tiny)Matches[Tiny]Window (4): 161
NumberOf(Short)Matches[Tiny]Window (8): 98
NumberOf(Medium)Matches[Tiny]Window (12): 21
RAM-to-RAM performance: 23 KB/s.
Compressed to 73,235 bytes.
LZSSE2: Compressing with LZSSE2 (level 17) 152,089 bytes ...
LZSSE2: Compressed to 56526 bytes.
LZSSE2: RAM-to-RAM performance: 8736 KB/s.
LZSSE2: Allocating Verification-Buffer 0 MB ...
LZSSE2: Decompressing 56,526 bytes ...
LZSSE2: RAM-to-RAM performance: 53 MB/s.
LZSSE2: Verification (input and output match) OK.

D:\Conor>Nakamichi_Tengu-Tsuyo_XMM_PREFETCH_4096_Intel_15.0_64bit_SSE41.exe University_of_Canterbury_The_Calgary_Corpus.tar
Nakamichi 'Tengu-Tsuyo', written by Kaze, based on Nobuo Ito's LZSS source, babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey enforced.
Note: Conor Stokes' LZSSE2(FASTEST decompressor) is embedded, all credits along with many thanks go to him.
Limitation: Uncompressed 8192 MB of filesize.
Current priority class is HIGH_PRIORITY_CLASS.
Allocating Source-Buffer 3 MB ...
Allocating Target-Buffer 35 MB ...
Compressing 3,265,536 bytes ...
/; Each rotation means 64KB are encoded; Done 100%
NumberOfFullLiterals (lower-the-better): 2966
NumberOf(Tiny)Matches[Tiny]Window (4): 5396
NumberOf(Short)Matches[Tiny]Window (8): 3633
NumberOf(Medium)Matches[Tiny]Window (12): 27429
RAM-to-RAM performance: 5 KB/s.
Compressed to 1,333,349 bytes.
LZSSE2: Compressing with LZSSE2 (level 17) 3,265,536 bytes ...
LZSSE2: Compressed to 1142536 bytes.
LZSSE2: RAM-to-RAM performance: 463 KB/s.
LZSSE2: Allocating Verification-Buffer 3 MB ...
LZSSE2: Decompressing 1,142,536 bytes ...
LZSSE2: RAM-to-RAM performance: 1089 MB/s.
LZSSE2: Verification (input and output match) OK.

D:\Conor>Nakamichi_Tengu-Tsuyo_XMM_PREFETCH_4096_Intel_15.0_64bit_SSE41.exe Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt
Nakamichi 'Tengu-Tsuyo', written by Kaze, based on Nobuo Ito's LZSS source, babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey enforced.
Note: Conor Stokes' LZSSE2(FASTEST decompressor) is embedded, all credits along with many thanks go to him.
Limitation: Uncompressed 8192 MB of filesize.
Current priority class is HIGH_PRIORITY_CLASS.
Allocating Source-Buffer 5 MB ...
Allocating Target-Buffer 37 MB ...
Compressing 6,225,580 bytes ...
-; Each rotation means 64KB are encoded; Done 100%
NumberOfFullLiterals (lower-the-better): 57
NumberOf(Tiny)Matches[Tiny]Window (4): 5321
NumberOf(Short)Matches[Tiny]Window (8): 586
NumberOf(Medium)Matches[Tiny]Window (12): 18
RAM-to-RAM performance: 5 KB/s.
Compressed to 2,577,921 bytes.
LZSSE2: Compressing with LZSSE2 (level 17) 6,225,580 bytes ...
LZSSE2: Compressed to 2388411 bytes.
LZSSE2: RAM-to-RAM performance: 1443 KB/s.
LZSSE2: Allocating Verification-Buffer 5 MB ...
LZSSE2: Decompressing 2,388,411 bytes ...
LZSSE2: RAM-to-RAM performance: 2277 MB/s.
LZSSE2: Verification (input and output match) OK.

D:\Conor>Nakamichi_Tengu-Tsuyo_XMM_PREFETCH_4096_Intel_15.0_64bit_SSE41.exe dickens
Nakamichi 'Tengu-Tsuyo', written by Kaze, based on Nobuo Ito's LZSS source, babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey enforced.
Note: Conor Stokes' LZSSE2(FASTEST decompressor) is embedded, all credits along with many thanks go to him.
Limitation: Uncompressed 8192 MB of filesize.
Current priority class is HIGH_PRIORITY_CLASS.
Allocating Source-Buffer 9 MB ...
Allocating Target-Buffer 41 MB ...
Compressing 10,192,446 bytes ...
\; Each rotation means 64KB are encoded; Done 100%
NumberOfFullLiterals (lower-the-better): 128
NumberOf(Tiny)Matches[Tiny]Window (4): 6259
NumberOf(Short)Matches[Tiny]Window (8): 712
NumberOf(Medium)Matches[Tiny]Window (12): 89
RAM-to-RAM performance: 5 KB/s.
Compressed to 3,964,930 bytes.
LZSSE2: Compressing with LZSSE2 (level 17) 10,192,446 bytes ...
LZSSE2: Compressed to 3872373 bytes.
LZSSE2: RAM-to-RAM performance: 6017 KB/s.
LZSSE2: Allocating Verification-Buffer 9 MB ...
LZSSE2: Decompressing 3,872,373 bytes ...
LZSSE2: RAM-to-RAM performance: 3692 MB/s.
LZSSE2: Verification (input and output match) OK.

And to mix Level 17 with the TurboBench' results:

D:\Conor>turbobenchs.exe alice29.txt -ezlib,9/fastlz,2/lzturbo,19,29,39/bzip2/chameleon,2/snappy_c/zstd,1,21/density,3/lz4,16/lz5,15/lzham,4/brieflz/brotli,11/crush,2/lzma,9/zpaq,2/lzf/yappy/trle/memcpy/lzsse2,1,16 -g -k0 -B2G
...
TurboBench:  - Thu Mar 31 03:24:07 2016

      C Size  ratio%     C MB/s     D MB/s   Name            File
       43206    28.4      10.65      26.32   bzip2           alice29.txt.tbb
       46514    30.6       0.48     178.72   brotli 11       alice29.txt.tbb
       48467    31.9       2.86      49.62   lzma 9          alice29.txt.tbb
       49714    32.7       4.59     360.40   zstd 21         alice29.txt.tbb
       50001    32.9       2.52     498.65   lzturbo 39      alice29.txt.tbb
       51230    33.7       1.80      90.21   lzham 4         alice29.txt.tbb
       54174    35.6      10.91     208.91   zlib 9          alice29.txt.tbb
       55992    36.8       3.79     189.17   crush 2         alice29.txt.tbb
       56526                         53 MB/s lzsse2 17       !outside TurboBench!
       56530    37.2       5.64    1810.58   lzsse2 16       alice29.txt.tbb
       59041    38.8     102.90     320.86   zstd 1          alice29.txt.tbb
       59168    38.9       5.99      51.56   zpaq 2          alice29.txt.tbb
       61582    40.5       2.57     578.29   lz5 15          alice29.txt.tbb
       62405    41.0       2.75     731.20   lzturbo 29      alice29.txt.tbb
       62982    41.4       3.15    1653.14   lzturbo 19      alice29.txt.tbb
       63667    41.9      18.05    1382.63   lz4 16          alice29.txt.tbb
       67441    44.3      51.19     147.95   brieflz         alice29.txt.tbb
       73235                       9280 MB/s Nakamichi 'TT'  !outside TurboBench!
       74370    48.9      11.61    1246.63   lzsse2 1        alice29.txt.tbb
       80799    53.1      55.61    1216.71   yappy           alice29.txt.tbb
       82930    54.5     165.49     336.48   lzf             alice29.txt.tbb
       84753    55.7     160.60     288.05   fastlz 2        alice29.txt.tbb
       88021    57.9     228.71     620.77   snappy_c        alice29.txt.tbb
       93187    61.3      45.17      95.83   density 3       alice29.txt.tbb
      102175    67.2     783.96    1134.99   chameleon 2     alice29.txt.tbb
      149618    98.4     131.00    1178.98   trle            alice29.txt.tbb
      152093   100.0   11699.15   10863.50   memcpy          alice29.txt.tbb

D:\Conor>turbobenchs.exe University_of_Canterbury_The_Calgary_Corpus.tar -ezlib,9/fastlz,2/lzturbo,19,29,39/bzip2/chameleon,2/snappy_c/zstd,1,21/density,3/lz4,16/lz5,15/lzham,4/brieflz/brotli,11/crush,2/lzma,9/zpaq,2/lzf/yappy/trle/memcpy/lzsse2,1,16 -g -k0 -B2G
...
TurboBench:  - Thu Mar 31 03:24:44 2016

      C Size  ratio%     C MB/s     D MB/s   Name            File
      851327    26.1       1.74      56.80   lzma 9          University_of_Canterbury_The_Calgary_Corpus.tar.tbb
      863894    26.5       0.38     223.57   brotli 11       University_of_Canterbury_The_Calgary_Corpus.tar.tbb
      890859    27.3      11.58      23.16   bzip2           University_of_Canterbury_The_Calgary_Corpus.tar.tbb
      902605    27.6       1.55     643.33   lzturbo 39      University_of_Canterbury_The_Calgary_Corpus.tar.tbb
      911442    27.9       2.64     408.96   zstd 21         University_of_Canterbury_The_Calgary_Corpus.tar.tbb
      913218    28.0       1.60     125.16   lzham 4         University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1022374    31.3       0.71     230.05   crush 2         University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1060310    32.5       9.42     216.94   zlib 9          University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1077695    33.0       4.89      59.12   zpaq 2          University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1109125    34.0       1.70     614.17   lz5 15          University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1114744    34.1       1.77     864.13   lzturbo 29      University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1142536                       1089 MB/s lzsse2 17       !outside TurboBench!
     1143886    35.0       6.18    1894.16   lzsse2 16       University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1188009    36.4     160.68     492.99   zstd 1          University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1231146    37.7       2.01    2177.02   lzturbo 19      University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1240210    38.0       4.05    1704.35   lz4 16          University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1298840    39.8      87.50     185.24   brieflz         University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1333349                       1152 MB/s Nakamichi 'TT'  !outside TurboBench!
     1405646    43.0      13.62    1544.72   lzsse2 1        University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1596322    48.9     203.79     411.69   lzf             University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1607581    49.2     188.43     367.82   fastlz 2        University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1610706    49.3     199.48     191.91   density 3       University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1624766    49.8      67.02    1497.95   yappy           University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     1682025    51.5     270.12     724.07   snappy_c        University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     2084324    63.8     939.99    1426.62   chameleon 2     University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     2810417    86.1     193.27    5070.71   trle            University_of_Canterbury_The_Calgary_Corpus.tar.tbb
     3265540   100.0    6557.30    5094.44   memcpy          University_of_Canterbury_The_Calgary_Corpus.tar.tbb

D:\Conor>turbobenchs.exe Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt -ezlib,9/fastlz,2/lzturbo,19,29,39/bzip2/chameleon,2/snappy_c/zstd,1,21/density,3/lz4,16/lz5,15/lzham,4/brieflz/brotli,11/crush,2/lzma,9/zpaq,2/lzf/yappy/trle/memcpy/lzsse2,1,16 -g -k0 -B2G
...
TurboBench:  - Thu Mar 31 03:26:16 2016

      C Size  ratio%     C MB/s     D MB/s   Name            File
     1757268    28.2       9.84      19.54   bzip2           Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     1871038    30.1       1.39      60.73   lzma 9          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     1879568    30.2       0.39     242.60   brotli 11       Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     1896779    30.5       1.71     377.10   zstd 21         Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     1899349    30.5       1.40     584.89   lzturbo 39      Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     1900373    30.5       1.20     140.49   lzham 4         Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2176735    35.0       0.18     206.36   crush 2         Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2268527    36.4       3.27      58.43   zpaq 2          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2373311    38.1       9.58     200.85   zlib 9          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2388411                       2277 MB/s lzsse2 17       !outside TurboBench!
     2391442    38.4       5.45    1961.43   lzsse2 16       Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2432148    39.1       1.61     465.15   lz5 15          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2438540    39.2       1.46     778.68   lzturbo 29      Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2577921                       1152 MB/s Nakamichi 'TT'  !outside TurboBench!
     2616408    42.0     137.02     435.72   zstd 1          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2704928    43.4       1.69    1962.67   lzturbo 19      Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2735066    43.9      15.21    1617.87   lz4 16          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     2889607    46.4      72.81     152.13   brieflz         Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3147772    50.6     260.04     241.30   density 3       Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3281175    52.7      13.60    1379.78   lzsse2 1        Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3569529    57.3     998.65    1802.43   chameleon 2     Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3598717    57.8     182.00     362.88   lzf             Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3608159    58.0      60.16    1158.89   yappy           Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3618356    58.1     155.23     310.36   fastlz 2        Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     3923093    63.0     232.06     581.94   snappy_c        Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     6224732   100.0     134.80    1299.16   trle            Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb
     6225584   100.0    6672.65    5997.67   memcpy          Complete_Works_of_Seneca_-_Lucius_Annaeus_Seneca.txt.tbb


D:\Conor>turbobench_singlefile_noNakamichi_2G.bat dickens

D:\Conor>turbobenchs.exe dickens -ezlib,9/fastlz,2/lzturbo,19,29,39/bzip2/chameleon,2/snappy_c/zstd,1,21/density,3/lz4,16/lz5,15/lzham,4/brieflz/brotli,11/crush,2/lzma,9/zpaq,2/lzf/yappy/trle/memcpy/lzsse2,1,16 -g -k0 -B2G
...
TurboBench:  - Thu Mar 31 03:29:44 2016

      C Size  ratio%     C MB/s     D MB/s   Name            File
     2799524    27.5      10.03      19.52   bzip2           dickens.tbb
     2830650    27.8       1.13      64.07   lzma 9          dickens.tbb
     2832113    27.8       0.37     211.91   brotli 11       dickens.tbb
     2847745    27.9       0.95     143.84   lzham 4         dickens.tbb
     2863997    28.1       1.37     297.34   zstd 21         dickens.tbb
     2864646    28.1       1.15     412.65   lzturbo 39      dickens.tbb
     3350093    32.9       0.14     230.43   crush 2         dickens.tbb
     3403391    33.4       2.99      55.74   zpaq 2          dickens.tbb
     3667758    36.0       1.15     607.63   lzturbo 29      dickens.tbb
     3755591    36.8       1.38     363.13   lz5 15          dickens.tbb
     3854739    37.8      10.07     209.30   zlib 9          dickens.tbb
     3872373                       3692 MB/s lzsse2 17       !outside TurboBench!
     3872652    38.0       5.43    1799.20   lzsse2 16       dickens.tbb
     3964930                       1152 MB/s Nakamichi 'TT'  !outside TurboBench!
     4278337    42.0     137.44     435.24   zstd 1          dickens.tbb
     4377516    42.9       1.35    2002.45   lzturbo 19      dickens.tbb
     4431261    43.5      15.18    1667.61   lz4 16          dickens.tbb
     4647463    45.6      73.97     151.79   brieflz         dickens.tbb
     4954934    48.6     251.39     239.81   density 3       dickens.tbb
     5259012    51.6      13.14    1408.38   lzsse2 1        dickens.tbb
     5827189    57.2     941.57    1758.83   chameleon 2     dickens.tbb
     5857340    57.5     178.72     360.04   lzf             dickens.tbb
     5872803    57.6      59.65    1199.39   yappy           dickens.tbb
     5995421    58.8     175.58     317.15   fastlz 2        dickens.tbb
     6337838    62.2     233.77     633.70   snappy_c        dickens.tbb
    10188455   100.0     135.68    1319.41   trle            dickens.tbb
    10192450   100.0    6390.25    4785.19   memcpy          dickens.tbb

D:\Conor>

Very strange, decompression speed differs a lot between Hamid's bench and mine, my trials are 64, with 'dickens' Intel 15.0 is 2x faster than GCC 5.3.0, or I am wrong?!
Also, no clue, why with 'alice29' my bench gives the miserable 53 MB/s whereas TurboBench reports 1810.58MB/s?! That's why told you that my knowledge is inferior, I failed to offer reliable bench. Maybe, I will change clock() with:

#if defined(_icl_mumbo_jumbo_)
// GetRDTSC() taken from strchr.com
#if defined(_M_IX86)
unsigned long long __forceinline GetRDTSC(void) {
   __asm {
      ; Flush the pipeline
      XOR eax, eax
      CPUID
      ; Get RDTSC counter in edx:eax
      RDTSC
   }
}
#elif defined(_M_X64)
unsigned long long __forceinline GetRDTSC(void) {
    return __rdtsc();
}
#else
unsigned long long __forceinline GetRDTSC(void) {
    return GetTickCount();
}
#endif
#endif

#if defined(_gcc_mumbo_jumbo_)
static __inline__ unsigned long long GetRDTSC()
{
  unsigned hi, lo;
  __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
  return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}
#endif

I still don't understand, even partially, the decompression code, yet, at first glance the code generated by Intel 15.0 is tight and makes full use of registers, no?!

Cannot say keep up the fantastic work since you made good already.

Best,
Sanmayce

Block size > input size causes mismatch

For the following program if the BLOCK_SIZE > the length of the uncompressed data then I get the wrong size returned from LZSSE2_Decompress(). If I set BLOCK_SIZE == the uncompressed data size then I get the expected result. The header indicates any block size >= the uncompressed data size is acceptable.

#include <cstddef>
#include "lzsse2/lzsse2.h"
#include <iostream>
#include <cstring>

using namespace std;

#define STARTING_DATA_SIZE 2580
#define BLOCK_SIZE (STARTING_DATA_SIZE * 2)

unsigned char starting_data[] = {
        0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0xc0,
        0x01, 0x00, 0x50, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x04,
        0x80, 0x00, 0x00, 0x40, 0x03, 0x00, 0x00, 0x04, 0x00, 0x10, 0x08, 0x01,
        0x20, 0x04, 0x20, 0x28, 0x90, 0x10, 0x44, 0x04, 0x00, 0x00, 0x08, 0x08,
        0x00, 0x80, 0x00, 0x80, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
        0x01, 0x00, 0x00, 0x40, 0x00, 0x62, 0x01, 0x04, 0x00, 0x00, 0x08, 0x08,
        0x10, 0x04, 0x00, 0x0c, 0x80, 0x04, 0x00, 0x80, 0x08, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x20, 0x00, 0x00, 0x10, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x08, 0x00, 0x02, 0x04, 0x80, 0x02, 0x01,
        0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x08, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x10, 0x02, 0x00, 0x00,
        0x00, 0x90, 0x08, 0x00, 0x21, 0x40, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x88, 0x00, 0x00, 0x00, 0x80, 0x00, 0x02, 0x00, 0x01, 0x00, 0x00, 0x40,
        0x00, 0x00, 0x10, 0x10, 0x40, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x02,
        0x00, 0x00, 0x00, 0x28, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x01,
        0x00, 0x00, 0x10, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x06, 0x00, 0x40, 0x42, 0x00, 0x00, 0x01, 0x00, 0x20, 0x00, 0x09,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x91, 0x01, 0x02, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x30, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x40, 0x18, 0x00, 0x00, 0x20, 0x00, 0x80, 0x00, 0x04,
        0x80, 0x01, 0x04, 0x00, 0x00, 0x04, 0x02, 0x04, 0x00, 0x20, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x7c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x01, 0x00, 0x00, 0x40, 0x01, 0x02,
        0x00, 0x00, 0x00, 0x02, 0x08, 0x04, 0x80, 0x02, 0x00, 0x00, 0x03, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x08, 0x01, 0x20, 0x04, 0x00, 0x08, 0x00, 0x04,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x80, 0x00, 0x10,
        0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x01, 0x05, 0x00, 0x00, 0x80, 0x62,
        0x00, 0x04, 0x00, 0x00, 0x00, 0x80, 0x10, 0x00, 0x00, 0x00, 0x00, 0x04,
        0x00, 0x00, 0x08, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00,
        0x00, 0x20, 0x00, 0x80, 0x90, 0x01, 0x00, 0x00, 0x00, 0x00, 0x20, 0x08,
        0x04, 0x00, 0x04, 0x00, 0x02, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x08, 0x02, 0x08, 0x04, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x30, 0x02, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x01, 0x40,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x88, 0x10, 0x00, 0x00, 0x88, 0x10,
        0x00, 0x20, 0x00, 0x00, 0x00, 0x40, 0x00, 0x01, 0x00, 0x02, 0x24, 0x00,
        0x40, 0x00, 0x00, 0x00, 0x00, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x10, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x42, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x80,
        0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x20, 0x00,
        0x20, 0x00, 0x00, 0x00, 0x02, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x80, 0x00, 0x01, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x02, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x5d, 0x00,
        0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0xc0,
        0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x04,
        0x80, 0x00, 0x00, 0x40, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x01,
        0x20, 0x04, 0x20, 0x28, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x08,
        0x00, 0x80, 0x00, 0x80, 0x00, 0x10, 0x00, 0x40, 0x00, 0x00, 0x80, 0x00,
        0x01, 0x00, 0x00, 0x10, 0x00, 0x62, 0x00, 0x04, 0x00, 0x00, 0x08, 0x00,
        0x10, 0x00, 0x10, 0x00, 0x00, 0x04, 0x40, 0x02, 0x08, 0x00, 0x00, 0x00,
        0x08, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x20, 0x00, 0x02, 0x10, 0x00,
        0x00, 0x00, 0x40, 0x00, 0x20, 0x08, 0x00, 0x02, 0x04, 0x82, 0x02, 0x00,
        0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x20, 0x10, 0x00, 0x00, 0x08, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x08, 0x84, 0x00, 0x00, 0x10, 0x02, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x21, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x48, 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x04,
        0x00, 0x00, 0x10, 0x00, 0x40, 0x40, 0x40, 0x00, 0x00, 0x00, 0x00, 0x03,
        0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x10, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x42, 0x00, 0x00, 0x01, 0x10, 0x00, 0x00, 0x01,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x01, 0x02, 0x20, 0x00, 0x00, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x04, 0x18, 0x00, 0x00, 0x20, 0x00, 0x90, 0x80, 0x00,
        0x00, 0x01, 0x04, 0x20, 0x00, 0x04, 0x12, 0x00, 0x02, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x6b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00,
        0x00, 0x01, 0x00, 0x00, 0x08, 0x84, 0x80, 0x00, 0x00, 0x00, 0x03, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x08, 0x01, 0x20, 0x04, 0x00, 0x08, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x80, 0x00, 0x10,
        0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x62,
        0x00, 0x04, 0x80, 0x00, 0x08, 0x00, 0x10, 0x00, 0x00, 0x40, 0x00, 0x04,
        0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x10, 0x40, 0x00, 0x00, 0x00, 0x00, 0x20, 0x08,
        0x00, 0x00, 0x04, 0x00, 0x02, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x88, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x10, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x80, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
        0x40, 0x00, 0x20, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x42, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80,
        0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x20, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00,
        0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3f, 0x00,
        0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xc0,
        0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x04,
        0x80, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x08, 0x00, 0x00, 0x08, 0x01,
        0x28, 0x04, 0x00, 0x28, 0x00, 0x14, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x80, 0x00, 0x80, 0x09, 0x10, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
        0x01, 0x00, 0x01, 0x00, 0x00, 0x62, 0x00, 0x04, 0x00, 0x00, 0x08, 0x80,
        0x10, 0x00, 0x00, 0x80, 0x00, 0x04, 0x00, 0x00, 0x08, 0x00, 0x08, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00,
        0x04, 0x10, 0x00, 0x00, 0x20, 0x08, 0x00, 0x00, 0x04, 0x00, 0x02, 0x00,
        0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x04, 0x00, 0x10, 0x00, 0x08, 0x00,
        0x00, 0x24, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x02, 0x00, 0x00,
        0x01, 0x00, 0x04, 0x00, 0x01, 0x00, 0x00, 0x08, 0x00, 0x00, 0x08, 0x00,
        0x08, 0x00, 0x00, 0x02, 0x80, 0x00, 0x02, 0x00, 0x01, 0x00, 0x00, 0x00,
        0x04, 0x00, 0x10, 0x00, 0x40, 0x00, 0x40, 0x00, 0x10, 0x00, 0x00, 0x02,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x10, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00,
        0x00, 0x04, 0x00, 0x01, 0x42, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x01, 0x02, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x80, 0x00, 0x00,
        0x00, 0x04, 0x04, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x02, 0x00, 0x58, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00,
        0x00, 0x40, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x08, 0x04, 0x80, 0x00, 0x40, 0x00, 0x03, 0x00,
        0x00, 0x00, 0x00, 0x08, 0x08, 0x01, 0x20, 0x04, 0x00, 0x18, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x40, 0x80, 0x00, 0x10,
        0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x62,
        0x00, 0x04, 0x00, 0x00, 0x08, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x04,
        0x00, 0x00, 0x09, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x80,
        0x00, 0x00, 0x00, 0x02, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x08,
        0x00, 0x00, 0x04, 0x00, 0x02, 0x00, 0x10, 0x00, 0x02, 0x00, 0x80, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x08, 0x10, 0x00, 0x04, 0x00, 0x04, 0x00, 0x00,
        0x00, 0x00, 0x10, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0c, 0x80, 0x00, 0x00, 0x80, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
        0x40, 0x00, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x08, 0x00, 0x08, 0x02, 0x00,
        0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x42, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80,
        0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00,
        0x00, 0x00, 0x00, 0x80, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04,
        0x00, 0x00, 0x00, 0x80, 0x40, 0x00, 0x00, 0x00, 0x04, 0x00, 0x04, 0x00,
        0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0xc0,
        0x00, 0x00, 0x40, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x04,
        0x80, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x01,
        0x20, 0x04, 0x00, 0x28, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x40, 0x00,
        0x00, 0x80, 0x00, 0x80, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
        0x01, 0x00, 0x00, 0x00, 0x00, 0x62, 0x00, 0x0c, 0x00, 0x00, 0x08, 0x00,
        0x10, 0x00, 0x40, 0x04, 0x00, 0x04, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x02, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x08, 0x00, 0x00, 0x06, 0x00, 0x02, 0x00,
        0x00, 0x00, 0x02, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x08,
        0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x02, 0x80, 0x01,
        0x08, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x08, 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x01, 0x40, 0x00, 0x00,
        0x08, 0x00, 0x10, 0x00, 0x40, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x02,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x80, 0x00, 0x10, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x42, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x01, 0x80, 0x01, 0x02, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x20, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x00,
        0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x02, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x4f, 0x00, 0x00, 0x80, 0x10, 0x08, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x01, 0x82,
        0x00, 0x00, 0x00, 0x02, 0x08, 0x04, 0x80, 0x00, 0x00, 0x00, 0x03, 0x00,
        0x00, 0x08, 0x00, 0x00, 0x08, 0x01, 0x20, 0x04, 0x00, 0x08, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x80, 0x00, 0x80, 0x08, 0x10,
        0x00, 0x00, 0x20, 0x00, 0x80, 0x00, 0x01, 0x04, 0x00, 0x00, 0x00, 0x62,
        0x00, 0x04, 0x00, 0x00, 0x00, 0x80, 0x14, 0x00, 0x00, 0x00, 0x00, 0x04,
        0x00, 0x02, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00,
        0x00, 0x21, 0x04, 0x00, 0x90, 0x01, 0x00, 0x00, 0x00, 0x00, 0x20, 0x08,
        0x04, 0x00, 0x04, 0x00, 0x02, 0x00, 0x00, 0x40, 0x02, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x20, 0x00, 0x08, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x10, 0x02, 0x00, 0x04, 0x00, 0x02, 0x08, 0x80, 0x01, 0x40,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x10, 0x00, 0x00, 0x80, 0x10,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x01, 0x00, 0x00, 0x04, 0x04,
        0x40, 0x00, 0x20, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x10, 0x00, 0x00, 0x30,
        0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x42, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x80,
        0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x80,
        0x20, 0x00, 0x00, 0x20, 0x02, 0x20, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00,
        0x02, 0x00, 0x00, 0x80, 0x00, 0x01, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x02, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x64, 0x00,
        0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x40, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x08, 0x04,
        0x80, 0x00, 0x00, 0x00, 0x03, 0x40, 0x00, 0x08, 0x00, 0x01, 0x08, 0x01,
        0x60, 0x0c, 0x00, 0x08, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x80, 0x00, 0x80, 0x08, 0x10, 0x00, 0x00, 0x00, 0x00, 0x90, 0x00,
        0x01, 0x04, 0x00, 0x00, 0x00, 0x62, 0x00, 0x04, 0x80, 0x00, 0x08, 0x00,
        0x14, 0x00, 0x00, 0x40, 0x00, 0x04, 0x00, 0x02, 0x08, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x21, 0x00, 0x00, 0x10, 0x00,
        0x00, 0x08, 0x00, 0x00, 0x20, 0x08, 0x00, 0x00, 0x04, 0x00, 0x02, 0x00,
        0x00, 0x40, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x08, 0x00,
        0x00, 0x24, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x02, 0x00, 0x00,
        0x20, 0x02, 0x00, 0x80, 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x20, 0x00,
        0x08, 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x40,
        0x00, 0x00, 0x10, 0x00, 0x00, 0x04, 0x40, 0x00, 0x20, 0x00, 0x00, 0x02,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x10, 0x00, 0x00, 0x30, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00,
        0x00, 0x04, 0x00, 0x00, 0x42, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x01,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x01, 0x02, 0x00, 0x00, 0x00, 0x40,
        0x08, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x20, 0x00, 0x20, 0x20, 0x20,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x80, 0x00, 0x00,
        0x80, 0x00, 0x06, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x61, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x40, 0x00, 0x00,
        0x00, 0x01, 0x00, 0x00, 0x08, 0x04, 0x80, 0x00, 0x00, 0x00, 0x03, 0x00,
        0x00, 0x11, 0x00, 0x01, 0x0a, 0x01, 0x20, 0x04, 0x00, 0x08, 0x00, 0x00,
        0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x80, 0x04, 0x10,
        0x00, 0x00, 0x00, 0x00, 0x80, 0x00, 0x01, 0x04, 0x00, 0x00, 0x00, 0x62,
        0x00, 0x04, 0x84, 0x00, 0x08, 0x00, 0x10, 0x00, 0x00, 0x40, 0x00, 0x04,
        0x00, 0x00, 0x08, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x20, 0x00,
        0x00, 0x20, 0x00, 0x00, 0x12, 0x00, 0x02, 0x08, 0x00, 0x02, 0x20, 0x08,
        0x00, 0x02, 0x04, 0x00, 0x02, 0x00, 0x00, 0x00, 0x02, 0x00, 0x80, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x10,
        0x08, 0x00, 0x10, 0x02, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x01, 0x00,
        0x00, 0x20, 0x04, 0x00, 0x00, 0x08, 0x08, 0x00, 0x00, 0x00, 0x80, 0x00,
        0x04, 0x00, 0x40, 0x08, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
        0x40, 0x00, 0x01, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x30, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x20, 0x10, 0x00, 0x00, 0x20,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x42, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80,
        0x01, 0x03, 0x00, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x20, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x80, 0x10, 0x04, 0x00, 0x00, 0x00,
        0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x5e, 0x00
};

int main(int argc, char *argv[]) {
    char *buffer = (char*) malloc(BLOCK_SIZE);
    char *compressed_buffer = (char*) malloc(BLOCK_SIZE);

    LZSSE2_OptimalParseState *state = LZSSE2_MakeOptimalParseState(static_cast< size_t >( BLOCK_SIZE));
    const auto size = static_cast<const uint32_t>(LZSSE2_CompressOptimalParse(state, starting_data, STARTING_DATA_SIZE, compressed_buffer, BLOCK_SIZE, 16));
    cerr << "compressed size = " << size << endl;
    char *uncompressed_buffer = (char*) malloc(BLOCK_SIZE);
    const size_t uncompressed_size = LZSSE2_Decompress(compressed_buffer, size, uncompressed_buffer, BLOCK_SIZE);
    cerr << "uncompressed_size = " << uncompressed_size << endl;

    if(uncompressed_size != STARTING_DATA_SIZE) {
        cerr << "sizes don't match" << endl;
    }

    LZSSE2_FreeOptimalParseState(state);
}

LZSSE writes outside buffer for very short data

I've started fuzzing the decompressor, and it found this issue pretty much instantly.

Here is the program I'm using. It writes the uncompressed size at the beginning of the files as a size_t; for the test archives to work size_t must be 8 (though changing a few instances of size_ts to int64_t should make them work elsewhere).

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <stdint.h>
#include <string.h>
#include <stdbool.h>

#include "lzsse2.h"

static void
print_help(const char* program_name) {
  printf("Usage: %s c|d\n", program_name);
  printf("Compress or decompress from stdin to stdout\n");
}

#define IO_BLOCK_SIZE ((size_t) (1024 * 1024))

int
main(int argc, char** argv) {
  bool compress;

  if (argc != 2) {
    print_help(argv[1]);
    return EXIT_SUCCESS;
  }

  compress = *(argv[1]) == 'c';

  uint8_t* input = (uint8_t*) malloc(IO_BLOCK_SIZE);
  size_t input_length = 0;

  while (!feof(stdin)) {
    const size_t bytes_read = fread(input + input_length, 1, IO_BLOCK_SIZE, stdin);
    input_length += bytes_read;
    input = (uint8_t*) realloc(input, input_length + IO_BLOCK_SIZE);
    assert(input != NULL);
  }

  uint8_t* output = NULL;
  size_t output_length = 0;

  if (compress) {
    unsigned int level = 7;
    if (argv[1][1] != '\0') {
      level = atoi(argv[1] + 1);
    }

    LZSSE2_OptimalParseState* state = LZSSE2_MakeOptimalParseState(input_length);
    assert(state != NULL);

    output = (uint8_t*) malloc(input_length);

    output_length = LZSSE2_CompressOptimalParse(state, input, input_length, output, input_length, level);
    assert(output_length != 0);
    assert(output_length <= input_length);
    LZSSE2_FreeOptimalParseState(state);

    fwrite(&input_length, sizeof(size_t), 1, stdout);
  } else {
    if (input_length <= sizeof(size_t))
      goto cleanup;

    const size_t decompressed_length = *((size_t*) input);
    output = (uint8_t*) malloc(decompressed_length);
    if (output == NULL)
      goto cleanup;

    output_length = LZSSE2_Decompress (input + sizeof(size_t), input_length - sizeof(size_t), output, decompressed_length);
    if (output_length == 0)
      goto cleanup;
  }

  fwrite(output, 1, output_length, stdout);

 cleanup:
  free(output);
  free(input);

  return EXIT_SUCCESS;
}

Here are some archives which cause a crash when compiled with AddressSanitizer: http://code.coeusgroup.com/afl-results/52558fa6-cf92-4446-a84b-636a02faecff.tar.xz

I'm posting this publicly in spite of https://github.com/nemequ/compfuzz/#disclosure-policy because I don't see this being an issue in real life (you would need a malloc implementation which puts tiny buffers immediately before a page boundary, which only tools designed to find issues like this do), and I doubt anyone is using LZSSE in production yet. If I find more realistic issues I'll disclose them privately.

Provide C API

The existing headers are really close to C… I think they would actually be usable as C if the functions just had C linkage.

You also have a #pragma once and #ifdef-based guards, you could get rid of one… IIRC #pragma once requires C11 (though AFAIK all compilers support it), so it would be nice if you got rid of the #pragma once.

typo in description

In README.md there is a typo of Szymanski's surname. You have "Storer-Szymanksi" and should be "Storer-Szymanski".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.