<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Does not handle data larger than 4GB,about davideuler/snappy

Comments (8)

GoogleCodeExporter commented on July 21, 2024

This is really inherent in the format; you couldn't change this without also 
breaking existing decompressors.

In any case, it's sort of meaningless compressing such large chunks, giving 
that the format internally breaks all data up in 64 kB parts that are processed 
individually anyway. In other words, it will buy you nothing in increased 
compressibility.

Original comment by [email protected] on 14 May 2013 at 9:21

from snappy.

GoogleCodeExporter commented on July 21, 2024

The point is that this bug is undocumented and hidden from the user. One of two 
resolutions seem reasonable to me:
1. Change all the apis from size_t to uint32 to avoid false advertising
2. Add automatic 4GB chunking to your wrapper layer

Thanks,
-Dan

Original comment by [email protected] on 15 May 2013 at 1:38

from snappy.

GoogleCodeExporter commented on July 21, 2024

#2 is out of the question, really, since that would break the format (the 
length is stored only once). We could probably change the types in the APIs, 
but that would mean ABI breakage, which is also bad.

Original comment by [email protected] on 15 May 2013 at 1:42

from snappy.

GoogleCodeExporter commented on July 21, 2024

Thanks. For a future version I would suggest expanding the format as follows: 
check the leading uint32: if it is 0 then assume it's the new format followed 
by a 64 bit size and otherwise fall back to the existing 32 bit algo. That 
would salvage all existing stored data and allow you to extend the format. I 
would also add an extra version number into the header so that you can change 
the format in the future without breaking backwards compatibility.

Original comment by [email protected] on 15 May 2013 at 1:54

from snappy.

GoogleCodeExporter commented on July 21, 2024

We're really not going to break the format for something people shouldn't 
really do anyway, sorry -- if you have that much data, stream it somehow 
instead of doing one Compress() call. The format has been stable for over eight 
years now, and forwards- and backwards-compatibility is an important feature.

I could probably add a comment saying that there's a hard limit of 4GB, though?

Original comment by [email protected] on 15 May 2013 at 1:57

from snappy.

GoogleCodeExporter commented on July 21, 2024

I would argue that 8 years ago there was not such a proliferation of server 
hardware with tens or hundreds of GB of RAM. Compressing an array greater than 
4GB in memory today is commonplace, which is why I was so stunned by such a 
"year 2000" bug. But anyway, we get the message. No change is sight. Will 
pursue other alternatives. Thanks for your quick responses.

Original comment by [email protected] on 15 May 2013 at 2:20

from snappy.

GoogleCodeExporter commented on July 21, 2024

I suppose that having a 32-bit limitation is because it would be a waste of 
space to use 64-bit pointers internally for the relatively small number of 
cases that are going to use it.  For a compression library this is generally 
unacceptable.

If you want to compress large datasets exceeding 4 GB the normal thing is to 
split them in chunks, compress them separately, and then decompress them and do 
the join by hand.  It is more work, but the compression ratios will benefit 
quite a lot.

Original comment by [email protected] on 20 May 2013 at 5:22

from snappy.

GoogleCodeExporter commented on July 21, 2024

faltet: It's not actually about that; we already use 16-bit pointers and not 
32-bit for the compression itself (we split into 64 kB chunks that are 
compressed individually), so the encoder could handle it with minimal extra 
work, but since the decoder happened not to support it at some point, we don't 
want to introduce backwards incompatibility for a very small gain.

In any case, it's pretty clear that this is not going to change, so I'm closing 
the bug.

Original comment by [email protected] on 21 May 2013 at 9:34

Changed state: WontFix

from snappy.

Does not handle data larger than 4GB about snappy HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent