Comments (9)
The root cause is that the LZF compression in BE systems carried out by UnsafeChunkEncoderBE class incorrectly computes the matching patterns in the stream, using word and double word sequences (methods _findMatchLength and _findLongMatchLength in class com/ning/compress/lzf/impl/UnsafeChunkEncoderBE.java). Specifically, the way the trailing null bytes are computed using the method _leadingBytes.
Endian difference does not exist for a word or a double word value residing in a register, so computing the trailing 0s for such a number is always obtained through Long.numberOfTrailingZeros, irrespective of the platform endian-ness. [ please refer to http://docs.oracle.com/javase/7/docs/api/java/lang/Long.html#numberOfTrailingZeros%28long%29 to see that the API is endian neutral].
Please let me know if you need additional more information on this (including a draft proposal for fix.)
Thanks,
Gireesh.
from compress.
Ouch. Yes, this is a serious problem and has to be fixed.
from compress.
Quick note: no progress yet since I was out on vacation for past 3 weeks. Hoping to address this soon.
from compress.
Any update on this? were you able to make progress?
Please review: https://bugzilla.redhat.com/show_bug.cgi?id=1115264
The same issue reported in Fedora project, has been resolved, the patch is available in this link, which resolves the issue.
Please let me know if I can be of any assistance further.
from compress.
Unfortunately I have not had time to look into this due to other crises (not related to this project). But I hope to get back to it as soon as possible, since this is a critical problem. Thank you for the link, that should be helpful.
from compress.
@gireeshpunathil Sorry for the long delay; I finally got back and patched things as suggested. I also simplified handling a bit, without (I hope!) changing behavior itself. However, I don't have direct access to a big-endian system right now (will try to find one).
I did release 1.0.3
, so if you can easily double-check tests, that'd be great. If not, I'll eventually find a way to verify it.
from compress.
Thanks, I will do the testing and will let you know the result.
from compress.
@cowtowncoder, I tested with your fix, and confirm that it fixes the reported problem. Thank you very much!
from compress.
@gireeshpunathil thank you for doing all the detective work here!
from compress.
Related Issues (20)
- Add a method to encode directly into given output buffer (of guaranteed size) HOT 1
- Implement encoder (compressor) that makes use of sun.misc.Unsafe HOT 1
- Unsafe-based decompressor of 0.9.7 fails on 2 sample files from 'maxcomp' data set HOT 1
- Improve 'DataHandler` callback to allow early termination HOT 1
- Expose number of bytes read from `InputStream`, via `LZFInputStream` HOT 1
- Unsafe clean up of Thread Local Value
- Add new variants for "compress only if comp rate at least N" HOT 1
- Add convenience method(s) for GZIP read/write
- Document parallel compression task
- OptimizedGZIPInputStream fails on chunked stream HOT 2
- estimateMaxWorkspaceSize() is too small HOT 9
- did not start with 'ZV' signature bytes HOT 5
- API changes report for Compress LZF
- Fix issues outlined by "lgtm.com"'s static analysis HOT 1
- Add Java 9+ module info using Moditect HOT 1
- `Unsafe` needs support in `module-info.java` HOT 1
- Mistaken Code in k8s HOT 1
- Maintenance, Contributor access? HOT 11
- README links are broken
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compress.