Giter Club home page Giter Club logo

Comments (3)

pieroxy avatar pieroxy commented on July 24, 2024

I'm not sure I understand the question. LZString makes no assumption about its input. Its input is a sequence of 16-bit integers, under the form of a UTF-16 string. I "hacked" LZ by creating a special dictionary entry which introduces a new token in the input. That way, the decoder doesn't know in advance how many different characters there will be in the end. The less characters diversity in the input, the better the compression. So under it's current form, LZString already compresses ASCII much better than another encoding. But that's just because ASCII happens to have only 256 possible characters where UTF-* has much more variety.

Half of the compression happens on the LZ part with a dictionary-based compression algorithm. This algorithm produces a bit stream. This stream is then encoded either in Base64 or stuffed in a string using 15bits per character.

I'm not sure where a gain in compression ratio could be obtained. Plus, it would then be of a limited utility. JS, CSS and HTML templates should be in UTF-8 anyway just as a good practice.

I suggest you read the home page http://pieroxy.net/blog/pages/lz-string/index.html (especially the goal section.) This was originally meant for localStorage where the quota is in characters, not in bytes. Hence the effort to try and stuff a bitstream into an actual UTF-16 string.

from lz-string.

urbien avatar urbien commented on July 24, 2024

sorry if I was not clear or if this seems like a dumb idea. My thinking was that I could arrange for the server to return ASCII not UTF-8 and receive it in xhr as an arrayBuffer by setting responseType = 'arraybuffer'; This way if lz-string was to use not a string but arrayBuffer as input, then I thought it could achieve a better compression, while still making a buffer UTF-16 compatible for saving in LocalStorage.

I understand lz-string assumes strings consist of 16 bit Unicode characters. I know that FT Labs guys experimented with 2 ASCII chars stuffed into UTF-16 char for 2x compression in LocalStorage. This made me think about ascii and lz-string.

But may be a change in lz-string to accept 8bit ASCII chars buffer is non-trivial or may be you feel compressed result will not be much smaller. Anyway, I appreciate your response and a great utility you provided to the dev community!

Btw, did anyone create jsperf for lz-string vs uncompressed read/write to LocalStorage?

from lz-string.

pieroxy avatar pieroxy commented on July 24, 2024

There are two things in LZString: The compression part (based on LZW) and the encoding part (stuffing 15bits of data in each character of a UTF16 string).

Obviously the second part is oblivious to the input type. The first one however is more sensible to it. Maybe by doing a preinitialization of the dictionary with all 256 chars it could optimize something, but I seriously doubt it since in code files, at most ~80-90 different chars are used, in other words yout bitspace is vastly underfilled and most likely than not, your tokens could be encoded on 7 bits and even maybe 6 bits on some cases. That is precisely the optimization I did in LZW since I was starting with 16bit space tokens (impossible to preinitialize the dic with every token). So no, I don't think it would give interesting results. But that is just a hunch ;-)

AFAIK, no JSPerf focuses on IO in localstorage.

from lz-string.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.