Comments (4)
Thanks for the report !
I was going to say that the indexOf method could find the wrong JSZip.signature.DATA_DESCRIPTOR
, but the current method has the same flaw. If the current file is a zip file with data descriptors, the file won't be correctly unzipped.
Your patch is a nice improvement over the existing code but the whole findDataUntilDataDescriptor
method could be deleted. I have a related patch (which removes this method and fixes the nested data descriptors bug) waiting on my machine but I didn't finished/pushed it, sorry about that :-(
I just pushed it on my branch issue30. I'll create a pull request for review as soon as I'm sure the unit tests are ok everywhere (I'll do that tomorrow). Is that ok for you ?
A note about the inflate and deflate files : the implementation might change (for a more robust one) or new compression methods might be added so the compress/uncompress interface must remain generic and easy to implement.
Lazily decompressed files is an interesting feature (and I don't have any sleeping patch for this !). A way to convert a compressed string into an object without loading the whole decompressed string in memory could be nice too (a new method on ZipObject
and a lazy decompressed file may be the easiest way to implement it).
from jszip.
Thanks for the quick reply,
That sort of segues into the next "improvement" (in my mind) I did, which is not bothering to extract the compressed data as a substring, instead just passing the whole zip string and an offset to the inflate method.
I've put the change into your new jszip-load.js as so..... :
var fileStats = {start: reader.index, cdata: reader.stream}; // Basically a position in the entire zip file
//this.compressedFileData = reader.readString(this.compressedSize);
compression = findCompression(this.compressionMethod);
if (compression === null) { // no compression found
throw new Error("Corrupted zip : compression " + pretty(this.compressionMethod) +
" unknown (inner file : " + this.fileName + ")");
}
//this.uncompressedFileData = compression.uncompress(this.compressedFileData);
this.uncompressedFileData = compression.uncompress(fileStats);
and in jszip-inflate.js I change the inflate method to do this:
function zip_inflate (fileStats) {
console.log ("inflating zip file v2");
var out, buff;
var i, j;
zip_inflate_start();
zip_inflate_data = fileStats.cdata;
zip_inflate_pos = fileStats.start;
buff = new Array(1024);
var bigout = [];
out = [];
var k = 0;
while((i = zip_inflate_internal(buff, 0, buff.length)) > 0) {
out.length = 0;
for(j = 0; j < i; j++) {
out[j] = String.fromCharCode(buff[j]);
}
bigout[k] = out.join("");
k++;
}
zip_inflate_data = null; // G.C.
return bigout.join("");
}
Basically there's 2 changes here, one is changing the read character routine to use buffers which are joined rather than string concatenated. Online sources say this is kinder to memory especially in older browsers (though its hard to find sources that discuss memory efficiency rather than speed efficiency). The join is a 2-stage affair because by monitoring memory use in task manager it seemed to use less peak memory in the five main browsers I've been trying to get this to function with (Chrome, IE, FF, Opera, Safari) than a 1-stage affair.
The second change is that zip_inflate_data is set to the cdata field of the object I pass in (and that is just the entire zip file as a string), and I set zip_inflate_pos to the start position of the file I want decompressed. This, for my files at least, seems to work straight off the bat. I thought I'd have to go hunting for an end character or know the end point or something but that seems to be dealt with in the inflation routines. Again, this is just for the few big old zips I've tested... I'd guess you'd know better whether this trips up any other type of inflating... you did warn there are other, and will be other, ways of inflating data.
(From this point I'm now exploring sending each of those out[] buffers to a routine that strips out data it doesn't want before doing a join, mainly by doing delimiter counts, hopefully reducing the memory footprint - frankly the whole of my 'memory efficient' quest revolves around creating as few new strings as possible - and making them as small as they need to be if I do so)
from jszip.
One other reason I've fixated upon the use of Strings within the code is because javascript strings use 2 bytes per character whereas for a binary file such as a zip file just 1 should be sufficient. As such, I wondered what would happen if I read in the initial zip as an ArrayBuffer and rather than turn it into a String in the JSZip.utils functions try and change the extracting to work on an ArrayBuffer (actually the Uint8Array view of it). I've managed to get this working with various modifications (that shouldn't break it for processing Strings) and I can now read in and conditionally unzip files Chrome et al wouldn't touch a couple of weeks ago. Would you'd be interested in me branching the code or just mailing you what I've got with comments?
from jszip.
If you can push your changes on a branch, that will be great !
from jszip.
Related Issues (20)
- Fortify scan finds HTML5: Overly Permissive Message Posting Policy in jszip.js line 11477 and 11504 HOT 1
- Using Jszip in Angular to export 1-2 MB files considerably slows down UI HOT 2
- linebreak issues in windows text files HOT 1
- Byte alignment of resulting zip archive
- Missing folders when trying to extract files which are compressed using native compression software of Windows HOT 1
- Unable to open docx file in Node
- Same zip different zipContents when loadAsync
- JSZip v3.10+ breaks in some sandboxed browser environment because dependency setimmediate breaks HOT 1
- Can we generate LF instead of CRLF
- Reading Folder Doesn't Work
- Large zip file breaks with loadAsync
- Math.random() usage
- Is jszip unmaintained? HOT 9
- Zipping 1GB+ and splitting to chunks is slow - is there a way to speed it up?
- Can't use ReadableStream from generateNodeStream in the PutObjectCommand of s3
- escape and unescape are deprecated
- Error: End of data reached
- Failed to resolve module specifier.
- ZIP64 Support broken HOT 2
- Add asyncIterator support to NodejsStreamOutputAdapter
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jszip.