mafintosh / tar-stream Goto Github PK
View Code? Open in Web Editor NEWtar-stream is a streaming tar parser and generator.
License: MIT License
tar-stream is a streaming tar parser and generator.
License: MIT License
This is great, but it needs to error on invalid headers!
If you can give me some hints I can put in a pull request,
my guess is to error if the checksum is incorrect,
unless the block is all nulls (as you can have null blocks inbetween files)
GNU tar use @LongLink to set when the next entry will have a long name. This is used in the GCC and Linux kernel .tar.gz files, so when using tar-stream to de-compress them some files are not extracted and other times there are EACCESS errors.
I perform tar file manipulation in my application. Adding, or replacing files in the tar archive. I built a wrapper that tries to keep it managed.
I'm manipulating tar files the way you suggest in the readme, by opening the tar, putting each entry into a new tar stream and writing that. However I don't really want a new tar file, so I've taken to moving the existing tar file to tmp (using fs.rename) and opening it from there. That way I can always just write to where the file is supposed to be.
It seems when I try to move a tar to tmp, while it is being accessed by tar-stream I receive an error. So, if two requests happen almost immediately after one another I get:
Error: ENOENT, rename '/<where-file-is-supposed-to-be>.tar'
at Error (native)
At the fs.rename call.
What should I do about this? Is there a way to check whether a file operation is currently being performed?
Should I wait in that case using setTimeout?
fs.rename(self.fullPath, tmpName, function (err) {
if (err) {
// TEMP: throw err
throw err;
Is there a way to manipulate the tar file in place, without moving it? Wouldn't that just cause the same problem? I'm sorry if this is an easy question.
Hi,
I must generate a dynamic tar file from a S3 directory and as Matteo Collina suggested me I'm using yours fantastic module for tar with pump in the following way:
// this contain all files present in the directory
var files = [];
async.eachSeries(files, function(data, callback){
// is an s3 file stream created with this module https://github.com/jb55/s3-blob-store
var stream = store.createReadStream({ key: data.Key });
var pack = tar.pack();
var entry = pack.entry({ name: data.Key, size:data.Size }, function(err){
if (err) console.log(err);
});
pump(stream, entry, function(err){
if (err) console.log(err);
if (tmp_count === files)
pack.finalize();
callback();
})
pump(pack, res, function(err){
if (err) console.log(err);
})
}, function(err){
if (err) console.log(err);
req.end();
})
this code generate the following stack trace:
Error: invalid header
at Object.exports.decode (/tar-fs/node_modules/tar-stream/headers.js:205:40)
at onheader (/tar-fs/node_modules/tar-stream/extract.js:103:39)
at Extract._write (/tar-fs/node_modules/tar-stream/extract.js:206:8)
at Extract._continue (/tar-fs/node_modules/tar-stream/extract.js:170:28)
at oncontinue (/tar-fs/node_modules/tar-stream/extract.js:61:10)
at ondrain (/tar-fs/node_modules/tar-stream/extract.js:81:5)
at Extract._write (/tar-fs/node_modules/tar-stream/extract.js:206:8)
at doWrite (/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:237:10)
at writeOrBuffer (/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:227:5)
at Writable.write (/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:194:11)
What you think about? I'm doing errors in my code or is there a problem?
Regards
I've got a tar file with a few large (>10GB) files in it. When tar-stream gets to the first big file, it chokes with an Invalid tar header
error. I can't provide the tar file itself, but hopefully this is helpful:
it hums along through a few small files, and then it hits the header for the first large file (provided here base64 encoded):
cnMtZHMwNTk1NDhfMjAxNi0xMS0yOFQxOTAwMTEuMDAwWi9nb29kZWdncy1nYXJiYW56by9vcmRlcl9pdGVtcy5ic29uAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAwMDA2NDQAMDAwMTc1MQAwMDAxNzUxAIAAAAAAAAACfZ6FHjEzMDE3MTAxMDc1ADAyMzMxNgAgMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB1c3RhciAgAG1hZG1pbgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYWRtaW5zAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
which it claims to parse and tells me is {name: 'rs-ds059548_2016-11-28T190011.000Z/goodeggs-garbanzo/order_items.bson', size: 2}
... but the size according to bsdtar on OSX is 10697475358
...
and then it chokes parsing the next tar header (presumably because it has used the entirely wrong offset).
If I can provide more data (without providing the file itself), please let me know.
hi,
the pack.entry function only accepts buffers, but no streams. Im not really sure if this is even implementable. Any ideas about this?
I've written simple code, it just copies a .tar
to another one logging files names:
const tar = require('tar-stream');
const pack = tar.pack();
const extract = tar.extract();
const path = require('path');
const fs = require('fs');
extract.on('entry', (header, stream, next) => {
console.log(header.name);
stream.pipe(pack.entry(header, next));
});
extract.on('finish', () => {
// all entries done - lets finalize it
pack.finalize();
});
const tarPath = './example.tar';
const tarPathParsed = path.parse(tarPath);
const outputPath = `${tarPathParsed.dir}/${tarPathParsed.name}.new${tarPathParsed.ext}`;
let oldTarballStream = fs.createReadStream(tarPath);
let newTarballStream = fs.createWriteStream(outputPath);
// pipe the old tarball to the extractor
oldTarballStream.pipe(extract);
newTarballStream.on('close', () => {
console.log(`${outputPath} has been written`);
});
// pipe the new tarball the another stream
pack.pipe(newTarballStream);
Also I've created an example.tar
with a single file named Тестовый файл.txt
(Cyrillic characters in the file name). When I ran my code above, I've got example.new.tar
with 2 files, both are named Pax Header
. One of them contains:
38 path=Тестовый файл.txt
Another Pax Header
contains the full content of Тестовый файл.txt
.
Moreover, once I re-ran the code applying it to the example.new.tar
(with those 2 PaxHeader
's) I've got a tarball with, also, 2 PaxHeader
's, but one of them was:
38 path=Тестовый файл.txt
38 path=Тестовый файл.txt
Another, again, was exact my original Тестовый файл.txt
.
I believe it's a bug of pack()
.
The following code tries to pipe some nonsense into an extractor:
var tar = require('tar-stream');
var stream = require('stream');
var extract = tar.extract();
extract.on('error', function(e) {
console.log(e);
});
extract.on('entry', function(header, stream, next) {
console.log(header);
});
extract.on('finish', function() {
console.log('finish');
});
var input = new stream.PassThrough();
input.pipe(extract);
input.end(new Buffer('some random content'));
The only output is finish
, but it should really emit some sort of error
I have a tarball from the registry (and others) that consistency fail with invalid tar header errors but are valid archive.
To replicate
wget https://registry.npmjs.org/eslint-config-metashop/-/eslint-config-metashop-1.5.0.tgz
Use the following test case
const Tar = require('tar-stream');
const Gunzip = require('gunzip-maybe');
const Fs = require('fs');
const gunzip = Gunzip();
const extract = Tar.extract();
const inputFile = Fs.createReadStream(process.argv[2]);
extract.on('error', function (err) {
console.log(err);
});
extract.on('entry', function (header, stream, callback) {
stream.on('end', function () {
return callback()
});
stream.resume()
});
inputFile.on('error', function (err) {
console.log(err)
});
inputFile.pipe(gunzip).pipe(extract);
The error I get is something like
Error: Invalid tar header. Maybe the tar is corrupted or it needs to be gunzipped?
at Object.exports.decode (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/headers.js:265:40)
at Extract.onheader [as _onparse] (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:124:39)
at Extract._write (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:248:8)
at Extract._continue (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:212:28)
at oncontinue (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:65:10)
at Extract.onheader [as _onparse] (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:132:7)
at Extract._write (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:248:8)
at Extract._continue (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:212:28)
at oncontinue (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:65:10)
at Extract.onheader [as _onparse] (/Users/adam_baldwin/Documents/projects/tarHash/node_modules/tar-stream/extract.js:132:7)
Doing some debugging it appears that the stream has advanced too far into the archive and the data it's trying to pass to parse the header is actually file contents, that's as far as we've got so far.
I see that the default size is set to 0. Why is that? If you leave size undefined, shouldn't the size be set to the length of the input (string or stream)?
Apologies, still learning nodejs libraries. The Packing example does not run because myStream is not define. I assume I am supposed to open a file, and get the stream for it and assign it to myStream. Could you add that to the example so people can copy, paste, run, then edit?
var tar = require('tar-stream')
var pack = tar.pack() // p is a streams2 stream
// add a file called my-test.txt with the content "Hello World!"
pack.entry({ name: 'my-test.txt' }, 'Hello World!')
// add a file called my-stream-test.txt from a stream
var entry = pack.entry({ name: 'my-stream-test.txt' }, function(err) {
// the stream was added
// no more entries
pack.finalize()
})
myStream.pipe(entry)
// pipe the pack stream somewhere
pack.pipe(process.stdout)
On iOS 10 (Mobile Safari 10.0) and Desktop Safari 9.1.2, I get:
Error: Invalid tar header. Maybe the tar is corrupted or it needs to be gunzipped?
This: decodeOct(buf, 148)
is returning NaN
.
In the README.md file for the Extracting example you have
next(); // ready for next entry
If you do this it produces an Error that next() is not defined.
Looking at you test code and a simple test I wrote this line should say
callback(); // ready for next entry
hi,
i encountered a bug with this package:
gunzip-maybe
or zlib.createGunzip()
the output stops after a few file entriesextract.on('entry')
handler everything goes wellhttps://gist.github.com/chpio/6d0cedae59d8416d0aed
ohh, i don't know if it's relevant, i noticed the stops occur after a large file entry.
Doesn't look like a tar package is an EventEmitter, which means no 'error' event. So where do errors go? How do you handle them?
Hey, I've experienced a nasty bug. When supplying an input stream of length A and there is already a directly supplied content of length B, there is no stream callback. Working example:
var fs = require("fs");
var tar = require('tar-stream');
var fileName = "demo";
var pack = tar.pack();
fs.writeFileSync(fileName, new Array(1674).join("X"), "utf-8");
pack.entry({ name: "specific-length.txt" }, new Array(13399).join("X"));
fs.stat(fileName, function (err, stat) {
if (err) {
return console.log(err);
}
var packOptions = {
mode: stat.mode,
mtime: stat.mtime,
name: fileName,
size: stat.size
};
var rs = fs.createReadStream(fileName);
var entry = pack.entry(packOptions, function (err) {
console.log("This never happens");
pack.finalize();
pack.pipe(fs.createWriteStream("output.tar"));
});
console.log("We pipe the stream here and expect a callback...");
return rs.pipe(entry);
});
13399 followed by 1674 are the lengths I stumbled upon. I presume this happens in specific intervals based on stream buffer sizes and such. Looking into source, seems there is a disconnect between Sink's and Pack's drain dynamics. Callback is saved and never called. Didn't understand the code enough to actually fix it. :-(
Tested on node v.10.22, tar-stream 0.2.5.
On this item on the README.md: https://github.com/mafintosh/tar-stream/blob/master/README.md#extracting
Shouldn't you call 'resume' after the observation of end/data events?
Calling resume before can possibly flush before the end event is registered.
Thanks for this library!
Hi,
With the following archive :
http://download.oracle.com/otn-pub/java/jdk/7u79-b15/server-jre-7u79-windows-x64.tar.gz
The file type of the first entry is wrong. It should a directory, not a file :
jdk1.7.0_79/ file
jdk1.7.0_79/COPYRIGHT file
jdk1.7.0_79/LICENSE file
jdk1.7.0_79/README.html file
jdk1.7.0_79/release file
Here's my small program :
var tar = require('tar-stream');
var fs = require('fs');
var extract = tar.extract();
extract.on('entry', function(header, stream, callback) {
console.log(header.name + ' ' + header.type);
stream.on('end', function() {
callback() // ready for next entry
})
stream.resume() // just auto drain the stream
});
extract.on('finish', function() {
// all entries read
console.log('finish extract');
});
var readStream = fs.createReadStream('jdk.tar');
readStream.on('error', function(err) {
console.log('read error', err);
});
readStream.pipe(extract);
There's different behaviour on the entry size field of symlinks when the location is set as stream or in the header, having sometimes a symlink with zero size and other times a symlink to nowhere but with a size the length of the destination path. It's needed to know what says the spec and make it work homoneneously.
This is the strangest bug, but I've narrowed it down to go binaries. tar-stream seems to silently fail to add them. Anything else I add to the tar works fine.
Steps to reproduce:
Create file test.go
:
package main
import "fmt"
func main() {
fmt.Println("test")
}
Build go file:
$ go build test.go
Create file test.js
:
var fs = require("fs");
var tar = require('tar-stream');
var fileName = "./test";
var pack = tar.pack();
fs.stat(fileName, function (err, stat) {
if (err) {
return console.log(err);
}
var packOptions = {
mode: stat.mode,
mtime: stat.mtime,
name: 'test',
size: stat.size
};
var rs = fs.createReadStream(fileName);
var entry = pack.entry(packOptions, function (err) {
console.log("This happens");
pack.finalize();
});
console.log("We pipe the stream here and expect a callback...");
return rs.pipe(entry);
});
Expected result: This happens
gets logged and the pack is finalized.
Actual result: This happens
is never logged and pack is not finalized
stream.entry({ name: `files/test.html`, mode: parseInt('777', 8) }, "test......");
But, result in :
➜ la files
total 16
-rwxr-xr-x@ 1 tony staff 1.4K 7 31 11:30 test.html
Readable-stream is versioned so that ~1.0.0 is Streams2, ~1.1.0 is Streams3.
The prefix for readable-stream
dependency has changed to ^
in 60cef01 which allows tar-stream
to depend on the 1.1.x Streams3 versions of readable-stream
.
BufferList is not used anywhere. However 'bl' is still listed in package.json.
Hello, it would be awesome to support this feature. As described on http://www.gnu.org/software/tar/manual/html_node/Standard.html.
example tar 1000MB folder into 500MB volumes...
is it possible?
As shown on this pull-request I'm converting a cpio
file generated with the get_init_cpio
tool of the Linux kernel to a tar
file. The generated tar
file works correctly with vagga, but it crash on Docker with a "Invalid tar header" error, and the same file makes file-roller (Ubuntu/Gnome compressed files manager) to core dump.
Inspecting the content of the generated file directly with the tar
command I get the next output:
[piranna@Mabuk:~/Proyectos/NodeOS]
(vagga) > tar -tvf node_modules/nodeos-barebones/out/latest
tar: Sustituyendo `.' por un nombre miembro vacío
d--x--x--x 0/0 0 2015-10-28 12:05
-r-xr-xr-x 0/0 651800 2015-10-28 12:05 lib/libc.so
lr-xr-xr-x 0/0 8 2015-10-28 12:05 lib/ld-musl-x86_64.so.1 -> libc.so
tar: Saltando a la siguiente cabecera
-r--r--r-- 0/0 1250352 2015-10-28 12:05 lib/libstdc++.so.6.0.17
lr--r--r-- 0/0 20 2015-10-28 12:05 lib/libstdc++.so.6 -> libstdc++.so.6.0.17
tar: Saltando a la siguiente cabecera
l--x------ 0/0 9 2015-10-28 12:05 init -> bin/node
tar: Un bloque de ceros aislado en 25824
tar: Saliendo con fallos debido a errores anteriores
[piranna@Mabuk:~/Proyectos/NodeOS]
(vagga) > echo $?
2
There are two missing entries (the ones with the tar: Saltando a la siguiente cabecera
message) corresponding to the lib/libgcc_s.so.1
and the bin/node
files. Their stat
object as given by cpio-stream
are:
{ ino: 724,
mode: 33060,
uid: 0,
gid: 0,
nlink: 1,
mtime: Wed Oct 28 2015 12:05:19 GMT+0100 (CET),
size: 96712,
devmajor: 3,
devminor: 1,
rdevmajor: 0,
rdevminor: 0,
_nameLength: 18,
_sizeStrike: 96712,
_nameStrike: 18,
name: 'lib/libgcc_s.so.1' }
{ ino: 727,
mode: 33133,
uid: 0,
gid: 0,
nlink: 1,
mtime: Wed Oct 28 2015 01:40:48 GMT+0100 (CET),
size: 11216736,
devmajor: 3,
devminor: 1,
rdevmajor: 0,
rdevminor: 0,
_nameLength: 9,
_sizeStrike: 11216736,
_nameStrike: 10,
name: 'bin/node' }
I'm not sure what could be the reason for this problem, since seems it's not related with file name length or file size or permissions or being binary ones... :-/ You can find the tar file if you want to inspect it yourself at https://dropfile.to/gWBaf
I mentioned this previously, but I'd like to be able to use tar-stream in this manner (or similar):
intar.pipe(tarStream(onentry, onfinish)).pipe(transformedTar)
I made a feeble attempt to get this to work outside of tar stream in dockerify lazy-stream branch, but had only mixed success.
The tests that all passed previously now only pass in node 0.10
.
I hope though this can server as a start to figure out how the above could be achieved.
when doing
tar.pack('folder-a').pipe(tar.extract('folder-b'));
where the contents of folder-a
is over roughly 1GB
I get FATAL ERROR: JS Allocation failed - process out of memory
Ideally, I would think I could add as many entries with streams as I want, then call finalize
right afterward and everything would work (ie automatically wait for all the input streams to complete before actually creating the package). The documentation seems to imply that this isn't the case tho. Can I or can't i do that? If not, why not? Can we make it so finalize can be called without explicitly waiting for the streams?
Hey from Node.js here!,
Starting on Node 10 this package will emit deprecation warnings. See this guide on what you should do in order to migrate to Buffer.alloc
/Buffer.from
.
See nodejs/node#19079 for discussion around this change and why we can't make new Buffer
work
The result.tar generated by the following codes fails to be unpacked,
const tar = require('tar-stream');
const writeStream = require('fs').createWriteStream('result.tar');
const pack = tar.pack();
pack.pipe(writeStream);
// the specific pattern I found:
// here, a '0' represents an ASCII character and a '哈' represents a unicode character
const directory = './0000000哈哈000哈哈0000哈哈00哈00哈0哈哈哈哈哈0哈/0000哈哈哈/';
const name = directory + 'somefile.txt';
const entry = pack.entry({ name }, 'any text', (...args) => console.log(args));
pack.finalize();
showing this after executing tar -xf result.tar
on terminal
tar: Ignoring malformed pax extended attribute
tar: Error exit delayed from previous errors.
or something like this when double-clicked on Mac OS
Error 1: Operation not allowed
I'm working on Mac OS and have tried the codes on node of both version 6.9.1 and 7.5.0, producing the same result.
tar-stream
works perfectly with almost all other unicode patterns so I think there might be a bug?
In the docs on extraction, the line pack.pipe(extract);
is at the end of the example, but pack
isn't defined and it doesn't make sense to me. Is that supposed to be there?
When the input data ends while tar-stream waits for more data to extract, it doesn't raise an error. This means that a truncated tar file will extract without reporting errors, while creating incomplete files.
If there is still missing data for a file or a partially read header, an error should be raised instead.
This issue is likely also the cause for #71. If only a short file is processed (shorter than a tar header), no errors are raised since the partial header data is never processed.
If I want to add a directory, do I have to traverse the whole directory tree and add in the directories and files inside there, or is there an easier way?
Hey there, the tar .pipe()
command doesn't work as I expect, which would be similar to fs.createReadStream().pipe()
... The problem is that it never closes.
So for example, this program using tar-stream
will hang.
var tar = require('tar-stream'),
spawn = require('child_process').spawn
var pack = tar.pack();
pack.entry({name: 'hello.txt'}, 'Hello world!')
var cat = spawn('cat');
pack.pipe(cat.stdin);
cat.stdout.pipe(process.stdout);
pack.on('end', function () {
console.log('This is never fired.');
});
While this program with fs.createReadStream
will close as expected.
var fs = require('fs'),
spawn = require('child_process').spawn
var fileStream = fs.createReadStream('test.js');
var cat = spawn('cat');
fileStream.pipe(cat.stdin);
cat.stdout.pipe(process.stdout);
fileStream.on('end', function () {
console.log('This does fire!');
});
It appears that your streaming tar parser does not support sparse files. Am I mistaken?
It it helps, GNU tar implements sparse support at:
http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c?id=63f2e969ddc162da7ae49a955bba9c6a2a0e77dc#n354
My question is relevant to this:
http://stackoverflow.com/questions/40166300/cross-platform-sparse-file-compression-with-nodejs
Also poking at cpio-stream for sparse support options too: finnp/cpio-stream#9
Why does this code, https://github.com/aaricpittman/yarn-pack-test/blob/master/stand-alone.js, corrupt the image files when the files are read with the 'binary' encoding and fine when I don't specify the encoding?
Can you confirm (or not) that there is an 8GB limit (per file) for TAR creation with the implementation of TAR that you use (ustar is it?)
If so, is there any way round this that you know of? Or another library i could use?
Many thanks
it seems that pack.js line 98 would always fail to pack streams unless the size is set in the data.
I maintain ctalkington/node-archiver and would like to consolidate efforts on the creation of tar archives. would you be open to an alternative method that collects the stream before writing the header? in cases where the size isn't passed along and the source is a stream? if so, i can work up a PR.
Is there an option to follow symlinks, similar to tar's -L
or -h
option?
I had a bit of trouble figuring out how to serve a directory as a .tar.gz
in an Express app. Here is a snippet on how I accomplished it.
As a gist, https://gist.github.com/MadLittleMods/7eedb4001c52acec104e91dbd80618b5
const Promise = require('bluebird');
const path = require('path');
const fs = require('fs-extra');
const stat = Promise.promisify(fs.stat);
const glob = Promise.promisify(require('glob'));
const tarstream = require('tar-stream');
const zlib = require('zlib');
const express = require('express');
function targzGlobStream(globString, options) {
const stream = tarstream.pack();
const addFileToStream = (filePath, size) => {
return new Promise((resolve, reject) => {
const entry = stream.entry({
name: path.relative(options.base || '', filePath),
size: size
}, (err) => {
if(err) reject(err);
resolve();
});
fs.createReadStream(filePath)
.pipe(entry);
});
};
const getFileMap = glob(globString, Object.assign({ nodir: true }, options))
.then((files) => {
const fileMap = {};
const stattingFilePromises = files.map((file) => {
return stat(file)
.then((fileStats) => {
fileMap[file] = fileStats;
});
});
return Promise.all(stattingFilePromises)
.then(() => fileMap);
});
getFileMap.then((fileMap) => {
// We can only add one file at a time
return Object.keys(fileMap).reduce((promiseChain, file) => {
return promiseChain.then(() => {
return addFileToStream(file, fileMap[file].size);
});
}, Promise.resolve());
})
.then(() => {
stream.finalize();
});
return stream.pipe(zlib.createGzip());
}
const app = express();
app.get('/logs.tar.gz', function (req, res) {
const logDirPath = path.join(process.cwd(), './logs/');
const tarGzStream = targzGlobStream(path.join(logDirPath, '**/*'), {
base: logDirPath
});
res
.set('Content-Type', 'application/gzip')
.set('Content-Disposition', 'attachment; filename="logs.tar.gz"');
tarGzStream.pipe(res);
});
when trying to do the following:
var readStream = fs.createReadStream(tarballPath);
var extractStream = tar.extract(nodeModulesPath);
readStream
.pipe(zlib.createGunzip())
.pipe(extractStream);
readStream.on('error', callback);
extractStream.on('error', callback);
extractStream.on('finish', callback);
I keep getting the following error in some cases. The only difference between the case where things fail and where things don't fail is when the tarball already exists on disk before running the script (doesn't fail) and when i create the tarball and try to extract it (but only after the finish event on the tar.pack(nodeModulesPath)
). I've run a fs.existsSync call before the code above to confirm that the tarball exists and the node modules path does not.
TypeError: Cannot read property 'corked' of undefined
at Writable.end (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:429:12)
at emptyStream (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:18:5)
at onheader (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:135:34)
at Extract._write (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:207:8)
at doWrite (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:279:12)
at writeOrBuffer (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:266:5)
at Writable.write (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:211:11)
at write (_stream_readable.js:601:24)
at flow (_stream_readable.js:610:7)
at Gunzip.pipeOnReadable (_stream_readable.js:642:5)
This appears to be caused by the fact that _writableState
is on the _parent
property of the Source instance and not on the stream object itself.
I tried adding a line to the source instantiation to pass this property through so it's available in s.end()
, but then I get the following error:
Error: write after end
at writeAfterEnd (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:161:12)
at Writable.write (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:208:5)
at Writable.end (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/node_modules/readable-stream/lib/_stream_writable.js:426:10)
at Extract._write (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:203:12)
at Extract._continue (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:171:28)
at oncontinue (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:62:10)
at onheader (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:143:5)
at Extract._write (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:207:8)
at Extract._continue (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:171:28)
at oncontinue (/Users/andrewdeandrade/code/unpm-install/node_modules/tar-fs/node_modules/tar-stream/extract.js:62:10)
My first intuition is that https://github.com/mafintosh/tar-stream/blob/master/extract.js#L31 should read PassThrough.call(this, self);
, but that didn't fix the problem either (but still might be something you want to add)
I tried checking the value of _writableState all the way through the instantiation stack. It's defined at the end of the constructor function for Writable(), but is undefined after the Writable.call(options)
in the instantiation function of Duplex.
Any ideas on what could be causing this and how to fix it?
I only found it by accident. Maybe a Related
section or something.
Running a slightly modified example "Modifying existing tarballs" code on the compressed tar file http://registry.npmjs.org/which/-/which-1.0.5.tgz which other tar tools say are valid
Generates
processing entry package/bin/
_stream_readable.js:476
dest.on('unpipe', onunpipe);
^
TypeError: Cannot call method 'on' of undefined
at PassThrough.Readable.pipe (_stream_readable.js:476:8)
at null.<anonymous> (bug.js:27:9)
at EventEmitter.emit (events.js:106:17)
at onheader (node_modules\tar-stream\extract.js:101:9)
at Extract._write (node_modules\tar-stream\extract.js:172:7)
at doWrite (_stream_writable.js:226:10)
at writeOrBuffer (_stream_writable.js:216:5)
at Writable.write (_stream_writable.js:183:11)
at write (_stream_readable.js:583:24)
at flow (_stream_readable.js:592:7)
So this is either a documentation / example improvement to outline that pack.entry returns returns a stream or nothing depending on the header.type field
or
A bug that extract.on('entry')
does seem to return a valid (but empty) stream to read from but pack.entry does not return a Sink point to read this empty stream because it is a directory
Modified example code below
// Location of problem tar file http://registry.npmjs.org/which/-/which-1.0.5.tgz
var tarStream = require('tar-stream');
var zlib = require('zlib');
var fs = require('fs');
var input = "which-1.0.5.tgz";
var output = "rewritten_which-1.0.5.tgz";
var inputTarfile = fs.createReadStream(input);
var outputTarfile = fs.createWriteStream(output);
var gunzip = zlib.createGunzip();
var gzip = zlib.createGzip();
// Stream copy the tar.gz
var tarExtract = tarStream.extract();
var tarPack = tarStream.pack();
tarExtract.on('entry', function(header, stream, callback) {
console.log('processing entry ' + header.name);
// write the unmodified entry to the pack stream
stream.pipe(tarPack.entry(header, callback));
});
tarExtract.on('finish', function() {
// all entries copied, add new entry
tarPack.finalize();
});
//read input
inputTarfile.pipe(gunzip).pipe(tarExtract);
// write output
tarPack.pipe(gzip).pipe(outputTarfile);
Possible a better example for the documentation if this is the approach to fixing this bug is taken
var extract = tar.extract();
var pack = tar.pack();
var path = require('path');
extract.on('entry', function(header, stream, callback) {
// let's prefix all names with 'tmp'
header.name = path.join('tmp', header.name);
var entrySink = pack.entry(header, callback);
// If no entrySink was returned then the entry was not a 'file' or 'contigious-file'
// therefore there is nothing to pipe data in to
if ( typeof entrySink != "undefined") {
// write the new entry to the pack stream
stream.pipe(entrySink);
}
});
extract.on('finish', function() {
// all entries done - lets finalize it
pack.finalize();
});
// pipe the old tarball to the extractor
oldTarball.pipe(extract);
// pipe the new tarball the another stream
pack.pipe(newTarball);
Hi, Sorry to raise issue here but I think you guys maintaining this module would understand this best,
So I thought I might get some help from here.
I am using this module to read a .tgz file, and to read every files's content from this tar file.
I am kind of stuck here
here is what I am trying to do :
.tgz
file structure:
root_folder
|-- _sub_folder1
| |-- file1
| |-- file2
| ....
|-- _sub_folder2
...
(in coffee script) Read every sub folder and file
extract = require('tar-stream').extract()
fs.createReadStream(FILE_PATH).pipe(zlib.createUnzip()).pipe(extract)
.on 'entry', (header, stream, callback) ->
console.log "header -->", header.name, header.size, header.type
if hearder.type == "directory"
#go inside this directory and find all files
#read content of every file......
# what should I do here ??
else if hearder.type == "file"
#read content of this file......
stream.resume()
.on 'error', ->
console.log "error"
.on 'finish', ->
console.log "finished"
out put:
header --> offline_2014-08-06_16:54:28/ 0 directory
this entry end
What version of the tar spec does tar-stream implement?
const stream = tar.pack()
stream.entry({ name: '/foo/test.txt' }, 'hello');
stream.finalize();
Now I have a tar stream, how to gz it?
Bumped my head against the wall for a while until I figured out that object serialization doesn't happen automatically when writing objects to an entry.
We could do something similar to
Line 127 in 8e3b174
JSON.stringify
on the object to prevent the user from committing an empty buffer to the file?A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.