Comments (20)
Ok, seems I've found what's happening here: the tar: Saltando a la siguiente cabecera
entries only happens when the inmediately before entry is a symlink:
tar: Sustituyendo `.' por un nombre miembro vacío
d--x--x--x 0/0 0 2015-10-28 17:00
-r-xr-xr-x 0/0 651800 2015-10-28 17:00 lib/libc.so
-r--r--r-- 0/0 96712 2015-10-28 17:00 lib/libgcc_s.so.1
-r--r--r-- 0/0 1250352 2015-10-28 17:00 lib/libstdc++.so.6.0.17
-r-xr-xr-x 0/0 11216736 2015-10-28 01:40 bin/node
tar: Sustituyendo `.' por un nombre miembro vacío
d--x--x--x 0/0 0 2015-10-28 17:02
-r-xr-xr-x 0/0 651800 2015-10-28 17:02 lib/libc.so
lr-xr-xr-x 0/0 8 2015-10-28 17:02 lib/ld-musl-x86_64.so.1 -> libc.so
tar: Saltando a la siguiente cabecera
-r--r--r-- 0/0 1250352 2015-10-28 17:02 lib/libstdc++.so.6.0.17
-r-xr-xr-x 0/0 11216736 2015-10-28 01:40 bin/node
tar: Saliendo con fallos debido a errores anteriores
As you can see, the second has a symlink lib/ld-musl-x86_64.so.1 -> libc.so
, and the next entry that's lib/libgcc_s.so.1
has a failure. The file size also changes (13220352 -> 13220864) exactly 512 bytes, that's the size of a tar block and the one of the symlink entry.
Thoughs?
from tar-stream.
I've checked the differences on the blocks using a tar file with just one file and another with a symlink and a file, and beside timestamp I don't find almost differences... :-( There's a field after the timestamp that change but don't know what it means, the root dir changes from 0x30, 0x30, 0x37, 0x32, 0x31, 0x37
to 0x30, 0x30, 0x37, 0x32, 0x31, 0x34
while the file changes from 0x30, 0x31, 0x32, 0x32, 0x36, 0x32
to 0x30, 0x31, 0x32, 0x32, 0x35, 0x37
, besides that and the inclusion of the symlink block there's no other change... :-(
from tar-stream.
Ok, that number that's changuing is the checksum... now I'm totally lost about what's happening here :-( I know that there are some differences between different implementations and formats on the tar
file format... Maybe both Docker, file-roller and the tar
expect to be the tar file in a different format of the one generated by tar-stream?
from tar-stream.
I'm also encountering this "Invalid tar header" error using [email protected]
and [email protected]
. When I replace tar-stream
with a [email protected]
and [email protected]
implementation, it works without error.
Repo steps to follow...
from tar-stream.
Repro project is here: https://github.com/cgmartin/node-tar-dockerode-repro (using Mac OS X)
Narrowed it down to a difference in file group ownership, which explains why everything was working fine for some other team members, but not for me...
When the files in the tar directory have GID 0
("wheel" on my system) the invalid tar header error occurs. When I change the same files to a non-zero GID, it works!
Looks like the 0
GID matches up with the stat info that @piranna reported, too.
from tar-stream.
@cgmartin thanks for taking the time to putting this together. do i need docker running to run your example?
from tar-stream.
@mafintosh no, thank you! Yes, those repro steps will need docker running. I was using Docker version 1.9.0, build 76d6bc9 on Yosemite. I'll try running some different gnu/bsd cli tar tools against the generated files (using the other test-tarfile.js
script) to see if it can be reproduced outside of the docker API.
I just tried this on another mac system (El Capitan) with same docker version and wasn't able to reproduce it. It worked with GID 0
. This is definitely an odd one...
from tar-stream.
@cgmartin i'll try to run docker and see if i can reproduce
from tar-stream.
I was unable to repro with different tar cli tools on the files.
On the Yosemite 10.10.5 machine (w/FileVault enabled):
Tried different node versions (0.10.40, 0.12.3, 4.2.1), but got the same failed results running the test-dockerode.js
scripts. I also tried different GIDs for the files: It is not just 0
, but it seems like ANY group that my user does not belong to causes the header error in my case. But If I use a GID that my user belongs to, it works fine. The tar + fstream
method continues to work regardless of GID or other changes.
On the El Capitan 10.11.1 machine (w/FileVault disabled):
Tried the same node versions, with varied GID permissions and could not reproduce.
Both systems are using Docker 1.9.0 build 76d6bc9.
Just let me know If there are any other scenarios you'd like me to try - happy to help narrow down further.
from tar-stream.
Could you be able to compare both tar files with hexdump? This way we could see what's different between them...
from tar-stream.
Any update on this? Is there something I can do to help here? I need this working to port NodeOS to Docker... :-/
from tar-stream.
@piranna i forgot where we left this. do we have a simple test case that showcases this bug without using docker? then i can probably easily fix it
from tar-stream.
Yes, it seems gnu tar and file-roller also has problems when parsing tar
files generated by tar-stream having symlinks inside, as the error happens
on the entries following the ones of that symlinks. Tar files structure is
just concatenated registers, so maybe it would be test by generating two
tar files, one with a single file and another with a symlink and a file,
and look for diferences...
El 4/3/2016 19:33, "Mathias Buus" [email protected] escribió:
@piranna https://github.com/piranna i forgot where we left this. do we
have a simple test case that showcases this bug without using docker? then
i can probably easily fix it—
Reply to this email directly or view it on GitHub
#44 (comment)
.
from tar-stream.
great. could you gist an example that produces a simple problematic tar ball? then i'll take a look at it
from tar-stream.
I have been trying to reproduce the problem on a small script without luck :-( I've checked it and it doesn't happen on all symlinks as you can see in the attached file (decompress it and keep only with barebones.tar`, that's the conflictive one), that's generated with my cpio2tar script I put before. The output I get is:
> tar -tvi < barebones.tar
tar: Sustituyendo `.' por un nombre miembro vacío
d--x--x--x 0/0 0 2016-03-04 16:04
dr-xr-xr-x 0/0 0 2016-03-04 16:04 bin
d--------- 0/0 0 2016-03-04 16:04 dev
d--x--x--x 0/0 0 2016-03-04 16:04 lib
d--x------ 0/0 0 2016-03-04 16:04 sbin
---x------ 0/0 4080 2016-03-01 10:57 init
l--x------ 0/0 10 2016-03-04 16:04 sbin/init -> /bin/node
-r-xr-xr-x 0/0 609336 2016-03-04 16:04 lib/libc.so
lr-xr-xr-x 0/0 8 2016-03-04 16:04 lib/ld-musl-i386.so.1 -> libc.so
tar: Saltando a la siguiente cabecera
-r--r--r-- 0/0 1878024 2016-03-04 16:04 lib/libstdc++.so.6.0.21
lr--r--r-- 0/0 20 2016-03-04 16:04 lib/libstdc++.so.6 -> libstdc++.so.6.0.21
tar: Saltando a la siguiente cabecera
tar: Saliendo con fallos debido a errores anteriores
from tar-stream.
I've seen a similar problem, you can try this to produce the error.
mkdir -p data
echo "123 + 456" > data/sample.in
echo "579" > data/sample.out
tar zcvf data.tar.gz -C data .
node extract.js
And here is my extract.js
var fs = require('fs'),
gunzip = require('gunzip-maybe'),
tarStream = require('tar-stream');
var read = fs.createReadStream('data.tar.gz')
var extract = tarStream.extract()
read.pipe(gunzip()).pipe(extract)
extract.on('entry', function(entry, stream, cb) {
console.log('entry (:', entry.name)
stream.pipe(fs.createWriteStream(entry.name))
stream.on('end', cb)
})
extract.on('error', function(err) {
console.log('error ):', err)
})
extract.on('finish', function() {
console.log('yay !!')
})
edit:
BTW, it's quite easy to avoid:
if (entry.type === 'file') {
// Do something awesome
cb();
} else {
// Do nothing
cb();
}
from tar-stream.
This problem happens when used tar-stream for packing, not for extract. Why
do you say they are related?
El 12/3/2016 3:57, "Manuel Pineda" [email protected] escribió:
@mafintosh https://github.com/mafintosh @piranna
https://github.com/pirannaI've seen a similar problem, you can try this to produce the error.
mkdir -p dataecho "123 + 456" > data/sample.inecho "579" > data/sample.out
tar zcvf data.tar.gz -C data .
node extract.jsAnd here is my extract.js
var fs = require('fs'),
gunzip = require('gunzip-maybe'),
tarStream = require('tar-stream');
var read = fs.createReadStream('data.tar.gz')var extract = tarStream.extract()read.pipe(gunzip()).pipe(extract)extract.on('entry', function(entry, stream, cb) {
console.log('entry (:', entry.name)
stream.pipe(fs.createWriteStream(entry.name))
stream.on('end', cb)
})
extract.on('error', function(err) {
console.log('error ):', err)
})
extract.on('finish', function() {
console.log('yay !!')
})—
Reply to this email directly or view it on GitHub
#44 (comment)
.
from tar-stream.
Seems I've been able to find this with the help of @luii, and in fact, it's not something new. Seems gnu tar
(booooh...) requires the file size of the symlink entry header to be zero, while we are setting it to the length of the target path (22, in this case). Here you have some snippets with a symlink from two tar files, one generated with tar-stream
and another with tar
:
00003e00 64 6f 63 73 2f 6c 61 79 65 72 31 00 00 00 00 00 |docs/layer1.....|
00003e10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003e60 00 00 00 00 30 30 30 37 37 37 20 00 30 30 31 37 |....000777 .0017|
00003e70 35 30 20 00 30 30 30 31 34 34 20 00 30 30 30 30 |50 .000144 .0000|
00003e80 30 30 30 30 30 32 32 20 31 32 37 30 34 35 32 31 |0000022 12704521|
00003e90 33 36 31 20 30 31 34 33 37 37 20 00 32 4e 6f 64 |361 014377 .2Nod|
00003ea0 65 4f 53 20 6c 61 79 65 72 20 31 2e 70 6e 67 00 |eOS layer 1.png.|
00003eb0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003f00 00 75 73 74 61 72 00 30 30 00 00 00 00 00 00 00 |.ustar.00.......|
00003f10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003f40 00 00 00 00 00 00 00 00 00 30 30 30 30 30 30 20 |.........000000 |
00003f50 00 30 30 30 30 30 30 20 00 00 00 00 00 00 00 00 |.000000 ........|
00003f60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00006600 64 6f 63 73 2f 6c 61 79 65 72 31 00 00 00 00 00 |docs/layer1.....|
00006610 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00006660 00 00 00 00 30 30 30 30 37 37 37 00 30 30 30 31 |....0000777.0001|
00006670 37 35 30 00 30 30 30 30 31 34 34 00 30 30 30 30 |750.0000144.0000|
00006680 30 30 30 30 30 30 30 00 31 32 37 30 34 35 32 31 |0000000.12704521|
00006690 33 36 31 00 30 31 35 35 36 33 00 20 32 4e 6f 64 |361.015563. 2Nod|
000066a0 65 4f 53 20 6c 61 79 65 72 20 31 2e 70 6e 67 00 |eOS layer 1.png.|
000066b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00006700 00 75 73 74 61 72 20 20 00 70 68 69 6c 69 70 70 |.ustar .philipp|
00006710 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00006720 00 00 00 00 00 00 00 00 00 75 73 65 72 73 00 00 |.........users..|
00006730 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
Probably this is due to the fact I'm using the stream API since cpio-stream
module send the symlink path as it's content.
from tar-stream.
Efectively, cpio-stream
is setting the symlinks size to the length of the path, but I don't think it's (too much) a cpio-stream
bug but instead of gnu tar
. Anyway, I think tar-stream
must set the size
header explicitly to zero to prevent this.
/cc @finnp
from tar-stream.
Fixed already in master, clossing.
from tar-stream.
Related Issues (20)
- "Invalid tar header: unknown format." from valid tar file, IANA tz database HOT 5
- Tar file is corrupted when using single File larger than 8 GB - tar-fs HOT 1
- Packing files at the root HOT 1
- File corrupted when combining extract with gzip HOT 2
- unable to use stream.pipeline() HOT 1
- doesn't work with HOT 2
- skipping on entry, header errors HOT 1
- zero-sized files in tarballs do not call `end` handler HOT 2
- error handling issue HOT 3
- Add browser support HOT 12
- Error: Writable stream closed prematurely HOT 1
- Missing fast-fifo dependency HOT 2
- Using tar stream in the browser HOT 8
- Error: Invalid tar header. Maybe the tar is corrupted or it needs to be gunzipped? HOT 4
- create entry with base-256 size HOT 3
- Invalid tar header: unknown format. (for ExtendedName)
- Is it possible to return raw file offsets from within the tar? HOT 2
- Uncatchable error in pack.entry HOT 3
- Benchmarks not maintained in tar-fs
- tar entry destroyed when piping extract to pack
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tar-stream.