thrau / jarchivelib Goto Github PK

View Code? Open in Web Editor NEW

198.0 10.0 36.0 211 KB

A simple archiving and compression library for Java

Home Page: https://github.com/thrau/jarchivelib

License: Apache License 2.0

Java 99.13% Makefile 0.87%

compression archiving extraction

jarchivelib's Introduction

jarchivelib

A simple archiving and compression library for Java that provides a thin and easy-to-use API layer on top of the powerful and feature-rich org.apache.commons.compress.

Usage

Using the ArchiverFactory

Create a new Archiver to handle zip archives

Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.ZIP);

Create a new Archiver to handle tar archives with gzip compression

Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.TAR, CompressionType.GZIP);

Alternatively you can use string representations of the archive and compression types.

Archiver archiver = ArchiverFactory.createArchiver("zip");

The ArchiveFactory can also detect archive types based on file extensions and hand you the correct Archiver. This example returns an Archiver instance that handles tar.gz files. (It would also recognize the .tgz extension)

Archiver archiver = ArchiverFactory.createArchiver(new File("archive.tar.gz"));

Using Archivers

Extract

To extract the zip archive /home/jack/archive.zip to /home/jack/archive:

File archive = new File("/home/jack/archive.zip");
File destination = new File("/home/jack/archive");

Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.ZIP);
archiver.extract(archive, destination);

Create

To create a new tar archive with gzip compression archive.tar.gz in /home/jack/ containing the entire directory /home/jack/archive

String archiveName = "archive";
File destination = new File("/home/jack");
File source = new File("/home/jack/archive");

Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.TAR, CompressionType.GZIP);
File archive = archiver.create(archiveName, destination, source);

notice that you can omit the filename extension in the archive name, as it will be appended by the archiver automatically if it is missing.

Stream

To access the contents of an archive as a Stream, rather than extracting them directly onto the filesystem

ArchiveStream stream = archiver.stream(archive);
ArchiveEntry entry;

while((entry = stream.getNextEntry()) != null) {
    // access each archive entry individually using the stream
    // or extract it using entry.extract(destination)
    // or fetch meta-data using entry.getName(), entry.isDirectory(), ...
}
stream.close();

Dependencies

commons-compress(tm) 1.21

Compatibility

Java 7, 8, 9, 10, 14
Currently only tested for *nix file systems.

OSGi

jarchivelib compiles to a bundle and is OSGi compatible

jarchivelib 0.8.x and below

Java 6 and 7

Known limitations

Permissions are not stored when creating archives
There is no support for Windows permissions
JAR files are treated like streamed zip files and can not restore permissions

jarchivelib's People

Contributors

Stargazers

Watchers

jarchivelib's Issues

Move to github

I will move the focus of the project from bitbucket to github.

Backwards compatibility for Java 1.6

Due to popular demand, the source will be downgraded to make it compatible with Java 1.6.

This means:

remove try-with-resource blocks
re-implement closeQuietly method for closing streams
remove diamond operator

Producing archives with non-standard extensions

Hello

Would you consider taking a PR that made it possible to specify an alternative file extension? I'd like to use jarchivelib to create "dump" files for Neo4j, but we don't want to advertise the file format in the extension.

Thanks
-Ben

Abstract tests for compressor

to properly test #6, create abstract tests in the same way archivers are tested.

Need way to set LONGFILE_POSIX and BIGNUMBER_POSIX for TarArchiveOutputStream

I'm trying to use the jarchivelib API to create a .tar.gz file and I need to set the LONGFILE_POSIX and BIGNUMBER_POSIX settings on the TarArchiveOutputStream. There may be a way to do this already using the API but I can't seem to figure it out. See http://commons.apache.org/proper/commons-compress/tar.html

The API looks great, it would be great if there were a way to use it with these settings.

Thanks!

Strange external processes spawned

Was having an issue on linux where my maven process wouldn't properly finish. After digging a couple hours, it looked like extra threads were being spun up and not properly cleaned up. Unfortunately, the root cause was jarchivelib. Didn't have the issue on windows. After digging thru your code, I suspects its the spinning up of "chmod" as an external process to apply permissions. Java 7 has pretty awesome support for properly setting posix permissions all in Java. No need for an external process. I reverted back to Apache commons compress for now, but wanted to give you a heads up on the issue.

Test jarchivelib on Windows systems

Perhaps that should be done for the first major release.

also test #42

Get extracted size?

Hi,

is it possible to get size of the extracted tarball (tgz) would have on the filesystem through one of the lib methods?

Cheers.

Compressor tests for non-readable/writable files

currently IO exceptions are propagated, would make sense to extend error handling for this and test it properly.

Create test fixture for testing permissions

Wrap permission accessors for commons' ArchiveEntry

There is no unified interface in commons' ArchiveEntry to access permission flags as they differ greatly for each different archive type.

With some type of wrapper hierarchy that focuses on POSIX permissions, it should be possible to cover a high percentage of use cases. No idea how to map non-POSIX permissions yet though (ntfs/fat being the most important).

Upgrade commons-compress to 1.9

Not Honoring Unix Permissions

The CompressArchiveEntry does not take into account the unix permissions of the commons's ArchiveEntry. This can be read on a zip entry from getUnixMode(), for example. Tar files also support executable files (from what I remember), but I can't see any way to pull that data from a TarArchiveEntry.

Create a Karaf OSGi feature descriptor

Useful for Karaf users

Unpack from a stream

I have a use case where I have an input stream (and I know the type) and I'd like to unpack to a directory, It looks as if I can't do that, am I missing something?

Configure travis build to auto-deploy snapshots

Compressors can't (de)compress into directories

Only into files directly, which is impractical and inconsistent. Like the Archiver, the Compressor API should allow to compress and decompress files into directories. Filename extensions need to be handled accordingly. This might be awkward when removing the filename extension of the compression type from the file. (When a "gz" compressor is created, but the filename extension is ".gzip")

Set up travis-ci

Make it so.

Add "NONE" compressor

It would be nice to have a compresser, which explicitly donates that there is explicitly no compression.

Main reason: it would be possible to always use the same method to create archivers using some dynamic mechanism.

Upgrade commons-compress to 1.8

org.rauschig:jarchivelib maven jar contains commons-compress classes

org.rauschig:jarchivelib:jar:0.6.1 maven artifact simultaneously depends on and contains org.apache.commons:commons-compress:jar:1.8 (you can see this here, in the "Packages" section).

If maven-enforcer-plugin is configured with BanDuplicateClasses rule, plain Java (non-OSGI) projects depending on jarchivelib will fail to build:

[WARNING] Rule 2: org.apache.maven.plugins.enforcer.BanDuplicateClasses failed with message:
Duplicate classes found:

  Found in:
    org.rauschig:jarchivelib:jar:0.6.0:compile
    org.apache.commons:commons-compress:jar:1.8:compile

Looks like this had been done to package jarchivelib as an OSGI bundle. Using Import-Package: org.apache.commons.compress.*;version=1.8 instead of Embed-Dependency will probably solve this problem and will ensure that the users can add dependency on externally-packaged commons-compress bundle, e.g. from Eclipse Orbit. See this stackoverflow question for a similar solution.

Provide OSGi wrapper bundle for xz dependency

To provide full OSGi support. Maybe in pax tipi

Archiver.extract(InputStream, File) closes stream

The callee should not close a stream!

Set up Travis to run builds for Java 6 and 7

in concordance to #2

Upgrade commons-compress to 1.7

Get next entry after using skip?

Hello,

so I'm trying to do seeking inside a .tar.gz and I've attempted to get offset of a certain file I need, and I know its size.

            byte[] fileBytes = new byte[entrySize];
            stream2.read(fileBytes, entryOffset, entrySize);

However this doesn't seem to produce usable file at all.

Any ideas? Or maybe could I could "skip(entryOffset)" and then getCurrentEntry?

Thanks.

Upgrade to commons-compress 1.14

Integrate all commons-compress compression types

lzma
snappy-framed
snappy-raw
xz
z

Wrap ZipFile as ArchiveInputStream

From the commons-compress documentation for zip:

This means the ZIP format cannot really be parsed correctly while reading a non-seekable stream, which is what ZipArchiveInputStream is forced to do.
[...]
If possible, you should always prefer ZipFile over ZipArchiveInputStream. 1

this is necessary for #14 to retrieve file attributes properly

Adding entries to existing archives

Would be a great feature but it will require to actually represent archives by some interface which will in turn involve more API changes, or actually a re-design of the entire Archiver API.

Back to the Future - Java 7

Reverting back to Java 6 was fun and all, but preserving file attributes is a pain/not possible properly without Java 7. To fully support attribute preservation, I need java.nio.

For jarchivelib v2.0.0 I will be using Java 7.

Add top-level folder to archive

Suppose I have the following file structure:

└── rootFolder
    ├── folder1
    │   └── someFile
    ├── folder2
    │   └── someFile
    └── folder3
        └── someFile

and I want to add folder1 and folder2 to an archive.

The following code doesn't quite work, because it will just add someFile twice (without adding the roots folder1 and folder2):

Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.ZIP);
archiver.create("archive", dest, new File("rootFolder/folder1"), new File("rootFolder/folder2"));

produces

└── archive.zip
    └── someFile

But I'd like to obtain:

└── archive.zip
    ├── folder1
    │   └── someFile
    └── folder2
        └── someFile

Any thoughts on how to achieve this?

Remove commons-io dependency

The last remnant of commons-io is the use of FileUtils to force-delete the files/directories, this could be removed.

Tests sometimes fail due to ArrayComparisonFailure

Apparently I disregarded the possibility of file list orderings being different on different systems. Some of the asserts fails sporadically in travis.

Integrate 7z archive support

Should be possible with commons-compress 1.8

Tests for passing wrong archive types to archivers

What happens if you pass a .tar archive to an archiver that handles .zip? This is not tested at all at the moment.

Can not unzip soft link file correctly

hi,when my xx.tar.gz file contains a soft link, after extract, the soft link turn a empty file.

File extension handling

The way file types and suffixes are handled grew into an unclean implementation.

Desired behaviour of e.g. archive-compressors that handle ".tgz" rather than composite extensions such as ".tar.gz", has to be thought through and a clean concept derived.

Consequently, the API for Archiver and ArchiverFactory have to be more precise on how to deal with probing of file types and creation of archives where the file suffix is appended automatically.

Javadoc breaks build with Maven and Java 8

doesn't like the <p/> tags

Separate archiver/compressor auto-detection behavior

The commons stream factories provide methods for creating input-streams based on auto-detection. This functionality is currently used without any kind of consideration. Archivers can extract anything that is auto-detectable by the ArchiveStreamFactory, because CommonsArchiver creates the streams by passing the InputStream of the file to extract, rather than the archive format.
While this may be a convenient behavior in some cases, its confusing, unclean and can cause issues where archive/compression types may actually not be auto-detectable (all the new compression formats, or JAR for example).

ArchiveStreamFactory cannot tell JAR archives from ZIP archives and will not auto-detect JARs.

Auto-detection should but be segregated and properly documented.

Support for symlinks

Currently links are not supported at all. Depending on the archive type, weird things will happen.

Add OSGi bundle name property

should be org.rauschig :: jarchivelib

extract an archive from an InputStream

Enhance the Archiver interface to have an extract(InputStream archive, File destination) method for archives that are not files (such as URLs).

Add Archiver.stream(ArchiveStream) support

Debian archive files (.deb) are an "ar" file containing a set of tar.gz files.

I'd love to be able to pass the ArchiveStream from the "ar" into a new Archiver.stream() call to extract a file from the embedded tar.gz.

I'm not familiar enough with Streams to know if that would work. Archiver.stream() appears to take a File only.

Thoughts?

Document compressors

They're not handled in the current examples.

Deleteing file fails after Compressor.decompress()

Simple and clean API but I have an issue.

I am using a test file named fileNane.gz. It is not a compressed file but a text file with the .gz extension. This file seems to be locked by a process when trying to delete.

compressor.decompress(f, new File(f.getParent()) ) throws an exception as expected but I cannot delete the file.

Here is my code:

private boolean decompress(File f) {
    try{
        Compressor compressor = CompressorFactory.createCompressor(f);
        System.out.println("COMPRESSED FILE");

        compressor.decompress(f, new File(f.getParent()) );
        FileUtils.deleteQuietly(f);

        return true;
    }catch(IllegalArgumentException e){

    }catch(FileNotFoundException e){

    }catch (Exception e) {

        System.out.println(f.canExecute()); // is true
        System.out.println(f.canRead()); // is true
        System.out.println(f.canWrite()); // is true
        System.out.println(f.exists()); // is true

        System.out.println(f.delete()); // is false

        System.err.println(e); 
    }

    return false;
}

Dedicated pack200 test

testing Pack200 works differently than testing gz or bzip2. create tests for explicitly asserting Pack200 integration functionality.

Preserve file permission when creating archives

Restoring file permissions on extraction works with #14, however there are no clean mechanisms for reading permissions from a File and setting them in an ArchiveEntry.

Add license information in root directory

Could you please state the license for this component? I don't see one? Apologies if I'm being dense.

0.6.0 embeds faulty commoms-compress version 1.8

The current version 0.6. embeds commons-compress which has a bug that makes extracting some Tar archives fail. This is fixed with version 1.8.1.
jarchivelib may want to update to the latest version of commons-compress to circumvent issues with failing archives.

An example for a failing archive is the Maven 3.0.5 tar.gz file: http://ftp.halifax.rwth-aachen.de/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz