Giter Club home page Giter Club logo

arx's Introduction

What is arx

Arx is a file archive format based on the jubako container format.

It allow you to create, read, extract file archive (as zip or tar does).

Arx (and Jubako) is in active development. While it works pretty well, I do not recommand to use it to do backups. However, you can use it to transfer data or explore archives.

How it works

Jubako is a versatile container format, allowing to store data, compressed or not in a structured way. It main advantage (apart from its versability) is that is designed to allow quick retrieval of data fro the archive without needing to uncompress the whole archive.

Arx use the jubako format and create arx archive which:

  • Store file's data compressed.
  • Store files using a directory/tree structure.
  • Can do random access on the arx archive to get a specific files
  • Allow to mount the archive to explore and use (read only) the files in the archive without decompressing it.

Install arx

Binaries for Windows, MacOS and Linux are available for every release. You can also install arx using Cargo:

cargo install arx

Use arx

Create an archive

Creating an archive is simple :

arx create -o my_archive.arx -r my_directory

It will one file : my_archive.arx, which will contains the my_directory directory.

Extract an archive

Extracting (decompressing) an archive is done with :

arx extract my_archive.arx -C my_out_dir

Listing the content of an archive

You can list the content of (the list of files in) the archive with :

arx list my_archive.arx

And if you want to access to the content of only one file :

arx dump my_archive.arx my_directory/path/to/my_file > my_file
# or
arx dump my_archive.arx my_directory/path/to/my_file -o my_file

Mounting the archive

On linux, you can mount the archive using fuse.

mkdir mount_point
arx mount my_archive.arx mount_point

If you don't provide a mount_point, arx will create a temporary one for you

arx mount my_archive.arx # Will create my_archive.arx.xxxxxx

arx will be running until you unmount mount_point.

Converting a zip archive to an arx archive

zip2arx -o my_archive.arx my_zip_archive.zip

Converting a tar archive to an arx archive

tar2arx -o my_archive.arx my_tar_archive.tar.gz

or

tar2arx -o my_archive.arx https://example.com/my_tar_archive.tar.gz

Performance

The following compare the performance of Arx to different archive formats.

  • Arx, Tar, Squasfs is compressed the content using zstd, level 5.
  • Zip is compressed using level 9
  • Fs is FileSystem (no archive). Archive creation and extraction is simulated with cp -a.

Tests has been done on different data sets :

  • the whole linux kernel (linux-5.19)
  • the drivers directory in linux kernel
  • the document directory in the linux kernel

Source directory is stored on a sdd. All test are run on a tmpfs (archive and extracted files are stored in memory).

Mount diff time is the time to diff the mounted archive with the source directory

arx mount archive.arx mount_point &
time diff -r mount_point/linux-5.19 linux-5.19
umount mount_point

Mounting the tar and zip archive is made with archivemount tool. Squashfs is mounted using kernel. SquashfsFuse is mounted using fuse API. Arx mount is implemented using fuse API.

Linux doc

Documentation directory only of linux source code:

Type Creation Size Extract Listing Mount diff Dump
Arx 150ms963μs 11.10 MB 038ms395μs 004ms051μs 299ms764μs 005ms618μs
FS 150ms639μs 38.45 MB 106ms821μs 006ms962μs 077ms414μs 498μs
Squashfs 103ms076μs 10.60 MB 098ms787μs 005ms365μs 261ms533μs 002ms088μs
SquashfsFuse 097ms863μs 10.60 MB - - 748ms597μs -
Tar 141ms079μs 9.68 MB 065ms744μs 041ms015μs 02m41s 042ms143μs
Zip 01s083ms 15.22 MB 388ms720μs 037ms044μs 03m06s 014ms088μs

This is the ratio time / Arx time. A ratio greater than 100% means Arx is better.

Type Creation Size Extract Listing Mount diff Dump
FS 100% 346% 278% 172% 26% 9%
Squashfs 68% 95% 257% 132% 87% 37%
SquashfsFuse 65% 95% - - 250% -
Tar 93% 87% 171% 1012% 53997% 750%
Zip 718% 137% 1012% 914% 62350% 251%

Linux Driver

Driver directory only of linux source code:

Type Creation Size Extract Listing Mount diff Dump
Arx 01s060ms 98.23 MB 241ms699μs 009ms516μs 01s290ms 007ms193μs
FS 778ms095μs 799.02 MB 523ms191μs 021ms578μs 467ms559μs 495μs
Squashfs 829ms886μs 121.70 MB 435ms851μs 012ms289μs 01s629ms 002ms190μs
SquashfsFuse 829ms237μs 121.70 MB - - 03s823ms -
Tar 911ms042μs 97.96 MB 515ms178μs 472ms060μs - 504ms231μs
Zip 20s498ms 141.91 MB 03s665ms 098ms194μs - 034ms481μs

This is the ratio time / Arx time. A ratio greater than 100% means Arx is better.

Type Creation Size Extract Listing Mount diff Dump
FS 73% 813% 216% 227% 36% 7%
Squashfs 78% 124% 180% 129% 126% 30%
SquashfsFuse 78% 124% - - 296% -
Tar 86% 100% 213% 4961% - 7010%
Zip 1932% 144% 1516% 1032% - 479%

Linux Source Code

Type Creation Size Extract Listing Mount diff Dump
Arx 02s104ms 170.97 MB 435ms846μs 022ms238μs 02s829ms 010ms613μs
FS 01s605ms 1.12 GB 01s046ms 043ms358μs 943ms546μs 493μs
Squashfs 01s430ms 201.43 MB 725ms532μs 024ms050μs 03s272ms 002ms374μs
SquashfsFuse 01s417ms 201.43 MB - - 13s864ms -
Tar 01s479ms 168.77 MB 938ms758μs 799ms550μs - 802ms427μs
Zip 31s810ms 252.96 MB 06s260ms 256ms137μs - 045ms722μs

This is the ratio time / Arx time. A ratio greater than 100% means Arx is better.

Type Creation Size Extract Listing Mount diff Dump
FS 76% 674% 240% 195% 33% 5%
Squashfs 68% 118% 166% 108% 116% 22%
SquashfsFuse 67% 118% - - 490% -
Tar 70% 99% 215% 3595% - 7561%
Zip 1511% 148% 1436% 1152% - 431%

The kernel compilation is the time needed to compile the whole kernel with the default configuration (-j8). For arx, we are compiling the kernel using the source in the archive mounted in mount_point.

Kernel compilation is made is "real" condition. Source or arx archive are stored on ssd.

Type Compilation
Arx 40m
FS 32m

Arx archive are a bit bigger (about 1%) than tar.zst archive but 15% smaller that squashfr. Creation and full extraction time are a bit longer for arx but times are comparable.

Listing files ar accessing individual files from the archive is far more rapid using arx or squash. Access time is almost constant indpendently of the size of the archive. For tar however, time to access individual file is greatly increasing when the archive size is increasing.

Mounting a arx archive make the archive usable without extracting it. A simple diff -r takes 4 more time than a plain diff between two directories but it is a particular use case (access all files "sequentially" and only once).

But for linux documentation arx is 444 time quicker than tar (several hours). The bigger the tar archive is the bigger is this ratio. I haven't try to do a mount-diff for the full kernel.

For kernel compilation, the overhead is about 25%. But on the opposite side, you can compile the kernel without storing 1.3GB of source on your hard drive.

arx's People

Contributors

mgautierfr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

jeandelest

arx's Issues

Build from git on arm64 stop at 217/222 when cargo install arx works

Hi,

I fall in same problem (must be a relative utf-8 path) when trying archiving a chroot (so with broken symlink)
I test x86_64 binaries proposed and it solve the problem.

I will try to generate arx on an arm64 (Ubuntu2204) system (final target is for Asustor NAS ... but it's my problem)
... using cargo install arx ... all run as well binaries are available and run BUT WITH the ERROR (ready to be adapted as NAS (my part))
... due to error, I have tried to compile from git (where problem is solved) I clone the repo.
And I use : cargo build --release (I am not a rust dev.)
ALL seems to go ahead without error up to freezing ...
Building [=======================> ] 217/222: mount_fuse_arx(bin), auto_mount(bin), zip2arx(bin), tar2arx(...

... I also try to do a cargo clean to restart without previous run before tests

root@AS3302Tv2aphil:/opt/arx# cargo build --release Compiling libc v0.2.155 Compiling cfg-if v1.0.0 Compiling proc-macro2 v1.0.86 Compiling version_check v0.9.4 Compiling unicode-ident v1.0.12 Compiling pkg-config v0.3.30 Compiling byteorder v1.5.0 Compiling memchr v2.7.4 Compiling jobserver v0.1.32 Compiling quote v1.0.36 Compiling once_cell v1.19.0 Compiling cc v1.1.6 Compiling syn v2.0.72 Compiling ahash v0.8.11 Compiling typenum v1.17.0 Compiling crossbeam-utils v0.8.20 Compiling zerocopy v0.7.35 Compiling generic-array v0.14.7 Compiling utf8parse v0.2.2 Compiling anstyle-parse v0.2.5 Compiling getrandom v0.2.15 Compiling zstd-sys v2.0.12+zstd.1.5.6 Compiling anstyle-query v1.1.1 Compiling colorchoice v1.0.2 Compiling serde v1.0.204 Compiling anstyle v1.0.8 Compiling log v0.4.22 Compiling adler v1.0.2 Compiling is_terminal_polyfill v1.70.1 Compiling allocator-api2 v0.2.18 Compiling miniz_oxide v0.7.4 Compiling hashbrown v0.14.5 Compiling anstream v0.6.15 Compiling rustix v0.38.34 Compiling aho-corasick v1.1.3 Compiling subtle v2.6.1 Compiling clap_lex v0.7.2 Compiling heck v0.5.0 Compiling strsim v0.11.1 Compiling serde_derive v1.0.204 Compiling regex-syntax v0.8.4 Compiling zstd-safe v6.0.6 Compiling bitflags v2.6.0 Compiling linux-raw-sys v0.4.14 Compiling autocfg v1.3.0 Compiling num-traits v0.2.19 Compiling regex-automata v0.4.7 Compiling clap_derive v4.5.11 Compiling clap_builder v4.5.11 Compiling crossbeam-epoch v0.9.18 Compiling powerfmt v0.2.0 Compiling rayon-core v1.12.1 Compiling deranged v0.3.11 Compiling clap v4.5.11 Compiling crossbeam-deque v0.8.5 Compiling crypto-common v0.1.6 Compiling blake3 v1.5.3 Compiling equivalent v1.0.1 Compiling indexmap v2.2.6 Compiling serde_spanned v0.6.7 Compiling toml_datetime v0.6.7 Compiling zerocopy-derive v0.6.6 Compiling crossbeam-channel v0.5.13 Compiling uuid v1.10.0 Compiling backtrace v0.3.73 Compiling fuser v0.13.0 Compiling crc-catalog v2.4.0 Compiling portable-atomic v1.7.0 Compiling iana-time-zone v0.1.60 Compiling arrayvec v0.7.4 Compiling constant_time_eq v0.3.0 Compiling fastrand v2.1.0 Compiling gimli v0.29.0 Compiling arrayref v0.3.8 Compiling either v1.13.0 Compiling rayon v1.10.0 Compiling tempfile v3.10.1 Compiling chrono v0.4.38 Compiling addr2line v0.22.0 Compiling crc v3.2.1 Compiling zerocopy v0.6.6 Compiling dropout v0.1.0 Compiling toml_edit v0.22.17 Compiling bstr v1.9.2 Compiling block-buffer v0.10.4 Compiling lru v0.11.1 Compiling ring v0.17.8 Compiling bzip2-sys v0.1.11+1.0.8 Compiling page_size v0.5.0 Compiling memmap2 v0.8.0 Compiling object v0.36.2 Compiling fxhash v0.2.1 Compiling crc32fast v1.4.2 Compiling lazy_static v1.5.0 Compiling static_assertions v1.1.0 Compiling rustc-demangle v0.1.24 Compiling same-file v1.0.6 Compiling spmc v0.3.0 Compiling smallvec v1.13.2 Compiling pathdiff v0.2.1 Compiling unicode-width v0.1.13 Compiling console v0.15.8 Compiling walkdir v2.5.0 Compiling flate2 v1.0.30 Compiling toml v0.8.16 Compiling digest v0.10.7 Compiling epochs v0.2.4 Compiling os_info v3.8.2 Compiling cpufeatures v0.2.12 Compiling number_prefix v0.4.0 Compiling roff v0.2.2 Compiling relative-path v1.9.3 Compiling anyhow v1.0.86 Compiling tinyvec_macros v0.1.1 Compiling tinyvec v1.8.0 Compiling clap_mangen v0.2.23 Compiling indicatif v0.17.8 Compiling zstd v0.12.4 Compiling jubako v0.3.0-dev (https://github.com/jubako/jubako.git#dbf5e898) Compiling human-panic v1.2.3 Compiling clap_complete v4.5.10 Compiling lzma-sys v0.1.20 Compiling rustls-pki-types v1.7.0 Compiling thiserror v1.0.63 Compiling untrusted v0.9.0 Compiling zstd-safe v5.0.2+zstd.1.5.2 Compiling spin v0.9.8 Compiling libarx v0.2.1 (/opt/arx/libarx) Compiling bzip2 v0.4.4 Compiling unicode-normalization v0.1.23 Compiling regex v1.10.5 Compiling thiserror-impl v1.0.63 Compiling inout v0.1.3 Compiling is-terminal v0.4.12 Compiling termcolor v1.4.1 Compiling percent-encoding v2.3.1 Compiling rustls v0.23.12 Compiling unicode-bidi v0.3.15 Compiling base64ct v1.6.0 Compiling rand_core v0.6.4 Compiling humantime v2.1.0 Compiling password-hash v0.4.2 Compiling idna v0.5.0 Compiling form_urlencoded v1.2.1 Compiling env_logger v0.10.2 Compiling cipher v0.4.4 Compiling rustls-webpki v0.102.6 Compiling sha2 v0.10.8 Compiling hmac v0.12.1 Compiling num-conv v0.1.0 Compiling time-core v0.1.2 Compiling zeroize v1.8.1 Compiling pbkdf2 v0.11.0 Compiling time v0.3.36 Compiling zstd v0.11.2+zstd.1.5.2 Compiling url v2.5.2 Compiling aes v0.8.4 Compiling bgzip v0.2.2 Compiling xz2 v0.1.7 Compiling webpki-roots v0.26.3 Compiling sha1 v0.10.6 Compiling xattr v1.3.1 Compiling filetime v0.2.23 Compiling constant_time_eq v0.1.5 Compiling base64 v0.22.1 Compiling zip v0.6.6 Compiling tar v0.4.41 Compiling niffler v2.5.0 Compiling zip2arx v0.2.1 (/opt/arx/zip2arx) Compiling arx v0.2.1 (/opt/arx/arx) Compiling ureq v2.10.0 Compiling tar2arx v0.2.1 (/opt/arx/tar2arx) Building [=======================> ] 217/222: mount_fuse_arx(bin), auto_mount(bin), zip2arx(bin), tar2arx(...

can you help me to compile from git, or when new source will be used with cargo install arx (as working previously)

Philippe.

Feature request: delete a file

What most formats don't even care to solve is how to remove files from an archive efficiently without recreating the whole archive again. With concepts like defragmenting unused space etc.

Is this something this project could target or this out of scope🤔

"must be a relative utf-8 path" when giving a relative path for the directory (with a downloaded binary of arx)

          Same issue but with a local path for the directory (with a downloaded binary of arx):

arx create -o irstea_documents.arx -r Documents

Compressed Cluster : ############################################+ 515 / 516
Uncompressed Cluster : --------------------------------------------- 0 / 1 Well, this is embarrassing.

arx had a problem and crashed. To help us diagnose the problem you can send us a crash report.

We have generated a report file at "/tmp/report-ddf531c4-69ad-476d-9f5b-c602edea3b20.toml". Submit an issue or email with the subject of "arx Crash Report" and include the report as an attachment.

We take privacy seriously, and do not perform any automated error collection. In order to improve the software, we rely on people to submit reports.

Thank you kindly!

then:

christo:~$ cat /tmp/report-ddf531c4-69ad-476d-9f5b-c602edea3b20.toml
"name" = "arx"
"operating_system" = "Ubuntu 22.04 (jammy) [64-bit]"
"crate_version" = "0.2.1"
"explanation" = """
Panic occurred in file 'libarx/src/create/fs_adder.rs' at line 88
"""
"cause" = ""/home/christo/Documents/Pôle" must be a relative utf-8 path"
"method" = "Panic"
"backtrace" = """

0: 0x651652a4739c - libarx::create::entry_store_creator::DirEntry::add_entry::hfac430b68bc9369a
1: 0x651652a479d8 - libarx::create::entry_store_creator::DirEntry::add::h6a26588866a85f93
2: 0x651652a47c2d - libarx::create::entry_store_creator::DirEntry::add::h6a26588866a85f93
3: 0x651652a5eee1 - arx::create::create::h0d31a0a1c81133f8
4: 0x651652a036c0 - arx::main::h8e785563c91bb87e
5: 0x651652a74d63 - std::sys_common::backtrace::__rust_begin_short_backtrace::h512139b71798be16
6: 0x651652a0e9d2 - main
7: 0x7345bdc29d90 - __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h:58
8: 0x7345bdc29e40 - __libc_start_main_impl
at ./csu/../csu/libc-start.c:392
9: 0x6516529f7255 - _start
10: 0x0 - """
christo:~$

Originally posted by @cchristofr in #31 (comment)

must be a relative utf-8 path

hi,
I'm trying arx and get an error when I pass the input as an absolute path :

$ cd /tmp/
$ mkdir test.arx.d/
$ cp ./arx test.arx.d/

$ # OK
$ ./arx -vvv create --outfile ./test.arx ./test.arx.d/
[INFO  arx::create] Creating archive Some("./test.arx")
[INFO  arx::create] With files ["./test.arx.d/"]
[DEBUG arx::create] Saved place is 0
$ rm -f test.arx

$ # KO
$ ./arx -vvv create --outfile ./test.arx /tmp/test.arx.d/
[INFO  arx::create] Creating archive Some("./test.arx")
[INFO  arx::create] With files ["/tmp/test.arx.d/"]
Well, this is embarrassing.
arx had a problem and crashed. To help us diagnose the problem you can send us a crash report.
We have generated a report file at "/tmp/report-a142a59f-fe43-41af-ad9e-c5fc2f1541c6.toml". Submit an issue or email with the subject of "arx Crash Report" and include the report as an attachment.
- Homepage: https://github.com/jubako/arx
- Authors: Matthieu Gautier <[email protected]>
We take privacy seriously, and do not perform any automated error collection. In order to improve the software, we rely on people to submit reports.
Thank you kindly!
$ cat /tmp/report-a142a59f-fe43-41af-ad9e-c5fc2f1541c6.toml
"name" = "arx"
"crate_version" = "0.2.1"
"explanation" = """
Panic occurred in file '/home/runner/work/arx/arx/libarx/src/create/fs_adder.rs' at line 146
"""
"cause" = "\"/tmp/test.arx.d/\" must be a relative utf-8 path."
"method" = "Panic"
"backtrace" = """
   0: 0x642f3c0fa4c5 - arx::create::create::h0d31a0a1c81133f8
   1: 0x642f3c09e6c0 - arx::main::h8e785563c91bb87e
   2: 0x642f3c10fd63 - std::sys_common::backtrace::__rust_begin_short_backtrace::h512139b71798be16
   3: 0x642f3c0a99d2 - main
   4: 0x75fe15335c88 - <unresolved>
   5: 0x75fe15335d4c - __libc_start_main
   6: 0x642f3c092255 - _start
   7:        0x0 - <unresolved>"""

the problem remains if I install arx with cargo.
regards, lacsaP.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.