oll3 / bita Goto Github PK

View Code? Open in Web Editor NEW

257.0 10.0 8.0 3.2 MB

Differential file synchronization over http

Home Page: https://crates.io/crates/bita

License: MIT License

Rust 99.90% Shell 0.10%

differential-updates synchronization file-sync rust download software-update ota-update

bita's People

Contributors

Stargazers

Watchers

Forkers

odsod atouchet rminderhoud denji snyke7 iq-scm silvanshade mcofficer

bita's Issues

Add optional output file verification

Can be used in integration test and for the paranoid user to ensure that a clone was executed successfully.

Compare the blake2 sum present in archive with the sum of the output file after clone.

Use Blake3 for hashing chunk and source

Allow to use Blake3 for hashing chunks and source for potential compress/clone speedup.

Need support in dictionary.
Allow to select different algorithm for source and chunk, or force same?
Use as default and place Blake2 behind feature flag?

Allow for adding custom http headers to requests

This can be useful when we need to provide authorization/authentication tokens to gain access to server.

Provide binary releases

I want to move the bita generation job from my jenkins to an ephemeral environment, namely github actions. To this end, it would be very convenient to have binaries I can download, instead of compiling bita from scratch on every run.

Use prost for protobuf generation

0.9.0: new Box<dyn Chunker> is no longer Send

Trying to update to v0.9.0 right now, and I've run into a slight problem: Whereas previously, i could create and own a Chunker, the new ChunkerConfig::new_chunker only returns a Box<dyn Chunker> - which, no matter how hard i try, i cannot make Send.

Code is taken pretty much straight from the in-place-cloner example:

async fn bitar_update_archive(
    instance_name: &str,
    target_path: &PathBuf,
    url: String,
) -> Result<()> {

    // Open archive which source we want to clone
    let reader = bitar::ReaderRemote::from_url(url.parse()?);
    let mut source_archive = bitar::Archive::try_init(reader).await?;

    // Open our target file
    let mut target = OpenOptions::new()
        .read(true)
        .create(true)
        .write(true)
        .open(&target_path)
        .await?;

    send_progress_message(instance_name, "Scanning local chunks".into());
    // Scan the target file for chunks and build a chunk index
    let mut output_index = bitar::ChunkIndex::new_empty();
    {
        let chunker = source_archive.chunker_config().new_chunker(&mut target);
        let mut chunk_stream = chunker.map_ok(|(offset, chunk)| (offset, chunk.verify()));
        while let Some(r) = chunk_stream.next().await {
            let (offset, verified) = r?;
            let (hash, chunk) = verified.into_parts();
            output_index.add_chunk(hash, chunk.len(), &[offset]);
        }
    }

...

Trying to run that method form within iced's runtime results in an error like this:
(perform_update indirectly calls the method above)

error: future cannot be sent between threads safely
   --> src\instance.rs:223:44
    |
223 |                     iced::Command::perform(perform_update(self.clone()), Message::Dummy),
    |                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ future returned by `perform_update` is not `Send`
    |
    = help: the trait `std::marker::Send` is not implemented for `dyn Chunker`
note: future is not `Send` as this value is used across an await
   --> src\update.rs:143:29
    |
141 |         let chunker = source_archive.chunker_config().new_chunker(&mut target);
    |             ------- has type `Box<dyn Chunker>` which is not `Send`
142 |         let mut chunk_stream = chunker.map_ok(|(offset, chunk)| (offset, chunk.verify()));
143 |         while let Some(r) = chunk_stream.next().await {
    |                             ^^^^^^^^^^^^^^^^^^^^^^^^^ await occurs here, with `chunker` maybe used later
...
155 |     }
    |     - `chunker` is later dropped here
note: required by a bound in `iced::Command::<T>::perform`
   --> C:\Users\MCO\.cargo\registry\src\github.com-1ecc6299db9ec823\iced_futures-0.3.0\src\command.rs:25:53
    |
25  |         future: impl Future<Output = T> + 'static + Send,
    |                                                     ^^^^ required by this bound in `iced::Command::<T>::perform`

I could probably monkeypatch that by moving it into its own thread (yuck), but I'd rather ask if you can think of a fix on your end.

Implement chunk index cache

A cache which describes which chunks exists and where they live could be saved and associated with the target while cloning. This cache could be used on next clone operation to avoid having to scan the previous target (now possibly used as seed) for chunks.

MicroService to Update an installer package

I am new to rust and am interested in using this crate for a project. I was able to build the source code on VS Code but unable to run it and get an "unknown error" when I try and run. I suspect I am not using it right. Is it possible to have a small section for newbies like me to get started with bita on visual studio code. I am also running this on windows. The real goal is to allow beta to manage installer packages by only delivering the incremental changes from the previous vetsion.

clone: Update in place

Allow to clone using the output file as seed, updating the output file in-place.

Fetch dictionary
Scan output for chunks
Find overlaps
Move chunks around?

Very cool

should be called ChunkMonkey though

clone: Add new http retry parameter

Let user decide if (and for how many times) we we should retry a failing request to remote or bail. Should also allow for setting a delay in-between the retries.

Allowing other filesystem types, in particular SMB2

It looks like what you need from a remote filesystem is efficient read-only range requests, so it should be possible to support SMB2 as the remote filesystem.

clone: Add new http request timeout parameter

So that a clone request can be aborted after some given time when remote doesn't answer.

Signed dictionary

I'm very interested in using bita for a software update project I'm working on, but it would be extremely useful if it were possible to be able to verify the release image with a cryptographic signature (probably PGP). My understanding is that simply signing the archive file wouldn't be very useful because you'd need to download the entire archive in order to verify the signature, defeating the point of the incremental downloads.

However, because the dictionary contains cryptographic hashes of all the chunks, I believe it would be sufficient to simply provide a signature to authenticate the dictionary, which would then validate the integrity of all the associated chunks. I would propose using the Sequoia library to generate a <archive filename>.sig file at compression time that contains a detached signature of the dictionary, and that bita/bitar would download and use that signature, if present, when fetching the archive in order to authenticate the dictionary.

Does that sound like a reasonable approach?

Commit code generated by prost-build

Since v0.11 (4fea65b), prost-build requires protoc to be installed while compiling. This means any downstream application / library of bitar also needs it to compile.

Seeing how "the new recommendation for libraries is to commit their generated protobuf" (tokio-rs/prost#657), it's probably best for bitar to ship the generated code, rather than generating it at compile-time.

Archive created using fixed size and no compression cannot be cloned

I was able to create an archive that I cannot clone on bita 0.9.0. Steps:

bita compress --fixed-size 64 --compression none -i agreement.hlp agreement.cba
bita clone agreement.cba agreement_new.hlp

I started with no compression and a fix sized chunking strategy to try to keep it simple but based on some of the code I suspect there's a different set of configurations that are better supported than fix sized/no compression. I intend to look further into this but wanted to get the issue up

agreement.zip

Allow for chunking with fixed block size

Add cli option for compressing with fixed chunk size.

update failed in bita

thread '' panicked at /home/lynxi/.cargo/registry/src/mirrors.ustc.edu.cn-61ef6e0cd06fb9b8/bitar-0.11.0/src/archive.rs:286:46:
index out of bounds: the len is 5898 but the index is 5898
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 3493789840

Clonning to block device should allow partition to be bigger than the archive

When testing bita for applying changes to a ext4 formatted I've came across the following error:

Error: Size of output device (1430 MiB (1499661312 bytes)) differ from size of archive target file (142 MiB (149934080 bytes))

full command output

root@uh-qemu-x86-64:~# bita clone --seed-output updatehub-image.ext4.bita /dev/vda3
Archive:
  Built with version: 0.7.1
  Archive size: 29 MiB (30866271 bytes)
  Header checksum: 1635e9a121ca79f2cc1fb5e1bf23da9c8a64a5d95e146ba6299c900200d74f0cede30d88f6589abd34d9d07aef580a3d1041b3e722fc3ebc69a12421e03d0511
  Chunking algorithm: RollSum
  Rolling hash window size: 64 bytes
  Chunk minimum size: 16 KiB (16384 bytes)
  Chunk maximum size: 16 MiB (16777216 bytes)
  Chunk average target size: 64 KiB (65536 bytes) (mask: 0b111111111111111)
Source:
  Source checksum: cf895d5a02488c426a6a60dfda220f201a19bc1ba0c7eb4a5c8113319b21e81b2f0bc45923b5d3d21ebc288c03ed134fea5b76a2e42f9c5dc995177bef6e68b0
  Chunks in source: 1463 (unique: 1455)
  Average chunk size: 100 KiB (102909 bytes)
  Source size: 142 MiB (149934080 bytes)

Cloning archive updatehub-image.ext4.bita to /dev/vda3...
Error: Size of output device (1430 MiB (1499661312 bytes)) differ from size of archive target file (142 MiB (149934080 bytes))

Which was caused by the following piece of code:

bita/src/clone_cmd.rs

Lines 136 to 145 in e6d3a9e

 if output.is_block_dev() { 

 let size = output.size().await?; 

 if size != archive.total_source_size() { 

 return Err(anyhow!( 

 "Size of output device ({}) differ from size of archive target file ({})", 

 size_to_str(size), 

 size_to_str(archive.total_source_size()) 

 )); 

 } 

 }

I believe that error should only be raised in case that the partition is smaller than the archive. A bigger partition should not be an issue.

Commenting out that block of the code in order to test the clone as a hole did work for applying the desired changes, but indexing the partition took more time than I would expect. I think the hole output device is being processed in order to perform the clone. If that's the case, wouldn't it be possible (for cases such as this) to index only the chunks up to the archive total source size?

Use as library

Hello,

Bita almost perfectly fits my needs, except it seems to be exclusively a CLI tool. Is there any chance this can be used as library from another rust project?

	if output.is_block_dev() {
	let size = output.size().await?;
	if size != archive.total_source_size() {
	return Err(anyhow!(
	"Size of output device ({}) differ from size of archive target file ({})",
	size_to_str(size),
	size_to_str(archive.total_source_size())
	));
	}
	}