oll3 / bita Goto Github PK
View Code? Open in Web Editor NEWDifferential file synchronization over http
Home Page: https://crates.io/crates/bita
License: MIT License
Differential file synchronization over http
Home Page: https://crates.io/crates/bita
License: MIT License
Can be used in integration test and for the paranoid user to ensure that a clone was executed successfully.
Compare the blake2 sum present in archive with the sum of the output file after clone.
Allow to use Blake3 for hashing chunks and source for potential compress/clone speedup.
This can be useful when we need to provide authorization/authentication tokens to gain access to server.
I want to move the bita generation job from my jenkins to an ephemeral environment, namely github actions. To this end, it would be very convenient to have binaries I can download, instead of compiling bita from scratch on every run.
Trying to update to v0.9.0 right now, and I've run into a slight problem: Whereas previously, i could create and own a Chunker
, the new ChunkerConfig::new_chunker
only returns a Box<dyn Chunker>
- which, no matter how hard i try, i cannot make Send
.
Code is taken pretty much straight from the in-place-cloner example:
async fn bitar_update_archive(
instance_name: &str,
target_path: &PathBuf,
url: String,
) -> Result<()> {
// Open archive which source we want to clone
let reader = bitar::ReaderRemote::from_url(url.parse()?);
let mut source_archive = bitar::Archive::try_init(reader).await?;
// Open our target file
let mut target = OpenOptions::new()
.read(true)
.create(true)
.write(true)
.open(&target_path)
.await?;
send_progress_message(instance_name, "Scanning local chunks".into());
// Scan the target file for chunks and build a chunk index
let mut output_index = bitar::ChunkIndex::new_empty();
{
let chunker = source_archive.chunker_config().new_chunker(&mut target);
let mut chunk_stream = chunker.map_ok(|(offset, chunk)| (offset, chunk.verify()));
while let Some(r) = chunk_stream.next().await {
let (offset, verified) = r?;
let (hash, chunk) = verified.into_parts();
output_index.add_chunk(hash, chunk.len(), &[offset]);
}
}
...
Trying to run that method form within iced's runtime results in an error like this:
(perform_update
indirectly calls the method above)
error: future cannot be sent between threads safely
--> src\instance.rs:223:44
|
223 | iced::Command::perform(perform_update(self.clone()), Message::Dummy),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ future returned by `perform_update` is not `Send`
|
= help: the trait `std::marker::Send` is not implemented for `dyn Chunker`
note: future is not `Send` as this value is used across an await
--> src\update.rs:143:29
|
141 | let chunker = source_archive.chunker_config().new_chunker(&mut target);
| ------- has type `Box<dyn Chunker>` which is not `Send`
142 | let mut chunk_stream = chunker.map_ok(|(offset, chunk)| (offset, chunk.verify()));
143 | while let Some(r) = chunk_stream.next().await {
| ^^^^^^^^^^^^^^^^^^^^^^^^^ await occurs here, with `chunker` maybe used later
...
155 | }
| - `chunker` is later dropped here
note: required by a bound in `iced::Command::<T>::perform`
--> C:\Users\MCO\.cargo\registry\src\github.com-1ecc6299db9ec823\iced_futures-0.3.0\src\command.rs:25:53
|
25 | future: impl Future<Output = T> + 'static + Send,
| ^^^^ required by this bound in `iced::Command::<T>::perform`
I could probably monkeypatch that by moving it into its own thread (yuck), but I'd rather ask if you can think of a fix on your end.
A cache which describes which chunks exists and where they live could be saved and associated with the target while cloning. This cache could be used on next clone operation to avoid having to scan the previous target (now possibly used as seed) for chunks.
I am new to rust and am interested in using this crate for a project. I was able to build the source code on VS Code but unable to run it and get an "unknown error" when I try and run. I suspect I am not using it right. Is it possible to have a small section for newbies like me to get started with bita on visual studio code. I am also running this on windows. The real goal is to allow beta to manage installer packages by only delivering the incremental changes from the previous vetsion.
Allow to clone using the output file as seed, updating the output file in-place.
should be called ChunkMonkey though
Let user decide if (and for how many times) we we should retry a failing request to remote or bail. Should also allow for setting a delay in-between the retries.
It looks like what you need from a remote filesystem is efficient read-only range requests, so it should be possible to support SMB2 as the remote filesystem.
So that a clone request can be aborted after some given time when remote doesn't answer.
I'm very interested in using bita for a software update project I'm working on, but it would be extremely useful if it were possible to be able to verify the release image with a cryptographic signature (probably PGP). My understanding is that simply signing the archive file wouldn't be very useful because you'd need to download the entire archive in order to verify the signature, defeating the point of the incremental downloads.
However, because the dictionary contains cryptographic hashes of all the chunks, I believe it would be sufficient to simply provide a signature to authenticate the dictionary, which would then validate the integrity of all the associated chunks. I would propose using the Sequoia library to generate a <archive filename>.sig
file at compression time that contains a detached signature of the dictionary, and that bita/bitar would download and use that signature, if present, when fetching the archive in order to authenticate the dictionary.
Does that sound like a reasonable approach?
Since v0.11 (4fea65b), prost-build
requires protoc
to be installed while compiling. This means any downstream application / library of bitar
also needs it to compile.
Seeing how "the new recommendation for libraries is to commit their generated protobuf" (tokio-rs/prost#657), it's probably best for bitar to ship the generated code, rather than generating it at compile-time.
I was able to create an archive that I cannot clone on bita 0.9.0. Steps:
bita compress --fixed-size 64 --compression none -i agreement.hlp agreement.cba
bita clone agreement.cba agreement_new.hlp
I started with no compression and a fix sized chunking strategy to try to keep it simple but based on some of the code I suspect there's a different set of configurations that are better supported than fix sized/no compression. I intend to look further into this but wanted to get the issue up
Add cli option for compressing with fixed chunk size.
thread '' panicked at /home/lynxi/.cargo/registry/src/mirrors.ustc.edu.cn-61ef6e0cd06fb9b8/bitar-0.11.0/src/archive.rs:286:46:
index out of bounds: the len is 5898 but the index is 5898
note: run with RUST_BACKTRACE=1
environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 3493789840
When testing bita for applying changes to a ext4 formatted I've came across the following error:
Error: Size of output device (1430 MiB (1499661312 bytes)) differ from size of archive target file (142 MiB (149934080 bytes))
root@uh-qemu-x86-64:~# bita clone --seed-output updatehub-image.ext4.bita /dev/vda3
Archive:
Built with version: 0.7.1
Archive size: 29 MiB (30866271 bytes)
Header checksum: 1635e9a121ca79f2cc1fb5e1bf23da9c8a64a5d95e146ba6299c900200d74f0cede30d88f6589abd34d9d07aef580a3d1041b3e722fc3ebc69a12421e03d0511
Chunking algorithm: RollSum
Rolling hash window size: 64 bytes
Chunk minimum size: 16 KiB (16384 bytes)
Chunk maximum size: 16 MiB (16777216 bytes)
Chunk average target size: 64 KiB (65536 bytes) (mask: 0b111111111111111)
Source:
Source checksum: cf895d5a02488c426a6a60dfda220f201a19bc1ba0c7eb4a5c8113319b21e81b2f0bc45923b5d3d21ebc288c03ed134fea5b76a2e42f9c5dc995177bef6e68b0
Chunks in source: 1463 (unique: 1455)
Average chunk size: 100 KiB (102909 bytes)
Source size: 142 MiB (149934080 bytes)
Cloning archive updatehub-image.ext4.bita to /dev/vda3...
Error: Size of output device (1430 MiB (1499661312 bytes)) differ from size of archive target file (142 MiB (149934080 bytes))
Which was caused by the following piece of code:
Lines 136 to 145 in e6d3a9e
I believe that error should only be raised in case that the partition is smaller than the archive. A bigger partition should not be an issue.
Commenting out that block of the code in order to test the clone as a hole did work for applying the desired changes, but indexing the partition took more time than I would expect. I think the hole output device is being processed in order to perform the clone. If that's the case, wouldn't it be possible (for cases such as this) to index only the chunks up to the archive total source size?
Hello,
Bita almost perfectly fits my needs, except it seems to be exclusively a CLI tool. Is there any chance this can be used as library from another rust project?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.