plakarlabs / plakar Goto Github PK

View Code? Open in Web Editor NEW

48.0 48.0 5.0 5.17 MB

plakar is a backup solution

License: ISC License

Go 100.00%

plakar's People

Contributors

Stargazers

Watchers

Forkers

barajus jrhopper barcus jpoutrin

plakar's Issues

snapshots merging

it would be nice to have two ways of merging snapshots:

1- a union merge: take snapshot A and snapshot B, then create snapshot C where it's A | B (taking newest file on collision)

2- a sequence merge: take snapshot A, B and C, then create a snapshot D which represents the latest state after seeing A, B and C in sequence
for 2, if you always snapshot the same directory, then the last snapshot is already a sequence merge
but if say A is /etc, B is /etc where I remvoed file, and C is just /etc/nginx with a change
then D should be A without the files removed in B and with the change in C

`plakar rm` is broken

Describe the bug
There is a regression with plakar rm causing it to fault when destroying an existing snapshot:

% plakar rm d79b854f-8528-41a1-91a5-a3500321919e
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1027f7d3c]

goroutine 1 [running]:
github.com/poolpOrg/plakar/storage.(*Snapshot).Purge(0x140001700e0, 0x1400022e079, 0x24)
        /Users/gilles/Wip/github.com/poolpOrg/plakar/storage/snapshot.go:433 +0x3c
main.cmd_rm(0x102a54070, 0x140001537a0, 0x140000121d0, 0x1, 0x1)
        /Users/gilles/Wip/github.com/poolpOrg/plakar/cmd/plakar/cmd_rm.go:55 +0x250
main.main()
        /Users/gilles/Wip/github.com/poolpOrg/plakar/cmd/plakar/plakar.go:265 +0xa24

To Reproduce
Steps to reproduce the behavior:

create a snapshot
delete the snapshot

Expected behavior
The command should simply return and the snapshot should no longer be available.

Desktop (please complete the following information):

OS: any

Additional context
This was introduced with the rework into packages.

snapshot instance should rebuild a filesystem view

many commands allow passing a path along with a snapshot id (ie: plakar ls deadbeef:/usr/bin).

snapshot indexes contain various mappings which allow restoring all pathnames but which are hard to work with when trying to figure out the hierarchy itself.

snapshot instance should rebuild a tree structure from these mappings or, if not too costly, this hierarchy should be built within the index so that it can be reloaded more easily. It should be possible to implement a function that lists what files and directories are within a specific path without having to compare strings.

Allow providing an option to limit disk I/O

Similarly to #7, plakar will try to read/write as fast as it can to the storage layer, letting the disk performances throttle operations.

There should be an option to limit this because this puts a lot of pressure on disks and may impact other services running on the machine, but also because if the storage layer uses a remote server and is not throttle by disk, it may exhaust the bandwidth.

I have not given much thinking into this but maybe something along the lines of $ plakar -limit-read 100Mb/s -limit-write 200Mb/s ... would be a possible solution

Index should provide reverse lookups

It should be possible to perform reverse lookups on various fields.

For instance, it is currently possible to figure out which chunks are part of an object, but not which objects share the same chunk without scanning all objects.

This is not necessary for snapshotting and recovering but it makes some commands harder to implement and the cost of having these reverse indexes is relatively low.

directory watcher: suggested by proullon on Discord

Is your feature request related to a problem? Please describe.
nope.

Describe the solution you'd like
a watcher monitoring a directory and triggering a push on change.

Describe alternatives you've considered
none.

Additional context
none.

replace `from/to` with `on` in the CLI

Is your feature request related to a problem? Please describe.

Depending on the direction of the command on the plakar, the CLI expect the from or to parameter to be passed when working with a non-default plakar, this is confusing as user needs to know the direction of a command, but also because the CLI code needs to handle special cases.

Describe the solution you'd like

Replace with on which works in all cases.

allow providing an option to throttle CPU

currently plakar does a lot of parallelisation to maximise resources utilisation and speed up operations.

on my machine it'll happily consume all cores, but it may be desirable to provide a knob that will artificially slow it down and leave resources available to others.

some ideas:

$ plakar -cpu 4 to limit usage to 4 cores
$ plakar -concurrency 10000 to limit concurrency to 10000 goroutines

Replace Rabin CDC with FastCDC

The chunker used in plakar is Restic's Rabin CDC which was fine to bootstrap the project but in theory much slower than FastCDC.

Switch to a FastCDC implementation if it exists, or write one.

https://www.usenix.org/conference/atc16/technical-sessions/presentation/xia

plakarlabs / plakar Goto Github PK

plakar's People

Contributors

Stargazers

Watchers

Forkers

plakar's Issues

snapshots merging

`plakar rm` is broken

snapshot instance should rebuild a filesystem view

Allow providing an option to limit disk I/O

Index should provide reverse lookups

directory watcher: suggested by proullon on Discord

replace `from/to` with `on` in the CLI

allow providing an option to throttle CPU

Replace Rabin CDC with FastCDC

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent