nextpertise / autorsyncbackup Goto Github PK
View Code? Open in Web Editor NEWAutoRsyncBackup is a Rsync+Hardlink backup solution written in Python as wrapper around Rsync.
License: GNU General Public License v3.0
AutoRsyncBackup is a Rsync+Hardlink backup solution written in Python as wrapper around Rsync.
License: GNU General Public License v3.0
If rsync produces an error, the client will not be backup up and there will be an empty folder. Which result in a full backup set for the next run.
This could be avoided to do folder rotate after a successful backup.
When we skip rotating till we finished all rsync jobs we can compress the time that clients are affected by the backup.
Lock file so there will be max 1 backup per host at the same time
Yes, bash can me multiprocessing by using the "&" sign.
For this task we need a global config file for:
Now we can choose between SSH and Rsync protocol I'd to see this back in the e-mail report. Could be handy while debugging.
In some error cases I see entries of previous backup attempts. This ticket will get a followup if I have a reproducible situation.
Rsync support's dry run. This is a nice feature to check if the remote host is correct configured without performing the actual backup.
This feature should be implemented after implementing post rotating #3
Add flag for verbosity (-v) so you don't need to tail a logfile. Maybe we can change the current debug boolean into a logleven int variable.
Rsync has SSH support, lets make autorsyncbackup compatible with SSH.
The job class initializes class variables not instance variables.
Due to a peculiarity in Python this turns out alright, when run all class variables are overwritten and are thus converted to instance variables.
That won't work for the arrays. They are .append()-ed and thus their contents are shared among all instances.
When running autorsyncbackup on Debian Buster using Python 3.7 in debug mode these errors will pop up on screen.
/usr/lib/python3/dist-packages/paramiko/kex_ecdh_nist.py:39: CryptographyDeprecationWarning: encode_point has been deprecated on EllipticCurvePublicNumbers and will be removed in a future version. Please use EllipticCurvePublicKey.public_bytes to obtain both compressed and uncompressed point encoding.
m.add_string(self.Q_C.public_numbers().encode_point())
/usr/lib/python3/dist-packages/paramiko/kex_ecdh_nist.py:96: CryptographyDeprecationWarning: Support for unsafe construction of public numbers from encoded data will be removed in a future version. Please use EllipticCurvePublicKey.from_encoded_point
self.curve, Q_S_bytes
/usr/lib/python3/dist-packages/paramiko/kex_ecdh_nist.py:111: CryptographyDeprecationWarning: encode_point has been deprecated on EllipticCurvePublicNumbers and will be removed in a future version. Please use EllipticCurvePublicKey.public_bytes to obtain both compressed and uncompressed point encoding.
hm.add_string(self.Q_C.public_numbers().encode_point())
/usr/lib/python3/dist-packages/paramiko/ecdsakey.py:164: CryptographyDeprecationWarning: Support for unsafe construction of public numbers from encoded data will be removed in a future version. Please use EllipticCurvePublicKey.from_encoded_point
self.ecdsa_curve.curve_class(), pointinfo
INFO: Successfully connected to host via ssh protocol (hostname.fqdn.com)
Autorsyncbackup was installed using the guide in Readme.md on a vanilla up-to-date Debian Buster install.
Should you need any further information, please do not hesitate to contact me.
@sebastic Does this warning also occur on your installation?
Add a flag to examine the last backup state of the given job file.
Example:
autorsyncbackup -j /etc/autorsyncbackup/example.job -s
-j = job file
-s = show job's last state
If job is failed, return code should be 1, otherwise it should be 0
The email produced is HTML only and cannot be read by plain text agents.
Add a non-html section to the produced email.
Set Error/Warning/Info/Debug:
prefix in logger lib
Currently you can't see in the report e-mail what is actually backupped. It would be nice if we print the list of paths for each host.
DEBUG: open db [/var/lib/autorsyncbackup//autorsyncbackup.db]
DEBUG: Check for table `jobrunhistory`
DEBUG: Check for table jobcommandhistory
ERROR: Could not insert job details for host (host.example.com) into the database (/var/lib/autorsyncbackup//autorsyncbackup.db)
After using autorsyncbackup for a while I see backups in the daily/monthly directory but none in the weekly directory.
Needs more investigation.
Python 2 will reach EOL next month.
Distributions are in the process to remove Python 2 because of that.
The next Debian stable release (bullseye) will most likely not include the Python 2 modules required for autorsyncbackup and hence will need to use their python3 variants then.
Since the last PR I got the following error:
Traceback (most recent call last):
File "/usr/local/bin/autorsyncbackup", line 141, in <module>
runBackup(options.job, options.dryrun)
File "/usr/local/bin/autorsyncbackup", line 88, in runBackup
jobrunhistory.deleteHistory()
TypeError: unbound method deleteHistory() must be called with jobrunhistory instance as first argument (got nothing instead)
Today we got a failed backup due to a config file typo:
Reading main config from /etc/autorsyncbackup/main.yaml
Writing to logfile /var/log/autorsyncbackup/autorsyncbackup.log
Starting AutoRsyncBackup
Traceback (most recent call last):
File "/usr/local/bin/autorsyncbackup", line 48, in <module>
director.processBackupStatus(job)
File "/usr/local/share/autorsyncbackup/src/director.py", line 236, in processBackupStatus
job.backupstatus['fileset'] = ':'.join(job.fileset)
TypeError: sequence item 0: expected string, int found
We should not only check the YAML file for vadility but also type check the individual variables.
Before releasing version 2.0 to the world we need to make a manual. Would be nice to publish this to readthedocs.
To anyone, feel free to start with this!
sshd on Debian 12 disabled the older ssh-rsa encryption from the handshake for pubkeys.
Thus makes is impossible to login using the old paramiko Debian 11 has in its repo (2.7.2).
Debian 12 on the other hand uses a newer version of paramiko (2.12.0-2) which has the exact opposite effect, it can't login to older clients with <=Debian9.
This change seems to be the culprit:
https://www.paramiko.org/changelog.html#2.9.0
The fix for plain ssh is:
Host *
HostKeyAlgorithms=+ssh-rsa
PubkeyAcceptedKeyTypes +ssh-rsa
It seems that using the newer ssh-rsa2 makes the older sshd choke, using ssh-rsa first would solve this.
I tried using "disabled_algorithms" but this only made me able to connect to older or newer clients, never both.
Any suggestions which enables us to connect to new and legacy clients?
SSH and Rsync should both be able to set. Since Rsync protocol is faster but the SSH protocol has the ability to run remote hook commands.
Group options for Rsync
current situation:
username
password
share
should be:
rsync_username
rsync_password
rsync_share
Group options for SSH
current situation:
username
sshpublickey
share
should be:
ssh_username
ssh_privatekey
Remove check for share (there is no share when rsyncing via SSH)
INFO: /etc/autorsyncbackup/host.domain.job: No share is set, skipping job.
Hook point's (Pre- and Post- backup scripts)
Autorsyncbackup is incompatible with Debian Bookworm, mainly due to the deprecation of some stuff in newer versions of python packages like Jinja2:
DeprecationWarning: 'jinja2.Markup' is deprecated and will be removed in Jinja 3.1. Import 'markupsafe.Markup' instead.
Debian Bookworm currently provides 3.2.1:
:~# apt-cache show python3-jinja2
Package: python3-jinja2
Source: jinja2
Version: 3.1.2-1
In the current implementation we report based on the last SQL record in the database. We don't check if this is actually the record corresponding to the last backup run.
We need to validate the data by adding a additional database field for jobrunhistory.
root@backup2:/etc/autorsyncbackup# autorsyncbackup -d -j s02.server.com.job -v
Starting AutoRsyncBackup
Reading main config from /etc/autorsyncbackup/main.yaml
DEBUG: /etc/autorsyncbackup/main.yaml: No rsyncpath is set, using default value: /usr/bin/rsync
DEBUG: /etc/autorsyncbackup/main.yaml: No jobconfigdirectory is set, using default value: /etc/autorsyncbackup/
DEBUG: /etc/autorsyncbackup/main.yaml: No jobspooldirectory is set, using default value: /var/spool/autorsyncbackup/
DEBUG: /etc/autorsyncbackup/main.yaml: No backupdir is set, using default value: /var/data/backups/autorsyncbackup/
DEBUG: /etc/autorsyncbackup/main.yaml: No logfile is set, using default value: /var/log/autorsyncbackup/autorsyncbackup.log
DEBUG: Writing to logfile /var/log/autorsyncbackup/autorsyncbackup.log
DEBUG: /etc/autorsyncbackup/main.yaml: No speedlimitkb is set, using default value: 0
DEBUG: /etc/autorsyncbackup/main.yaml: No dailyrotation is set, using default value: 8
DEBUG: /etc/autorsyncbackup/main.yaml: No weeklyrotation is set, using default value: 5
DEBUG: /etc/autorsyncbackup/main.yaml: No monthlyrotation is set, using default value: 13
DEBUG: /etc/autorsyncbackup/main.yaml: No smtphost is set, using default value: localhost
DEBUG: s02.server.job: No enabled tag is set, using default value: True
DEBUG: s02.server.com.job: No SSH jobconfig variable set.
DEBUG: s02.server.com.job: No weeklyrotation is set, using default
DEBUG: s02.server.com.job: No monthlyrotation is set, using default
DEBUG: s02.server.job: No weeklybackup is set, using default
DEBUG: s02.server.com.job: No monthlybackup is set, using default
Error while connecting to host (5) - @ERROR: auth failed on module share
rsync error: error starting client-server protocol (code 5) at main.c(1534) [Receiver=3.0.9]
Add hostname to "Error while connecting to host (hostname) (error code) (error)
The Sqlite db file is currently growing with every runned backup job. Need to add a "workingDirectory" field in the backup which is set to: "daily", "weekly" or "monthly". In this case we can calculate which backups are out of date and can be cleaned up.
When a hook script fails and continueonerror = False
the result is that the backup may not be run (in case the hook script is executed before the backup starts).
In that case there is no backup status to parse, yet the code tries to match the regexs which fail and raise an exception.
Currently we need to read the whole e-mail to see which backups has been failed. It would be nice if we can see this in the top overview.
As extra it would be nice if the hostname will be clickable as anchor to the log below.
autorsyncback uses python-paramiko for ssh connectivity which needs to be updated / replaced to work with ssh daemons on a jessie system.
On several places in the code base there is an exit statement. Which has as result that the program stops even if there are more clients to backup.
Keep in mind that this feature needs to be compatible with the SSH issue.
Add python version file and a --version CLI option
In the current master version we experience that it takes a long time to unlink the latest backup. I expect that we can speed this process up by using the expired backup as rsync destination.
For example host1.domain.tld
has a daily retention of 4 days, currently there will be created an empty directory called "current". After rsyncing the oldest backup will be purged. See table below:
Directory | New directory |
---|---|
current | 2015-11-26_18-06-42_backup.0 |
2015-11-25_18-06-42_backup.0 | 2015-11-25_18-06-42_backup.1 |
2015-11-24_20-01-23_backup.1 | 2015-11-24_20-01-23_backup.2 |
2015-11-23_18-51-07_backup.2 | 2015-11-23_18-51-07_backup.3 |
2015-11-24_09-08-00_backup.3 | 2015-11-24_09-08-00_backup.4 |
2015-11-25_08-15-56_backup.4 | Purged |
A better approach would be to rsync to the directory of the oldest backup, in this case we don't need to delete the oldest directory and the backup might be faster as well (dependent of the amount of changes). See table below:
Directory/Action | New directory |
---|---|
2015-11-25_08-15-56_backup.4 | current |
* Do backup * | |
current | 2015-11-26_18-06-42_backup.0 |
2015-11-25_18-06-42_backup.0 | 2015-11-25_18-06-42_backup.1 |
2015-11-24_20-01-23_backup.1 | 2015-11-24_20-01-23_backup.2 |
2015-11-23_18-51-07_backup.2 | 2015-11-23_18-51-07_backup.3 |
2015-11-24_09-08-00_backup.3 | 2015-11-24_09-08-00_backup.4 |
Bash can't read global variable set in a subshell. So we need to refactor the subshell logic.
Also see: http://stackoverflow.com/questions/15541321/set-a-variable-from-a-subshell
Problems detected in the following function's:
bkdir=`checkBackupEnvironment`
folder=`createFolderCurrent "${bkdir}"`
hardlink=`getHardlinkOption "${bkdir}"`
DEBUG: insert into jobcommandhistory (jobrunid, local, before, returncode, continueonerror, script, stdout, stderr) values (19344, 0, 1, 1, 0, 'sudo /usr/local/bin/automysqlbackup', '[]', '[error(110, 'Connection timed out')]')
This query failed because there is a '
in the query string.
This result that the backup mail returns the record of the previous run.
What we need to do:
ERROR: Error while connecting to host (host.domain.tld) - Authentication failed.
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "/usr/src/autorsyncbackup/src/lib/jobthread.py", line 15, in run
self.executeJob(self.q)
File "/usr/src/autorsyncbackup/src/lib/jobthread.py", line 29, in executeJob
self.director.executeRsync(job, latest)
File "/usr/src/autorsyncbackup/src/lib/director.py", line 47, in executeRsync
self.executeJobs(job, job.afterRemoteHooks)
File "/usr/src/autorsyncbackup/src/lib/director.py", line 39, in executeJobs
raise CommandException('Hook %s failed to execute' % c['script'])
CommandException: Hook /bin/ls -l failed to execute
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.