Backup Software

Not recently I have looked through the backup solution for my home zoo. After looking through a number of alternatives (afbackup, backup-manager, backup2l, backintime, backupninja, boxbackup, cdbackup, cedar-backup2, chiark-backup, dvbackup, dvdbackup, faubackup, flexbackup, jpilot-backup, rdiff-backup, slbackup, storebackup, vbackup) from wikipedia I have stopped at BackupPC. There are the following reasons, why I have chosen this software:

  1. Web-interface. I know, that is not important for backup software, which can be a CLI-utility, but from the other side if you have apache server running, PHP-based interface is a plus. You can configure BackupPC from browser, force a backup, view the logs and browse currently available file backupped versions in a tree viewer and retrieve the necessary version just by clicking on it.
  2. It supports Linux and Windows boxes. Linux works with SSH1)+tar, or SSH2)+rsync and Windows works as Samba+tar or SSH+rsync.
  3. It supports full and incremental backups of different levels.
  4. It performs per-file data compression before storing it to the pool.
  5. Daemon periodically checks for host availability and performs backup when host becomes online (useful feature for notebooks and other mobile devices).

The main idea of BackupPC is the following: it goes through the data tree to backup and retrieves only new/modified files since the last backup. The files are fetched to local filesystem and are organized in a pool, so if several clients have the same file, it will be cached only once. For home users this feature is not beneficial, as usually all boxes have different non-relative information to backup. This a behaviour of most backup software (backintime, rdiff-backup, etc).

Another modern approach is one provided by CrashPlan, which offers peer-to-peer backup with your friends (who should also have this software installed). The drawbacks of this approach might be:

  • The intersection of the time when you and your peer are online is small and you risk to miss daily backup for example. So you should have several peers to minimise the risk, but paying back with increased outbound traffic (multiplied by number of peers).
  • If your peer is resided physically long distance away from you, it might be not easy to get a full backup (ask your friend to burn and send you a DVD?)
  • Also you “pay” for the traffic, both in- and outbound.

Further reading:

Universal utility to access cloud

Provided that Mail.Ru has abandoned their Linux client and Dropbox has limited the amount of clients to 3 and restricted the local filesystem to be ext4 there is a need to search for universal alternative:

Project Supports MailRu? Supports DropBox? Notes
rclone :YES: here, but does not work with 2FA enabled :YES: here :MINUS: No two-way synchronization, see https://forum.rclone.org/t/upback-two-way-synchronization-utility-based-on-rclone/4692/5
CloudCross :HELP: affected by 2FA? :YES: :MINUS: Cannot synchronize part of the tree

Utility to optimally distribute files into multiple fixed-sized volumes

Implements First Fit Decreasing (FFD):

Implements First Fit:

  • dirsplit from cdrkit – makes given number of random shuffles (500 by default) and chooses the best one (source).
    • :ADD: Takes into account that data in ISO image allocates more space because of additional structures.

Closed source (hence algorithm is unknown):

    • :DEL: Has problems with Cyrillic when creating ISO. Cannot create UDF ISO. So the only option to overcome these is to create .irp filelist for InfraRecoder.
    • :HELP: When using two other modes different from “In order” (“Re-order …”), the result is every time different (looks like Random First Fit?).

My comparison:

  • DVD Span: 16176MB left on last volume
  • dirsplit (-a 5000 iterations): 18084MB left on last volume

See also:

Strategy I use with dirsplit for 50GB3) BD DL disks:
  1. Create filelists:
    dirsplit -s47730M -a50000 -e1 -p /mnt/iso/video_ /mnt/video

    Note that split size is found out in empirical way.

  2. Verify resulting ISO size. The maximum ISO size should be no more than the size of the target media (otherwise make a correction / rerun previous step):
    for file in /mnt/iso/video_*.list
    do
        echo -n "[$file]: "
        extents=`genisoimage -no-rr -allow-limited-size -graft-points -q -print-size -path-list $file`
        echo "$(($extents * 2048)) B = $(($extents / 512)) MB = $(($extents / 512 / 1024)) GB = $extents extents"
    done | sort -r -n -k 2,2

    which prints something like this:

    [/mnt/iso/video_2.list]: 50048901120 B = 47730 MB = 46 GB = 24437940 extents
    [/mnt/iso/video_3.list]: 50048890880 B = 47730 MB = 46 GB = 24437935 extents
    [/mnt/iso/video_1.list]: 50048743424 B = 47730 MB = 46 GB = 24437863 extents
    [/mnt/iso/video_4.list]: 50048741376 B = 47730 MB = 46 GB = 24437862 extents
    [/mnt/iso/video_5.list]: 47073021952 B = 44892 MB = 43 GB = 22984874 extents
  3. Generate ISO:
    for file in /mnt/iso/video_*.list
    do
        echo "[$file]"
        genisoimage -no-rr -allow-limited-size -graft-points -path-list $file -V "${file%.*}_`date +%F`" -o ${file%.*}.iso
    done
  4. Burn ISO starting from most biggest (one that comes first in step 2) to be more secure that you will safely burn others.

Recovery

BackupPC questions answered

Additional methods for resolving host IP except nmblookup or ping

During setting up of my notebook I've met the following problem: unfortunately, the NetBIOS Windows name was not visible from Linux box. First I thought that Samba server has to be configured appropriately. Playing with wins support = yes and domain master = yes configuration options has not given the desired result. I have also checked registry setting for LanManager (see 1 and 2 and 3 and 4), thinking that it is running in “hidden” mode. Finally I've traced the network traffic and found out, that broadcast UDP packets do not pass through WiFi router in direction to Windows box, what I found strange. So in this configuration nmblookup does not reliably resolve Windows host IP address and ping does not work either, because of domain search problem.

I would suggest to:
  • Try to ping host first, then ping host. and get the results from one that succeeded
  • Implement one more check, nslookup host.

Also look for latest updates on maillist.

How to test the network share is accessible by given user?

Use the following command:

smbclient -U user%pass //host/share$ -c dir

The problem when trying to connect to the share: “Connection to [host] failed (Error NT_STATUS_UNSUCCESSFUL)”

In the case you are using KAV/KIS, you need to get the network type to “Local network” for your WiFi connection in Firewall → Settings → Networks tab dialog as advised here.

When restoring files using web interface the downloaded ZIP archive is always broken

You are using the broken libarchive-zip-perl v1.30. The solution is either to use compression level 0, or to downgrade to v1.18. Also reported to issue #54827.

How to uncompress zlib data?

BackupPC compresses logs using zlib but they are not true .gz or .zip files.

The trick is to prepend the gzip magic number and compress method:
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - RestoreLOG.z | gunzip

For example:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - /var/lib/backuppc/pc/centurion/XferLOG.1.z | gunzip | egrep -v '^  (pool|create)' | less

1) , 2) optional
3) Maximum ISO size 50048901120 B
software/backup.txt · Last modified: 2015/03/15 22:26 by dmitry
 
 
Recent changes RSS feed Driven by DokuWiki