Friday, October 21, 2011

Using TAR/LZO compression cross platform for backups, a practical guide

Windows Platform


Using LZOP and 7Z from the command line to create a UNIX/Linux compatible
tar archive compressed with lzo compression.

to compress:

first, find the list of all your shadow copy volumes:


vssadmin list shadows


you should find an entry for the drive you are looking for:

Contents of shadow copy set ID: {b8ae8ecc-f536-439a-ac9f-de1c508786c6}
  Contained 1 shadow copies at creation time: 10/20/2011 8:49:49 AM
     Shadow Copy ID: {3c244889-abf3-4a82-b0be-d999cf78a13f}
        Original Volume: (D:)\\?\Volume{4afdb26c-c432-11e0-8885-806e6f6e6963}\
        Shadow Copy Volume: \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy4
        Originating Machine: XXXXXXXX.XXX.XXXXXX
        Service Machine: XXXXXXXX.XXX.XXXXXX
        Provider: 'Microsoft Software Shadow Copy provider 1.0'
        Type: ClientAccessible
        Attributes: Persistent, Client-accessible, No auto release, No writers, Differential


Now create a soft link to that volume:

mklink /D  E:\backups\ddrive\ \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy4\

you have to add the \  on the end or it won’t work right.

Now use a combination of 7zip and lzop to create a unix-linux compatible tar archive with lzo compression:

7z a -mx=0 -ttar -so ddrive.tar  E:\backups\ddrive | lzop -1vfo N:\hyperv\ ddrive.tar.lzo

If you don’t put “ddrive.tar” in the 7zip part it won’t work right, even though you
are writing to standard output.

I haven’t seen any particular use in specifying any compression ration other than -1 with lzop, and
with the size of the files we are using we want the compression to go as quickly as possible or
it will take days to complete a full backup. (and lzma and lzma2 offer much better compression ratios
if you have unlimited time to create the archive file)

to test (via checksums inside the tar file) you can use:

lzop –dc N:\hyperv\196.53\ddrive.tar.lzo | 7z t –siddrive.tar

important note: notice no space inbetween “-si” and the fictional filename “ddrive.tar” unlike the creation of
the archive wher eyou specify standard output as
-so ddrive.tar

you have to omit the space between when testing or decompressing or you get “Error: not implemented”  (wtf)


similarly, to decompress to the current directory, you could use

lzop –dc N:\hyperv\196.53\ddrive.tar.lzo | 7z e –siddrive.tar

optionally using –o to set an output directory.

*if you’re asking yourself why you just use the cygwin port of “tar” and the –-lzop option,
you can provided you are not using a softlink to a volume shadow copy. In the current release I had to use 7z and lzop
together to be able to compress from a mklink folder. tar won’t see it at all.




Solaris/OpenSolaris/OpenIndiana Platform


Here things are a little simpler. You can use tar by itself to do the compression in one step:

tar –-lzop -cvf /zpool1/backup/esxi.tar.lzo /zpool1/nfs/vm/esxibackup >/root/backup_vms.log

As a practical example, the below bash script will:

update the PATH to include lzop binary
sync the local date/time with the domain
mount a windows share (on our Backup Exec 2011 R3 windows server)
clean up (delete) any previous backup clone
clean up (delete) any previous backup snapshot
create a new backup snapshot
create a new backup clone
archive all of our ESXi VM's to a file via the windows share on the backup server
clean up (delete) the backup clone we just created
clean up (delete) the backup snapshot we just created
email me the log file of the backup, so I can tell if it completed.

The IP address has been masked with XXX in place of a number for security reasons.

--------------
root@opensolaris1:~# cat ~/backup_vms_to_53.sh
#!/bin/bash

export PATH=/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin:/usr/local/bin

ntpdate XXX.XXX.XXX.1

mount -F smbfs //administrator:putyouradminpasswordhere@XXX.XXX.XXX.53/backup /zpool1/nfs/vm/XXX.53

zfs destroy zpool1/nfs/vm/esxibackup
zfs destroy zpool1/nfs/vm/esxi@backup

zfs snapshot zpool1/nfs/vm/esxi@backup
zfs clone zpool1/nfs/vm/esxi@backup zpool1/nfs/vm/esxibackup

cd /zpool1/nfs/vm/196.53/196.149

tar --lzop -cvf /zpool1/nfs/vm/XXX.53/esxi.tar.lzo /zpool1/nfs/vm/esxibackup >/root/backup_vms_to_53.log


zfs destroy zpool1/nfs/vm/esxibackup
zfs destroy zpool1/nfs/vm/esxi@backup

mutt -s "XXX.XXX to XXX.53 backup complete" bmalicoat@mail.aticousa.com -a /root/backup_vms_to_53.log

--------------


Recovering your data



extracting from a tar archive using tar itself is annoying and impractical. if you want to get something
out of a tar archive without extracting the entire archive (wtf tar) you can decompress with lzop and
pipe it into 7z which will let you extract based on string pattern matching (regex):

lzop -dc esxi.tar.lzo | 7z e -siesxi.tar -ir!*.log -o.\temp

which would extract all the *.log files from the backup to a "temp" folder in
the current directory without recreating the directory structure from the tar archive.
(specify a different destination with -o if you prefer)

This works fine in 7zip 9.22 in windows on your backup exec server. The bad news?
This does NOT work in 7zip 4.55 in OpenIndiana. (if someone has an alternate syntax that DOES work
please let me know) You may be able to get the source for 7z 9.22 to compile in Solaris but
I'm not going to go down that road when it works ok in windows.

so if you need to recover a VMDK and VMX file:
extract it on your Backup Exec windows server then copy to your OpenIndiana server.

if you don't use windows in your environment as your backup server (I would imagine this
means you have all disk-only backups and have ditched tapes entirely or you use a
different backup software than the windows only Backup Exec)


if you have the space on your openindiana server, you can extract the tar file:

lzop -d esxi.tar.lzo

then 7z extract the file(s) you're looking for:

7z e esxi.tar

another option would be to compress with just 7z not using tar or lzo but
in our environment it takes to long to finish the archiving to be practical.
(lzo compression really is that much faster than everything else that exists
in both windows and unix/linux environments)

what would be nice:
a 7zip gui that would read/decompress write/compress tar/lzo archives.

free way to backup ESXi VM's

If you find yourself in the position that your (small) company has neither the funds to:
A) license ESX over ESXi

B) license a remote agent for Backup Exec to do backups of your ESX host

Then you have to consider alternative ways of backing up your VM's.
I would definitely recommend storing your VM's on an NFS share (as opposed to iSCSI) because it allows you opportunities to do file-level backups instead of block level backups. 

If you have Backup Exec and live in a Windows Server centric environment, you might consider
using an NFS share on a Server 2008 R2 host. Backup Exec comes with Remote Agent for Windows licenses  and can use VSS (volume shadow copy services) to make a snapshot of the drive that your NFS share is on and copy the NFS folder. The assumes you have already had a copy of Backup Exec (tested using 2011 R3 and it was fine) or some other backup software that takes advantage of VSS.
 
If you don't have Backup Exec or similar VSS using software, your only free option (that I'm aware of) is to use Solaris/Opensolaris/Openindiana. ZFS snapshots and clones make it (relatively) easy to write bash shell scripts you can schedule with crontab to easily do full/differential backups with rsync. The destination of the backups is pretty flexible as well, as you can easily mount an NFS or Windows share on your backup fileserver from Opensolaris or Openindiana. (If your company doesn't have budget for Backup Exec likely you are using Openindiana because its free for commercial use.)

If you are backing up a ZFS volume (filesystem in Openindiana) to another ZFS volume on another Openindiana server, differential backups are straightforward. You can use the --update option in rsync and it will only backup those VM files that have actually changed. If your VMDK files contain a lot of empty space (pre-allocated vs sparse) then the --compress-level=1 option in rsync works great (for Openindiana to Openindiana, don't expect it to work cross platform, it didn't for me). Cuts down on your network traffic if there is not a dedicated connection between servers.

Another option (on top of the above) is to compress the VM's server side BEFORE transferring them to the backup destination. This also helps to cut down on network traffic (we have a half gigabit half 100Mbit network and not all destinations are on the same switch so a big help here.) I'll detail using LZO compression in the next post.