Bruce's Tech Blog: 2011

Friday, October 21, 2011

Using TAR/LZO compression cross platform for backups, a practical guide

Windows Platform

Using LZOP and 7Z from the command line to create a UNIX/Linux compatible
tar archive compressed with lzo compression.

to compress:

first, find the list of all your shadow copy volumes:

vssadmin list shadows

you should find an entry for the drive you are looking for:

Contents of shadow copy set ID: {b8ae8ecc-f536-439a-ac9f-de1c508786c6}
  Contained 1 shadow copies at creation time: 10/20/2011 8:49:49 AM
     Shadow Copy ID: {3c244889-abf3-4a82-b0be-d999cf78a13f}
        Original Volume: (D:)\\?\Volume{4afdb26c-c432-11e0-8885-806e6f6e6963}\
        Shadow Copy Volume: \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy4
        Originating Machine: XXXXXXXX.XXX.XXXXXX
        Service Machine: XXXXXXXX.XXX.XXXXXX
        Provider: 'Microsoft Software Shadow Copy provider 1.0'
        Type: ClientAccessible
        Attributes: Persistent, Client-accessible, No auto release, No writers, Differential

Now create a soft link to that volume:

mklink /D E:\backups\ddrive\ \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy4\

you have to add the \ on the end or it won’t work right.

Now use a combination of 7zip and lzop to create a unix-linux compatible tar archive with lzo compression:

7z a -mx=0 -ttar -so ddrive.tar E:\backups\ddrive | lzop -1vfo N:\hyperv\ ddrive.tar.lzo

If you don’t put “ddrive.tar” in the 7zip part it won’t work right, even though you
are writing to standard output.

I haven’t seen any particular use in specifying any compression ration other than -1 with lzop, and
with the size of the files we are using we want the compression to go as quickly as possible or
it will take days to complete a full backup. (and lzma and lzma2 offer much better compression ratios
if you have unlimited time to create the archive file)

to test (via checksums inside the tar file) you can use:

lzop –dc N:\hyperv\196.53\ddrive.tar.lzo | 7z t –siddrive.tar

important note: notice no space inbetween “-si” and the fictional filename “ddrive.tar” unlike the creation of
the archive wher eyou specify standard output as
-so ddrive.tar

you have to omit the space between when testing or decompressing or you get “Error: not implemented” (wtf)

similarly, to decompress to the current directory, you could use

lzop –dc N:\hyperv\196.53\ddrive.tar.lzo | 7z e –siddrive.tar

optionally using –o to set an output directory.

*if you’re asking yourself why you just use the cygwin port of “tar” and the –-lzop option,
you can provided you are not using a softlink to a volume shadow copy. In the current release I had to use 7z and lzop
together to be able to compress from a mklink folder. tar won’t see it at all.

Solaris/OpenSolaris/OpenIndiana Platform

Here things are a little simpler. You can use tar by itself to do the compression in one step:

tar –-lzop -cvf /zpool1/backup/esxi.tar.lzo /zpool1/nfs/vm/esxibackup >/root/backup_vms.log

As a practical example, the below bash script will:

update the PATH to include lzop binary
sync the local date/time with the domain
mount a windows share (on our Backup Exec 2011 R3 windows server)
clean up (delete) any previous backup clone
clean up (delete) any previous backup snapshot
create a new backup snapshot
create a new backup clone
archive all of our ESXi VM's to a file via the windows share on the backup server
clean up (delete) the backup clone we just created
clean up (delete) the backup snapshot we just created
email me the log file of the backup, so I can tell if it completed.

The IP address has been masked with XXX in place of a number for security reasons.

--------------
root@opensolaris1:~# cat ~/backup_vms_to_53.sh
#!/bin/bash

export PATH=/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin:/usr/local/bin

ntpdate XXX.XXX.XXX.1

mount -F smbfs //administrator:putyouradminpasswordhere@XXX.XXX.XXX.53/backup /zpool1/nfs/vm/XXX.53

zfs destroy zpool1/nfs/vm/esxibackup
zfs destroy zpool1/nfs/vm/esxi@backup

zfs snapshot zpool1/nfs/vm/esxi@backup
zfs clone zpool1/nfs/vm/esxi@backup zpool1/nfs/vm/esxibackup

cd /zpool1/nfs/vm/196.53/196.149

tar --lzop -cvf /zpool1/nfs/vm/XXX.53/esxi.tar.lzo /zpool1/nfs/vm/esxibackup >/root/backup_vms_to_53.log

zfs destroy zpool1/nfs/vm/esxibackup
zfs destroy zpool1/nfs/vm/esxi@backup

mutt -s "XXX.XXX to XXX.53 backup complete" bmalicoat@mail.aticousa.com -a /root/backup_vms_to_53.log

--------------

Recovering your data

extracting from a tar archive using tar itself is annoying and impractical. if you want to get something
out of a tar archive without extracting the entire archive (wtf tar) you can decompress with lzop and
pipe it into 7z which will let you extract based on string pattern matching (regex):

lzop -dc esxi.tar.lzo | 7z e -siesxi.tar -ir!*.log -o.\temp

which would extract all the *.log files from the backup to a "temp" folder in
the current directory without recreating the directory structure from the tar archive.
(specify a different destination with -o if you prefer)

This works fine in 7zip 9.22 in windows on your backup exec server. The bad news?
This does NOT work in 7zip 4.55 in OpenIndiana. (if someone has an alternate syntax that DOES work
please let me know) You may be able to get the source for 7z 9.22 to compile in Solaris but
I'm not going to go down that road when it works ok in windows.

so if you need to recover a VMDK and VMX file:
extract it on your Backup Exec windows server then copy to your OpenIndiana server.

if you don't use windows in your environment as your backup server (I would imagine this
means you have all disk-only backups and have ditched tapes entirely or you use a
different backup software than the windows only Backup Exec)

if you have the space on your openindiana server, you can extract the tar file:

lzop -d esxi.tar.lzo

then 7z extract the file(s) you're looking for:

7z e esxi.tar

another option would be to compress with just 7z not using tar or lzo but
in our environment it takes to long to finish the archiving to be practical.
(lzo compression really is that much faster than everything else that exists
in both windows and unix/linux environments)

what would be nice:
a 7zip gui that would read/decompress write/compress tar/lzo archives.

free way to backup ESXi VM's

If you find yourself in the position that your (small) company has neither the funds to:
A) license ESX over ESXi

B) license a remote agent for Backup Exec to do backups of your ESX host

Then you have to consider alternative ways of backing up your VM's.
I would definitely recommend storing your VM's on an NFS share (as opposed to iSCSI) because it allows you opportunities to do file-level backups instead of block level backups.

If you have Backup Exec and live in a Windows Server centric environment, you might consider
using an NFS share on a Server 2008 R2 host. Backup Exec comes with Remote Agent for Windows licenses and can use VSS (volume shadow copy services) to make a snapshot of the drive that your NFS share is on and copy the NFS folder. The assumes you have already had a copy of Backup Exec (tested using 2011 R3 and it was fine) or some other backup software that takes advantage of VSS.

If you don't have Backup Exec or similar VSS using software, your only free option (that I'm aware of) is to use Solaris/Opensolaris/Openindiana. ZFS snapshots and clones make it (relatively) easy to write bash shell scripts you can schedule with crontab to easily do full/differential backups with rsync. The destination of the backups is pretty flexible as well, as you can easily mount an NFS or Windows share on your backup fileserver from Opensolaris or Openindiana. (If your company doesn't have budget for Backup Exec likely you are using Openindiana because its free for commercial use.)

If you are backing up a ZFS volume (filesystem in Openindiana) to another ZFS volume on another Openindiana server, differential backups are straightforward. You can use the --update option in rsync and it will only backup those VM files that have actually changed. If your VMDK files contain a lot of empty space (pre-allocated vs sparse) then the --compress-level=1 option in rsync works great (for Openindiana to Openindiana, don't expect it to work cross platform, it didn't for me). Cuts down on your network traffic if there is not a dedicated connection between servers.

Another option (on top of the above) is to compress the VM's server side BEFORE transferring them to the backup destination. This also helps to cut down on network traffic (we have a half gigabit half 100Mbit network and not all destinations are on the same switch so a big help here.) I'll detail using LZO compression in the next post.

Thursday, July 7, 2011

32bit Solaris performance issues

[2011-05-27]
We have 3 OpenSolaris/OpenIndiana fileservers at work. One of them is an intel core 2 duo with 8gig of DDR2 ram (it has 64bit Opensolaris and works like a champ). The second is a really old Dell pentium 4 with 2.5 gig of memory, and the third is an old HP 4U server (I’ll note down the model number and a pic later). After multiple benchmarks and digging, I’ve come to the conclusion:
Never use 32bit solaris for an ESXi nfs target!!!!
Why you ask? It seems the 32bit kernel has been put to pasture in terms of feature updates (and rightly so) and will scarcely use any of the server’s memory for the ARC (ZFS memory cache). Write performance doesn’t seem to take a hit (that I’ve noticed) but read performance is awful. Even just doing a
echo ::memstat | mdb -k
you can’t even see the ZFS file cache memory allocation on a 32bit installation. Doing kstat monitoring, you see about 64meg to 128meg (pitiful) memory being used short term for the ARC and then cleared (yeah the data doesn’t even stay in memory).
so… if you are like us and severely constrained by budget, resist the temptation to use that old 32bit machine (even if it was an expensive 4U server at the time) for an nfs target for ESXi, your VM’s will thank you.
Incidentally, we use the pentium-4 with a RAIDZ array of 1TB drives connected via USB to store database backups to disk. You heard me right, a USB RAIDZ array. What kind of write performance do you get with such a beast? A whopping 4megabytes/sec. Yes 4. But that’s enough for our purposes to keep transaction/full dumps of our Sybase database for an entire year. (Done in addition to tape backups).
So… 32bit Solaris/OpenSolaris/OpenIndiana should be relegated to disk backups replacing a tape drive more or less. Just my 2 cents.
(the 4U HP server has 9 gigs of ram, of which a whopping 900 megabytes or so actually get used by the OS, what a waste)
[2011-07-16]
I read somewhere online that Oracle has chosen to drop 32bit support from subsequenct Solaris versions (11+). I'm guessing that the openindiana folks are keeping it because they're trying to appeal to desktop users in addition to the server crowd.

ESXi performance issues

[2011-05-23]
Lately at work I’ve been putting some thought into how to make ESXi in our environment perform better. Don’t get me wrong, its doing what its designed to and advertises to be able to do. Its just that I would love to get closer to replacing ALL our servers with VM’s except for hypervisors and fileservers.
For lots of VM’s, the CPU speed and disk read/write speed are not an issue. For example we have a windows server 2003 hosting a website that reads and writes data to/from the production database server, and its fine as a VM.
However, we do compiles on a (different) server 2003 machine and what takes about 1h5mins on the physical machine is taking almost 2 hours in a exact copy (vmware converter) vm. The CPU is rarely maxed out, so the only conclusion is that its continual (small) reads and writes that get bogged down going against the nfs share on a Solaris RAIDZ array. RAIDZ is kind of awful for small frequent reads and writes (performance goes way up on big file transfers) and our “fileserver” is really a desktop machine with 8GB of memory so… you get what you pay for?
Another issue may be too many hands in the same cookie jar. Obviously when you’re using a single RAIDZ array for 5+ VM’s and they’re all doing disk I/O at roughly the same time your performance is going to go to crap. If we had the resources, I’d like to see how 5+ separate mirrored zpools would do in comparison (I would expect a significant increase but at the moment no way to test it). Another consideration might be a single large RAIDZ for vm’s not needing the performance, and 2 or 3 mirrored zpools of SSD’s. Only problem is the SSD’s are definitely out of the budget unless they come down in price a bit first.
Not to mention we also use the same RAIDZ array for CIFS shares for code, excel templates, documents, backups, all kinds of things.
So… what steps can be taken to get that 2 hour compile time reasonably closer to the 1 hour and 5 minutes of a physical (1U rack) server?
(more later and as additional optimization attempts are done)
[2011-07-07]
I just re-read this blog and I realize the obvious solution I didn't mention about adding a SATA SSD directly to the hypervisor. This would be great solution except for the fact that the hypervisor in question is a consumer HP desktop PC with no open drive bays and the $500 ish for the SSD is not in the budget.

cwrsync and openindiana -> windows server 2008 R2

[2011-05-15]
Today I ran into an issue with cwrsync going from openindiana to windows server 2008 r2. If you’re not familiar with it, cwrsync is a port of rsync to windows by the folks at
http://www.itefix.no/i2/cwrsync
I’m sure there are other ports, but cwrsync is free, and seems to support all the functionality (including ssh transport and rsync daemon connections) that the linux/unix versions do, so I’ve decided to go with that.
If you’re not familiar with rsync, its well worth your effort to look into it, if you do any kind of file or directory synchronization across servers. (Yes there are a number of uses even locally but that’s not my focus). It has a windows version robocopy but robocopy does not play as well with linux and unix. (You can use robocopy to/from linux but it requires a samba share)
Anyway, my issue was that I was really stressing the server by trying to do 3 massive directory syncronizations from three source openindiana servers (hosting the esxi vm’s) to a windows server 2008 r2 machine via rsync sender / cwrsync server receiver.
I was getting a network error at seemingly random places on one or more of the OI (openindiana) boxes.
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: read error: Connection reset by peer (131)
rsync error: error in rsync protocol data stream (code 12) at io.c(759) [sender=3.0.6]
I did some research and all I could come up with was setting the timeout= in the rsyncd.conf file on the windows server 2008 r2.
rsyncd.conf
——————-
use chroot = false
strict modes = false
hosts allow = *
log file = rsyncd.log
uid = 0
gid = 0
timeout = 3000
contimeout = 3000
[ydrive]
path = /cygdrive/y/esxi
read only = false
transfer logging = no
timeout = 3000
contimeout = 3000
[zdrive]
path = /cygdrive/z/esxi
read only = false
transfer logging = no
timeout = 3000
contimeout = 3000
I tried values of 30 (which is supposed to be 30 seconds) which didn’t work, then 300 (which didn’t work) and then finally (keeping my fingers crossed) 3000 seconds. Don’t ask me why if the value is in seconds you have to set the timeout= to such a high value to get it to work, but we’ll see if that was indeed the case. (The file transfer in question takes 20+ hours complete over gigabit)
Why on earth are we transferring all that data off OI onto 2008 R2 you ask? (well, I would be) we have a Neo tape drive and Backup Exec software only installs on windows now. (Not referring to the remote agent i mean the machine that drives the physical tape drive) so… we need to
from each OI box:
create snapshot
clone snapshot
rsync —progress —times —update —recursive —delete —z —compress-level=1 —inplace /zpool1/nfs/vm/esxibackup/* XXX.XXX.XXX.XXX::ydrive
from the three openindiana boxes that host our VM’s. (dedicated fileservers)
This is the only way I’m aware of to (easily) backup all your vm’s while they’re still running without manually copying VMX files, then creating snapshots, then copying the VMDK files minus deltas, then deleting the snapshot of every single VM. I realize some people have scripts to do this, but for me trying to get that to work flawlessly on a weekly basis for 40+ VM’s is not a good solution. The snapshot->clone->rsync solution guarantees at least that we get a “exact moment in time” copy. Probably might be issues with a VM with a *non* journaling filesystem, but we don’t have anything without one, so it works for us. (note: we don’t use vm’s for production databases or mail servers)
Anyway, i hope the timeout=3000 idea helps if you run into a similar situation.
I only experienced it when the server was getting really hammered from three other machines rsync’ing to it simultaneously, but your mileage may vary.
(I’m not at work right now, I’ll log on and post the rysncd.conf bits and the results of the timeout=3000 change tomorrow)
[2011-05-17] followup:
The timeout= didn’t help. Two of the rsync’s still crashed with similar errors
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: write failed on “FHQ/FHQ_1-flat.vmdk” (in zdrive): Permission denied (13)
rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.8]
rsync: read error: Connection reset by peer (131)
rsync error: error in rsync protocol data stream (code 12) at io.c(759) [sender=3.0.6]
later that night. I am trying various options. It almost seems as if I am either overloading the gigabit switch we’re using or the target server. I am experimenting now with using —compression-level=9 and —sparse (—spare and —inplace are mutually exclusive) to see if that helps and will update this blog tomorrow. (I believe using —sparse instead of —inplace would actually make the difference. There might be an issue with cwrsync and trying to do in place updates to existing files, we’ll see)
[2011-05-18]
still getting similar error messages even after changing to —sparse and (thus) not doing an —inplace anymore. I thought that maybe it had to do with enabling compression on the target windows directory (windows built in compression) causing too much stress on the server (with three simultaneous rsync’s going) but that doesn’t really seem to be the case either (after re-testing with compression off). So I’m left with some problem inherent to cwrsync specific (?) to server 2008 r2 and doing multiple inbound rsyncs at the same time. My solution for now is to stagger the backups from the three NFS servers during the week so only one is running at a time such that we can have a weekly backup to tape on sundays. I’m sticking a fork in it, cause I’m done messing with cwrsync to get it to work. Shame really cause it would be a much more elegant solution to be able to schedule all three servers to backup during overlapping time windows. c’est la vie.
[2011-05-23]
No problems since staggering the backups so that only one is running at a time.
Works out to be a fairly hands free (hopefully) trouble free backup solution for all the VM’s to disk and tape backup.
[2011-05-27]
It appears —whole-file (essentially turning off the file delta algorithm built into rsync) works much better with cwrsync than trying to let it update only the parts that changed.
[2011-07-07]
Just a quick update. Since switching to --whole-file, the backups are working like a charm. They run from bash scripts set up in crontab. I also installed "mutt" to enable emailing myself and another system admin when the backup is finished with a confirmation message and the rsync log file as an attachment. (cuts down on logging in the check on backup status)