Bruce's Tech Blog: 2012

Friday, July 13, 2012

I'm working on a windows ZFS administration tool.
Basically, it will let you quickly monitor multiple Solaris/Opensolaris/Openindiana boxes, show you a list of zpools, zpool status, ZFS filesystems, snapshots, and let you delete snapshots by selecting.

(most of it is now functional, I just have to work on automating the checking of pool status on a polling interval and sending an email in case a zpool goes offline.)

Thursday, July 12, 2012

Tiers of (Hyper-V) disaster recovery

Tiers of (Hyper-V) disaster recovery:

---------------------------------------------

Tier: omg we’re f*cked

No backups of VHD or VM config files.

No alternate hypervisors to run VM’s on.

Make sure your resume is up to date and your phone number is unlisted.

Backup Software Cost: nada/zip/zero. Use all that extra money for beer. Hurray!

Hardware Cost: hardware? Who needs a server? Just use the free hyper-v server install on a desktop PC.

Tier: living la vida loca

Filesystem level backups of VHD files through a program using VSS service. (example: Hyperoo)

You will have to manually restore, then re-create VM configurations (including reconfiguring IP addresses of NICs in each VM)

Make sure you have some sort of documentation that you made your boss aware restoring from hardware failure will take days or weeks.

Backup Software Cost: somewhere in the range of $158 and up (depending on software used)

Hardware Cost: Backup server, disk space, tape drive(s). If you are lucky you can re-purpose an old machine you virtualized for no (additional) cost.

NOTE: database dumps and transaction dumps for SQL Server/Oracle/etc databases are handled outside of VM backup mechanism. (i.e. you don’t rely on VM backups to back up your relational databases)

NOTE: you do NOT (!) use a VM for a fileserver and store production data inside a VM. (File server backups/replication are handled outside of VM’s)

Tier: Tums and Pepto-Bismol are your friends

VM level backups of VM’s to disk/tape through a program that is VM aware ( Veeam, Backup Exec, etc)

You can use your backup software to selectively restore VM’s to another hypervisor than they were backed up from.

Time to restore all your VM’s can vary but might take several hours.

Backup images may be several hours/days old (lost transactions)

Backup Software Cost: $1,000 to $6,0000 (depends on how many sockets/servers you are backing up)

Hardware Cost: Backup server, disk space, tape drive(s). If you are lucky you can re-purpose an old machine you virtualized for no (additional) cost.

NOTE: database dumps and transaction dumps for SQL Server/Oracle/etc databases are handled outside of VM backup mechanism. (i.e. you don’t rely on VM backups to back up your relational databases)

NOTE: you do NOT (!) use a VM for a fileserver and store production data inside a VM. (File server backups/replication are handled outside of VM’s)

Tier: I may have a nervous twitch in my eye but I can sleep at night

VM level backups of VM’s to disk/tape through a program that is VM aware (Veeam, Backup Exec, etc)

VM level replication to a(n) alternate hypervisor(s) using replication software (Veeam, etc)

You can use your backup software to selectively restore VM’s to another hypervisor than they were backed up from.

You can failover a VM from a dead hypervisor to a replica on another hypervisor.

Time to restore all your VM’s can vary but might take a few minutes to several hours.

Backup images may be a couple hours old (lost transactions), but if you are replicating frequently, especially important VM’s storing data locally, you can minimize this impact somewhat.

Backup Software Cost: $1,000 to $6,0000 (depends on how many sockets/servers you are backing up)

Hardware Cost: Backup server, disk space, tape drive(s). If you are lucky you can re-purpose an old machine you virtualized for no (additional) cost. At least two hypervisors required. (maybe three if you need off-site recovery options)

NOTE: database dumps and transaction dumps for SQL Server/Oracle/etc databases are handled outside of VM backup mechanism. (i.e. you don’t rely on VM backups to back up your relational databases)

NOTE: you do NOT (!) use a VM for a fileserver and store production data inside a VM. (File server backups/replication are handled outside of VM’s)

Tier: I leave work at 5 on the dot every day.

Clustered/SCVMM environment. Multiple hypervisors in a cluster enabling live migration, load balancing,

Automatic/managed failover from a failed hypervisor to other members in the cluster.

Time to failover/bring back up in crash recovery state all your VM’s can be measured in minutes.

VM level backups of VM’s to disk/tape through a program that is VM aware (Veeam, Backup Exec, etc)

Backup Software Cost: $1,000 to $6,0000 (depends on how many sockets/servers you are backing up)

Hardware Cost: Backup server, disk space, tape drive(s). At least two hypervisors required. (maybe three if you need off-site recovery options). SAN (iSCSI target) server required also.

The only point of failure in this scenario would be if your SAN storage fails. (which can happen obviously) This would require you to restore from backups to another SAN and reconfigure hyper-v cluster.

Tier: The sound of the waves relaxes me as I sip my Pina Colada

Clustered/SCVMM environment. Multiple hypervisors in a cluster enabling live migration, load balancing,

Automatic/managed failover from a failed hypervisor to other members in the cluster.

SAN (iSCSI) storage replicated in HA (high availability) fashion with automatic IP failover. (software or hardware solutions)

Time to failover/bring back up in crash recovery state all your VM’s can be measured in minutes.

VM level backups of VM’s to disk/tape through a program that is VM aware (Veeam, Backup Exec, etc) (Although at this point, these backups might be more for “point in time” recovery for non-disaster reasons)

Cost: If you are in this category, your company probably has more annual revenue than most small countries, so cost is not an issue.

Friday, June 22, 2012

Custom Building a Hyper-V server

When it comes to the hardware you are going to use to virtualize your infrustructure on, you basically have two choices: buy or build it yourself. Obviously you can run a few machines on a 6 core desktop machine you put 16 Gb of memory into, but that’s only suitable for testing purposes -- not production use.
If your company or client has the money then I would definitely say buy prebuilt hardware certified for your choice of wmware or hyper-v (read HP or IBM) backed by an enterprise class SAN. Most midsize to small companies do not have the budget to accommodate those kinds of infrastructure purchases. That leaves us with building our own hypervisor from parts. Luckily not as complicated as it sounds but there are a few gotchas to keep in mind along the way. ( I will be documenting the project we are working on at work towards that end in this post as time permits)

Case: SUPERMICRO CSE-836TQ-R800B Black 3U Rackmount Server Case $809.99

At $810, this might seem a bit steep for a 3U case, but keeping in mind that it has 16 hot swap 3.5” SATA/SAS bays with backplane and dual/redundant 800W power supplies, it’s a good investment. All the fans and rail mount kit (screwless) come included as well. Over a nice case.

Motherboard: SUPERMICRO MBD-X9DR3-F-O $469.99

There are comparable Intel motherboards for dual CPU Ivy Bridge Xeon chips, but if you are buying a Supermicro case you’ll probably get easier installation. This MB uses the same Intel C600 series chipsets that the other ones do. It has 2 6gb SATA ports, 4 3gb SATA ports, and 2 SCU ports (8087 SAS plugs) for a total of 14 SATA connections.

Memory: Kingston 4GB 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10600) ECC Registered Server Memory Model KVR1333D3D8R9S/4GHB $279.92 for 32Gb (we intend to expand it to 64Gb later)

RAID card: Areca ARC-1223-8I $470.00

Yes, I supposed you could stick with RAID 10 and the onboard Intel SCU chipset, but you will get better performance out of the Areca card. In our case, we are using 1 SATA MB connection for the DVD drive, 2 SATA MB connections for the 2 drive boot mirror (software raid) leaving us room to build a 8 drive RAID5 or RAID6 array for VM file storage. The 6 drive bays leftover can be populated with larger slower drives and used for disk backup purposes or what have you.

NIC card: Intel E1G44HTBLK 10/ 100/ 1000Mbps PCI-Express 2.0 Server Adapter I340-T4 $230

The Supermicro motherboard comes with 2 gb Ethernet ports, but since we have a Cisco router that supports 802.5ad link aggregation we bonded the 4 ports on this PCIe 4x card to get decent bandwidth for all the VM’s that are going to run on this server.

CPUs: 2 Intel Xeon E5-2620 Sandy Bridge-EP 2.0GHz $420 each

Sandy Bridge technology. 12 cores plus hyperthreading, turbo boost, what more can you ask for? Incidentally, the E5 models below 2620 are cheaper but do not have turbo boost or hyperthreading, so I would say this is the bare minimum for CPU you want to go with.

Grand Total (excluding tax and shipping): $2,819.98 - not bad at all for what it’s capable of.

Obviously given the price of hard drives right now your final cost will go up but not by a ridiculous amount if you stick with 3.5” SATA drives. (If you have the money for a server full of 2.5” SAS enterprise quality hard drives or SSD array, you probably would be going with an HP or IBM prebuilt server)

Notes on installation:

The motherboard went into the case pretty smoothly. All the cables worked out except for 1 of the case fans, which we had to buy a separate fan power cable extension for. Memory no issues. CPU – make sure you get the exact (fanless) heatsink or it won’t fit.

The included SATA/SAS breakout cables included were all of sufficient length to attach to the backplane. The bottom SATA connectors of the backplane are a little hard to reach with big hands.

Windows Server 2008 R2 will not boot from the SCU. At least we couldn’t figure out how to get it to, even using the Intel SCU array drivers included with the cd. You can install to/boot from a SATA raid 1 no problems though, so we went with 2x500Gb boot drives connected to the MB SATA ports.

You will have to install the Intel NIC drivers before the Ethernet connections will work in 2008 R2 (no big deal, you can download the latest online from Intel)

We had a moment of panic when (after installing everything and doing windows updates) starting up a CentOS 5.2 VM put the server into an infinite blue screen of death crash loop. There is a hotfix available from Microsoft that cleared up the issue, but that made it crystal clear the Hyper-V is only half baked when it comes to Linux support. (At least that’s how it feels coming from a VMWare setup where almost everything just works for any guest O/S)

I’ll be updating this entry as we move over our production workload to the server. I’ll probably do a separate entry for the Veeam replication & backup solution that is necessitated by not using an enterprise grade SAN solution to store your VM’s on iSCSI targets.

If you have any issues or questions feel free to drop me a line.

Monday, January 9, 2012

Quasi-realtime filesystem replication using ZFS snapshots

Doing rsync backups of our ESXi nfs share was taking a considerable amount of time (18+ hours). This
wouldn't really be an issue except that it impacts performance of the NFS server. Also your backups
are considerably "stale" towards one week old, which makes recovery of critical data questionable.

If you need absolutely immediate replication, you can considering doing a mirror
of iSCSI targets on ZFS over a high speed connection. I've read of people doing that
but never had a need to, myself. The advantage here is that ZFS would handle re-silvering the
mirror automagicaly if the connection dropped to the external iSCSI target for a period of time.
This seems a little overkill for our situation, however.

So, I came up with a quick and dirty script that:

makes a snapshot
replicates it
deletes the older (previous) snapshot

so (as a testing phase) I have it scheduled to run every 5 minutes:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /root/replicate_vms.sh zpool1/nfs/vm/esxi XXX.XXX.XXX.XXX

(ip changed to XXX for security reasons)
you may want to change the location of the log file, /root/replicate_vms.log
to suit your purposes, but you should end up with something like

root@XXXXXXXXX:/zpool1/nfs/vm# cat ~/replicate_vms.log
zpool1/nfs/vm/esxi@2012-01-09 08:39:41 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:40:44 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:43:16 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:45:01 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:50:00 replicated to XXX.XXX.XXX.XXX successfully.

pre-requisites:
replication is one-way only
target filesystem must be set read only (zfs set readonly=true)
source server must be added to /root/.ssh/known_hosts in target server so ssh does not require a password
target server must have: "PermitRootLogin yes" in /etc/ssh/sshd_config
source and target server must have "lzop" program in /usr/local/bin (you can download and build it from lzop site)

source and target filesystems must be "primed" by doing the first snapshot and zfs send | zfs receive manually:
source snapshot must have YYYY-MM-DD H:M:S format so the sort works in ordering them chronologically:
zfs snapshot "<filesystem>`date "+%Y-%m-%d %H:%M:%S"`"

possible issues with this strategory:
If the volume of updates made to the filesystem > the connection bandwith between source and target,
there will be no way for it to "keep up" with live updates. I don't expect that to happen unless you
are trying to replicate over the internet or a 100mbit connection. Or perhaps if you have large size
databases or fileservers running in your VMs on the source NFS target.
If you are replicating over the internet and you don't trust your VPN security 100%, you could
add an additional layer of encryption on top of ssh using crypt or some other command line utility
that supports standard input and standard output. ssh + lzop + crypt = pretty darn secure.

here is the script:

replicate_vms.sh
------------------------------------------------
#!/bin/bash

export PATH=/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin
export last_local_snapshot="`zfs list -t snapshot -o name | grep $1 | sort | tail --lines=1`"
export new_local_snapshot=""$1@`date"+%Y-%m-%d %H:%M:%S"`""
export last_remote_snapshot=`ssh root@$2"zfs list -t snapshot -o name | grep $1 " | sort | tail --lines=1`

echo "last previous snapshot: " $last_local_snapshot
echo "new snapshot: " $new_local_snapshot
echo "last remote snapshot: " $last_remote_snapshot

zfs snapshot "$new_local_snapshot"

echo "zfs send -i \"$last_remote_snapshot\" \"$new_local_snapshot\" | ssh root@$2\"zfs receive $1 \""
zfs send -i "$last_remote_snapshot" "$new_local_snapshot" | /usr/local/bin/lzop -1c | ssh root@$2"/usr/local/bin/lzop -dc | zfs receive $1 "

export new_last_remote_snapshot=`ssh root@$2"zfs list -t snapshot -o name | grep $1 " | sort | tail --lines=1`

if [ "$new_local_snapshot" == "$new_last_remote_snapshot" ]; then

echo "$new_local_snapshot replicated to $2 successfully." >> /root/replicate_vms.log
zfs destroy "$last_local_snapshot"
ssh root@$2"zfs destroy \"$last_remote_snapshot\""
else
echo "$new_local_snapshot failed to replicate to $2! ERROR!" >> /root/replicate_vms.log
fi
-----------------------------------------------------