I'm working on a windows ZFS administration tool.
Basically, it will
let you quickly monitor multiple Solaris/Opensolaris/Openindiana boxes,
show you a list of zpools, zpool status, ZFS filesystems, snapshots, and
let you delete snapshots by selecting.
(most of it is now
functional, I just have to work on automating the checking of pool
status on a polling interval and sending an email in case a zpool goes
offline.)
Friday, July 13, 2012
Thursday, July 12, 2012
Tiers of (Hyper-V) disaster recovery
Tiers
of (Hyper-V) disaster recovery:
---------------------------------------------
Tier: omg we’re f*cked
No backups of VHD or VM config files.
No alternate hypervisors to run VM’s on.
Make sure your resume is up to date and
your phone number is unlisted.
Backup Software Cost: nada/zip/zero. Use
all that extra money for beer. Hurray!
Hardware Cost: hardware? Who needs a
server? Just use the free hyper-v server install on a desktop PC.
Tier:
living la vida loca
Filesystem level backups of VHD files
through a program using VSS service. (example: Hyperoo)
You will have to manually restore, then
re-create VM configurations (including reconfiguring IP addresses of NICs in
each VM)
Make sure you have some sort of
documentation that you made your boss aware restoring from hardware failure
will take days or weeks.
Backup Software Cost: somewhere in the
range of $158 and up (depending on software used)
Hardware Cost: Backup server, disk
space, tape drive(s). If you are lucky you can re-purpose an old machine you
virtualized for no (additional) cost.
NOTE: database dumps and transaction
dumps for SQL Server/Oracle/etc databases are handled outside of VM backup
mechanism. (i.e. you don’t rely on VM backups to back up your relational
databases)
NOTE: you do NOT (!) use a VM for a
fileserver and store production data inside a VM. (File server
backups/replication are handled outside of VM’s)
Tier:
Tums and Pepto-Bismol are your friends
VM level backups of VM’s to disk/tape
through a program that is VM aware ( Veeam, Backup Exec, etc)
You can use your backup software to
selectively restore VM’s to another hypervisor than they were backed up from.
Time to restore all your VM’s can vary
but might take several hours.
Backup images may be several hours/days
old (lost transactions)
Backup Software Cost: $1,000 to $6,0000
(depends on how many sockets/servers you are backing up)
Hardware Cost: Backup server, disk
space, tape drive(s). If you are lucky you can re-purpose an old machine you
virtualized for no (additional) cost.
NOTE: database dumps and transaction
dumps for SQL Server/Oracle/etc databases are handled outside of VM backup
mechanism. (i.e. you don’t rely on VM backups to back up your relational
databases)
NOTE: you do NOT (!) use a VM for a
fileserver and store production data inside a VM. (File server
backups/replication are handled outside of VM’s)
Tier:
I may have a nervous twitch in my eye but I can sleep at night
VM level backups of VM’s to disk/tape
through a program that is VM aware (Veeam, Backup Exec, etc)
VM level replication to a(n) alternate
hypervisor(s) using replication software (Veeam, etc)
You can use your backup software to
selectively restore VM’s to another hypervisor than they were backed up from.
You can failover a VM from a dead
hypervisor to a replica on another hypervisor.
Time to restore all your VM’s can vary
but might take a few minutes to several hours.
Backup images may be a couple hours old
(lost transactions), but if you are replicating frequently, especially
important VM’s storing data locally, you can minimize this impact somewhat.
Backup Software Cost: $1,000 to $6,0000
(depends on how many sockets/servers you are backing up)
Hardware Cost: Backup server, disk
space, tape drive(s). If you are lucky you can re-purpose an old machine you
virtualized for no (additional) cost. At least two hypervisors required. (maybe
three if you need off-site recovery options)
NOTE: database dumps and transaction
dumps for SQL Server/Oracle/etc databases are handled outside of VM backup
mechanism. (i.e. you don’t rely on VM backups to back up your relational
databases)
NOTE: you do NOT (!) use a VM for a
fileserver and store production data inside a VM. (File server
backups/replication are handled outside of VM’s)
Tier:
I leave work at 5 on the dot every day.
Clustered/SCVMM environment. Multiple
hypervisors in a cluster enabling live migration, load balancing,
Automatic/managed failover from a failed
hypervisor to other members in the cluster.
Time to failover/bring back up in crash
recovery state all your VM’s can be measured in minutes.
VM level backups of VM’s to disk/tape
through a program that is VM aware (Veeam, Backup Exec, etc)
Backup Software Cost: $1,000 to $6,0000
(depends on how many sockets/servers you are backing up)
Hardware Cost: Backup server, disk
space, tape drive(s). At least two hypervisors required. (maybe three if you
need off-site recovery options). SAN (iSCSI target) server required also.
The only point of failure in this
scenario would be if your SAN storage fails. (which can happen obviously) This
would require you to restore from backups to another SAN and reconfigure
hyper-v cluster.
Tier: The sound of the waves relaxes me as I sip my
Pina Colada
Clustered/SCVMM environment. Multiple
hypervisors in a cluster enabling live migration, load balancing,
Automatic/managed failover from a failed
hypervisor to other members in the cluster.
SAN (iSCSI) storage replicated in HA
(high availability) fashion with automatic IP failover. (software or hardware
solutions)
Time to failover/bring back up in crash
recovery state all your VM’s can be measured in minutes.
VM level backups of VM’s to disk/tape
through a program that is VM aware (Veeam, Backup Exec, etc) (Although at this
point, these backups might be more for “point in time” recovery for
non-disaster reasons)
Cost: If you are in this category, your
company probably has more annual revenue than most small countries, so cost is
not an issue.
Friday, June 22, 2012
Custom Building a Hyper-V server
When it comes to the hardware you are going to use
to virtualize your infrustructure on, you basically have two choices: buy or
build it yourself. Obviously you can run a few machines on a 6 core desktop
machine you put 16 Gb of memory into,
but that’s only suitable for testing purposes -- not production use.
If your company or client has the money then I would definitely say buy prebuilt hardware certified for your choice of wmware or hyper-v (read HP or IBM) backed by an enterprise class SAN. Most midsize to small companies do not have the budget to accommodate those kinds of infrastructure purchases. That leaves us with building our own hypervisor from parts. Luckily not as complicated as it sounds but there are a few gotchas to keep in mind along the way. ( I will be documenting the project we are working on at work towards that end in this post as time permits)
If your company or client has the money then I would definitely say buy prebuilt hardware certified for your choice of wmware or hyper-v (read HP or IBM) backed by an enterprise class SAN. Most midsize to small companies do not have the budget to accommodate those kinds of infrastructure purchases. That leaves us with building our own hypervisor from parts. Luckily not as complicated as it sounds but there are a few gotchas to keep in mind along the way. ( I will be documenting the project we are working on at work towards that end in this post as time permits)
Case: SUPERMICRO
CSE-836TQ-R800B Black 3U Rackmount Server Case $809.99
At $810, this might seem a bit steep for a 3U
case, but keeping in mind that it has 16 hot swap 3.5” SATA/SAS bays with
backplane and dual/redundant 800W power supplies, it’s a good investment. All
the fans and rail mount kit (screwless) come included as well. Over a nice
case.
Motherboard: SUPERMICRO MBD-X9DR3-F-O $469.99
There are comparable Intel motherboards for dual
CPU Ivy Bridge Xeon chips, but if you are buying a Supermicro case you’ll
probably get easier installation. This MB uses the same Intel C600 series
chipsets that the other ones do. It has 2 6gb SATA ports, 4 3gb SATA ports, and
2 SCU ports (8087 SAS plugs) for a total of 14 SATA connections.
Memory: Kingston
4GB 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10600) ECC Registered Server Memory Model
KVR1333D3D8R9S/4GHB $279.92 for
32Gb (we intend to expand it to 64Gb
later)
RAID card:
Areca ARC-1223-8I $470.00
Yes, I supposed you could stick with RAID 10 and
the onboard Intel SCU chipset, but you will get better performance out of the
Areca card. In our case, we are using 1 SATA MB connection for the DVD drive, 2
SATA MB connections for the 2 drive boot mirror (software raid) leaving us room
to build a 8 drive RAID5 or RAID6 array for VM file storage. The 6 drive bays
leftover can be populated with larger slower drives and used for disk backup
purposes or what have you.
NIC card: Intel
E1G44HTBLK 10/ 100/ 1000Mbps PCI-Express 2.0 Server Adapter I340-T4 $230
The Supermicro motherboard comes with 2 gb Ethernet
ports, but since we have a Cisco router that supports 802.5ad link aggregation
we bonded the 4 ports on this PCIe 4x card to get decent bandwidth for all the
VM’s that are going to run on this server.
CPUs: 2 Intel Xeon E5-2620 Sandy Bridge-EP 2.0GHz $420 each
Sandy Bridge technology. 12 cores plus hyperthreading, turbo boost,
what more can you ask for? Incidentally, the E5 models below 2620 are cheaper
but do not have turbo boost or hyperthreading, so I would say this is the bare
minimum for CPU you want to go with.
Grand Total (excluding tax and shipping): $2,819.98
- not bad at all for what it’s capable of.
Obviously given the price of hard drives right now
your final cost will go up but not by a ridiculous amount if you stick with 3.5”
SATA drives. (If you have the money for a server full of 2.5” SAS enterprise
quality hard drives or SSD array, you probably would be going with an HP or IBM
prebuilt server)
Notes on installation:
The motherboard went into the case pretty
smoothly. All the cables worked out except for 1 of the case fans, which we had
to buy a separate fan power cable extension for. Memory no issues. CPU – make sure
you get the exact (fanless) heatsink or it won’t fit.
The included SATA/SAS breakout cables included
were all of sufficient length to attach to the backplane. The bottom SATA
connectors of the backplane are a little hard to reach with big hands.
Windows Server 2008 R2 will not boot from the SCU.
At least we couldn’t figure out how to get it to, even using the Intel SCU
array drivers included with the cd. You can install to/boot from a SATA raid 1
no problems though, so we went with 2x500Gb boot drives connected to the MB
SATA ports.
You will have to install the Intel NIC drivers
before the Ethernet connections will work in 2008 R2 (no big deal, you can
download the latest online from Intel)
We had a moment of panic when (after installing
everything and doing windows updates) starting up a CentOS 5.2 VM put the
server into an infinite blue screen of death crash loop. There is a hotfix
available from Microsoft that cleared up the issue, but that made it crystal
clear the Hyper-V is only half baked when it comes to Linux support. (At least
that’s how it feels coming from a VMWare setup where almost everything just
works for any guest O/S)
I’ll be updating this entry as we move over our
production workload to the server. I’ll probably do a separate entry for the
Veeam replication & backup solution that is necessitated by not using an
enterprise grade SAN solution to store your VM’s on iSCSI targets.
If you have any issues or questions feel free to
drop me a line.
Monday, January 9, 2012
Quasi-realtime filesystem replication using ZFS snapshots
Doing rsync backups of our ESXi nfs share was taking a considerable amount of time (18+ hours). This
wouldn't really be an issue except that it impacts performance of the NFS server. Also your backups
are considerably "stale" towards one week old, which makes recovery of critical data questionable.
If you need absolutely immediate replication, you can considering doing a mirror
of iSCSI targets on ZFS over a high speed connection. I've read of people doing that
but never had a need to, myself. The advantage here is that ZFS would handle re-silvering the
mirror automagicaly if the connection dropped to the external iSCSI target for a period of time.
This seems a little overkill for our situation, however.
So, I came up with a quick and dirty script that:
makes a snapshot
replicates it
deletes the older (previous) snapshot
so (as a testing phase) I have it scheduled to run every 5 minutes:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /root/replicate_vms.sh zpool1/nfs/vm/esxi XXX.XXX.XXX.XXX
(ip changed to XXX for security reasons)
you may want to change the location of the log file, /root/replicate_vms.log
to suit your purposes, but you should end up with something like
root@XXXXXXXXX:/zpool1/nfs/vm# cat ~/replicate_vms.log
zpool1/nfs/vm/esxi@2012-01-09 08:39:41 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:40:44 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:43:16 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:45:01 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:50:00 replicated to XXX.XXX.XXX.XXX successfully.
pre-requisites:
replication is one-way only
target filesystem must be set read only (zfs set readonly=true)
source server must be added to /root/.ssh/known_hosts in target server so ssh does not require a password
target server must have: "PermitRootLogin yes" in /etc/ssh/sshd_config
source and target server must have "lzop" program in /usr/local/bin (you can download and build it from lzop site)
source and target filesystems must be "primed" by doing the first snapshot and zfs send | zfs receive manually:
source snapshot must have YYYY-MM-DD H:M:S format so the sort works in ordering them chronologically:
zfs snapshot "<filesystem>`date "+%Y-%m-%d %H:%M:%S"`"
possible issues with this strategory:
If the volume of updates made to the filesystem > the connection bandwith between source and target,
there will be no way for it to "keep up" with live updates. I don't expect that to happen unless you
are trying to replicate over the internet or a 100mbit connection. Or perhaps if you have large size
databases or fileservers running in your VMs on the source NFS target.
If you are replicating over the internet and you don't trust your VPN security 100%, you could
add an additional layer of encryption on top of ssh using crypt or some other command line utility
that supports standard input and standard output. ssh + lzop + crypt = pretty darn secure.
here is the script:
replicate_vms.sh
------------------------------------------------
#!/bin/bash
export PATH=/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin
export last_local_snapshot="`zfs list -t snapshot -o name | grep $1 | sort | tail --lines=1`"
export new_local_snapshot=""$1@`date"+%Y-%m-%d %H:%M:%S"`""
export last_remote_snapshot=`ssh root@$2"zfs list -t snapshot -o name | grep $1 " | sort | tail --lines=1`
echo "last previous snapshot: " $last_local_snapshot
echo "new snapshot: " $new_local_snapshot
echo "last remote snapshot: " $last_remote_snapshot
zfs snapshot "$new_local_snapshot"
echo "zfs send -i \"$last_remote_snapshot\" \"$new_local_snapshot\" | ssh root@$2\"zfs receive $1 \""
zfs send -i "$last_remote_snapshot" "$new_local_snapshot" | /usr/local/bin/lzop -1c | ssh root@$2"/usr/local/bin/lzop -dc | zfs receive $1 "
export new_last_remote_snapshot=`ssh root@$2"zfs list -t snapshot -o name | grep $1 " | sort | tail --lines=1`
if [ "$new_local_snapshot" == "$new_last_remote_snapshot" ]; then
echo "$new_local_snapshot replicated to $2 successfully." >> /root/replicate_vms.log
zfs destroy "$last_local_snapshot"
ssh root@$2"zfs destroy \"$last_remote_snapshot\""
else
echo "$new_local_snapshot failed to replicate to $2! ERROR!" >> /root/replicate_vms.log
fi
-----------------------------------------------------
wouldn't really be an issue except that it impacts performance of the NFS server. Also your backups
are considerably "stale" towards one week old, which makes recovery of critical data questionable.
If you need absolutely immediate replication, you can considering doing a mirror
of iSCSI targets on ZFS over a high speed connection. I've read of people doing that
but never had a need to, myself. The advantage here is that ZFS would handle re-silvering the
mirror automagicaly if the connection dropped to the external iSCSI target for a period of time.
This seems a little overkill for our situation, however.
So, I came up with a quick and dirty script that:
makes a snapshot
replicates it
deletes the older (previous) snapshot
so (as a testing phase) I have it scheduled to run every 5 minutes:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /root/replicate_vms.sh zpool1/nfs/vm/esxi XXX.XXX.XXX.XXX
(ip changed to XXX for security reasons)
you may want to change the location of the log file, /root/replicate_vms.log
to suit your purposes, but you should end up with something like
root@XXXXXXXXX:/zpool1/nfs/vm# cat ~/replicate_vms.log
zpool1/nfs/vm/esxi@2012-01-09 08:39:41 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:40:44 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:43:16 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:45:01 replicated to XXX.XXX.XXX.XXX successfully.
zpool1/nfs/vm/esxi@2012-01-09 08:50:00 replicated to XXX.XXX.XXX.XXX successfully.
pre-requisites:
replication is one-way only
target filesystem must be set read only (zfs set readonly=true)
source server must be added to /root/.ssh/known_hosts in target server so ssh does not require a password
target server must have: "PermitRootLogin yes" in /etc/ssh/sshd_config
source and target server must have "lzop" program in /usr/local/bin (you can download and build it from lzop site)
source and target filesystems must be "primed" by doing the first snapshot and zfs send | zfs receive manually:
source snapshot must have YYYY-MM-DD H:M:S format so the sort works in ordering them chronologically:
zfs snapshot "<filesystem>`date "+%Y-%m-%d %H:%M:%S"`"
possible issues with this strategory:
If the volume of updates made to the filesystem > the connection bandwith between source and target,
there will be no way for it to "keep up" with live updates. I don't expect that to happen unless you
are trying to replicate over the internet or a 100mbit connection. Or perhaps if you have large size
databases or fileservers running in your VMs on the source NFS target.
If you are replicating over the internet and you don't trust your VPN security 100%, you could
add an additional layer of encryption on top of ssh using crypt or some other command line utility
that supports standard input and standard output. ssh + lzop + crypt = pretty darn secure.
here is the script:
replicate_vms.sh
------------------------------------------------
#!/bin/bash
export PATH=/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin
export last_local_snapshot="`zfs list -t snapshot -o name | grep $1 | sort | tail --lines=1`"
export new_local_snapshot=""$1@`date"+%Y-%m-%d %H:%M:%S"`""
export last_remote_snapshot=`ssh root@$2"zfs list -t snapshot -o name | grep $1 " | sort | tail --lines=1`
echo "last previous snapshot: " $last_local_snapshot
echo "new snapshot: " $new_local_snapshot
echo "last remote snapshot: " $last_remote_snapshot
zfs snapshot "$new_local_snapshot"
echo "zfs send -i \"$last_remote_snapshot\" \"$new_local_snapshot\" | ssh root@$2\"zfs receive $1 \""
zfs send -i "$last_remote_snapshot" "$new_local_snapshot" | /usr/local/bin/lzop -1c | ssh root@$2"/usr/local/bin/lzop -dc | zfs receive $1 "
export new_last_remote_snapshot=`ssh root@$2"zfs list -t snapshot -o name | grep $1 " | sort | tail --lines=1`
if [ "$new_local_snapshot" == "$new_last_remote_snapshot" ]; then
echo "$new_local_snapshot replicated to $2 successfully." >> /root/replicate_vms.log
zfs destroy "$last_local_snapshot"
ssh root@$2"zfs destroy \"$last_remote_snapshot\""
else
echo "$new_local_snapshot failed to replicate to $2! ERROR!" >> /root/replicate_vms.log
fi
-----------------------------------------------------
Subscribe to:
Comments (Atom)







