iSCSI Target Server Choices

I manage a small a set of Citrix Xenserver hosts for various infrastructure functions, for storage, I’ve been running openfiler for about 3 years now, since the last reboot, my uptime is 1614 days! It’s pretty solid, but the interface seems buggy, there’s a lot of things in there I don’t use. When I do need to go change something, it’s so long in between uses, that I have to re-read documentation to figure out what the heck it’s doing. I’ve got a new Xenserver cluster coming online soon, and have been researching, thinking, dreaming, of what I’m going to use for VM storage this time.

Openfiler, really has been mostly great. My server load runs about 1.13 always, which somewhat bugs me, mostly due to conary (its package manager) running. Openfiler is almost never updated which isn’t a bad thing, since the machine is inside our firewall, without internet access unless I set a specific nat rule for it. I’m running it on an old Dell 310 server with two 2TB drives running RAID1, it’s got 4GB ram and boots to the same drives as openfiler runs its magic on (this server was originally implemented as a quick fix, to get us off local Xen storage, so we could do rolling restarts). It’s not a problem, but now, 3 years later, I notice, the latest version, IS THE SAME version I have installed and have been running for the last 1614 days… So maybe it’s time to find something new.

So I build out a nice Dell 530 server, dual 16gb flash cards, dual 120gig write intensive SSDs, a bunch of 2TB SATA drives, dual six core procs, and 32gig ram, dual power supplies, nice RAID card. The system arrived, and I had a lot of good feedback for NAS4Free, both online (googling, lots of reddit threads), and even in person recommendations. I was pretty excited about it honestly, I’m a little unfamiliar with FreeBSD, but have used it on and off in my now 20 year Linux career. I went ahead and installed the thing to the 16gb flash, as recommended. I disabled RAID on the server, and setup all the drives as SATA. Booted to the system and got rolling. It was really simple, seems easy to use, does WAY more than I could even actually want, in a storage device. I setup a big lun, with ZFS and iSCSI, added the write intensive SSDs as cache, installed all the recent updates, and was ready.. Then I read documentation a bit.

  • iSCSI can’t make use of SSD write cache.. Well, I guess I get an all SSD lun.
    • “A dedicated log device will have no effect on CIFS, AFP, or iSCSI as these protocols rarely use synchronous writes.”
  • Don’t use more than 50% of your storage space with ZFS and iSCSI.. WHAT?
    • “At 90% capacity, ZFS switches from performance- to space-based optimization, which has massive performance implications. For maximum write performance and to prevent problems with drive replacement, add more capacity before a pool reaches 80%. If you are using iSCSI, it is recommended to not let the pool go over 50% capacity to prevent fragmentation issues.”

So, this was some sad news, no write caching, cant use more than 50% of my disk space, but, I decided to press on. I went home for the night. The next morning I got a friendly email from my new server that it had some critical updates, cool, I though, so I installed the updates, now it wants to reboot. So, I let NAS4free reboot, two days later, more critical updates and a reboot required.. This is a bad thing for me. I run servers that really need to be up 24/7/365, yes, we run everything clustered, and redundant, and can reboot a server without anyone noticing, but not the entire storage device, that kills the point of having my VMs all stay up. This is still okay, because we have a second VM cluster, which has “the sister machines” to all our cluster nodes going into it. I just dont want to have to fully shutdown a VM cluster so the storage host can reboot once or twice a week. Kudos to the NAS4Free guys though, it’s a really good thing they are so active, it’s just not going to be the device for me.

So, I ripped it apart. Created 2xRAID1 SSD, a RAID10 set out of the 2TB drives, and installed my best friend Debian. Debian is rock solid, I only need to reboot for kernel updates, and that’s very few. Installed iscsitarget, setup my block devices using lvm, and bam! Within 30 minutes I had an iSCSI target setup and connected to Xen.

Reliability? I see a lot of ZFS fanboys touting that hardware RAID sucks, ZFS is awesome, good luck recovering your data, etc. I really haven’t had problems with RAID in the 15+ years I’ve been using it. We buy vendor supported hardware, if something dies, Dell sends me a new one. I backup onsite and offsite. I haven’t had to restore from a backup (other than testing restores), in years. I think this will all be okay.

Next article, I’ll write about setting up my iSCSI target, since there wasn’t many decent articles out there, I’ll write about it. It’s really pretty simple. Even have multipath IO working.

Setting up Memcached for HTML::Mason

Updated the corporate website today to include memcached. It was hitting our legacy application’s MSSQL database (which we have to still use), a ton, and slowing down the *choke* windows application.

Anyways, memcached saved the day! Way less hits on the database, and only took a few simple hooks to implement! I know I could have used Mason’s cache, but it isn’t distributed across servers that were not on this web server.

We use HTML::Mason for the site, so just a few simple hooks did the job.
1) Preloaded the Cache::Memcached module into my mod_perl.
2) Most of the website is driven off part number lookup. Even non parts are actually parts in our database, but just have content associated with them. So in the part retrieve Mason page, I added a line to load up memcached.

my $memd = new Cache::Memcached {
'servers' => [ "10.10.1.44:11211", "10.10.1.40:11211" ],
};

I get a $pn variable in from all other places so I check for its existence in the cache.

$mPart = $memd->get($pn);

Then just add a hook around my standard DB call and then a set after the pull and assign if we hit an else.

if (!$mPart) {
$partList = $dbh->selectrow_arrayref("SELECT blablabla from priceBook WHERE itemID = '$pn'");
$memd->set($pn,$partList,600); # Expire cache at 10 minutes (600 seconds).
} else {
$partList = $mPart;
}

Wednesday Server Moves

Gosh the server this site and a few others is on, is pretty worn out. It’s an old P4 1.8ghz with 1gig ram, stuck in a 4U chassis. It’s running some pretty old software.

Time to upgrade. Dell 2950, almost ready to go, hopefully tomorrow I’ll swap it out. Will not move the site to it yet, of course, as I’ll have tons of work moving over all my content.