iSCSI Target Server Choices

I manage a small a set of Citrix Xenserver hosts for various infrastructure functions, for storage, I’ve been running openfiler for about 3 years now, since the last reboot, my uptime is 1614 days! It’s pretty solid, but the interface seems buggy, there’s a lot of things in there I don’t use. When I do need to go change something, it’s so long in between uses, that I have to re-read documentation to figure out what the heck it’s doing. I’ve got a new Xenserver cluster coming online soon, and have been researching, thinking, dreaming, of what I’m going to use for VM storage this time.

Openfiler, really has been mostly great. My server load runs about 1.13 always, which somewhat bugs me, mostly due to conary (its package manager) running. Openfiler is almost never updated which isn’t a bad thing, since the machine is inside our firewall, without internet access unless I set a specific nat rule for it. I’m running it on an old Dell 310 server with two 2TB drives running RAID1, it’s got 4GB ram and boots to the same drives as openfiler runs its magic on (this server was originally implemented as a quick fix, to get us off local Xen storage, so we could do rolling restarts). It’s not a problem, but now, 3 years later, I notice, the latest version, IS THE SAME version I have installed and have been running for the last 1614 days… So maybe it’s time to find something new.

So I build out a nice Dell 530 server, dual 16gb flash cards, dual 120gig write intensive SSDs, a bunch of 2TB SATA drives, dual six core procs, and 32gig ram, dual power supplies, nice RAID card. The system arrived, and I had a lot of good feedback for NAS4Free, both online (googling, lots of reddit threads), and even in person recommendations. I was pretty excited about it honestly, I’m a little unfamiliar with FreeBSD, but have used it on and off in my now 20 year Linux career. I went ahead and installed the thing to the 16gb flash, as recommended. I disabled RAID on the server, and setup all the drives as SATA. Booted to the system and got rolling. It was really simple, seems easy to use, does WAY more than I could even actually want, in a storage device. I setup a big lun, with ZFS and iSCSI, added the write intensive SSDs as cache, installed all the recent updates, and was ready.. Then I read documentation a bit.

  • iSCSI can’t make use of SSD write cache.. Well, I guess I get an all SSD lun.
    • “A dedicated log device will have no effect on CIFS, AFP, or iSCSI as these protocols rarely use synchronous writes.”
  • Don’t use more than 50% of your storage space with ZFS and iSCSI.. WHAT?
    • “At 90% capacity, ZFS switches from performance- to space-based optimization, which has massive performance implications. For maximum write performance and to prevent problems with drive replacement, add more capacity before a pool reaches 80%. If you are using iSCSI, it is recommended to not let the pool go over 50% capacity to prevent fragmentation issues.”

So, this was some sad news, no write caching, cant use more than 50% of my disk space, but, I decided to press on. I went home for the night. The next morning I got a friendly email from my new server that it had some critical updates, cool, I though, so I installed the updates, now it wants to reboot. So, I let NAS4free reboot, two days later, more critical updates and a reboot required.. This is a bad thing for me. I run servers that really need to be up 24/7/365, yes, we run everything clustered, and redundant, and can reboot a server without anyone noticing, but not the entire storage device, that kills the point of having my VMs all stay up. This is still okay, because we have a second VM cluster, which has “the sister machines” to all our cluster nodes going into it. I just dont want to have to fully shutdown a VM cluster so the storage host can reboot once or twice a week. Kudos to the NAS4Free guys though, it’s a really good thing they are so active, it’s just not going to be the device for me.

So, I ripped it apart. Created 2xRAID1 SSD, a RAID10 set out of the 2TB drives, and installed my best friend Debian. Debian is rock solid, I only need to reboot for kernel updates, and that’s very few. Installed iscsitarget, setup my block devices using lvm, and bam! Within 30 minutes I had an iSCSI target setup and connected to Xen.

Reliability? I see a lot of ZFS fanboys touting that hardware RAID sucks, ZFS is awesome, good luck recovering your data, etc. I really haven’t had problems with RAID in the 15+ years I’ve been using it. We buy vendor supported hardware, if something dies, Dell sends me a new one. I backup onsite and offsite. I haven’t had to restore from a backup (other than testing restores), in years. I think this will all be okay.

Next article, I’ll write about setting up my iSCSI target, since there wasn’t many decent articles out there, I’ll write about it. It’s really pretty simple. Even have multipath IO working.

LeatherCraft 1539 18 Multi-Compartment Tool Carrier

I’m constantly struggling with keeping my network tool bag organized. It’s great after I organize it, but one job needs to be done and it’s completely disorganized again.

I have a ton of tools, carry everything I need to install network cabling, turn up T1 lines, troubleshoot analog lines, I can even do basic electrical work with what’s in my bag. It’s exploding with stuff, since I carry everything I could possibly need.

So I’ve been looking into a new bag.

The one I’m probably going to settle on is: Custom LeatherCraft 1539 18 Multi-Compartment Tool Carrier. It looks like it’ll do the trick. My current bag is a “bucketmouth” type bag, so there’s not really much organization, it’s so full of stuff I can’t zip it, so maybe this one will be great!

Kamailio – Changing the From URI for Level3

So Level3 uses the E.164 recommendation for sending caller information. The problem with this is that they send a + prefix to the phone number. The problem with sending the + in the caller number, is that a common desk phone (Polycom/Cisco/Yealink/Aastra) will try to make an IP call to the number, or just fail. It seems like only cell phones handle the + character in a number.

So to keep that plus out of the network, I added the following code to my kamailio.cfg to “filter” out the + before sending to the caller.

<br />
$avp(s:from)=$fu;<br />
$avp(s:from) = $(fu{re.subst,/\+1//g});<br />
  if ($(avp(s:from){s.len}) == 0) { $avp(s:from)  = $fu; }<br />
uac_replace_from("$avp(s:from)");<br />

Maybe there is a better way, but this is working in production. Let me know if anyone has a better method!

Setting up Memcached for HTML::Mason

Updated the corporate website today to include memcached. It was hitting our legacy application’s MSSQL database (which we have to still use), a ton, and slowing down the *choke* windows application.

Anyways, memcached saved the day! Way less hits on the database, and only took a few simple hooks to implement! I know I could have used Mason’s cache, but it isn’t distributed across servers that were not on this web server.

We use HTML::Mason for the site, so just a few simple hooks did the job.
1) Preloaded the Cache::Memcached module into my mod_perl.
2) Most of the website is driven off part number lookup. Even non parts are actually parts in our database, but just have content associated with them. So in the part retrieve Mason page, I added a line to load up memcached.

my $memd = new Cache::Memcached {<br />
'servers' => [ "10.10.1.44:11211", "10.10.1.40:11211" ],<br />
};

I get a $pn variable in from all other places so I check for its existence in the cache.
$mPart = $memd->get($pn);

Then just add a hook around my standard DB call and then a set after the pull and assign if we hit an else.
if (!$mPart) {<br />
$partList = $dbh->selectrow_arrayref("SELECT blablabla from priceBook WHERE itemID = '$pn'");<br />
$memd->set($pn,$partList,600); # Expire cache at 10 minutes (600 seconds).<br />
} else {<br />
$partList = $mPart;<br />
}

Fully IPv6 at home

Finally finished out the IPv6 rollout at the house. Tunnelbroker.net tunnel from my 2801 to he.net up, domain hosted here fully IPv6 with email, rDNS, web, all going good. IRC client running v6 too! Only using a handfull of addresses out of my massive /64 they assigned.

Passed the he.net certification too! As of tonight, I’m highest you can get up to, theres only 1,207certified for it out of probably 12,000 total workingon certs.

Now for the hard part. Dual Stacking the office, datacenter, and all our customers! The crappy thing is the majority of our customers are on old routing hardware, which has no future for v6, so at some point, we’ll have to spend some money.

Progress at work: my BGP peer is up with he.net for announcing our /32, our upstream providers wont be giving us native IPv6 until middle of next year, so tunneling is all we’ve got unless we add on some more bandwidth from another upstream provider with native IPv6… possibility.. Need quotes!

Freight Forwarder IT Workers Google Group.

Freight Forwarder IT Workers Google Group

I’m basically just adding this so Google will pick it up. I’ve made a group on Google for IT workers/employees/managers etc in the Freight Forwarding, NVOCC, import/export field to talk about things happening IT related in the industry, and to share ideas and such. Right now, there isn’t such a forum, group, or anything like this.

It would be nice to see how other people have solved common problems, etc.

Freight Forwarder IT Workers Google Group