Monthly Archives: April 2008

Bad Form HP… Bad Form Indeed

So, I'm sitting here in my newly reconfigured den/office/computer room/whatever (more on that later) waiting for CD images to download. Multiple CD images. I'm trying to upgrade the firmware for a SATA RAID card I eBay'd a few weeks back to handle the new 2TB RAID array I was building. It seemed to work fine, but after some use, it showed a nasty habit. The firmware seems to be buggy somehow, as I receive messages in my server's dmesg output about the adapter kernel panicking. I found it strange that a RAID card could kernel panic, but it makes sense when you consider the fact that firmware is just software for a hardware device. The card seems to recover itself about 80% of the time, but that other 20% involves hardware hangups, and hard reboots. No fun… especially when this box is providing storage for multiple other machines.

Back to the firmware. I'm trying to download the latest firmware for this card, but HP isn't making at all easy. Instead of putting a link to a specific file, or set of files, they just have links to their “Firmware Maintenance” CDs, which contain updated firmware for damn near everything they've unleashed upon the world in the past X years worth of product releases. Thing is, there's no documentation anywhere I can find about what each CD covers, what firmware versions are present, or anything like that.

Case in point – I downloaded the newest CD listed on the support page for the RAID card, and it didn't have any firmware for the card listed whatsoever. 650MB downloaded and a CD wasted, all for nothing. I try the oldest CD they have listed, and sure enough, it has firmware on it for my card. I was left wondering if there was a newer firmware somewhere on one of those intermediate CDs though. So here I am, downloading more CD images, and wasting more CDs. Grand! What's really great is that if I want to actually install one of these firmware updates, I need to find three working floppy disks that I can use to actually perform the upgrade because the embedded boot process won't install things automatically. Why? Even though the server is a HP-branded server, it's not of the proper vintage, so the boot loader/installer thing won't work. What a pile of crap.

Now that I'm done venting about stupidity (while downloading yet another CD)…

I've been feeling pretty run down over the past month or so. I attribute it mostly to what I would consider poor sleep. I typically sleep for a good 7-8 hours, but I haven't woke up feeling rested in quite a while. I have one working theory as to why, and it involves the server I'm working on right now, strangely enough. The simple fact is that the thing is loud. It's buried in my walk in closet with my other server gear, but it still had no issue polluting my bedroom (which is attached to said closet) with gratuitous amounts of noise. It seems like my poor sleep roughly coincides with the introduction of that machine into my closet, so I decided to shift things around a bit to see if it would help. I switched the roles of my two bedrooms – the larger bedroom became my office/computer room/underground lair, and the smaller room became the bedroom. It remains to be seen whether it will help much, but I liked this arrangement better when I used to have it. It also allows me to move the second cat litter box out of the bathroom, which is reason enough to move everything around in my opinion. They've never really been very good at keeping litter off the tile, which sucks. I'm not really a fan of walking on cat litter, especially when I'm fresh out of the shower and my feet are wet.

Longest Post About Xen Ever.

WARNING: Gratuitious technobabble ahead. Moms and non-linux types may want to just go on with their day and skip this one.

In my last post I alluded to doing some new fun stuff with AoE and Xen. One of the things that I like so much about Xen is its ability to combine the roles typically performed by multiple machines down into one machine. A few things that require specialized hardware, such as my MythTV recording backend machine, are not really feasible to put into virtualized environments, but there are quite a few tasks that work very well. One of my goals with my setups is to move everything I can onto a single “cluster,” with a single machine providing shared storage, and two (or more) machines running Xen environments. That way, I can add new instances with distinct functionality without adding (much) hardware.

One of the roles that has been resistant to my Xen transition is my firewall machine. My home network is divided into three separate logical networks, each with a different purpose. At one point, the firewall machine had four ethernet cards in it – one per network, plus one that connected to the internet. I was able to cut that down to two and eventually one physical interface by acquiring some VLAN-aware network gear. By using VLANs at the network level, I was able to push all four networks through one physical link. Having reached that goal, I knew it was feasible to run my firewall machine in a Xen instance. Actually, I knew it was possible since I’d read about other people doing it, but those stories came with very little in the way of implementation detail. I figured that if I could get the same VLAN information into a Xen instance, my goal would be attainable. In my ideal setup, I would be able to pass VLAN encapsulated traffic to an instance, traffic not encapsulated in a VLAN, or both.

Getting VLANs to work properly inside of a Xen instance was not as easy as I would have liked, however. I gave it a cursory attempt a few weeks ago, and it failed pretty badly, so I moved on to something prettier and shinier. I didn’t really dig much into why it failed, but it turns out that it was pretty simple. In my intial attempt, I tried to run the standard Xen network-bridge script against a VLAN interface defined by the 8021q kernel module. When I attempted to create the Xen network bridge on this interface, everything seemed to work, but in the end, only the bridge itself was present. The VLAN interface and its associated virtual interface were nowhere to be found. I tweaked a few things, with no change in the end result. I must not have been feeling particularly curious that day, and I let the attempts end in failure.

I decided to give things another shot last weekend. I searched around and found a few other people making the same attempts I was. One guy even published a few scripts that allowed him to connect VLAN interfaces to his network bridges so that each instance could have its own VLAN. While somewhat similar to my goal of passing both encapsulated and non-encapsulated traffic into my Xen instances, it wasn’t exactly the same. I looked over his scripts in the attempt to figure out what they were doing, and then looked at the default scripts that came with Xen. After an hour or two of digging through code, I realized why my previous attempt failed. When the network-bridge script does its thing, it takes an interface, puts it into an inactive state, renames it, gives its old name to one of the virtual ethernet interfaces provided by the netloop kernel module, and then attaches both of the aforementioned interfaces to a newly-created bridge interface. The deactivation of the physical interface is where things went awry. The script makes a call to the ‘ifdown’ script, which deactivates the interface. On a normal physical interface, there isn’t much to do short of just downing the interface, but with a VLAN interface, it actually destroys the interface in the process of disabling it. I had found my linchpin… so I thought.

I whipped up a quick patch to alter the behavior of the network-bridge script so it wouldn’t make the call to ifdown, which preserved the VLAN interface. I did a few tests, and things worked as I felt they should. I set up all of the VLAN interfaces in the domain0 environment, configured the network-bridge script to create a Xen bridge for each of them, and then let it rip. All of the interfaces that needed to be renamed were renamed, all the proper bridges were created, and everything was connected where it should have been connected. Everything looked great – except it completely didn’t work. After I changed the network port on my test machine to trunk all VLANs instead of just utilizing my main VLAN, everything stopped working, even though I had the right network stuff configured on the VLAN interfaces in the domain0 environment. I did some simple ping/tcpdump tests, and everything looked right, except the ping traffic was never making it into the dom0 like it should have. Outbound pings were visible in the proper VLAN interface, they egressed through the proper phsycial interface, made it to the destination machine, came back in through the proper physical interface… and then disappeared. The packets never made it back to the “physical” VLAN interface in the dom0, which prevented any kind of success.

After a good amount of expletives and some pacing around the apartment, I had an idea. My broken setup was connecting two separate types of virtual interfaces to the physical ethernet device in the whole initialization process – the VLAN interfaces and the bridge for trunked VLAN traffic. I decided to see if my missing packets were entering the bridge instead of the VLAN interface, and sure enough, they were. I had found another linchpin.

Seeing as how the traffic flow seemed to favor the bridge over the VLAN interfaces, I thought I was stuck. If the traffic is going into the bridge, its game over for the non-VLAN-encapsulated traffic, right? Wrong. The whole purpose of the network-bridge script is to masquerade the physical interface, replace it with a virtual one that is only visible to the dom0 environment, and connect them both to a bridge that is used pass data in and out of the dom0 and domU environments. The virtual interface acts as a normal interface in the dom0, so I figured it would be worth a try to configure the VLAN interfaces on the virtual interface instead of the physical one, and then build their bridges from that. I was a bit pessimistic at this point and really didn’t expect it to work, but it sure did. Traffic started flowing properly to the dom0 environment, which meant the domUs would probably work too. I brought up a test environment, and all of the networking stuff worked as intended. It could see both encapsulated and non-encapsulated VLAN traffic.

It was Miller Time at that point, and pretty late to boot. I had my celebratory beer and hit the sack soon after. I waited until this past weekend to attempt my goal of moving my firewall setup into a Xen instance. I’ve become pretty good at moving Linux environments between physical machines and virtual ones, so the actual move was pretty painless. Once everything was ready, I stopped the networking stuff on my physical firewall machine and brought it up in the virtual one. The only issue to speak of was my cable modem locking in on the MAC address of the ethernet card in my physical firewall machine. I changed the virtual MAC address inside of the firewall Xen instance to match the physical firewall interface, and everything fell into place. I now have a firewall instance that I can migrate between my Xen host machines with no interruption. Great success!

I know I promised diagrams, but I’m not going to put them here. This post is already long enough as it is. I plan on making a wiki entry for the actual technical details (not just the technical description) of the process, so stay tuned for that. My diagrams will be in there.

Update From the Team of One

It seems that posting here regularly is getting more and more problematic. It seems as though I’m busy a lot of the time, but most of the stuff I’m busy with is mundaine and would be boring to talk about, or its stuff that shouldn’t be talked about in a public forum. I’ve been a lot busier with human-oriented things at work as opposed to the purely technical side of things. It’s definitely taking a while to get accustomed to. I still lack team members for my group, but that’s my fault as much as anybody else’s. I still need to come up with a list of qualifications that are required/desired for people interested in joining the team. It seems easy enough to do, but every time I think of it, I come up with more and more items to put on that list. It may be a never-ending task… 🙂

Other than work, there isn’t a whole lot of things going on. I’ve been doing some more playing with AoE/Xen at home, and have succeeded in getting some things working that should open some doors regarding what I can do with those tools, but that’s a subject for another post… One with drawings and diagrams. 🙂

Oh yah… It’s getting warm! Yay for spring!