Xen + AoE + drbd = New Redundant Hotness
A few weeks ago, my buddy Justin posed an interesting problem, one that I've been pondering myself for some time. He's somewhat of a Xen zealot like myself, and is doing some Xen setups that are similar in construction to mine, with a central shared storage array and two or more dom0 machines where the child instances will live. The prospect of migrating domUs between dom0s is quite appealing to him, but he, like myself, realized a critical flaw in the setup. If the storage array fails or requires uptime-affecting maintenance of some sort, the whole setup grinds to a halt. That doesn't really fit the goals he and I are both after.
After a bit of thought, I looked to a project Justin had mentioned a few months back called drbd. It's designers deem it "network raid 1", and that's a pretty accurate description. It's essentially a system that mirrors data between two different machines either in an active-standby or active-active configuration. One of its primary goals is to provide storage as close to 100% of the time as is possible. Its usefulness would vary highly depending on the application. Having a normal file system on it shared between two machines could be nothing short of a nightmare, since neither server knows what the other is doing until changes are already written to the shared storage. A clustered file system would work well with it though. As I began to learn more about how it works, I realized it could potentially be a great solution for my predicament. Since either of of the two machines could provide storage at any given time, it would have no problem fufilling the near 100% uptime requirement.
What really makes the solution stand out to me isn't just drbd itself, but the combination of drbd and AoE. AoE is, by design, a connectionless protocol. When the kernel module is loaded, it does a device discovery to see what devices are available for its use, and listens thereafter for new devices. The information it learns is pretty much limited to a MAC address where the storage device is located and the vblade "addresses" within that device that are available. There's nothing within the protocol that outlaws multiple targets from advertising the same vblade "address", and it's up to the AoE initiator in the kernel module to choose where it's sending data. Because of this, you could have two linux vblade targets running on both "ends" of a drbd setup, and there'd be no conflicts whatsoever. The recommended setup in drbd is to consider a write operation as finished only when data has been written to disk on both "ends" of the drbd setup. Combine that with the fact that AoE will only send commands to one MAC address at a time, and its pretty much guaranteed that both vblade targets will be connected to the same data at all times, even though they're on different machines. I can think of a scenario or two where data would be out of sync, but it would require that disk write operations be done in parallel, and I'm farily certain that they aren't.
The fact that the same data is on both machines and that AoE allows for a quick and painless transfer between vblade targets is what makes this such a simple and effective solution for me. There may be a few seconds of lag while the AoE initiator realizes that the machine it was talking to has disappeared, but that will pass as soon as it does another device discovery and sees the other vblade target. This is perfectly acceptable in my usage scenarios thus far.
I took the plunge a week or so ago and started converting my storage at home to use drbds. It's was pretty simple to convert from my LVM-based setup, since all I had to do was create another single LVM for every partition I wanted to sync between machines. These additional LVM partitions store the metadata that drbd uses to track changes and to keep things in sync. This configuration also allows me to revert back to using "naked" LVM partitions as my vblade storage targets if I decide I don't like drbd in the future. I used my MythTV recording backend as the second drbd server, since it has a lot of space for extra drives and is on pretty much all the time. I put in a 120GB drive, and let everything fly. Once the initial synchronization was complete, I did a few tests, and everything worked as intended. I could kill vblade targets on either machine, and after a few seconds, the initiators would look at the other machine and use it for storage. Success!
As of this weekend I've also converted my setup at work to use the same general configuration. The primary storage consists of a big RAID array, with a secondary machine using a single drive as a backup. I figure that in most cases running in an active-active setup wouldn't be necessary, so I'm going to stick with active-standby, and only start the vblade targets on the secondary machine when I'm planning on a reboot or other maintenance event. I've also considered running in an active-off state (with periodic resyncs), so that there wouldn't be any performance hit from waiting for the second server to complete its writes. This would probably be a less desirable setup since the data could (and very likely would) be out of date if I were to suffer an unplanned outage such as a hardware failure. Nothing I run currently is terribly needy in terms of disk write performance, so I'm not terribly concerned at this point.
Longest Post About Xen Ever.
WARNING: Gratuitious technobabble ahead. Moms and non-linux types may want to just go on with their day and skip this one.
In my last post I alluded to doing some new fun stuff with AoE and Xen. One of the things that I like so much about Xen is its ability to combine the roles typically performed by multiple machines down into one machine. A few things that require specialized hardware, such as my MythTV recording backend machine, are not really feasible to put into virtualized environments, but there are quite a few tasks that work very well. One of my goals with my setups is to move everything I can onto a single "cluster," with a single machine providing shared storage, and two (or more) machines running Xen environments. That way, I can add new instances with distinct functionality without adding (much) hardware.
One of the roles that has been resistant to my Xen transition is my firewall machine. My home network is divided into three separate logical networks, each with a different purpose. At one point, the firewall machine had four ethernet cards in it - one per network, plus one that connected to the internet. I was able to cut that down to two and eventually one physical interface by acquiring some VLAN-aware network gear. By using VLANs at the network level, I was able to push all four networks through one physical link. Having reached that goal, I knew it was feasible to run my firewall machine in a Xen instance. Actually, I knew it was possible since I'd read about other people doing it, but those stories came with very little in the way of implementation detail. I figured that if I could get the same VLAN information into a Xen instance, my goal would be attainable. In my ideal setup, I would be able to pass VLAN encapsulated traffic to an instance, traffic not encapsulated in a VLAN, or both.
Getting VLANs to work properly inside of a Xen instance was not as easy as I would have liked, however. I gave it a cursory attempt a few weeks ago, and it failed pretty badly, so I moved on to something prettier and shinier. I didn't really dig much into why it failed, but it turns out that it was pretty simple. In my intial attempt, I tried to run the standard Xen network-bridge script against a VLAN interface defined by the 8021q kernel module. When I attempted to create the Xen network bridge on this interface, everything seemed to work, but in the end, only the bridge itself was present. The VLAN interface and its associated virtual interface were nowhere to be found. I tweaked a few things, with no change in the end result. I must not have been feeling particularly curious that day, and I let the attempts end in failure.
I decided to give things another shot last weekend. I searched around and found a few other people making the same attempts I was. One guy even published a few scripts that allowed him to connect VLAN interfaces to his network bridges so that each instance could have its own VLAN. While somewhat similar to my goal of passing both encapsulated and non-encapsulated traffic into my Xen instances, it wasn't exactly the same. I looked over his scripts in the attempt to figure out what they were doing, and then looked at the default scripts that came with Xen. After an hour or two of digging through code, I realized why my previous attempt failed. When the network-bridge script does its thing, it takes an interface, puts it into an inactive state, renames it, gives its old name to one of the virtual ethernet interfaces provided by the netloop kernel module, and then attaches both of the aforementioned interfaces to a newly-created bridge interface. The deactivation of the physical interface is where things went awry. The script makes a call to the 'ifdown' script, which deactivates the interface. On a normal physical interface, there isn't much to do short of just downing the interface, but with a VLAN interface, it actually destroys the interface in the process of disabling it. I had found my linchpin... so I thought.
I whipped up a quick patch to alter the behavior of the network-bridge script so it wouldn't make the call to ifdown, which preserved the VLAN interface. I did a few tests, and things worked as I felt they should. I set up all of the VLAN interfaces in the domain0 environment, configured the network-bridge script to create a Xen bridge for each of them, and then let it rip. All of the interfaces that needed to be renamed were renamed, all the proper bridges were created, and everything was connected where it should have been connected. Everything looked great - except it completely didn't work. After I changed the network port on my test machine to trunk all VLANs instead of just utilizing my main VLAN, everything stopped working, even though I had the right network stuff configured on the VLAN interfaces in the domain0 environment. I did some simple ping/tcpdump tests, and everything looked right, except the ping traffic was never making it into the dom0 like it should have. Outbound pings were visible in the proper VLAN interface, they egressed through the proper phsycial interface, made it to the destination machine, came back in through the proper physical interface... and then disappeared. The packets never made it back to the "physical" VLAN interface in the dom0, which prevented any kind of success.
After a good amount of expletives and some pacing around the apartment, I had an idea. My broken setup was connecting two separate types of virtual interfaces to the physical ethernet device in the whole initialization process - the VLAN interfaces and the bridge for trunked VLAN traffic. I decided to see if my missing packets were entering the bridge instead of the VLAN interface, and sure enough, they were. I had found another linchpin.
Seeing as how the traffic flow seemed to favor the bridge over the VLAN interfaces, I thought I was stuck. If the traffic is going into the bridge, its game over for the non-VLAN-encapsulated traffic, right? Wrong. The whole purpose of the network-bridge script is to masquerade the physical interface, replace it with a virtual one that is only visible to the dom0 environment, and connect them both to a bridge that is used pass data in and out of the dom0 and domU environments. The virtual interface acts as a normal interface in the dom0, so I figured it would be worth a try to configure the VLAN interfaces on the virtual interface instead of the physical one, and then build their bridges from that. I was a bit pessimistic at this point and really didn't expect it to work, but it sure did. Traffic started flowing properly to the dom0 environment, which meant the domUs would probably work too. I brought up a test environment, and all of the networking stuff worked as intended. It could see both encapsulated and non-encapsulated VLAN traffic.
It was Miller Time at that point, and pretty late to boot. I had my celebratory beer and hit the sack soon after. I waited until this past weekend to attempt my goal of moving my firewall setup into a Xen instance. I've become pretty good at moving Linux environments between physical machines and virtual ones, so the actual move was pretty painless. Once everything was ready, I stopped the networking stuff on my physical firewall machine and brought it up in the virtual one. The only issue to speak of was my cable modem locking in on the MAC address of the ethernet card in my physical firewall machine. I changed the virtual MAC address inside of the firewall Xen instance to match the physical firewall interface, and everything fell into place. I now have a firewall instance that I can migrate between my Xen host machines with no interruption. Great success!
I know I promised diagrams, but I'm not going to put them here. This post is already long enough as it is. I plan on making a wiki entry for the actual technical details (not just the technical description) of the process, so stay tuned for that. My diagrams will be in there.
I fail at posting
Wow. It's been like three weeks since I posted last. I think I've had more than five people ask me if I'm going to post again, or whether I'm gong to close the website down. I'm not closing the website down. I've actually been somewhat busy, and things have been somewhat eventful. Combine those with a general lack of motivation for posting, and you get three weeks of inactivity.
Right now I feel like crap. I had some sort of illness late last week, and its remnants have firmly implanted themselves in my lungs, giving me one of my death coughs from hell. I had a splitting headache for most of the weekend because of it. Today my head doesn't hurt as bad since I'm not coughing as frequently. I am, however, coughing more violently, and it seems to be making my throat swell up. Every time I swallow, it feels like I'm trying to swallow a golf ball or something else similar in size. Ugh.
In another sucky occurence, one of my cats, Marshall, broke my good laptop. It's been broken before in a similar fashion, but this time it's really hosed. I had the laptop next to my bed on my makeshift nightstand, and he was sitting on the far side of it. He apparently felt the need to visit, and he tried climbing over the laptop. He planted a paw on the top of the screen and tried pushing on it, and it opened far more than my makeshift repairs would allow it to go, and the plastic around the hinges pretty much disintegrated. The laptop still works, but it's pretty much useless as a laptop though. This has allowed me to convert it into a more powerful and much more compact mythtv frontend than the full-sized machine I had before. I have it set up in a diskless network-boot conifguration, so it can sit right next to my center channel speaker and not be adversely affected. I've still got my old beast of a thinkpad that I acquired from Jon, but I'd sure like to get a new laptop. Perhaps I'll look around after Christmas.
I've got a few more things to talk about, but I feel like laying down for a bit right now. I'll try to touch upon them soon.
Fast Times at Liquid Web Inc.
Wow. For whatever reason, the past couple of days at work have absolutely flown by. Perhaps it's because the team has been a little light due to illness, and on top of that Joe has been busy trying to get the Listening Ear back on its feet after the catastrophe that struck this past weekend. Maybe the lack of staff has kept me busier, which leads to faster days... who knows. Whatever it is, 4PM has seemed to come really early the past two days, and I can't complain.
Speaking of Joe, he was the big celebrity both on TV and in the paper detailing what happened over the weekend. I was able to grab the news cast he was on with MythTV, so I've got a nice mpeg2 video of it if he wants to save it as his 15 minutes of fame, or put it on the Ear homepage. Gotta love technology.
Woohoo, friday already.
Yah, friday comes quick when you start your week on wednesday. Still, it didn't come soon enough. Work has been way busy this week.... lots of highly annoying problems, and until about 3 hours ago, it seemed as though the phones never stopped ringing. Not cool.
I got the new and improved MythTV box working last night for the most part, and it's really cool to finally be able to use it on a TV, like it was designed to do. I still have to get the little remote control working and a few other things I haven't thought of yet, but the record/playback functionality works well. It's pretty tough to tell wether you're looking at the real TV signal or the regurgitated mpeg2 stream that Myth makes. I guess that's how it's suppsed to be though, so good job Myth folks!
A day off, finally
Yes, I finally have a day off... after 7 days of working the ass end of the day, I finally have a day where I didn't have to do anything. Sooo... I slept for 12 hours! Hah. And took an hour nap around 6pm! Hah! Ok. Yah. I'm still playing around with the laptop, mostly in the Gentoo area. I got MythTV going on it, and combined with the wireless, it rocks a lot. I can watch live or recorded TV anywhere that my wireless (or wired) LAN can take me. Pretty cool.
Speaking of wireless, I got it working in Gentoo, but with kind of a hack. I had to use DriverLoader, which is a program that allows you to use Windows drivers in linux where native linux drivers haven't been made yet. It's kind of handy, but I'd rather have a native open source driver. There have been a couple times where the connection has flaked out, but I don't know wether it was because of the hacked DriverLoader system, or just an anomaly in the wireless connection. Could be both, who knows.
The wired network card also has a little oddity going on it as well... When I try to mount a NFS share and send files to it, it's slow as freakin hell, but copying from the share is pretty damn fast. I was noticing that there are no transmit statistics being taken for that interface in ifconfig, so there maybe something goofy with the driver, or my installation of it. Whatever the problem is, it's really lame. I guess it could also be the cable... I don't think I've tried changing that out yet. I guess I'll try that soon.
Wow, I'm tired again.
Yes, the title says it all. I'm freakin exhausted. And once I finish this last godforsaken midnight shift, I get to drive down to my aunts house for Thanksgiving part deux. I probably won't get to sleep until like 9 or 10.... if anybody's counting, that's 27 hours of being awake, barring any naps or falling asleep at the wheel. Good thing I have monday and tuesday off - I'll probably wake up sometime tuesday once I get to bed tonight.
I've been working on the Gentoo install on my laptop over the past few hours... (it's probably the only thing keeping me awake!) I've made a bit of progress, but there are few things that I need to get working still before I'm happy. I've got sound, video, and the synaptics touchpad/usb mouse working right, but I still need to get the wireless ethernet working. There seems to be a couple open source projects that have drivers that are still in their infancy, but they don't seem to be very stable yet. There's also DriverLoader, but I don't have my WLAN drivers here at work and I couldn't find them on the web. Blah. Once I get wireless going I'll be happy, and can go onto other things, like installing MythTV on here and using it as a mobile frontend system... :-D
That's all for now... let's hope I don't pass out at the wheel or something.
Fun times in Ann Arbor
Yesterday I went down to Ann Arbor to participate in Steve's going away gathering. We had a great time. We had a good sized crew... Most of them were people I knew through Jim and Steve, all cool people. We sat around for a while and showed everyone the wonders of MythTV. Most parties involved, especially Jim, thought it was really cool. He wants to make a Myth system now, heheh... After our nerding out was finished, we went on to the Red Hawk and had dinner and beer... well, by the time all was said and done, there were 11 people there, and I would think that we were quite a pain for the wait staff. We also ran up a $250 tab, so that probably made the people of Red Hawk happy. On our way over to the next place, we of the MSU contingent made it clear that we wanted to urinate on some UofM building just out of principle, even though we never did. We almost did get into a fight though, which was really strange. We were just walking down a sidewalk behind these two guys, and one of them turns to us and says "hey look at my back!" He proceeded to pull up his tshirt and show us some tattoos on his back, and we were all like WTF. Apparently he didn't think we reacted appropriately, because he turned around with an agressive stance like he was going to hit someone. Luckily for him, his buddy said something before he got his ass whooped. Yah, there were two of them, and 11 of us (8 being guys). It would have only taken one or two of us to beat the tar out of them, so having 8 pretty much sealed it. They shoved off, and nothing happened from it. The tattoo guy had to have been on something - his eyes were glazed over and dull... that would be the only explanation I could accept for a scrawny guy trying to antagonize a group of people like ours. Oh well.
Our drinking destination last night ended up being Good Time Charlies (I think that was the name). We drank there for probably 3 hours or so, and had a great time. They stuck us on the patio, which was probably a good idea, since we were definitely the assholes in the place that night. We were there to get hammered, no two ways about it. And we did! We all started out with our beer of choice and a shot... Jim recommended that I get a shot of Goldschlager. It was pretty good actually... but I could see someone getting mega trashed off that stuff. Kinda cinnamon-y with no bad aftertaste. I guess we felt like punishing ourselves, because we ordered some shooters from the dark side of the menu... Matt made a few of us get a "Hairy Gorilla Fart". Yah, that thing sucked. Good for getting you hammered, that's about it. Tasted like ass - which it's probably called Hairy Gorilla Fart. We finished up our night there, much to Bob's dismay, as he was stealthily working on the hot blondes at the next table (so stealthily that I don't think they noticed). We headed over to Jimmy Johns for some much needed stomach buffer material. Once we consumed our fill, we headed back to Steve's place, and crashed out.
All of us who stayed over at Steve's place woke up way too early. The blinds on Steve's doorwall were broken, so when the sun came up, we were punished by it's morning fury. We got up reluctantly, and went over to Denny's for breakfast. Good stuff. After breakfast we parted company. Overall, it was a blast.
Vacation is Good...
Well, I guess I'm not really on vacation, since I had to work both days this past weekend to get these past two days off. But, they were weekdays and I wasn't at work, so I guess it's vacation. They've been pretty sweet so far, with lots of relaxing and hanging out with friends. I had lunch with Jon, Tom, and Darin yesterday, and that was good. Good food and good laughs were had by all. I also got the MythTV box finished up, with sound and full TV-out functionality. I'm quite stoked... I never got TV-out working right in linux with that card, so making it work is a plus. Now the Myth box just needs a serious upgrade so it can play back the video that it records, but that's not critical.
Today started off kinda sucky... The cat woke me up at 6:30 with incessant meow-ing, which kinda pissed me off. I told her to shut up a few times, which didn't work, so I got up and fed her, then promptly went back to bed. After I woke up, I kinda lounged around for a while, then went out and shot baskets in the blistering sun. Yah, it was hot. Then I decided it would be great to take a nap, which I did. It was excellent. After I woke up, I went out with the whole group of friends for a Happy Birthday dinner at Outback (yah, I turned 24 today, wheee), which was good. I don't think I've ever had a bad meal at Outback. After that, a few of us split off and checked out T3, which was sweet. I posted a review of it already, so I won't go into any depth here. Tomorrow is drunken relaxation on the sandbar at LaRouche Land, and that will be excellent as well. All attractive single girls are welcome to stop by.
MythTV progress and headaches
Well, I guess the headaches came before the progress, so I'll document them first. MythTV uses a package called XMLTV to grab the TV listings for its database. In turn, the north american version fo XMLTV grabs its listings from a site called zap2it.com, and from what I can tell, just parses the TV listings straight out of their HTML code. Well, zap2it changed the format of their site, which broke XMLTV, which breaks a lot of MythTV. Pain in the butt. It people who have a working channel listing are merely inconvenienced, but people like me who didn't have a working channel listing are screwed for the time being.
The good thing is that Steve was kind enough to mail me his database (containing channel listings, albeit the wrong ones), and I was able to play around a bit. The recording part works very well, using around 1GB per hour of recorded video. Not too bad for a high quality mpeg2 stream. Streaming that recorded video through mythtv is a bit slow and choppy though... I think its something in my configuration on the client machine, so it's probably something I can fix. Playing the videos with mplayer or windows media player is fine though, so if all else fails, I still have that. Very cool. Now I just need to buy a monster drive so I can record a ton of stuff. And upgrade the MythTV box so it can actually play back the video it records...
Related Tags
|
|