Mike Neir's Page[ignignokt][err]
Mike should proofread his status updates better. (7 hours ago)
Viewing 5 posts tagged with 'nfs'
(Oldest First :: Newest First)
Show related: del.icio.us links, tags

Xen + AoE = New Hotness

Thursday, June 14 2007, 8:08 AM

In my continued experimentation with hot migration of Xen environments, I think I've found a pretty awesome solution. It involves a system called ATA over Ethernet (AoE). This system transmits ATA commands over ethernet, so it allows for a remote disk to be treated like local block storage. The system was originally designed by a company called Coraid for use with their own proprietary disk arrays, but they produced a piece of software that replicates the same functionality on a normal linux machine.

I was doing experimentation with using NFS root filesystems, but there were a few things I didn't like about it. First off, creating the kernel was a pain. WIth all of the effort I mentioned in my previous post on the subject, keeping an updated kernel would be a total pain if you were using CentOS 5 like I am. Second, the kernel didn't seem to perform any caching of the NFS filesytems, so there was a large amount of traffic flowing over the network from all of the filesystem reads that the Xen environments were doing. Third, all of the root filesystem reads/writes were visible to the Xen instances, so their bandwidth counters (and their associated graphs in my Cacti system) were skewed by a large amount.

These issues don't seem to occur with AoE. The filesystems are imported on the host, so the stock CentOS Xen kernel doesn't have to be modified in any way. This also renders the network traffic required in maintaining the filesystems invisible to the Xen domains. The filesystem acts as a normal block device, so it is cached like a normal local disk is cached.

That's not to say there weren't issues. At first, the vblade daemon (the linux 'server' component of the AoE system) seemed pretty unstable. It seemed to randomly lock up, causing all of my Xen domains to crash, and forcing a reboot of the host server. I think it was just the way I was using it though. I was running the vblade program and backgrounding it, instead of using the vbladed script that was provided. I think it was locking things up when I disconnected the termnal in which I started the vblade instances. When the controlling PTY died, it caused the vblade instances to die in a bad way due to a lack of standard input and output channels. The vbladed script controls all of the input and output paths, so there's no worry if the terminal disconnects. Since I've started using vbladed, about three weeks ago, I haven't had a single failure.

I'm currently running vbladed against the LVM partitions I used with my NFS root filesystems. Off the bat, I thought this would come up a little short because I didn't have a swap partition available to the Xen domains. Then I remembered that I could use a regular flat file as swap space, so the problem went away.

Since the vblade server allows you to export a whole block device, be it a whole disk, a single partition, a LVM partition, or a whole RAID array, it opens up some interesting possibilities. On the remote system, you can access the exported block device as if it were a disk, partitioning it as you see fit, while on the system exporting, it could be one of many LVM partitions. This allows for the possibility of creating a "mini hard drive" for each Xen instance, each with its own root filesystem, swap space, and whatever else is deemed necessary. I haven't implemented this because I want to be able to use my LVM partitions with NFS if stability becomes an issue, but it would be a pretty neat setup.

Tags:

Hot Migration Action

Tuesday, May 15 2007, 6:26 AM

As described in my last few posts, I've recently acquired a good amount of new server hardware. Well, everything is in my posession now except a few sticks of RAM, and it's all set up at work. I ended up picking up the RAID enclosure I mentioned earlier, along with disks to fill it. It ended up being quite a bargain, with the enclosure, drive trays, and a external SCSI cable only costing around $50 plus shipping. Here's all the new gear mounted in a rack at work... My stuff is the white stuff in a sea of black servers.

[servers]

I've got the RAID enclosure connected to the dual P3 1.0GHz machine I bought (furthest away on the bottom), and combined, there's 18 drive bays available to the SCSI system. I've got fifteen 36GB drives (plus one hot spare) in a RAID5 storage array and two 18GB drives in RAID1 for the OS installation. The RAID5 array weighs in at about 500GB, so I have plenty of room to keep stuff that I don't want to lose.

I'm currently seeing how well my Xen domains function with NFS root filesystems. So far it looks pretty good. I've got the domains that host my web site (among other things) and the mysql domain running off the RAID5 array via NFS, and I haven't noticed any slowdowns whatsoever. The only unexpected thing I've come across is a few weird incompatibilities with the Gentoo init scripts, specifically when it tries to bring up networking devices. It just hangs up when trying to initialize eth1, which is the interface that the NFS root filesystem is accessed through. My firewall script also kills things, but I should be able to fix that.

Having things running over NFS allows for live migration of running domains. I tried it out a few hours ago, and it's surprisingly painless, given that the appropriate functionality is enabled in the Xen daemon. One command sends a running domain between physical Xen hosts, which is pretty damned neat. I can see this being tremendously useful in a high-availabilty sort of environment. If a host machine needs maintenance, you can simply transfer the running child domain to another host, do your business, and transfer it back with only a fraction of a second of downtime.

Tags:

Xen+NFS Root Filesystem Madness

Saturday, May 12 2007, 11:52 PM

As part of my continuing experimentation with Xen, I decided a while back to try running the child environments (domUs) from a NFS root filesystem, so I could play with hot-migrating domUs between Xen hosts. I just started playing with it a couple nights ago, and what a pain in the ass its been.

First off, I'm using CentOS 5 as the Xen host operating system (dom0) because it's got Xen support built right in. Handy right? Sure. It does not, however, have support for NFS root filesystems built into the Xen kernels it supplies. Not a big deal - I compile my own kernels all the time. I added the proper options into the kernel - IP Autoconfiguration support, NFS client support, and NFS root filesytem support - and I went on my way.

That wasn't the end of the trouble. While I could get the domU to use the NFS share as its root filesystem, it wasn't accessing it properly. The root user had no permissions to write to anything, so everything was broken. This is typical of a NFS share with the "root_squash" option enabled, but I specified that my share be expored with the opposite setting enabled ("no_root_squash"). No matter what I did, I couldn't find out why root squashing was happening. I could mount the share just fine from another machine, and root squashing wasn't happening.

I decided to look at the differences between the mount parameters between my broken domU and the working system. There were a few differences, but the thing that was causing problems was "sec=null". That setting disables all authentication for the mount, and all access is mapped to the anonymous user specified on the NFS server.

I had found my problem, but the solution eluded me. I tried every way I could think of to change the mount parameters, but nothing worked. Then I stumbled across this post to the Linux Kernel Mailing List. Apparently, something was broken in the NFS kernel code in the 2.6.18 release that has to do with properly identifying what NFS server version one is connecing to. CentOS uses the 2.6.18 kernel. I tried applying the patch described in the post, and voila! Everything works!

With everything working, I was able to play with a few other things. I have two physical networks in my Xen boxes, one public and one private. All domUs are connected to the public network on eth0, and the private network is connected on eth1. I want to mount the NFS shares on the private network, but the default Xen configuration directives only seem to allow mounting NFS roots via eth0. I got around this by specifying the IP configuration stuff in the "extra" directive instead of the ip, netmask, and gateway directives. Here's the relevant portion of the config file.

nfs_root="/xen/domains/test"
nfs_server="192.168.3.10"
root="/dev/nfs"
extra="ip=192.168.3.150:192.168.3.10:192.168.3.4:255.255.255.0::eth1:"

Now for more experimentation!

Tags:

A day off, finally

Tuesday, December 02 2003, 3:22 AM

Yes, I finally have a day off... after 7 days of working the ass end of the day, I finally have a day where I didn't have to do anything. Sooo... I slept for 12 hours! Hah. And took an hour nap around 6pm! Hah! Ok. Yah. I'm still playing around with the laptop, mostly in the Gentoo area. I got MythTV going on it, and combined with the wireless, it rocks a lot. I can watch live or recorded TV anywhere that my wireless (or wired) LAN can take me. Pretty cool.

Speaking of wireless, I got it working in Gentoo, but with kind of a hack. I had to use DriverLoader, which is a program that allows you to use Windows drivers in linux where native linux drivers haven't been made yet. It's kind of handy, but I'd rather have a native open source driver. There have been a couple times where the connection has flaked out, but I don't know wether it was because of the hacked DriverLoader system, or just an anomaly in the wireless connection. Could be both, who knows.

The wired network card also has a little oddity going on it as well... When I try to mount a NFS share and send files to it, it's slow as freakin hell, but copying from the share is pretty damn fast. I was noticing that there are no transmit statistics being taken for that interface in ifconfig, so there maybe something goofy with the driver, or my installation of it. Whatever the problem is, it's really lame. I guess it could also be the cable... I don't think I've tried changing that out yet. I guess I'll try that soon.

Tags:

Woohoo, it's the weekend

Saturday, June 21 2003, 1:46 AM

Well, the craziness I highlighted in my last post didn't happen, thank god, but there was plenty of drama to take its place. Our network provider was having all kinds of trouble with their router yesterday, providing nice slow ping times (a bad thing) and lots of customers calling about slow ping times and a few instances of downtime (another bad thing).

Today wasn't so bad... pretty calm actually. One potentially sucky thing was brought to light today though... Apparently our whole support office, an adjoining room, and a couple of adjoining walls all share one circuit, which is bad. Right now there are 8 computers, some other equipment, a fishtank + lights + pump, and a refrigerator all on that circuit. Yah, and we're supposed to add two more computers in there somehow... Riiiight.

I'm currently converting my file server over to Gentoo... yes, I've gone off the deep end, because Gentoo rocks. It's compiling some Xwindows stuff right now, and it'll probably be compiling shit until well into the afternoon/evening tomorrow. I got the NFS and Samba servers going first though, so it's sharing files like it normally would.

Tags:

Related Tags

                                      


RSS Feed | Comments RSS Feed | Valid HTML 4.01 | Valid CSS
Memcache: Hits: 65 Misses: 5 Updates: 5 Deletes: 0 LocalHits: 41 Time: 0.0157
MySQL: Selects: 10 Inserts: 4 Updates: 0 Deletes: 0 Time: 1.6629
Page Render Time: 1.8503 seconds