Dave LeCompte (really) (tsmaster) wrote,
Dave LeCompte (really)

LVM giveth and LVM taketh away

The short version of this post is that I'm currently (it's 2:49am right now, really) transferring files from a LVM volume set to a portable hard drive. I trust I have enough space on the hard drive. This is a post-catastrophic-failure recovery, so I feel fortunate that I'm able to recover anything at all.

The even shorter version of this post is: back up your data!

I don't know how long I'm going to make the long version. It is, after all, 2:49am.

Yesterday (Friday) morning, I became a little concerned when I noticed that I couldn't access my file server from work. I usually keep a SSH window open to allow me to do Linux-y stuff as needed. And I have some other ways to take notes and later refer to those notes, stored on this file server. And the physical machine that provides these handy services also is my web server, for all that matters. I've got nothing but fluff up on my various websites, but still.

So, Friday morning, the machine was down. I wondered if it was a power drop, so I pinged it a little while later, to see if the machine had come back up. Nothing. I wasn't able to get home much before midnight, and I was hoping it was going to be something as simple as physically flipping a switch, restarting the machine. When I flipped the switch, instead of the "click" that I was expecting that accompanies a switch, well, switching, there was no click. The power switch wobbled between off and on without resistance. That's not good.

First thing this morning, I went to a fairly local computer store to get a new power supply. Turns out, their website lied when it said they'd be open at 9am. So, I drove to Fry's, a less-local store. They were open, and sold me a power supply (plus other stuff, like a PS3 controller, but that's not important right now).

I took my haul back home and swapped the new power supply in. No joy. I tried some other options, and came to the conclusion that at the very least, something in the {motherboard, CPU, RAM} bundle was permanently failing. Well, that's OK, I can get a new motherboard... I spent some time thinking about whether to go to one of the local shops to get whatever they had in stock, or to order something online, like from NewEgg.com.

In the end, I took apart a somewhat-"spare" machine (in use, but not super critical), and took its MB/CPU/RAM bundle and started grafting the fileserver hard drives onto the spare motherboard, and that into the fileserver case. With a little work, I got the machine to boot, but the Power On Self Test (POST) gave me worrisome indications that it wasn't recognizing my hard drives. Crap. Crap crap crap. There go my files.

After something like two and a half hours of tinkering around with various combinations, I managed to get four drives up and running. My home directory on this file server was one large chunk of what I was terrified that I had lost. I'm still a little sick thinking about it. This home directory has ~150GB of assorted stuff. Some of that are MP3s I've ripped from CDs I own. Some of the 150GB are projects I've started and maybe not finished. The MP3s I can replace. The projects I'd have to start over. I probably will start over on the projects anyway, but that's - again - not important right now.

The clever, or "too clever by half" thing that I had come up with for this machine was to create a Linux Volume Manager (LVM) volume array so that I wouldn't be running out of room all the time. I think I started with 120GB of hard drive space and I grew to have something like 700GB at my disposal (which is nothing when you can go down to Staples and pick up a portable hard drive that will store a full terabyte (or maybe only 1000GB, but that's plenty, too).

The problem with my LVM plan was that I left the most important information on the oldest drives (I'm sure I meant to migrate all of the information onto a drive or two, perhaps with redundancy). So, I've got 4 drives of varying age, which is to say varying reliability. One of the gotchas of disk arrays is that by spreading your data across two disks, you're actually increasing the possibility of losing some of your data - one drive out of two is more likely to fail than one drive out of one. (If you want to experiment with this, roll a die. Roll two dice. You're more likely to roll at least one 6 when you roll two dice. Replace 6 with "catastrophic failure" and replace dice by disks, and you're done.)

So, as I was trying one configuration after another and getting inconsistent behavior from my frankensteined machine, I realized that I had 4 volumes, all four of which needed to be mounted at the same time for LVM to want to recognize my drive array. Fortunately, I was able to bring up all four drives at once, but I wasn't sure that it was going to be possible.

I'm going to bed soon here. Right now, the grafted-together machine is pulling the important bits (one one one) off the four hard drives. When I wake up, I should be able to bring the machine down and remove those four old hard drives to simplify the cable nightmare I've been struggling with for the past several hours.

And then I can go about finding the website and other information that had been on the fileserver. I think that's on a fifth drive that maybe I'll be able to use the same techniques to recover.

And then I will have a full backup of all the stuff that I was terrified of losing. I'll probably strip the reconstituted file server down to one or two drives and restore from my backup. There will be a fair amount of reconfiguring as part of the recovery, but that's acceptable at this point.

And then... sometime this week I really hope I follow through on my intention to set up an automatic backup to "the cloud" using Amazon's storage space (S3), possibly using JungleDisk, or some such. I saw an ad for Zamanda, which seems like a productized version of the Amanda backup, perhaps with easy means to backup to S3.

So, again. Back up your data.
  • Post a new comment


    Comments allowed for friends only

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 1 comment