General :: No Automatic Rebuild Of RAID 5 After Replacing Bad Disk
Dec 12, 2009
I have a 5 disk raid 5 array that is composed of SATA A:0,1; SATA B: 0,1, and SATA C:0, and one of the disks (SATA A:0) recently went bad on me. I have an ICP raid controller that is about 5 years old. I replaced SATA A:0. After rebooting, I went into the controller and verified that it saw the disk in the hard-disk info section...there I noticed that in the "status" section, that the SATA C:0, SATA B:1 disks were listed as being "in array", the SATA A disks were blank, and the SATA B:0 disk was listed as "fragment". When I go into the "repair array" section, the controller tells me that there are no arrays that are in failure, error, or need to be rebuilt.
This puzzles me, as I thought the controller would know that the array needs to be rebuilt after replacing the disk and I don't see a way to initiate a rebuild. If I just let the server boot after replacing the disk, then I get back that there are the correct number of disks in the raid 5 and that it is ready, however, the screen then goes blank and I get a blinking cursor and the system seems to hang. There are no activity lights on any of the drives associated with the raid 5, which makes me think that the system is not rebuilding the array at this point.
I went to setup my linux box and found that the OS drive had finally died. It was an extremely old WD raptor drive in a hot box full of drives so it was really only a matter of time before it just quit on me. Normally this wouldn't be such a big deal however I had just recently constructed an md RAID5 array of 3 1TB disks to act as an NFS mount for basically all of my important files. Maybe 2-3 weeks before the failure I had finished moving all of my most important stuff onto that array. Now I know that the array is intact. All the required data is sitting on those disks. Since only the OS level disk failed on me I should be able to get a new disk in there, reinstall ubuntu and then rebuild that array. how exactly do I go about doing that with mdadm? Do I create the array from the /dev character devices like when I initially built the array?
I've got a raid5 array of 4 disks with ubuntu 8.04 runing on it that is currently still working:
/dev/sda /dev/sdb /dev/sdc /dev/sdd
Smartmontools for /dev/sdc tell that there are 9 sectors pending for reallocation:
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 9 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 9 And /dev/sdd has increasing number of reallocated sectors (about 1 every couple of minutes):
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 1735 /dev/sdc has failed a coulple of times this week (but I have always sucessfully readded it to raid5) . But the increasing number of reallocated sectores on /dev/sdd concerns me even more.
I'm affraid that during removal of /dev/sdd and adding new /devs/sdd disk, raid might fall appart. That's why I would try to do it in Ubuntu Live CD:If the raid falls appart (/dev/sdc fails) during the readding of new /dev/sdd disk, I might still remove the new /dev/sdd and return the previous one and assemble the raid with:
/dev/sda /dev/sdb /dev/sdd (old one that was previously removed)
Does assembling Raid in Ubuntu Live and adding new disk for /dev/sdd write anything on /dev/sda, /dev/sdb and /dev/sdc in the process of adding /dev/sdd into raid5?
I've got a server running a software Raid for SATA disks on a P5E motherboard.
I had to had a lot of memory on thi sserver, then I have had to flash the bios. This has resetted the software raid on the disks, then when I boot, i got a Kernel Panic cause it doesn't find anything ...
how to rebuild the raid ? I can boot on a live cd, or anything else, but don't know how to do it without loosing my data.
My home-backup server, with 8*2TB disks won't boot anymore. Two disks failed at the same time and i rebuilt the raid 6 array without any problem, but now i can't boot the os. I'm using ubuntu server, 10.10. I've made screens of the displays to don't copy everything here. The problem at the boot:
And the Grub config: It's not a production server, but i would like to have it online. I've tried for the lasts 2 days (just a couple hours a day) but without success. I was suggested to do "mount -o remount,rw /" and than edit /etc/fstab, but it get the file don't exist error.
I've cloned a machine by removing a HDD from a Raid mirror set putting it into another machine and powering on and fsck'ing a few times and everything is great apart from the ethernet port numbering has gone a bit wonky.
The cloned machine which must have a specific configuration using eth0,1,2,3 naming convention however when I boot the freshly cloned machine eth4,5,6,7 show up and eth0,1,2,3, are missing (as it detects the new machines ethernet ports)
Is there some configuration file I can delete, reboot the machine and then Fedora rebuilds the file using eth0,1,2,3, and populating with the correct hardware address.I need to clone this machine lots and lots and lots of times and the manual why I've figured out is a little long winded.
I have never preformed a rebuild of an RAID array. I am collecting resources, which details how to build an RAID 5 array when one drive has failed. Does the BIOS on the RAID controller card start to rebuild the data on the new drive once it is installed?
Its from a Synology Box with 3 disks, which one is damaged. But this disk wasnt in use.Take a look on the raid-size of 493 GB - and the both available disks with 250GB..) On the others there were a linear raid. during this damaged disk the synology-device tells me, that the volume was crashed.But it look like, that this disk was not mounted into this volume.Quote:
DiskStation> mdadm --detail /dev/md2 /dev/md2: Version : 00.90
OS: Ubuntu 7.10 i386 Server (2.6.22-14-server)upgraded toUbuntu 8.04 i386 Server (2.6.24-19-server)I have 8 SATA drives connected and the drives are organized into three md RAID arrays as follows:/dev/md1: ext3 partition mounted as /boot, composed of 8 members (RAID 1) (sda1/b1/c1/d1/e1/f1/g1/h1)/dev/md2: ext3 partition mounted as /root, composed of 8 members (RAID 1) (sda2/b2/c2/d2/e2/f2/g2/h2)/dev/md3: ext3 partition mounted as /mnt/raid-md3, composed of 8 members (RAID 6) (sda3/b3/c3/d3/e3/f3/g3/h3), this is the main data partition holding 2.7TiBs worth of dataAll the raid member partitions are set to type "fd" (Linux RAID Autodetect).Important Note: 6 of the drives are connected to two Sil3114 SATA controller cards whilst 2 of the drives are connected to the on-board SATA controller (I don't know which model it is).
After upgrading my Ubuntu installation to 8.04, upon system restart there was an error message saying that my RAID arrays were degraded and thus the system was unable to boot from it.At the time, not knowing the cause of the sudden RAID failure, I attempted to force mdadm to start the arrays anyways (the RAID 1 arrays with 8 members each were no causes for concern, of course, but I wanted to back up my data on the degraded md3 array as soon as possible).Then it hit me, why would it recognize only 6 drives? Apparently the kernel has some compatibility problems with certain SATA controllers and my on-board controller chip was one of them.Sure enough, after moving all 8 drives to the Silicon Image controllers, the drives were all recognized without any problems.
If the missing drives were recognized again before the array was ever brought up again, everything would've been fine. But unfortunately I forced mdadm (--run switch) to bring it online with 2 missing members.This is when the problem began. I know that as soon as I re-add the two missing drives back into the md3 (RAID 6) array, the system will attempt to rebuild the array, using the data from the 6 drives.Given the size of the array and the type of the disk drives being used (off-the-shelf SATA drives with bit error rate of 1 out of 10^14 bits), I think it is highly likely that the system will encounter one or more bit errors during the rebuild., at this stage what I'm wondering is:1. If mdadm encounters a bit error during a RAID 6 rebuild, will it just give up on that particular file and move on to recover other data on the array? Or will it trash the entire array?
2. Is it possible to cheat mdadm by somehow replacing the new "raid metadata" on the 6 drives with the old data on the 2 drives? Will it make mdadm think the array is clean, consistent and nothing ever happened? Please do note that I did not write ANY new data onto the RAID 6 array from the time it was degraded until the time I brought it down with (--stop).Sorry for the long post and thank you for your time in advance. I really hope to get this RAID array back up without data corruption because I don't have a working backup of the array (I know, very stupid of me
I have a RAID 6 built on 6x 250GB HDDs w/EXT4. I will be upgrading the RAID to 4 2TB HDDs.
How would one go about this? What commands would need to be ran? I'm thinking about replacing the drives 1 at a time and letting it do the rebuild, but I know that would take a lot of time (which is fine). I don't have enough SATA ports to setup the new RAID and copy things over.
I know that this topic has been posted, responded to, and maybe even resolved, many times here, but I am stuck here with partially dead fileserver and need some pointers.
Problem: one disk drive that was part of a logical volume died. I have a replacement, but I can't get it into the LV and get the LV back up again.
pvcreate --uuid <uuid of dead drive> /dev/sdX1, where /dev/sdX1 is the newly created drive and its partition. vgcfgrestore VolGroup vgscan VolGroup vgchange -ay VolGroup e2fsck /dev/mapper/VolGroup-LogVol
but, e2fsck can't find a superblock. Apparently this drive is the first in the LV sequence, and it is not formatted as part of the LV.
So how to I get this new disk formatted into the LV without reformatting the entire LV and losing what data I still have?
I have recently installed Ubuntu 10.04.1 lts server on my Intel "fakeraid" (software raid) (2x250 sata).To test my RAID 1 I turned off one HD and start the system.The first screen (Intel software screen) show Status = Degraded, but the system starts normally with just one HD.Then I turned off the system and turned on the HD again, so the first screen (Intel software screen) shows Status = Rebuild. If I enter in the software raid panel the folowing message is showed: "Volumes with "Rebuild" status will be rebuilt within the operating system"The system starts normally... but this message status stays permanently even I restart the system again
I wonder how to attach new sata hard disk to software array where are two disk and one is crashed (this is a mirroring mode=Raid 1).Situation like this:I unpluged crashed disk and I buy the similar one and plug in What Next should I do?
I recently installed a server with Software RAID. I tested by powering it down, unplugging one drive and powering it up. Magically, it worked!I found out later that I have to manually add individual devices like md1 to sda2 md2 to sda4. I got all of them added and rebuilt but my question is: Is there a way to make it so that if I "removed" a drive and put it back, the system will senses the new drive and rebuilds based on some internal table?
I have a hypothetical situation in which I installed my operating system using a RAID1 mirror. At some point I decided that this setup was overkill, my machine isn't system critical, I value doubling my storage space more than speedy recovery, I'm doing routine backups, etc...
Short of backing up my system volume and repartitioning, or otherwise starting over, is there a way I can reconfigure my RAID1 array to only expect one disk so that mdadm no longer reports a Degraded state?
As you can see they now show up as inactive. And for some reason sdi1 and sdh1 are not even listed. What can I do to get them back? To make matters worse I placed some important data on them, and even if I was clever enough to keep an extra copy on another drive, guess which drive that was? So, I need to get them activated as is (at least so I can get the data of them) before I can rebuild them from scratch. I'm running Mandriva 2010.1 and rated tehm using the built in disk partitioner.
I want to build a 6xSATA RAID 5 system with on of the disks as spare disk. I think this give me a chance of 2 of 6 disks failing without losing data. I am right? Hardware: Intel ICH10R First I will creat a 3xSATA RAID 5, after I will add the spare disk and after that I will add the others disks. This is what I think I should do.
Step 1: Create RAID Device Code: mdadm --create --verbose /dev/md0 --metadata 1.2 --level=5 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1 I read that "--metadata 1.2" is the best option. It is true? Create filesystem on the RAID device
Using this method of calculation: * chunk size = 128kB (for RAID 5) * block size = 4kB (recommended for large files, and most of time) * stride = chunk / block = 128kB / 4k = 32kB * stripe-width = stride * ( (n disks in raid5) - 1 ) = 32kB * ( (5)- 1 ) = 32kB * 4 = 128kb Then: Code: mkfs.ext3 -v -m .1 -b 4096 -E stride=32,stripe-width=128 /dev/md0
Step 2: Add spare-disk Code: mdadm --add /dev/md0 /dev/sdd1 Is this enough?
I have installed a Fedora Core 12 Linux system onto a RAID 1 file system. I now need a way of getting an notification if the disk fails. Is there an SNMP MIB that covers Intel RAID? I have done the searching but still the answer alludes me.
I newly installed debian squeeze with software raid. The way I did was, as also given in this thread.
- I have 2 HDD with 500 GB each. For each of them, I created 3 partitions (/boot, / and swap) - I selected the hard drive and created a new partition table - I created a new partition that was 1GB. I then specified to use the partition as a Physical Volume for RAID. and used for /boot and enabled bootable. - Created another partition, which is of 480 GB, and then specified to use the partition as a Physical Volume for RAID. and used for /. - Created another partion and used for swap
Then RAID configuration: Through Configure RAID menu -> create MD device -> (2 for the number of drives, 0 for spare devices) Next select the partitions you want to be members of /dev/MD0. I selected /dev/sda1 and /dev/sdb1 (for /boot) Next select the partitions you want to be members of /dev/MD1. I selected /dev/sda6 and /dev/sdb6 (for /) And no RAID for swap partitions
'Finish Partitioning and write changes to disk' --> Finish the rest of the install like normal. Everything is ok now, except I am not sure how to test my raid config. When I pull the power of the HDD, it only boots from one disk. I read in some forum that I may have to install GRUB manually on the other. In Debian Squeeze, there is no grub command. Not sure how to make my software raid bootable from both disk. I configured /boot partitions of both disks to be boot=yes. Not sure whether that is ok.
my Fedora 11 system is not starting anylonger. It stops with the message:
VFS: Can't find ext4 filesystem on dev dm-0
The system told me since a while, that a lot of the sectors of one disk of the (software) RAID compound are failed already. So tried to disconnect each of the disks and start them separately. Unfortunaltly this is not working (for one its is not working at all, the other wents the same far as with both), when I tried to recover the system with the Fedora DVD, it said no distribution found. I am quite new and do not know so much about linux system, so i do not know what further information you could need. Maybe it can be important, that both disks are encryped (the system wents so far, that I can type in the password).
I have an SiI hardware SATA RAID card, with two 500GB disks in mirrored RAID configuration. When I first plugged them in and set it up, things seemed to work ok, but on boot the raid controller told me that the RAID needed rebuilding, and it would happen automatically after POST. So I didn't worry about it, and the drive mounted fine, and it's been that way for years. I just went in and manually on-line rebuilt the RAID in the controller's BIOS, and now when I boot into Ubuntu, both disks show up in fdisk, but neither show up in /dev/disk/by-uuid. Am I missing something?
I had done a new lucid install to a 1 TB RAID 1 array using the alternate CD a few weeks back. I messed up that system trying to some hardware working that lucid doesn't have drivers for yet, so I gave up on it and reinstalled to a single 80 GB disk that I now want to move over to the RAID array.
I moved all of the existing files on the array to a single folder, then copied all of the folders from the 80 GB disk over to the array with permissions and symlinks (minus the contents of /proc and /sys, which I created empty).
These are the commands I used:
p -a -d -R -v -t /media/raid_array /b* cp -a -d -R -v -t /media/raid_array /d* cp -a -d -R -v -t /media/raid_array /e* cp -a -d -R -v -t /media/raid_array /h*
I tried to change fstab to use the 689a... for root, but when I try to boot, it's still trying to open /dev/disk/by-uuid/412d...
So then I booted from the single disk again and chrooted into the array, then ran update-initramfs -u. I got 3 "grep: /proc/modules: No such file or directory" errors, and "cat: /proc/cmdline: No such file or directory"- so I created directory /proc/modules, created an empty file /proc/cmdline, and ran the initramfs update again. Then I tried to shut down, which hung (probably because I was doing all of this from a terminal window in Gnome), so I killed the power after a couple of minutes.
It's still trying to use /dev/disk/by-uuid/412d... to boot.
What am I missing? I assume I just have to change the UUID to mount as root, but I don't know how.
So I have a system that is about 6 years old running Redhat 7.2 that is supporting a very old app that cannot be replaced at the moment. The jbod has 7 Raid1 arrays in it, 6 of which are for database storage and another for the OS storage. We've recently run into some bad slowdowns and drive failures causing nearly a week in downtime. Apparently none of the people involved, including the so-called hardware experts could really shed any light on the matter. Out of curiosity I ran iostat one day for a while and saw numbers similar to below:
Some of these kinda weird me out, especially the disk utilization and the corresponding low data transfer. I'm not a disk IO expert so if there are any gurus out there willing to help explain what it is I'm seeing here. As a side note, the system is back up and running it just runs sluggish and neither the database folks nor the hardware guys can make heads or tails of it. Ive sent them the same graphs from iostat but so far no response.
Ubuntu has got this build-in check for errors which starts every 30 startups (if I remember well ) but my one gone missing... Strange. How can I turn it back on ?ound in the forum some information about Bonager, but is this original automatic disk check software shipped with Ubuntu or another piece of software ?
I want to create several virtual machines based on a minimal (no GUI) Ubuntu installation. I'm using VirtualBox (on Windows 7), the VMs are being created with 256MB RAM and using the Ubuntu Minimal CD Image [URL]. Because I want 4-5 of these virtual machines I want to use minimal disk space for storage too, which means restricting the virtual hard disk size for each. My first attempt was to limit it to 300MB, but when I got to the partitioning section of the installer it would not allow me to do automatic partitioning and forced me to do manual partitioning, it did moan about the size of the disk.
So I started again with a 1GB virtual hard disk, this time the installer was quite happy to do the automatic partitioning. My question is how small can I make my virtual hard disk without having to do manual partitioning? I don't have a problem with doing the partitioning manually but for easiness I just want to do it automatically and find it strange the acceptable size isn't mentioned anywhere (that I could find).
1. Make a disk image of my 9.10 system (formatted ext3, btw) on my Syology CS407 NAS so I can do a bare metal restore. Why is this a couple of clicks on my Mac and Windows boxes, but so far not easy on Jaunty? Did I miss something?
2. Drivers. Why can't I just have an automatic wrapper for Windows drivers so I can use any printer or scanner, or a simple point and click driver install for native drivers? I have my ethernet connected Brother MFC-7820N, and the Samsung CLP-315 that runs off my CS407 installed and working on my Jaunty, but it was way more work than expected. What is the easy, automatic or point and click way to install drivers?
3. Graphics drivers. I have decent cards in my big boxes, Nvidia GTX 200 series. But when I get kernel updates, I have to uninstall and reinstall the graphics driver. Is there an easy way to keep this working?
4. Is there one flavor of linux distro that has a really consistent standard for user interface? I like to be able move things around, but do like my menus to be consistent (and do I ever hate the MS ribbon!). I've really only tried Ubuntu.
Linux installs have come a long, long way from the old days, and are such a point and click operation that I just wonder what I'm doing wrong. Someone is bound to have sorted these things.
I have backtrack 4 on vmware player... i have intel (R) wifi 5100 agn wireless card and when i type airmon-ng there is nothing shown on interface....it's empty... i downloaded a driver from here [URL] and i have been told i need kernel rebuild... i have kernel 184.108.40.206 so how can rebuild it?
I am trying to install some patches and drivers needed for a wifi card, but im getting an error that says: "build your kernel with CONFIG_LIBIPW=m." How can I recompile the kernel to add that? And can I do it without having to download a new kernel package? (i mean recompiling the existing kernels)
Has anybody ever used Disk Utility to set up software RAID? Here I am running terminal commands (I'm a terminal junkie) and I just happen to stumble across instructions that indicate "Or you can just set it up through Disk Utility."
Sure enough in disk utility, it looks like all of the configurable options are there. It makes me wonder, though... is this kind of GUI functionality something that isn't really solid? Or does it operate predictably and effectively?
point me in the direction to get a step by step guide to setting up a Raid 5 using the Disk Utility and 3 spare drives? I have the main OS files on a 80gig drive and I would like to mount the 3 drives as Raid 5.Just shooting in the dark now.. Screen shot is attached. [URL]...
I have a Red Hat Enterprise (AS) 4.8 system and I need to know how to totally rebuild the system from dump tape. I have been making some full level 0 dumps of the system to the attached DAT72 tape drive... In the case the boot disk goes south, I need to reload from tape, onto a new disk drive. I know how to do this in Solaris. I assume you boot from CD to like a mini-root, then configure and mount the drive on temp mount points, restore the sys data, then load the "boot blocks" (like installboot on solaris).
I am working on: SUSE Linux Enterprise Server 10 (ia64) VERSION = 10 PATCHLEVEL = 2
Example 1: If Raid is enabled on disk then the information in cat /proc/partitions is shown as : 104 0 35532720 cciss/c0d0 104 1 2104483 cciss/c0d0p1 104 2 33423232 cciss/c0d0p2 104 16 35532720 cciss/c0d1 104 32 35532720 cciss/c0d2
Example2: If normal disk is taken the output is as follows: 8 0 35566480 sda 8 1 102383 sda1 8 2 409600 sda2 8 3 210925 sda3 8 4 2104515 sda4 8 5 32739023 sda5
According to my application I need to find out the information of the disk. How can I get disk information when my system is RAID enabled(cat /proc/partitions shows the entry of the controllers in example 1). Can we get Partitions information of the Disk when Disk are connected to system by controller?
I have a system with data stored in multiple disk arrays. I have to come up with a solution that will maintain the disk order of the arrays whenever a stripe fails, is removed and then put back in. One solution I came up with was to stamp every stripe with the disk array it belongs to along with its stripe id. I plan to put this stamp in the last 512 KB of each disk. And I maintain all this information in a sqlite database, that is disk array, stripe id, the software diskname, etc. So that whenever a disk is replaced, its stamp could be read and the corresponding entries in the database are updated.