Ubuntu :: Server Was Used To Boot (after MDADM Raid Rebuild)
Feb 1, 2011
My home-backup server, with 8*2TB disks won't boot anymore. Two disks failed at the same time and i rebuilt the raid 6 array without any problem, but now i can't boot the os. I'm using ubuntu server, 10.10. I've made screens of the displays to don't copy everything here. The problem at the boot:
And the Grub config: It's not a production server, but i would like to have it online. I've tried for the lasts 2 days (just a couple hours a day) but without success. I was suggested to do "mount -o remount,rw /" and than edit /etc/fstab, but it get the file don't exist error.
Its from a Synology Box with 3 disks, which one is damaged. But this disk wasnt in use.Take a look on the raid-size of 493 GB - and the both available disks with 250GB..) On the others there were a linear raid. during this damaged disk the synology-device tells me, that the volume was crashed.But it look like, that this disk was not mounted into this volume.Quote:
DiskStation> mdadm --detail /dev/md2 /dev/md2: Version : 00.90
I wonder how to attach new sata hard disk to software array where are two disk and one is crashed (this is a mirroring mode=Raid 1).Situation like this:I unpluged crashed disk and I buy the similar one and plug in What Next should I do?
My only goal is to have a raid-5 that auto-assembles and auto-mounts. Hardware: 4*2TB sata (raid disks), 1*500GB IDE (OS disk), 1*DVD IDE all plugged direct into the motherboard (nForce 750i SLI).
Starting partitions on the raid disks: gpt ext4 The problem occurs when I restart my comp after building it for the first time. I am able to see it assemble, I am able to partition it, I even mounted it Once.This is the second time I've built it so I have watched everything that happened. I don't know if this has anything to do with my problem, but when I created the raid my drive designations were: sda - 500GB(OS), sd[bcde] - 2TB(raid). When I restarted: sd[abcd] - 2TB(raid), sde - 500GB(OS).
1/4 of my drives died after about 3 years of usage. I replaced it with an identical drive and did a mdadm -add to re-add it to the array. I expected this to take quite a long time, but not more than 1 million minutes to complete!
I am running a 14 disk RAID 6 on mdadm behind 2 LSI SAS2008's in JBOD mode (no HW raid) on Debian 7 in BIOS legacy mode.
Grub2 is dropping to a rescue shell complaining that "no such device" exists for "mduuid/b1c40379914e5d18dddb893b4dc5a28f".
Output from mdadm: Code: Select all # mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Nov 7 17:06:02 2012 Raid Level : raid6 Array Size : 35160446976 (33531.62 GiB 36004.30 GB) Used Dev Size : 2930037248 (2794.30 GiB 3000.36 GB) Raid Devices : 14
Output from blkid: Code: Select all # blkid /dev/md0: UUID="2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb" TYPE="xfs" /dev/md/0: UUID="2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb" TYPE="xfs" /dev/sdd2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="09a00673-c9c1-dc15-b792-f0226016a8a6" LABEL="media:0" TYPE="linux_raid_member"
The UUID for md0 is `2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb` so I do not understand why grub insists on looking for `b1c40379914e5d18dddb893b4dc5a28f`.
**Here is the output from `bootinfoscript` 0.61. This contains alot of detailed information, and I couldn't find anything wrong with any of it: [URL] .....
During the grub rescue an `ls` shows the member disks and also shows `(md/0)` but if I try an `ls (md/0)` I get an unknown disk error. Trying an `ls` on any member device results in unknown filesystem. The filesystem on the md0 is XFS, and I assume the unknown filesystem is normal if its trying to read an individual disk instead of md0.
I have come close to losing my mind over this, I've tried uninstalling and reinstalling grub numerous times, `update-initramfs -u -k all` numerous times, `update-grub` numerous times, `grub-install` numerous times to all member disks without error, etc.
I even tried manually editing `grub.cfg` to replace all instances of `mduuid/b1c40379914e5d18dddb893b4dc5a28f` with `(md/0)` and then re-install grub, but the exact same error of no such device mduuid/b1c40379914e5d18dddb893b4dc5a28f still happened.
One thing I noticed is it is only showing half the disks. I am not sure if this matters or is important or not, but one theory would be because there are two LSI cards physically in the machine.
This last screenshot was shown after I specifically altered grub.cfg to replace all instances of `mduuid/b1c40379914e5d18dddb893b4dc5a28f` with `mduuid/2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb` and then re-ran grub-install on all member drives. Where it is getting this old b1c* address I have no clue.
I even tried installing a SATA drive on /dev/sda, outside of the array, and installing grub on it and booting from it. Still, same identical error.
I have been having this problem for the past couple days and have done my best to solve it, but to no avail. I am using mdadm, which I'm not the most experienced in, to make a raid5 array using three separate disks (dev/sda, dev/sdc, dev/sdd). For some reason not all three drives are being assembled at boot, but I can add the missing array without any problems later, its just that this takes hours to sync. Here is some information:
I have been having some odd issues over the last day or so while trying to get a raid 5 array running in software under Kubuntu. I installed 3 1TB drives and started up, my sd* order got all messed up( sda was now sdc and so on). This wasn't entirely unexpected, so I fixed up fstab and booted again. I found all three of the drives I installed, set them to raid auto-detect and used mdadm to create /dev/md0. I then created mdadm.conf by piping the output of mdadm --detail --scan --verbose into /etc/mdadm.conf.At this point, everything was still going swimmingly. I copied over a few hundred GB of data from another failing drive and everything seemed ok. I went to reboot once the copy was done and everything just went weird. All of the sd* drives went back to the original. Of course, this meant that the mdadm.conf was wrong. I tried to just change the device list, but that didn't work. I then deleted mdadm.conf and rebooted. The drive list stayed in the original order this time, so I just tried manually starting the array.
By erasing the partition table of the 3rd drive, I've been able to get it to the status of spare, but it says it is busy when I try to add it to the array. A grep through dmesg makes me think that md has a lock on it. I'm not sure where to go with it now. If anyone has any pointers, I would like to hear them.
I have an Linksys NSLU2 with four usb harddrives attached to it. One is for the os, the other three are setup as a RAID5 array. Yes, I know the raid will be slow, but the server is only for storage and will realistically get accessed once or twice a week at the most. I want the drives to spin down but mdadm is doing something in the background to access them. An lsof on the raid device returns nothing at all. The drive are blinking non-stop and never spin down until I stop the raid. Then they all spin down nicely after the appropriate time.
They are Western Digital My Book Essentials and will spin down by themselves if there is no access. What can I shutdown in mdadm to get it to stop continually accessing the drives? Is it the sync mechanism in the software raid that is doing this? I tried setting the monitor to --scan -1 to get to check the device just once, but to no avail. I even went back and formatted the raid with ext2 thinking maybe the journaling had something to do with it. There are no files on the raid device, it's empty.
mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdb1and I getmd1: raid array is not clean -- starting background reconstructionWhy is it not clean?Should I be worried?The HD is not new it has been used in before in a raid array but has beenrepartitionated.
I have 4 WD10EARS drives running in a RAID 5 array using MDADM.Yesterday my OS Drive failed. I have replaced this and installed a fresh copy of Ubuntu 11.04 on it. then installed MDADM, and rebooted the machine, hoping that it would automatically rebuild the array.It hasnt, when i look at the array using Disk Utility, it says that the array is not running. If i try to start the array it says :Error assembling array: mdadm exited with exit code 1: mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
mdadm: Not enough devices to start the array.I have tried MDADM --assemble --scan and it gives this output:mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.I know that there are 4 drives present as they are all showing, but it is only using 2 of them.I also ran MDADM -- detail /dev.md0 which gave:
root@warren-P5K-E:~# mdadm --detail /dev/md0 /dev/md0: Version : 0.90
When we assemble a raid array, from where does it load configuration information for that array? I thought it refers to /etc/mdadm.conf file, but in my system, mdadm.conf file doesn't even contain all information. Still it is able to successfully assemble previously created device.
I have a raid 5 array that appears to have died. I was just routinely looking at /var/log/messages and noticed that a drive in the array was complaining (via SMARTD).
This is the /home directory, and so is backed up, so it's not critical, but I'd like to get some things that changed after the last backup (the week before I noticed the failure)
Let me start by outlining what I know :
It's a 2TB array spread over three disks (mdadm software RAID5), here are the drives:
MDADM gives the drives :
Now, the array *WAS* up ok, but I umounted it. (in which /dev/md0 was mounted to /home) Yes,I know - I didn't want any changes being made to the array by anything - at least that was my thinking at the time. In hindsight... I would have killed any processes, locked out the server, backed up again and 'then' unmounted it.
But we are where we are, I'm sure there'll be time for recriminations later.
When I try to remount it, I get :
Ok - looks like it's lost the type - it's normally worked, maybe we'll give it a little hint - it's ext3 with a journal.
When I tell it it's an ext3, I get :
Now, before I go charging off specifying superblocks further along the disk, but I can't remember where they're stored.
Neither can I recall what the blocksize I originally created the array as (I have a feeling I specifed 4K, but I could be wrong).
debugfs is only telling me :
I should also point out that this server is hosted, so it's 150 miles away from me at the moment, so I can't just whip them out and dd a copy.
I have a strange issue with my RAID5 array - it worked fine for a month, a couple days ago it didn't start on boot with mdadm reporting "Input/Output error" - I didn't panic restarted my computer, same error. Then opened a Disk utility and it reported State: Not running, partially assembled - don't know why, I've pressed Stop RAID Array and started it again, voila - it reported State: Running - I've checked components list and there was nothing wrong with it. So I run Check Array utility, waited almost 3 hours to finish it and it worked since than, till today's morning - I've started my computer, and here we go, same error.
This is an initial state just after computer startup:
This is after I stop and start RAID5:
This is a components list:
I can see nothing wrong there yet not sure why mdadm fails on boot. I do not really like the windows solution I guess, when I check my array again, it will work fine again, but it then can fail in the same way without known reason.
We have some servers that run in very harsh environments (research vessel) that need to have high-availability.We have software RAID 1 for some measure of resiliency, along with proper data backups (tapes etc), however we would like to be able to break out a new server and re-image it (including RAID setup) from a known good copy if the hardware completely fails on the production box. Simplicity of the process is a big plus.I am interested in any advice on the best way to approach this. My current approach (relatively new to Linux administration, totally new to MDADM) is to use DD to take a complete gzipped copy of one of the RAID'ed devices (from a live CD): ode: dd if=/dev/sda bs=4096 | gzip -c > /mnt/external/image/test.img then reverse the process on the new PC, finally using Code:mdadm --assemble to re-create and re-build the array.
I am learning software raid 1 with centos 5.5. I created the raid with out any problems and removed the first drive to check there was no problems and it booted. I have installed the old drive back in the system as hdc and need to resync the drives (used old drive as partitions correct) I thought I could use raidhotadd but id does not seem to exist anymore. how I resync the drives in the array hda primary and hdc secondary using mdadm
This is message I get when I try and start itmdadm: /dev/md0 assembled from 2 drives - not enough to start the arrayBelow is the information I've collected about any help on how I can get the raid back up and going to I can get the data off of it would be awesome
I have an old Athlon XP 3000 machine that I keep around as a file server.It's currently got three 1TB drives which I had setup as mdadm raid 5 on FC10. The machine's original drive held the superblock for the raid array and it just had a massive heart attack. I've searched, my biggest source being URL...I can't tell if I can reassemble the superblock info lost with the original hard drive or if I've lost it all...
I have a raid5 on 10 disk, 750gb and it have worked fine with grub for a long time with ubuntu 10.04 lts. A couple of days ago I added a disk to the raid, growd it and then resized it.. BUT, I started the resize-process on a terminal on another computer, and after some time my girlfriend powered down that computer! So the resize process cancelled in the middle and i couldn't acess any of the HDDs so I rebooted the server.
Now the problem, the system is not booting up, simple black with a blinking line. Used a rescue CD to boot it up, finised the resize-process and the raid seems to be working fine so I tried to boot normal again. Same problem. Rescue cd, updated grub, got several errors: error: unsupported RAID version: 0.91. I have tried to purge grub, grub-pc, grub commmon, removed /boot/grub and installed grub again. Same problem.
I have tried to erased mbr (# dd if=/dev/null of=/dev/sdX bs=446 count=1) on sda (ide disk, system), sdb (sata, new raid disk). Same problem. Removed and reinstalled ubuntu 11.04 and is now getting error: no such device: (hdd id). Again tried to reinstall grub on both sda and sdb, no luck. update-grub is still generating error about raid id 0.91 and is back on a blinking line on normal boot. When you'r resizeing a raid MDADM changed the ID from 0.90 to 0.91 to prevent something that happend happened. But since I have completed the resize-process MDADM have indeed changed the ID back to 0.90 on all disks.
I have also tried to follow a howto on a similar problem with a patch on [URL] But I cant compile, various error about dpkg. So my problem is, I cant get grub to work. It just gives me a blinking line and unsupported RAID version: 0.91.
I've got a server running a software Raid for SATA disks on a P5E motherboard.
I had to had a lot of memory on thi sserver, then I have had to flash the bios. This has resetted the software raid on the disks, then when I boot, i got a Kernel Panic cause it doesn't find anything ...
how to rebuild the raid ? I can boot on a live cd, or anything else, but don't know how to do it without loosing my data.
I went to setup my linux box and found that the OS drive had finally died. It was an extremely old WD raptor drive in a hot box full of drives so it was really only a matter of time before it just quit on me. Normally this wouldn't be such a big deal however I had just recently constructed an md RAID5 array of 3 1TB disks to act as an NFS mount for basically all of my important files. Maybe 2-3 weeks before the failure I had finished moving all of my most important stuff onto that array. Now I know that the array is intact. All the required data is sitting on those disks. Since only the OS level disk failed on me I should be able to get a new disk in there, reinstall ubuntu and then rebuild that array. how exactly do I go about doing that with mdadm? Do I create the array from the /dev character devices like when I initially built the array?
I have never preformed a rebuild of an RAID array. I am collecting resources, which details how to build an RAID 5 array when one drive has failed. Does the BIOS on the RAID controller card start to rebuild the data on the new drive once it is installed?
I've cloned a machine by removing a HDD from a Raid mirror set putting it into another machine and powering on and fsck'ing a few times and everything is great apart from the ethernet port numbering has gone a bit wonky.
The cloned machine which must have a specific configuration using eth0,1,2,3 naming convention however when I boot the freshly cloned machine eth4,5,6,7 show up and eth0,1,2,3, are missing (as it detects the new machines ethernet ports)
Is there some configuration file I can delete, reboot the machine and then Fedora rebuilds the file using eth0,1,2,3, and populating with the correct hardware address.I need to clone this machine lots and lots and lots of times and the manual why I've figured out is a little long winded.
I have a 5 disk raid 5 array that is composed of SATA A:0,1; SATA B: 0,1, and SATA C:0, and one of the disks (SATA A:0) recently went bad on me. I have an ICP raid controller that is about 5 years old. I replaced SATA A:0. After rebooting, I went into the controller and verified that it saw the disk in the hard-disk info section...there I noticed that in the "status" section, that the SATA C:0, SATA B:1 disks were listed as being "in array", the SATA A disks were blank, and the SATA B:0 disk was listed as "fragment". When I go into the "repair array" section, the controller tells me that there are no arrays that are in failure, error, or need to be rebuilt.
This puzzles me, as I thought the controller would know that the array needs to be rebuilt after replacing the disk and I don't see a way to initiate a rebuild. If I just let the server boot after replacing the disk, then I get back that there are the correct number of disks in the raid 5 and that it is ready, however, the screen then goes blank and I get a blinking cursor and the system seems to hang. There are no activity lights on any of the drives associated with the raid 5, which makes me think that the system is not rebuilding the array at this point.
I have 4 WD10EARS drives running in a RAID 5 array using MDADM.Yesterday my OS Drive failed. I have replaced this and installed a fresh copy of Ubuntu 11.04 on it.I then installed MDADM, and rebooted the machine, hoping that it would automatically rebuild the array.It hasnt, when i look at the array using Disk Utility, it says that the array is not running. If i try to start the array it says : Error assembling array: mdadm exited with exit code 1: mdadm: failed to RUN_ARRAY /dev/md0: Input/output errormdadm: Notenough devices to start the array.I have tried MDADM --assemble --scan and it gives this output:mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.I know that there are 4 drives present as they are all showing, but it is only using 2 of them.I also ran MDADM -- detail /dev.md0 which gave:
root@warren-P5K-E:~# mdadm --detail /dev/md0 /dev/md0: Version : 0.90
I have done lots of searching and I haven't been able to find anyone else with the same problem. Whenever I create a RAID with 'mdadm', regardless of level (I've done linear, 0, and 5) the command I use is: