Ubuntu Servers :: 10.04 Server RAID5 LVM Failing After Reboot Due To It Using The /dev/sd* Designations
Dec 5, 2010
I'm a light linux user over the last couple of years and I decided to built a HTPC/NAS device.
40gb ide -> usb boot drive
3x2tb sata (4k Sector) drives
I've got another 2tb identical drive but it's holding data that is going to be copied to the raid after it's up and running and then be 'grown' into the raid array to yield a final 5+tb array. I tried doing a disk util raid array and it ended up failing after reboot due to it using the /dev/sd* designations and they swapped. I have no idea how to do the UUID version, my googlefu and practical guide to ubuntu. So I decided to do it manually in order to also fix the sector issue as disk util wasn't formatting them correctly and once formatted wouldn't let me create a raid array from the discs.
I am trying to connect the one of server RHEL5.4 to the IBM iSCSI storage. Server is equipped with 2 single port Qlogic iSCSI HBA(TOE). RHEL detected the HBA and installed driver itself (qla3XXX). I have configured the HBA ip address in the range of iSCSI host port of storage. Both of the HBA is connecting to the two different controller of storage. I have discovered the storage using command iscsiadm -m discovery command for both of the controller and it went through fine. But problem is whenever server is restarting if both of the hba is connected to the storage then server will not detect the volumes which is mapped to the server and then to detect the volume I need to run "mppBusRescan" and "vgscan" command each time. If only one path is connected it is fine.
I've got a new Ubuntu 10.04 server install with a new 3 disk RAID 5. The boot disk is separate, not part of the RAID. I was trying to practice what I would do if a disk died to recover the RAID, so I unplugged one of the three disks. The machine now just hangs on startup. It shows fsck at the top of the screen but doesn't got anywhere from there. If you press a key it shows the Ubuntu splash screen. If I plug the disk back in, everything boots up normally. So, my question is, how do I get the machine to boot with one of the RAID members missing? I know I can recover it using the Live CD, but it would be nice to be able to get back into the machine without the CD.
Running OpenSuse 11.4 and have setup 2 x Raid5 configs - raid created, disks format, everything working fine. I've just rebooted and the raid5 fails to initialize.
Getting these errors:
Code: May 29 16:58:30 suse kernel: [ 1788.170692] md: md0 stopped. May 29 16:58:30 suse kernel: [ 1788.197864] md: invalid superblock checksum on sdb1 May 29 16:58:30 suse kernel: [ 1788.197876] md: sdb1 does not have a valid v0.90 superblock, not importing!
I am getting really frustrated with trying to get my RAID5 working again. I had a RAID5 array built with 4 of the Western Digital 1.5tb "Advanced Format" drives, WD15EARS. However, when copying 1.5gb dvd encoded files to the drive, I was getting speeds of ~2mb/s. When researching how to make this faster, I came across all the posts about the Advanced Format drives and how that was causing a lot of issues for a lot of people. It looked like the solution was simple enough: partition starting at sector 64 or 2048 or whatever and then recreate the RAID. However, this is not working for me.
Here are my computer specs: Motherboard: Gigabyte GA-EP43-DS3L LGA 775 Intel P43 ATX CPU: Intel Core 2 Duo E8400 Wolfdale 3.0GHz 6MB L2 Cache LGA 775 65W RAM: 4gb DDR2 1066 (PC2 8500) Video card: ASUS GeForce 9600GT 512MB 256-bit Linux: 10.04
I have no drive failures but just need to recreate a raid5 set as the next free MD disk number. Originally I built a temp OS of debian on a single drive and had 4x2TB drives in a raid5 software array (MD0) this worked fine and allowed me to move all data to it, and remove our old fileserver. I have now pulled out the 4 x 2TB Raid 5 drives and created a new OS on two new 80GB drives, partioned as follows,
MD0 is now 250mb Raid1 as /boot MD1 is 4GB Raid1 Swap MD2 is 76GB Raid1 as /
If I turn off and push back in the 4x2TB drives I cannot see a MD3. I presume I would need to create a MD3 from these 4 drives but I dont want to mess things up as its live data. So im here asking for help, or a bit of hand holding to get it done right.
PS - Its a Debian Lenny 5.0.3 Raid1 fresh install replacing a Debian Lenny 5.0.3 on a single disk.
Suddenly I noticed that all my file system had gone into read-only mode. My first thought was that the Sata data cable had got loose for one of the drives, but that wasn't it. All cables were connected correctly. So I booted up again, but I only came to a rescue mode terminal.
I have four software MD raid volumes:
Running mdadm -D on the volumes told me that the sdc drive had been kicked out from both md0 and md1. However, md3 had kicked out two drives, so I couldn't get any information from mdadm -D on that. For md0 and md1 I could just add the kicked-out partitions back into the volume, but for md3 I don't even know which partitions got kicked out...
Here are some outputs:
Before I rebooted the first time I saved the 200 last rows of dmesg to a memory stick. Here they are:
Trying to restart the md3 volume in the rescue mode terminal:
The "Array State" row seems interesting. I guess that AAAA means all four drives are OK. But then why does the array state differ between the members?
Does anyone know how to figure out which two members that got kicked out? And how do I get them back in (assuming that they're OK)?
I've got a RAID5 array that doesn't want to automount after rebooting. I'm pretty familiar with linux, RAID, and mdadm, and up until now, I've had the RAID5 array working just fine. However, whenever I reboot, the array drops off and won't remount until I manually assemble and then mount the thing. I find this odd because I had everything automounting just fine back in 10.3, and even in 11.0 (I think - not sure on that). Currently, things are working, but I'd really like to not not have to type
Code: mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 followed by Code: mount /dev/md0 /mnt/data every time I reboot. Even including this in some sort of start-up script seems kludgey... Surely there must be a more elegant way of automatically bringing up a RAID5 array after booting? I'm not sure what information you'll need, so I'm going to go ahead and include as much as I can anticipate...
Well, this new issue when installing the new lucid relase iso seems to be closly related to my previous bug report refering to suspend and hybernate failing to power off, freezing at and of suspend/hybernate preperation. now what happens... installing ubuntu lucid 64bit, all gos perfect. finally i get the window begging me to reboot. i click reboot. window finsihes its work and disapears. that's it. screen is frozen with its purple background. no ctrl+alt+del, nor anything else works. only reset button. so it failed ro reboot like it fails to pwoer off for hybernate or suspend. on same machine this does not happen with karmic. it is a soley ubuntu lucid problem. any suggestion how to track that down? p.s. in my original thread and bug report i provided a pm-suspend.log that went all good till the last statement "powering off".
I installed 11.04 server and had samba share /tmp (as advised by the server pdf doc) shared to my windows 7 laptop, which was all well and good, so copied some files to it and rebooted the server, and they had been removed.i guess i shouldnt have put anything in /tmp as i presume this is cleared on reboot, so why did the documentation advise to create /tmp share?
Ok so I followed the instructions here [url] and this works great for the install however if the machine is rebooted the VMWare server refuses to start back up stating that it knows that its installed but it was not installed with the right installer.
This is on a Dell Server I can't remember the model right now but its got Dual PIII in it. I'm running Ubuntu Server 32bit 10.10 on the box as well. Thank you in advance for your assistance with this. Once I get this first server figured out I'll get my other one fired back up.
I'm writing you to ask some help with administrating a server remotely. I have a machine I use remotely when I have to travel, some time for quite long periods like from one to three months. Last time it happened to me that after upgrading I send the reboot command and the machine didn't turn down, so I couldn't be able to access it. My question is: how can I avoid such situations? Is there any best practice to follow?
After a couple of days, some commands related to the disk (df) or files (ls) or killing process (kill -9) doesn't respond. Even I can't reboot or shutdown my server. After an hard reboot, some files are not here anymore or the log files are not filled anymore until I restart.My disk are behind the RAID controller i6 and are configure in RAID 1. The disks are two HP SCSI 72,8GB 10k RPM.Maybe I am totally wrong to check the disk access side, so I am open to other explanation.I can also add that my CPU is running under 1% et my RAM under 10%.
3 new 1.5TB HD. 1 used 1.5TB hd with 980MB of data. I want to set up a raid 5 with a hot spare. I have music, pictures, videos, and movies (About 2.8TB worth). I have had a mismatch of drives previously, 250GB, 2 320GB, 500GB, 2 1TB and now a 1.5TB all with data. I have removed the one 250 and 2 320s and put the data on the 1.5TB that is currently installed.
What I would like to do is create a raid5 with the three new 1.5TB HD's, copy the data over from the currently installed 1.5TB and then grow or add that drive as a hot spare. Or just add it and then add another 1.5TB down the road as a hot spare don't know for sure.
In addition since I have 2 1 TB drives, I could add 2 more (Good deals on 1 TB drives right now) and have a total of 4 1TB drives. Could I have 2 raid5's (4-1TB's and 4-1.5TB's)in two separate arrays? I really do not know if that makes sense or not but here comes LVM. I am tired of managing my HD space and since i have multiple folders (Movies, music, pictures, videos) and within the movies folder I have R, G & PG folders for the ratings of the movies. (Pwd protect the R so the kids can't get to it) So with LVM installed with the Raid5 I should be able to create my folders and just keep adding data and not worry about moving folders around when I grow the storage by adding new drives. Is that correct? Maybe someone could point me to a how to.
Also, if I create 2 arrays (And I need to know so I can order the 2 additional 1TB drives), then I could put all the music, G and PG content on the one array and all the R and spicy stuff on the other and password protect it.
I recently installed a new home backup server with Ubuntu 9.10 x86_64 using the alternate CD. I used the CD's installer to partition my disk and created a software RAID 5 array on 4 disks with no spares. The root file system is located outside the raid array.
At first the array performed nicely but as it started to fill up, the io performance dropped significantly to the point where I get a transfer rate of 1-2MB/s when writing!
Created my own file server/nas, but get stuck in a problem after couple of months. I have a server with 4x 1,5tb disks, all connected to sata ports and 1 40gb ata133 disk running ubuntu 9.10 x64 amd. I've created a raid5 array using mdadm. It all worked great for couple of months but lately the raid5 array is degraded. disk sdd1 is faulting every few days. I have checked the drive but it is fine. If I re-add the disk and wait for 6 hours my raid5 array is all fine again, but after a few shutdowns, it is degraded.
my mdadm detail:
root@ubuntu: sudo mdadm --detail /dev/md0 /dev/md0: Version : 00.90 Creation Time : Mon Dec 14 13:00:43 2009 Raid Level : raid5
I have ubuntu server 10.04 on a server with 2.8ghz 1gb ddr2 with the os on a 2gb cf card attached to the IDE channel and a software raid5 with 4 x 750gb drives. On a samba share using these drives I am only getting around 5 MB/s connected via wireless N at 216mbps and my router and server both having gigabit ports. Is a raid 5 supposed to be that slow? I was seeing speeds of anywhere from 20-50MB/s from other people and am just wondering what i am doing wrong to be so far below that.
I've built a server with (intentionally) very low-power components. The motherboard uses a Via C3 CPU running at 700MHz. The server has 512MB of RAM and I'm running 8.04 Server Edition (no GUI). This is purely a file server - not a lot of daemons started (except the defaults) -- no web server, etc. Just NFS, Samba and Open SSH (for remote administration). I'm not sure how much free RAM it has (it's down at the moment).
Is the RAM/CPU going to be inadequate for running software RAID5? I've done some big rsyncs and even without RAID, this thing is pretty slow. I'm not terribly concerned about the write speed, but if the read performance is going to be inadequate for playing (not streaming - just playing) a 720p MKV movie over my LAN, then I need to rethink this.
As the title says, I have a failed RAID5 hard drive. What's the easiest way I can go by replacing it? I've seen many ways to do this, but I would like to know what other people are saying about this, and see how you would do it.
My fileserver initially had 3 1TB drives in RAID 5 configured with mdadm as /dev/md1. (System root is a mirrored raid on /dev/md0) I went to go add a 4th 1TB drive to /dev/md1 and grow the raid 5 accordingly. I was initially following this guide: [URL] but ran into issues on the 3rd and 4th commands. I've been trying a few things to remedy the issue since, but no luck. The drive seems to have been added to /dev/md1 properly, but I can't get the filesystem to resize to 3TB. I also am not entirely sure how /dev/md1p1 got created, but it appears to be the primary partition on the logical device /dev/md1. Relevent information:
The filesystem originated as ext3, I believe its showing up as ext2 in some of these results because I disabled the journal when doing some initial troubleshooting. Not sure what the issue is, but I didn't want to blindly perform operations on the filesystem and risk losing my data.
I also get sent to a Busybox (initramfs) shell with no text editor and don't know how to copy all the error messages and post them here. If there is a way, let me know. I've typed it out in the meantime:
Code: md0 : inactive sdxxxx Attempting to start the RAID in degraded mode... mdadm: CREATE user root not found mdadm: CREATE group disk not found
This is with a 3 disk RAID5 array. I turned off the system, pulled out a drive, and started it back up. Fresh install, all I've done so far is apt-get update and upgrade.
I have a little nice Ubuntu server with 6x 1TB drives assebmbled into a RAID5 array. Recently SATA cable of one of the drives failed. So I ordered a new cable and ran the server in degraded mode for a few days. Like this:
Code: /dev/md0: Version : 00.90 Creation Time : Sat Sep 19 10:39:11 2009 Raid Level : raid5 Array Size : 4883812480 (4657.57 GiB 5001.02 GB) code....
I'd like the 6th drive to be active, not spare, like before. Should I just wait for rebuild to be finished (it can easily take over 1 day)? Or should I add it somehow differently to be active immediately?
I'm not sure, but I think as I simulated failures unplugging one of the disk, after plugging it in again, the "failed" drive was active again and rebuilding was started as well of course. But it was 2 years ago, so...
The array works just fine for now - I can access files, etc. But I suspect, that in this state if another cable or drive fails, it won't survive anymore. Even after rebuilding is finished, but the 6th drive stays is still marked as "spare". Right?
I am trying to build a file server with RAID 5 over a couple of 1TB HDDs, to serve about 10 client machines using Ubuntu Server. I already own a 22-port switch: HP ProCurve v1810G-24 Switch (J9450A), which I am assuming will do the job. And for the actual server I am thinking of buying: HP ProLiant DL120 1U. Will this hardware suffice, or am I missing something important to get the whole thing running?
Everyone who deals with Linux knows that partitions on hard drives are designated as "sdx#", i.e., sda1 sdb2, etc. I know through experimentation that the number portion of the designation is assigned not according to order on the disk, but chronologically in the order they are created.
Further, if you have several partitions on the disk-say, sda1 through sda3-and you delete sda2, the designation of sda1 will remain the same, but sda3 will become the new sda2. The creation of any further partitions on the drive will start with designation sda3 and increment from that point.
At times this creates a conundrum, especially concerning bootable partitions. Some time back I rendered a partition containing OpenSUSE unbootable because of this, even though Ubuntu owned the GRUB bootloader in the MBR. Ubuntu's GRUB could find and point to the partition using the command "sudo update-grub", but when OpenSUSE took over the boot-up process, its GRUB was pointed to the wrong partition and would freeze up.
My question is this:
Under Windows, one is able to make a Drive letter persistent. Windows will keep the drive letter for that partition and assign around it. Is there a way to change a drive designation number, or at least make it persistent, under Linux? It would be a handy method to forestall these types of booting problems, among other things.
Presently, when a person has installed Linux side-by-side with Windows and want to delete the Windows partition and expand the Linux partition into the free space, I will tell them to format the partition, then shrink it to next to nothing instead of deleting it. This preserves the partition ID scheme while giving them the space to expand their Linux partition into...especially helpful with a seasoned Linux installation that would be a PITA to reinstall and set back up.
Oh, and I already know about UUID. This article explains it, but if you look down through the comments, you will see reasons that it is problematic for desktop application and usage. I want to make it as simple as possible for new Linux users (and myself! ).
Something weird happened last night and my raid5 failed. I am trying to re activate it and see if my data is dead or what. When I run mdadm -Asv /dev/md0 I get
Code: mdadm: looking for devices for /dev/md0 mdadm: cannot open device /dev/dm-1: Device or resource busy mdadm: /dev/dm-1 has wrong uuid. mdadm: cannot open device /dev/dm-0: Device or resource busy mdadm: /dev/dm-0 has wrong uuid. mdadm: cannot open device /dev/sde2: Device or resource busy mdadm: /dev/sde2 has wrong uuid. mdadm: cannot open device /dev/sde1: Device or resource busy mdadm: /dev/sde1 has wrong uuid. mdadm: cannot open device /dev/sde: Device or resource busy mdadm: /dev/sde has wrong uuid. mdadm: cannot open device /dev/sdd: Device or resource busy mdadm: /dev/sdd has wrong uuid. mdadm: cannot open device /dev/sdc: Device or resource busy mdadm: /dev/sdc has wrong uuid. mdadm: cannot open device /dev/sdb: Device or resource busy mdadm: /dev/sdb has wrong uuid. mdadm: cannot open device /dev/sda: Device or resource busy mdadm: /dev/sda has wrong uuid.
I opened GParted to create a new partition on a new drive. He wanted me to create a partition table first which I did, and it was created directly without any prompt like im used to see when creating partition. So I recognized too late, that i actually created a MBR on one of my 6 1TB raid5 drives. Not beeing sure if the ne MBR was really written, I have opened ubuntu disk utility and clicked on the check raid button. It directly made a resync. After the resync, mdadm --detail /dev/md0 told me everything is ok and synced. Then I wanted to mount it with:
mount /dev/md0 /mnt Then I get the following error: "mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so" I think I just killed my raid5 ;(
I shouldnt work on my server when im tired and when I actually have no time ;( My last hope is the fact, that "Disk Utility" shows that there is a .0 TB ext4 volume on my raid (see screen below) [URL]