| |
| |||||||
| Want to give back to the MythTV Community? Help answer threads with 0 replies. |
![]() |
| | LinkBack | Thread Tools | Display Modes |
|
#1
| |||
| |||
| I normally solve all my problems with google so this is my first post here... but long time reader. My system: running Ubuntu 9.10 (Karmic) on an i5-750 quad core. This is not a fresh install, it is an Ubuntu 9.04 system, upgraded to 9.10 (Karmic), then copied from my old system to the new and running lilo (not grub2) to cope with a raid boot. Over 4 drives, I wanted a 200MB ext3 raid1 /boot parition (/dev/md10), a 20GB ext3 raid6 partition (/dev/md6) for root, and a 2.9TB xfs raid6 partition (/dev/md8 ) for "storage" eg recordings, videos, music & photos. All partitions were initally installed as raid 1 over 2 drives (as I only had 2 new drives to begin with). I then acquired a further 2 drives and did an mdadm --grow to raid6 over 4 drives, using the latest mdadm 3.1 from Converting RAID5 to RAID6 and other shape changing in md/raid - this version 3.1 of mdadm allows growth of a raid 1 to raid 5/6 and growth of raid 5/6 to more disks. (In case you're wondering why raid6? Raid5 doesn't offer enough protection in a system I never want to go down - I've lost one Myth backend before due HDD failure and it took many weeks to get it back to how I had it! And raid10 is still not expandable - even under mdadm 3.1. Raid6 over 6 drives (my goal) is 66% space-efficient and can suffer two drive failures. I could even go to more drives in future.) Anyway, it all seemed to work perfectly - after reshaping (which took a couple of days), I was successfully running raid6 on 4 drives on both root and storage partitions until ... the system failed to shutdown cleanly... had to hard-reset... on reboot the xfs partition (/dev/md8 ) failed to mount and when I looked at the syslog it was reporting (prior to the shutdown event, hundreds of times): Code: "Filesystem "md8": xfs_log_force: error 5 returned." Code: Filesystem "md8": Disabling barriers, trial barrier write failed XFS mounting filesystem md8 Ending clean XFS mount for filesystem: md8 Starting XFS recovery on filesystem: md8 (logdev: internal) Failed to recover EFIs on filesystem: md8 This time xfs_repair can't even get through stage 1 - it now reports: Code: $ sudo xfs_repair -n /dev/md8 Phase 1 - find and verify superblock... superblock read failed, offset 0, size 524288, ag 0, rval 0 fatal error -- Invalid argument Code: $ cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md8 : inactive sdc8[4](S) sdd8[0](S) sde8[1](S)
4356152767 blocks super 1.2
md6 : active raid6 sdc6[4] sdd6[0] sdf6[3] sde6[1]
19534592 blocks super 1.2 level 6, 128k chunk, algorithm 18 [4/4] [UUUU]
md10 : active raid1 sdd1[0] sdc1[3] sde1[1] sdf1[2]
192640 blocks [4/4] [UUUU]
I don't know what's going with on the the mdstat for md8 - I thought this was only an xfs issue but it appears that only 3 of the 4 drives have been recognised by mdadm and they are all listed as spares and the array is listed as inactive. Now, it's fair to report that the computer was rebooted a few times during the raid1-raid6 reshaping and during this time about 900GB was copied from my old backup drive to the new array - maybe I overstressed what is a beta version of mdadm 3.1. I used exactly the same process to create md6 (root) and it hasn't missed a beat - but it also wasn't stressed with reboots and huge file activity during reshaping. So I don't know if this is an mdadm 3.1 reshaping problem (the array seemed to be clean and working perfectly once reshaping had finished) or an xfs problem. I suspected an xfs problem but the mdstat reporting only 3 of the 4 drives has me suspicious now. I'm tempted to try re-creating the array as raid6 over 4 drives using only ext3 (forget xfs) and hopefully all problems will go away - but would love any clues as to the source of my problems - is it purely an xfs issue or is it an mdadm/raid issue? Cheers, max |
|
#2
| |||
| |||
|
Further attempts to re-create the array: Code: $ sudo mdadm --examine /dev/md8 mdadm: No md superblock detected on /dev/md8. $ sudo mdadm --detail /dev/md8 mdadm: md device /dev/md8 does not appear to be active. $ sudo mdadm --assemble /dev/md8 /dev/sdd8 /dev/sde8 /dev/sdf8 /dev/sdc8 mdadm: cannot open device /dev/sdd8: Device or resource busy mdadm: /dev/sdd8 has no superblock - assembly aborted $ sudo mdadm --create /dev/md8 --level=6 --raid-devices=4 --chunk=8192 /dev/sdd8 /dev/sde8 /dev/sdf8 /dev/sdc8 mdadm: device /dev/sdd8 not suitable for any style of array $ sudo mdadm --examine --scan /dev/sdd8 ARRAY /dev/md/8 metadata=1.2 UUID=8aab3137:53541c75:f77e666c:9731b8eb name=onigiri:8 $ sudo mdadm --examine --scan /dev/sdc8 ARRAY /dev/md/8 metadata=1.2 UUID=8aab3137:53541c75:f77e666c:9731b8eb name=onigiri:8 $ sudo mdadm --examine --scan /dev/sde8 ARRAY /dev/md/8 metadata=1.2 UUID=8aab3137:53541c75:f77e666c:9731b8eb name=onigiri:8 $ sudo mdadm --examine --scan /dev/sdf8 $ sudo mdadm /dev/md8 -f /dev/sdf8 mdadm: cannot get array info for /dev/md8 I still can't help feeling that xfs is involved here but it also seems to be a raid problem. |
|
#3
| |||
| |||
|
I've always used whole drives for my raid and never split them about like that. I'm sure that had something to do with the error. I'm surprised you can't fail out the /sdf8. Since it's not mounted have you tried doing a fdisk /dev/sdf8, removing the partitition and superblock, readd them and then format it? I would think after a reboot it would allow you to remove then readd to the array.
|
|
#4
| |||
| |||
|
blackoper, thanks very much for the response. I have managed to finally re-create the array after executing a mdamd --stop command (which I hadn't done before the above post) - but I still don't understand why I couldn't fail the sdf8. I got the array re-created and waited many hours for it to re-build, and was ever hopeful that problems were over. Unfortunately, after waiting for the re-build of the array the xfs filesystem was not recoverable. xfs_repair could do nothing with it (failed at phase 1). So I have re-formatted the array now as ext3 (going to try again without xfs) and restored data from my backup (only a couple of weeks of tv-recordings lost, no big deal). The new array is running fine as ext3 under raid6. Therefore I have removed xfs from the equation of what is upsetting my current myth backend. I'm now still having issues with myth backend crashing the entire system after a few hours. This has been happening since the upgrade from Ubuntu 9.04 to 9.10 (with a concurrent upgrade from Myth 0.21 to 0.22) - it's these crashes that have probably been causing my filesystem/raid problems. Anyway, that's the subject of another thread which I will post soon. mdadm 3.1 has introduced a new funny thing that now my array refuses to name itself as md8 (even though I have done an mdadm --examine --scan /dev/sdc8 >> /etc/mdadm/mdadm.conf which has put the new UUID of the array in mdadm.conf) - it insists on naming itself md127. After hours of trying to fix that I have given up and just changed my fstab to mount md127 instead of md8. Mounting the UUID of the array in fstab doesn't work at all. Lots of stuff going on that I don't understand. The price of trying to use the development version of mdadm just so I have the opportunity to expand the array I guess. You said you use whole drives for your array - does that mean you use LVM on top of raid to create pseudo-partitions, or you don't create partitions at all? LVM wouldn't help me because I'm using raid6 and I need to boot from the array. You can't boot from raid6 so I needed either a non-raid /boot partition or a raid1 /boot partition. (I chose raid1 - it works with lilo and is supposed to work with grub2 (so I've read) but I couldn't make it work with grub2). I guess I could install an additional disk and use that for /boot (or a usb stick - but how to boot from usb to a whole-disk raid6 system is beyond my knowledge at this time). If I did install an additional disk for /boot then I could make the remaining drives whole-drive raid. It's important to note that I want my system's boot ability to be resilient to hard-disk (or usb) failure - I'm trying to build this system to be an "appliance" - it will continue to work after hard disk failure. My wife needs to be able to reboot it after a disk failure when I'm on the other side of the world (primary objective!) I've heard it said that raid is designed for disk failure, not power failure. Because my myth backend is constantly crashing the system since my upgrade, that is possibly akin to power failure. Therefore the raid problems. I have recently learned to reboot the system after total freeze via ALT-PRTSCRN-REISUB sysrq keyboard commands rather than pressing the reset button ( (something new to me - and I'm training my wife to do same!) but that doesn't always work - when myth crashes my system it really crashes it - probably a hardware issue related to the new kernel talking to my old analog LeadTek WinFast 2000XP capture card, but again, that is the subject of another thread - which I will post when I have enough time to provide enough information of what's happening. Thanks again for your response. Cheers, max Last edited by maxw; 03-10-2010 at 12:57 PM. |
| |
![]() |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Raid Card Failure = 14hours+ mayhem... | Coume | MythTVtalk News and Announcements | 1 | 06-14-2007 05:31 AM |
| Mounting a RAID volume | Retr0 | General | 3 | 05-06-2007 04:15 PM |
| nVidia SATA RAID Controllers | fromans4 | General | 0 | 03-03-2007 07:16 AM |
| raid | simon | Hardware | 3 | 07-27-2004 05:36 PM |
| Serial ATA and RAID performance comparison | digitalboy | Hardware | 1 | 06-15-2004 11:55 AM |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
| |