Troubleshooting Linux Software RAID (MDADM)

Recently I had the pleasure of rebooting my NAS server for some standard “maintenance” activities i.e. kernel updates etc. Naturally when it came back up my primary large file storage RAID 6 Array did not come up automatically after the reboot.

This is always my worst fear when it comes to rebooting that box… what if the RAID doesn’t come back up and some hard drives were limping along and I didn’t know it?!

After reading syslog I found a number of errors which clearly indicated a READ error on /dev/sdc.

Sep 22 22:25:37 blackbox kernel: [264777.620821] ata3.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x0
Sep 22 22:25:37 blackbox kernel: [264777.624084] ata3.00: irq_stat 0x40000008
Sep 22 22:25:37 blackbox kernel: [264777.627338] ata3.00: failed command: READ FPDMA QUEUED
Sep 22 22:25:37 blackbox kernel: [264777.630585] ata3.00: cmd 60/08:90:08:08:00/00:00:00:00:00/40 tag 18 ncq 4096 in
Sep 22 22:25:37 blackbox kernel: [264777.630585] res 41/40:00:09:08:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Sep 22 22:25:37 blackbox kernel: [264777.637060] ata3.00: status: { DRDY ERR }
Sep 22 22:25:37 blackbox kernel: [264777.640253] ata3.00: error: { UNC }
Sep 22 22:25:37 blackbox kernel: [264777.644587] ata3.00: configured for UDMA/133
Sep 22 22:25:37 blackbox kernel: [264777.644609] sd 2:0:0:0: [sdc] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 22 22:25:37 blackbox kernel: [264777.644616] sd 2:0:0:0: [sdc] tag#18 Sense Key : Medium Error [current] [descriptor] 
Sep 22 22:25:37 blackbox kernel: [264777.644622] sd 2:0:0:0: [sdc] tag#18 Add. Sense: Unrecovered read error - auto reallocate failed
Sep 22 22:25:37 blackbox kernel: [264777.644628] sd 2:0:0:0: [sdc] tag#18 CDB: Read(16) 88 00 00 00 00 00 00 00 08 08 00 00 00 08 00 00
Sep 22 22:25:37 blackbox kernel: [264777.644632] blk_update_request: I/O error, dev sdc, sector 2057

A couple sources are common for these styles of errors and I recommend troubleshooting all of them first.

  1. New SATA Cables — sometimes the old ones are of low quality, sometimes they get finicky or have dust covering the pins whatever the reason SATA cables are cheap buy some more and try those.
  2. Bad SATA Port — The specific port on your controller could be failing. If this is a port on your motherboard and you have another one available, try that one. If not consider buying a SATA controller card.
  3. BAD SATA Controller Card — These cards can be quite inexpensive in some cases. I’ve seen several of them fail in my lifetime often with spurious read errors like this one being the first symptom of a larger failure in the card.
  4. Bad Hard Drive — This one is the most obvious of course but the other things above should really be investigated first. While I’ve had more hard drive failures than any of the issues above I have also been stricken with getting a new RMA’d hard drive and having that fail to function as well due to one of the issues above being the real culprit.

One other way to investigate item #4 is with the use of the SMART utility in Linux available via `sudo apt-get install smartmontools` SMARTctl provides a TON of useful data from the SMART controller on the disk. They’ll let you know if your drive has already logged other concerning errors etc.

One of the most important sections being this:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 115
 3 Spin_Up_Time 0x0027 181 176 021 Pre-fail Always - 5941
 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 951
 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
 9 Power_On_Hours 0x0032 049 049 000 Old_age Always - 37920
 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 93
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 57
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 893
194 Temperature_Celsius 0x0022 120 109 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

My takeaway is that after 37920 gours (37920 / 24 / 365 = 4.33 Years) in operation and after already checking items 1-3 perhaps it was finally time to let this drive go.

Making that decision is half the battle, now it’s time to recover the array.

A Reconstructive RAID Cheat Sheet

It is for moments like this that I’m writing myself this guide for later.

Since I run RAID 6 with (5) 3TB drives that means I can lose 2 out of the 5 drives and still be ok. Since It looks like I’ve only lost 1 the RAID should have been able to function but for whatever reason it showed as inactive at boot time.

eric@blackbox:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdb1[1](S) sda1[7](S) sdd1[6](S) sde1[5](S)
 11720536064 blocks super 1.2
 
unused devices: <none>

Identify your Hard disks

root@blackbox:/home/eric# sudo fdisk -l | grep "2.7 TiB"
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdd: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sde: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdc: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors

/dev/sdc is my bad drive some I’m going to remove it from the array.

root@blackbox:/home/eric# mdadm --manage /dev/md0 --remove /dev/sdc1
mdadm: hot remove failed for /dev/sdc1: No such device or address

In this case, we can see from the mdstat output above that /dev/sdc1 was not inserted into the array at boot time so the remove operation has failed. Instead, what needs to be done is that the RAID array must be rebuilt. For this I need to stop the array.

root@blackbox:/home/eric# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@blackbox:/home/eric#  cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
unused devices: <none>

Alright, let’s rebuild this puppy. I had a clean power-down event before so everything should still be mostly in order.

root@blackbox:/home/eric# mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
mdadm: /dev/md0 has been started with 4 drives (out of 5).
root@blackbox:/home/eric# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid6 sda1[7] sdd1[6] sde1[5] sdb1[1]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
 
unused devices: <none>
root@blackbox:/home/eric# sudo mount /dev/md0 /media/Manta_Array/

And now with the array remounted and a new drive on order. Life is back to normal.

Some Additional Commands I Find Useful:

Detect Present State and write it to your RAID configuration file

mdadm --verbose --detail --scan > /etc/mdadm.conf

Assemble Existing Arrays by UUID (Optional) if possible

mdadm --assemble --scan --uuid=f6ff12cd:86a8e3fb:89bc0f58:ad15e3e2

Learn about RAID info stored on each individual drive (Superblock)

mdadm --examine /dev/sda1

 

Advertisements

Use Less Power on Your NAS

Inspired by my read of this article http://sandeen.net/wordpress/computers/how-to-build-a-10w-6-5-terabyte-commodity-x86-linux-server/ I wanted to do some experimentation on my own NAS server.

The biggest power users (aside from the CPU) are the hard drives. In my NAS I have (5) 3TB WD Red Drives, and (1) 2TB WD Red Drive as well as a pair of Samsung 850 based SSDs. Of course the spinning disks consume the most power. I’m able to determine this based on the UPS which I have connected to the NAS. It has a nice little power meter built-in which I’m sure isn’t extremely accurate but is good enough for my needs.

After much experimentation with the options provided in the article above I found the only things to make a noticeable difference on my NAS were the spindowns of the mechanical harddrives.

Using the following script I was able to cut down my idle power draw from ~72 watts –> ~54 watts for an 18 watt savings or 25% which isn’t too bad!

#!/bin/bash

echo "Setting Power down to 60 seconds."
hdparm -S 12 /dev/sda
hdparm -S 12 /dev/sdb
hdparm -S 12 /dev/sdc
hdparm -S 12 /dev/sde
hdparm -S 12 /dev/sdf

hdparm -S 12 /dev/sdd

echo "Powering Down Hard drives immediately."
hdparm -y /dev/sda
hdparm -y /dev/sdb
hdparm -y /dev/sdc
hdparm -y /dev/sde
hdparm -y /dev/sdf

hdparm -y /dev/sdd