Starting up MDADM RAID Arrays

At various points in time I’ve run multiple RAID arrays on a single system. One hard learned lesson is to never mount these arrays using the traditional FSTAB method. Upon Boot, every mountpoint in FSTAB will attempt to be mounted. If some have errors (like RAID arrays might if a drive has dropped out) that might result in a system which is effectively halted waiting for user-input for what to do.

It is much more simple to create a SystemD unit as a separate service to start these arrays at boot time.

Here is an example of such a Unit file placed at /etc/systemd/system/raid-startup.service:

eric@bluebox:~$ sudo systemctl cat raid-startup.service 
# /etc/systemd/system/raid-startup.service
Description=Raid Scanning and Startup Script
Before= nfs-kernel-server.service

ExecStartPre=/sbin/e2fsck -pf /dev/md0
ExecStart=/bin/mount /dev/md0 /media/Manta_Array
ExecStop=/bin/umount -l /media/Manta_Array


Once the file is placed. Run sudo systemctl daemon-reload to read-in any newly created/modified units followed by sudo systemctl enable raid-startup to start the service at next boot.

There are a couple customizations here. Notice I’ve also added some filesystem checking and repair to each startup routine. Since my NASes are restarted so infrequently I want to make sure that everything with my RAID arrays is crisp before I bring them back into service. I’ve also added a BEFORE line which ensures this will run before NFS starts up which eliminates a race condition where NFS could start before the array an miss any directories being served off the array.

Another pro tip:
Consider installing the smartctl application as the logging it produces in /var/log/syslog can be instrumental in understanding why a drive might have dropped out of your array.

​​​smartctl -a /dev/sda  will provide an incredible amount of input about your spinning disk drives.


My Two Favorite BASH scripting Commands

I write a fair amount of BASH scripts in the course of my daily work. Even more now that I’ve been doing lots of work with CI pipelines; BASH tends to be a really natural choice for those simple scripts.

Most of the time these scripts are running in a totally automated manor so robust debugging and logging needs to be written into the script directly. In the olden days I would have written a METRIC TON of echo statements to act as print statements… there are still places where I might do that but much less so today and that’s because I leverage these handy little builtin features of BASH called subshell options.

There are a whole bunch of subshell options but I’m really referring to just two of them that I use most frequently:

set -x

set -e

These two little beauties do a lot for very little effort.

What does ‘set -e’ do?

If you put the ‘set -e’ command at the top of your script, if ANY of your commands fail or exit with a bad return code the script will stop in it’s tracks. Using the ‘set -e’ will save you a significant amount of time in allowing your scripts to fail more gracefully.

Here’s an example:


set -e

echo "This command works."
echo "This command doesn't" > /root/output.txt
echo "This other command works but you won't see it."

You won’t see the third piece of output because the 2nd command will fail due to a lack of permissions. This failure will cause the script to stop before executing the third command which is unlike how the script would normally run which is to say that it would push on blindly unless you wrote-in error checking.

Or said a little bit differently, ‘set -e’ will:

Exit immediately if a pipeline (see Pipelines), which may consist of a single simple command (see Simple Commands), a list (see Lists), or a compound command (see Compound Commands) returns a non-zero status. The shell does not exit if the command that fails is part of the command list immediately following a while or until keyword, part of the test in an if statement, part of any command executed in a && or || list except the command following the final && or ||, any command in a pipeline but the last, or if the command’s return status is being inverted with !. If a compound command other than a subshell returns a non-zero status because a command failed while -e was being ignored, the shell does not exit. A trap on ERR, if set, is executed before the shell exits.

This option applies to the shell environment and each subshell environment separately (see Command Execution Environment), and may cause subshells to exit before executing all the commands in the subshell.

If a compound command or shell function executes in a context where -e is being ignored, none of the commands executed within the compound command or function body will be affected by the -e setting, even if -e is set and a command returns a failure status. If a compound command or shell function sets -e while executing in a context where -e is ignored, that setting will not have any effect until the compound command or the command containing the function call completes.


The benefit of ‘set -e’ is relatively easy to see. Scripts which are much shorter and to the point and require less catching of return-codes throughout the course of execution.

What does ‘set -x’ do?

‘set -x’ is another gem. I found out about ‘set -x’ much later but it’s also quite useful. ‘set -x’ will essentially print each command to standard out as it is executed. This is exceedingly useful for debugging CI pipelines and all manor of scripts where you need verbose logging at a moment’s notice.

Try running this script to see what it does:


set -x

echo "some command"
echo "The time is $(date)" > /dev/null

You’ll see now the power of this option. It will actually display every command it runs right in front of you even ones you wouldn’t normally see because they’re being redirected etc. This is super handy when commands are executing with variables and being looped over. You may not otherwise see what command number 10 in your loop of 20 items looked like.

With our Powers Combined…

My last tidbit on these commands is that they can be used together. Often I’ll add ‘set -e’ along with ‘set -x’ to maximize the value I get out of quick scripts. You can also disable them with the opposite options… i.e. disable ‘set -e’ with ‘set +e’

Happy Scripting!

Preparing a new RAID Drive for Insertion

Today I had the pleasure of fixing my deficient RAID array. With the new drive slotted and identified in position /dev/sdc it was time to create a GPT partition table on the drive and a single primary partition.

blackbox# parted /dev/sdc
GNU Parted 2.3
Using /dev/sdc
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel GPT
Warning: The existing disk label on /dev/sdc will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) mkpart primary 2048s 100%
(parted) q
Information: You may need to update /etc/fstab.

Unmount the filesystem, as a safety precaution.

root@blackbox:/home/eric# sudo umount /media/Manta_Array/
root@blackbox:/home/eric# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid6 sdh1[1] sdg1[7] sdb1[6] sda1[5]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
unused devices: <none>

At this point I’ve already confirmed that the partition table on all the RAID devices is consistent/identical by using the following commands.

parted /dev/sda print

parted /dev/sdb print

parted /dev/sdg print

parted /dev/sdh print

I’ve also already removed the errant drive before the last boot via

mdadm --manage /dev/md0 --remove /dev/sdc1

Now it’s time to add in the new drive.

root@blackbox:/home/eric# mdadm --manage /dev/md0 --add /dev/sdc1
mdadm: added /dev/sdc1
root@blackbox:/home/eric# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid6 sdc1[8] sdg1[7] sdb1[6] sda1[5] sdh1[1]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
 [>....................] recovery = 0.0% (401324/2930133504) finish=730.0min speed=66887K/sec
unused devices: <none>

As you can see there is a good bit of synchronization left to do here before the new drive can be put to use.

Meet Charles Clos: The Father of Clos Networks

Around Cumulus Networks we’re constantly talking about this thing called a Clos network. If you don’t know much about what those are, check out the following descriptions from Network World and Wikipedia:–what-s-old-is-new-again.html

They were a concept initially envisioned in the 1950’s by a gentleman working at Bell Labs named Charles Clos. Not a ton is known about Mr. Clos aside from the fact that his seminal paper  in “The Bell System Technical Journal ( Volume: 32, Issue: 2, March 1953 )” A study of non-blocking switching networks went on to make waves in the Data Center Computer Networking space many years later as it turns out non-blocking telephone networks have a lot to do with modern Data Center design pillars.

It bothered me that were no images to be found anywhere on the internet of Mr. Clos. So one day I decided I would get this mystery solved. I reached out to Bell Labs with an e-mail explaining my academic interest in Mr. Clos and asking for a picture. Before the end of that very day I received a response containing this image.

Internet, I would like to introduce you to Mr. Charles Clos…

10232017 - Scan - 171347
Description: three dimensional chart showing relationship of dial tone service marker occupancy and register occupancy in the #5 crossbar system l t r –mr r i wilkinson mrs r d leonard and mr c clos on the right.
Date: May 19, 1949

So there he is on the right, Charles Clos. I can sleep a little easier tonight putting a face to this much revered gentleman in the Data Center networking space.

Making Realistic Tombstones

Halloween is my favorite holiday of the year. To celebrate the occasion I try something a bit different each year. This year I had seen some interesting tutorials on making tombstones out of 4×8 foot 2 inch thick foam board. These days my biggest issue with all projects is having the time to complete them; knowing this I started this one in July.


The first step which I do not have a picture of was to cut the profile out for the tombstones. In our case we found that a single 4×8 sheet had enough material to construct about two whole tombstones. We did multiple layers, my spire tombstone was three layers thick while my wife’s traditional tombstone was two layers thick. Each of the main pieces was also surrounded by a base that was two layers thick as well.

At this point the time had come to put them together. For this I carved a C-channel in between the layers and adhered a piece of PVC pipe with construction adhesive in the channel. The PVC pipe would serve as a guide for the rebar which will hold these stones firmly in the ground.

Notice I left about 1/2″ of PVC pipe exposed. This was to be made flush by a bottom base layer made of wood. I used construction adhesive again to glue all of the layers together.


Here you can see I’ve cut a bottom sheet of MDF and used a paddle bit to make allowances for the extra 1/2 inch of PVC pipe to sit flush with the bottom edge. I used a heavy dose of construction adhesive again in between where the PVC conduit pokes through the bottom plate.

In the background of this photo you can also see the latex-based Drylock paint that was used as the primer coat for all the tombstones. I lathered this stuff on thick. You can get the white color from Home Depot for about $23/gallon and they’ll even throw some gray pigment in there too if you ask.

You can also see the 2 foot epoxy-coated rebar pieces I picked up from Home Depot as well. These should work well as they’ll provide a little extra protection from rust.


At this point I used the Stanley Sur-form shaper to even out all the edges and make the layers look as one. I also used a Dremel tool with the 565-02 attachment to make some nice even cuts to carve out the epitaphs for the tombstones. You can use a program like rasterizer to make images or words large enough to cover your tombstone.

With epitaphs carved and everything looking smooth, it was time to put on the first 2 layers of Drylock. You can see the entire family enjoyed this.


Here is a glamour shot of the first two coats of Drylock complete on the stones. I also slathered Drylock on the underside as well.


From here I just bought a few darker colors to fill-in the insets and reliefs in the tombstone to make it look a little more realistic. You can get by with a sample can from Home Depot and that should provide enough color for 1-2 tombstones.

Last step is to dig these bad boys into the ground and enjoy!


Installing the Newest Ifupdown2 on an OrangePi Nano

I’ve been pretty enamored with my OrangePi nanos since I first got one. So enamored in fact that I’m up to owning 5 of them now doing all matters of tasks. Being a network person I wanted to make sure I had some of the best interface configuration software available installed so naturally I wanted Ifupdown2.

The OrangePi nano runs a debian based version of software called Armbian which is truly awesome software. It has been stripped down and customized for specific devices to the point that it is a work of art. Since it is debian based it has access to ifupdown2 natively right in the repos. The only problem with that version is that it is outdated being from the November 2015 timeline. So I want to install the latest and greatest Ifupdown2….

From my armbian device:

# Install the newest version in the standard repo
sudo apt-get update -y
sudo apt-get install ifupdown2 -qy

# Now install the newest version of ifupdown2 directly from the debian repos
wget -O /root/ifupdown2.deb && \
dpkg -i /root/ifupdown2.deb && \
rm -rfv /root/ifupdown2.deb

sudo apt-cache policy ifupdown2 | grep Installed
echo "Output above should say: \"Installed: 1.0~git20170314-1\""

# Overwrite NMCLI tool to control the Eth0 interface with Ifupdown2
echo "[keyfile] unmanaged-devices=interface-name:eth0" | sudo tee -a /etc/NetworkManager/NetworkManager.conf

echo "### Before Change ###"
sudo nmcli dev status
sudo systemctl stop NetworkManager; sudo systemctl start NetworkManager

echo "### After Change ###"
sudo nmcli dev status
echo "Eth0 should now show as \"unmanaged\" according to the output above."


Troubleshooting Linux Software RAID (MDADM)

Recently I had the pleasure of rebooting my NAS server for some standard “maintenance” activities i.e. kernel updates etc. Naturally when it came back up my primary large file storage RAID 6 Array did not come up automatically after the reboot.

This is always my worst fear when it comes to rebooting that box… what if the RAID doesn’t come back up and some hard drives were limping along and I didn’t know it?!

After reading syslog I found a number of errors which clearly indicated a READ error on /dev/sdc.

Sep 22 22:25:37 blackbox kernel: [264777.620821] ata3.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x0
Sep 22 22:25:37 blackbox kernel: [264777.624084] ata3.00: irq_stat 0x40000008
Sep 22 22:25:37 blackbox kernel: [264777.627338] ata3.00: failed command: READ FPDMA QUEUED
Sep 22 22:25:37 blackbox kernel: [264777.630585] ata3.00: cmd 60/08:90:08:08:00/00:00:00:00:00/40 tag 18 ncq 4096 in
Sep 22 22:25:37 blackbox kernel: [264777.630585] res 41/40:00:09:08:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Sep 22 22:25:37 blackbox kernel: [264777.637060] ata3.00: status: { DRDY ERR }
Sep 22 22:25:37 blackbox kernel: [264777.640253] ata3.00: error: { UNC }
Sep 22 22:25:37 blackbox kernel: [264777.644587] ata3.00: configured for UDMA/133
Sep 22 22:25:37 blackbox kernel: [264777.644609] sd 2:0:0:0: [sdc] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 22 22:25:37 blackbox kernel: [264777.644616] sd 2:0:0:0: [sdc] tag#18 Sense Key : Medium Error [current] [descriptor] 
Sep 22 22:25:37 blackbox kernel: [264777.644622] sd 2:0:0:0: [sdc] tag#18 Add. Sense: Unrecovered read error - auto reallocate failed
Sep 22 22:25:37 blackbox kernel: [264777.644628] sd 2:0:0:0: [sdc] tag#18 CDB: Read(16) 88 00 00 00 00 00 00 00 08 08 00 00 00 08 00 00
Sep 22 22:25:37 blackbox kernel: [264777.644632] blk_update_request: I/O error, dev sdc, sector 2057

A couple sources are common for these styles of errors and I recommend troubleshooting all of them first.

  1. New SATA Cables — sometimes the old ones are of low quality, sometimes they get finicky or have dust covering the pins whatever the reason SATA cables are cheap buy some more and try those.
  2. Bad SATA Port — The specific port on your controller could be failing. If this is a port on your motherboard and you have another one available, try that one. If not consider buying a SATA controller card.
  3. BAD SATA Controller Card — These cards can be quite inexpensive in some cases. I’ve seen several of them fail in my lifetime often with spurious read errors like this one being the first symptom of a larger failure in the card.
  4. Bad Hard Drive — This one is the most obvious of course but the other things above should really be investigated first. While I’ve had more hard drive failures than any of the issues above I have also been stricken with getting a new RMA’d hard drive and having that fail to function as well due to one of the issues above being the real culprit.

One other way to investigate item #4 is with the use of the SMART utility in Linux available via `sudo apt-get install smartmontools` SMARTctl provides a TON of useful data from the SMART controller on the disk. They’ll let you know if your drive has already logged other concerning errors etc.

One of the most important sections being this:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 115
 3 Spin_Up_Time 0x0027 181 176 021 Pre-fail Always - 5941
 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 951
 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
 9 Power_On_Hours 0x0032 049 049 000 Old_age Always - 37920
 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 93
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 57
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 893
194 Temperature_Celsius 0x0022 120 109 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

My takeaway is that after 37920 gours (37920 / 24 / 365 = 4.33 Years) in operation and after already checking items 1-3 perhaps it was finally time to let this drive go.

Making that decision is half the battle, now it’s time to recover the array.

A Reconstructive RAID Cheat Sheet

It is for moments like this that I’m writing myself this guide for later.

Since I run RAID 6 with (5) 3TB drives that means I can lose 2 out of the 5 drives and still be ok. Since It looks like I’ve only lost 1 the RAID should have been able to function but for whatever reason it showed as inactive at boot time.

eric@blackbox:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdb1[1](S) sda1[7](S) sdd1[6](S) sde1[5](S)
 11720536064 blocks super 1.2
unused devices: <none>

Identify your Hard disks

root@blackbox:/home/eric# sudo fdisk -l | grep "2.7 TiB"
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdd: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sde: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdc: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors

/dev/sdc is my bad drive some I’m going to remove it from the array.

root@blackbox:/home/eric# mdadm --manage /dev/md0 --remove /dev/sdc1
mdadm: hot remove failed for /dev/sdc1: No such device or address

In this case, we can see from the mdstat output above that /dev/sdc1 was not inserted into the array at boot time so the remove operation has failed. Instead, what needs to be done is that the RAID array must be rebuilt. For this I need to stop the array.

root@blackbox:/home/eric# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@blackbox:/home/eric#  cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
unused devices: <none>

Alright, let’s rebuild this puppy. I had a clean power-down event before so everything should still be mostly in order.

root@blackbox:/home/eric# mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
mdadm: /dev/md0 has been started with 4 drives (out of 5).
root@blackbox:/home/eric# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid6 sda1[7] sdd1[6] sde1[5] sdb1[1]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
unused devices: <none>
root@blackbox:/home/eric# sudo mount /dev/md0 /media/Manta_Array/

And now with the array remounted and a new drive on order. Life is back to normal.

Some Additional Commands I Find Useful:

Detect Present State and write it to your RAID configuration file

mdadm --verbose --detail --scan > /etc/mdadm.conf

Assemble Existing Arrays by UUID (Optional) if possible

mdadm --assemble --scan --uuid=f6ff12cd:86a8e3fb:89bc0f58:ad15e3e2

Learn about RAID info stored on each individual drive (Superblock)

mdadm --examine /dev/sda1