My Two Favorite BASH scripting Commands

I write a fair amount of BASH scripts in the course of my daily work. Even more now that I’ve been doing lots of work with CI pipelines; BASH tends to be a really natural choice for those simple scripts.

Most of the time these scripts are running in a totally automated manor so robust debugging and logging needs to be written into the script directly. In the olden days I would have written a METRIC TON of echo statements to act as print statements… there are still places where I might do that but much less so today and that’s because I leverage these handy little builtin features of BASH called subshell options.

There are a whole bunch of subshell options but I’m really referring to just two of them that I use most frequently:

set -x

set -e

These two little beauties do a lot for very little effort.

What does ‘set -e’ do?

If you put the ‘set -e’ command at the top of your script, if ANY of your commands fail or exit with a bad return code the script will stop in it’s tracks. Using the ‘set -e’ will save you a significant amount of time in allowing your scripts to fail more gracefully.

Here’s an example:


set -e

echo "This command works."
echo "This command doesn't" > /root/output.txt
echo "This other command works but you won't see it."

You won’t see the third piece of output because the 2nd command will fail due to a lack of permissions. This failure will cause the script to stop before executing the third command which is unlike how the script would normally run which is to say that it would push on blindly unless you wrote-in error checking.

Or said a little bit differently, ‘set -e’ will:

Exit immediately if a pipeline (see Pipelines), which may consist of a single simple command (see Simple Commands), a list (see Lists), or a compound command (see Compound Commands) returns a non-zero status. The shell does not exit if the command that fails is part of the command list immediately following a while or until keyword, part of the test in an if statement, part of any command executed in a && or || list except the command following the final && or ||, any command in a pipeline but the last, or if the command’s return status is being inverted with !. If a compound command other than a subshell returns a non-zero status because a command failed while -e was being ignored, the shell does not exit. A trap on ERR, if set, is executed before the shell exits.

This option applies to the shell environment and each subshell environment separately (see Command Execution Environment), and may cause subshells to exit before executing all the commands in the subshell.

If a compound command or shell function executes in a context where -e is being ignored, none of the commands executed within the compound command or function body will be affected by the -e setting, even if -e is set and a command returns a failure status. If a compound command or shell function sets -e while executing in a context where -e is ignored, that setting will not have any effect until the compound command or the command containing the function call completes.


The benefit of ‘set -e’ is relatively easy to see. Scripts which are much shorter and to the point and require less catching of return-codes throughout the course of execution.

What does ‘set -x’ do?

‘set -x’ is another gem. I found out about ‘set -x’ much later but it’s also quite useful. ‘set -x’ will essentially print each command to standard out as it is executed. This is exceedingly useful for debugging CI pipelines and all manor of scripts where you need verbose logging at a moment’s notice.

Try running this script to see what it does:


set -x

echo "some command"
echo "The time is $(date)" > /dev/null

You’ll see now the power of this option. It will actually display every command it runs right in front of you even ones you wouldn’t normally see because they’re being redirected etc. This is super handy when commands are executing with variables and being looped over. You may not otherwise see what command number 10 in your loop of 20 items looked like.

With our Powers Combined…

My last tidbit on these commands is that they can be used together. Often I’ll add ‘set -e’ along with ‘set -x’ to maximize the value I get out of quick scripts. You can also disable them with the opposite options… i.e. disable ‘set -e’ with ‘set +e’

Happy Scripting!


Preparing a new RAID Drive for Insertion

Today I had the pleasure of fixing my deficient RAID array. With the new drive slotted and identified in position /dev/sdc it was time to create a GPT partition table on the drive and a single primary partition.

blackbox# parted /dev/sdc
GNU Parted 2.3
Using /dev/sdc
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel GPT
Warning: The existing disk label on /dev/sdc will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) mkpart primary 2048s 100%
(parted) q
Information: You may need to update /etc/fstab.

Unmount the filesystem, as a safety precaution.

root@blackbox:/home/eric# sudo umount /media/Manta_Array/
root@blackbox:/home/eric# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid6 sdh1[1] sdg1[7] sdb1[6] sda1[5]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
unused devices: <none>

At this point I’ve already confirmed that the partition table on all the RAID devices is consistent/identical by using the following commands.

parted /dev/sda print

parted /dev/sdb print

parted /dev/sdg print

parted /dev/sdh print

I’ve also already removed the errant drive before the last boot via

mdadm --manage /dev/md0 --remove /dev/sdc1

Now it’s time to add in the new drive.

root@blackbox:/home/eric# mdadm --manage /dev/md0 --add /dev/sdc1
mdadm: added /dev/sdc1
root@blackbox:/home/eric# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid6 sdc1[8] sdg1[7] sdb1[6] sda1[5] sdh1[1]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
 [>....................] recovery = 0.0% (401324/2930133504) finish=730.0min speed=66887K/sec
unused devices: <none>

As you can see there is a good bit of synchronization left to do here before the new drive can be put to use.

Installing the Newest Ifupdown2 on an OrangePi Nano

I’ve been pretty enamored with my OrangePi nanos since I first got one. So enamored in fact that I’m up to owning 5 of them now doing all matters of tasks. Being a network person I wanted to make sure I had some of the best interface configuration software available installed so naturally I wanted Ifupdown2.

The OrangePi nano runs a debian based version of software called Armbian which is truly awesome software. It has been stripped down and customized for specific devices to the point that it is a work of art. Since it is debian based it has access to ifupdown2 natively right in the repos. The only problem with that version is that it is outdated being from the November 2015 timeline. So I want to install the latest and greatest Ifupdown2….

From my armbian device:

# Install the newest version in the standard repo
sudo apt-get update -y
sudo apt-get install ifupdown2 -qy

# Now install the newest version of ifupdown2 directly from the debian repos
wget -O /root/ifupdown2.deb && \
dpkg -i /root/ifupdown2.deb && \
rm -rfv /root/ifupdown2.deb

sudo apt-cache policy ifupdown2 | grep Installed
echo "Output above should say: \"Installed: 1.0~git20170314-1\""

# Overwrite NMCLI tool to control the Eth0 interface with Ifupdown2
echo "[keyfile] unmanaged-devices=interface-name:eth0" | sudo tee -a /etc/NetworkManager/NetworkManager.conf

echo "### Before Change ###"
sudo nmcli dev status
sudo systemctl stop NetworkManager; sudo systemctl start NetworkManager

echo "### After Change ###"
sudo nmcli dev status
echo "Eth0 should now show as \"unmanaged\" according to the output above."


Troubleshooting Linux Software RAID (MDADM)

Recently I had the pleasure of rebooting my NAS server for some standard “maintenance” activities i.e. kernel updates etc. Naturally when it came back up my primary large file storage RAID 6 Array did not come up automatically after the reboot.

This is always my worst fear when it comes to rebooting that box… what if the RAID doesn’t come back up and some hard drives were limping along and I didn’t know it?!

After reading syslog I found a number of errors which clearly indicated a READ error on /dev/sdc.

Sep 22 22:25:37 blackbox kernel: [264777.620821] ata3.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x0
Sep 22 22:25:37 blackbox kernel: [264777.624084] ata3.00: irq_stat 0x40000008
Sep 22 22:25:37 blackbox kernel: [264777.627338] ata3.00: failed command: READ FPDMA QUEUED
Sep 22 22:25:37 blackbox kernel: [264777.630585] ata3.00: cmd 60/08:90:08:08:00/00:00:00:00:00/40 tag 18 ncq 4096 in
Sep 22 22:25:37 blackbox kernel: [264777.630585] res 41/40:00:09:08:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
Sep 22 22:25:37 blackbox kernel: [264777.637060] ata3.00: status: { DRDY ERR }
Sep 22 22:25:37 blackbox kernel: [264777.640253] ata3.00: error: { UNC }
Sep 22 22:25:37 blackbox kernel: [264777.644587] ata3.00: configured for UDMA/133
Sep 22 22:25:37 blackbox kernel: [264777.644609] sd 2:0:0:0: [sdc] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 22 22:25:37 blackbox kernel: [264777.644616] sd 2:0:0:0: [sdc] tag#18 Sense Key : Medium Error [current] [descriptor] 
Sep 22 22:25:37 blackbox kernel: [264777.644622] sd 2:0:0:0: [sdc] tag#18 Add. Sense: Unrecovered read error - auto reallocate failed
Sep 22 22:25:37 blackbox kernel: [264777.644628] sd 2:0:0:0: [sdc] tag#18 CDB: Read(16) 88 00 00 00 00 00 00 00 08 08 00 00 00 08 00 00
Sep 22 22:25:37 blackbox kernel: [264777.644632] blk_update_request: I/O error, dev sdc, sector 2057

A couple sources are common for these styles of errors and I recommend troubleshooting all of them first.

  1. New SATA Cables — sometimes the old ones are of low quality, sometimes they get finicky or have dust covering the pins whatever the reason SATA cables are cheap buy some more and try those.
  2. Bad SATA Port — The specific port on your controller could be failing. If this is a port on your motherboard and you have another one available, try that one. If not consider buying a SATA controller card.
  3. BAD SATA Controller Card — These cards can be quite inexpensive in some cases. I’ve seen several of them fail in my lifetime often with spurious read errors like this one being the first symptom of a larger failure in the card.
  4. Bad Hard Drive — This one is the most obvious of course but the other things above should really be investigated first. While I’ve had more hard drive failures than any of the issues above I have also been stricken with getting a new RMA’d hard drive and having that fail to function as well due to one of the issues above being the real culprit.

One other way to investigate item #4 is with the use of the SMART utility in Linux available via `sudo apt-get install smartmontools` SMARTctl provides a TON of useful data from the SMART controller on the disk. They’ll let you know if your drive has already logged other concerning errors etc.

One of the most important sections being this:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 115
 3 Spin_Up_Time 0x0027 181 176 021 Pre-fail Always - 5941
 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 951
 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
 9 Power_On_Hours 0x0032 049 049 000 Old_age Always - 37920
 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 93
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 57
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 893
194 Temperature_Celsius 0x0022 120 109 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

My takeaway is that after 37920 gours (37920 / 24 / 365 = 4.33 Years) in operation and after already checking items 1-3 perhaps it was finally time to let this drive go.

Making that decision is half the battle, now it’s time to recover the array.

A Reconstructive RAID Cheat Sheet

It is for moments like this that I’m writing myself this guide for later.

Since I run RAID 6 with (5) 3TB drives that means I can lose 2 out of the 5 drives and still be ok. Since It looks like I’ve only lost 1 the RAID should have been able to function but for whatever reason it showed as inactive at boot time.

eric@blackbox:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdb1[1](S) sda1[7](S) sdd1[6](S) sde1[5](S)
 11720536064 blocks super 1.2
unused devices: <none>

Identify your Hard disks

root@blackbox:/home/eric# sudo fdisk -l | grep "2.7 TiB"
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdd: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sde: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Disk /dev/sdc: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors

/dev/sdc is my bad drive some I’m going to remove it from the array.

root@blackbox:/home/eric# mdadm --manage /dev/md0 --remove /dev/sdc1
mdadm: hot remove failed for /dev/sdc1: No such device or address

In this case, we can see from the mdstat output above that /dev/sdc1 was not inserted into the array at boot time so the remove operation has failed. Instead, what needs to be done is that the RAID array must be rebuilt. For this I need to stop the array.

root@blackbox:/home/eric# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@blackbox:/home/eric#  cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
unused devices: <none>

Alright, let’s rebuild this puppy. I had a clean power-down event before so everything should still be mostly in order.

root@blackbox:/home/eric# mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdd1 /dev/sde1
mdadm: /dev/md0 has been started with 4 drives (out of 5).
root@blackbox:/home/eric# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid6 sda1[7] sdd1[6] sde1[5] sdb1[1]
 8790400512 blocks super 1.2 level 6, 512k chunk, algorithm 2 [5/4] [UU_UU]
unused devices: <none>
root@blackbox:/home/eric# sudo mount /dev/md0 /media/Manta_Array/

And now with the array remounted and a new drive on order. Life is back to normal.

Some Additional Commands I Find Useful:

Detect Present State and write it to your RAID configuration file

mdadm --verbose --detail --scan > /etc/mdadm.conf

Assemble Existing Arrays by UUID (Optional) if possible

mdadm --assemble --scan --uuid=f6ff12cd:86a8e3fb:89bc0f58:ad15e3e2

Learn about RAID info stored on each individual drive (Superblock)

mdadm --examine /dev/sda1


Adding New Fonts in Bulk to Ubuntu 16.04

This process should also work for 12.04 and 14.04.

Create a new directory under /usr/share/fonts

sudo mkdir /usr/share/fonts/opentype/newfonts

Place all OTF or TTF files in that directory.

Run the font-caching utility to fix permissions on these new fonts and make them available to applications immediately.

sudo chmod -R 655 /usr/share/fonts
sudo fc-cache -fv

Other Methods

There are other methods available on modern Ubuntu as well. For individual fonts you can just double-click on them and click the “install” option in the upper right. Or use a purpose-built program like font-manager.

sudo apt-get install font-manager

Using Active/Backup Bonding (mode 1) with Ifupdown2

Ifupdown2 is a very useful interface configuration utility with tons of enhancements over the stock utility ifupdown. It was built with a specific initial use-case in mind which is for use on network operating systems (NOS) like Cumulus Linux. Cumulus requires LACP support as the primary bonding method. Other modes like active-backup (mode 1) were not initially fully implemented if ifupdown2. This is changing however; CM-14985 brings support for the bond-primary keyword and will be present in the next release of Cumulus Linux and the next version of Ifupdown2.

To hold you over until then here’s a workaround I’ve been using on my server at home running Ifupdown2 for performing active/backup bonding. Writing the sys file directly can provide the same behavior.

auto lo
iface lo inet loopback

auto enp4s0
iface enp4s0
 alias Motherboard Ethernet 
 mtu 9194

auto enxf01e341f95
iface enxf01e341f95
 alias USB3 Ethernet
 mtu 9194

auto bond0
iface bond0
 alias ActiveBackup Uplink
 bond-mode active-backup
 bond-slaves enxf01e341f95 enp4s0
 mtu 9194
 pre-up echo enp4s0 > /sys/class/net/bond0/bonding/primary

Building FRRouting for PowerPC on Debian Wheezy

Tried to do this to modernize the routing software running on an older whitebox which was built on the PowerPC architecture.

One of the challenges on these platforms aside from the PPC arch is the limited space. I found my switch did not have enough hard disk space to complete the build. My answer was to use a USB stick to provide additional disk space to complete the build. At the completion of the build my build directory consumed ~214 MB so plan accordingly if your switch does not have sufficient on-board space.

Assume ROOT for all commands unless otherwise stated.

I mounted my USB stick to –> /mnt/USB

mkdir /mnt/USB
# Use Fdisk to confirm USB device.
fdisk -l 
mount /dev/sda1 /mnt/USB

Add the sources

cat << EOT >> /etc/apt/sources.list
deb wheezy main contrib non-free
deb-src wheezy main contrib non-free

deb wheezy/updates main contrib non-free
deb-src wheezy/updates main contrib non-free

deb wheezy-updates main contrib non-free
deb-src wheezy-updates main contrib non-free

deb wheezy-backports main non-free contrib

Add the Prereq packages

apt-get install git autoconf automake libtool make gawk libreadline-dev texinfo dejagnu pkg-config libpam0g-dev bison flex python-pytest libc-ares-dev python3-dev libjson-c-dev build-essential fakeroot devscripts


Install some out of Repo Prereqs from Source as shown in the Ubuntu 12.04 LTS build guide

Install newer bison from Ubuntu 14.04 package source:

mkdir builddir
cd builddir
tar -jxvf bison_3.0.2.dfsg.orig.tar.bz2 
cd bison-3.0.2.dfsg/
tar xzf ../bison_3.0.2.dfsg-2.debian.tar.gz 
sudo apt-get build-dep bison
debuild -b -uc -us
cd ..
sudo dpkg -i ./libbison-dev_3.0.2.dfsg-2_amd64.deb ./bison_3.0.2.dfsg-2_amd64.deb 
cd ..
rm -rf builddir

Install newer version of autoconf and automake:

tar xvf autoconf-2.69.tar.gz
cd autoconf-2.69
./configure --prefix=/usr
sudo make install
cd ..

tar xvf automake-1.15.tar.gz
cd automake-1.15
./configure --prefix=/usr
sudo make install
cd ..

Add frr groups and user

sudo groupadd -g 92 frr
sudo groupadd -r -g 85 frrvty
sudo adduser --system --ingroup frr --home /var/run/frr/ \
   --gecos "FRR suite" --shell /sbin/nologin frr
sudo usermod -a -G frrvty frr

Download Source, configure and compile it

git clone frr
cd frr
./configure \
    --prefix=/usr \
    --enable-exampledir=/usr/share/doc/frr/examples/ \
    --localstatedir=/var/run/frr \
    --sbindir=/usr/lib/frr \
    --sysconfdir=/etc/frr \
    --enable-pimd \
    --enable-watchfrr \
    --enable-ospfclient=yes \
    --enable-ospfapi=yes \
    --enable-multipath=64 \
    --enable-user=frr \
    --enable-group=frr \
    --enable-vty-group=frrvty \
    --enable-configfile-mask=0640 \
    --enable-logfile-mask=0640 \
    --enable-rtadv \
    --enable-fpm \
    --with-pkg-git-version \
make install

Most guides would end here but there’s a bit more required to get FRR functioning.

Create empty FRR configuration files

sudo install -m 755 -o frr -g frr -d /var/log/frr
sudo install -m 775 -o frr -g frrvty -d /etc/frr
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/zebra.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/bgpd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/ospfd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/ospf6d.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/isisd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/ripd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/ripngd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/pimd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/ldpd.conf
sudo install -m 640 -o frr -g frr /dev/null /etc/frr/nhrpd.conf
sudo install -m 640 -o frr -g frrvty /dev/null /etc/frr/vtysh.conf

Install the init.d service

sudo install -m 755 tools/frr /etc/init.d/frr
sudo install -m 644 tools/etc/frr/daemons /etc/frr/daemons
sudo install -m 644 tools/etc/frr/daemons.conf /etc/frr/daemons.conf
sudo install -m 644 -o frr -g frr tools/etc/frr/vtysh.conf /etc/frr/vtysh.conf

Enable your Routing Daemons

cat << EOT > /etc/frr/daemons

Start FRR

service frr start
service frr status

Enable FRR At boot time for subsequent reboots

sudo update-rc.d frr defaults

Fix Exit Scripts

 sed -i 's/ip route flush proto ripng/ip route flush proto 190 \# ripng/' /usr/lib/frr/frr
 sed -i 's/ip route flush proto bgp/ip route flush proto 186 \# bgp/' /usr/lib/frr/frr
 sed -i 's/ip route flush proto isis/ip route flush proto 187 \# isis/' /usr/lib/frr/frr
 sed -i 's/ip route flush proto ospf/ip route flush proto 188 \# ospf/' /usr/lib/frr/frr
 sed -i 's/ip route flush proto rip/ip route flush proto 189 \# rip/' /usr/lib/frr/frr
 sed -i 's/ip route flush proto static/ip route flush proto 191 \# static/' /usr/lib/frr/frr

Hopefully that should do it for you. Now the next step is figuring out how to build a proper deb from the source. I’ll leave that process for next time 🙂