Installing Arch Linux with System Encryption Enabled

I still use Arch Linux because it is pretty much the only rolling-release distro that offers binary (pre-compiled) packages by default *and* only installs bare minimum packages required to get a working Linux system (no window manager, no desktop environment). I was re-installing Arch Linux on my desktop last week to set up system-wide encryption and decided to record all of the commands I used to get the bare minimum working. I am not aware of any source out there that actually tells you all of the console commands you need to run when you first install Arch Linux in plaintext format so I decided to write this post.

The system I am setting up is simple: it has only 1 hard drive. This drive has 1 boot partition 100MB big, and the rest is a single LVM partition that will eventually contain the / (aka “root”), /home, and swap partitions. This single LVM partition will be encrypted, so that when the system boots, it would require a password to unlock the /, /home, and swap partitions to actually load up the Kernel and userland.

If you are wondering why the /home partition is mounted on /mnt/home in some of the commands, this is because the Arch Linux live CD environment will load up a temporary Linux system onto your RAM, and we use the /mnt directory (again all files/folders are temporarily in your RAM) to mount the hard drive partitions. You can always use a live CD anywhere to load up a barebone Linux system to your RAM and mount the hard drives on your computer for inspection, and in such a case you will again use the /mnt directory to mount your hard drives here.

If you don’t know how to use vi (tsk, tsk), then just substitute that command with “nano” or something. Here are the commands, with comments, in plaintext format:

    # Put in the Arch Linux live CD and run the following commands...

    # Set up 2 partitions; the first one is the boot 100MB partition (be sure
    # to toggle the BOOT flag on it), and the second will be the LVM partition.
    # Be sure to select 8E for the LVM partition! You can use
    # /dev/disk/by-uuid/XXXX for more fine-grained control; use "blkid -o list"
    # to find out which drives are given which UUIDs.

    cfdisk /dev/sda

    # Now set up encryption on the LVM partition (the 2nd partition, so it's
    # /dev/sda2). It is recommended to use aes-xts-plain64 if the partition is
    # larger than 2 TB, I think.

    cryptsetup -c aes-xts-plain -y -s 512 luksFormat /dev/sda2

    # Decrypt the encrypted partition with LUKS. We need to decrypt it first
    # before setting up LVM volumes and filesystems on it. This will create
    # an entry under /dev/mapper/luks.

    cryptsetup luksOpen /dev/sda2 luks

    # Set up LVM. We call our volume group "vg0". To be honest I am not sure if
    # the "--contiguous y" option for swap matters because we are encrypting it
    # (doesn't block encryption encrypt data randomly on the disk?), but that's
    # what I found online and that's what I'll use here. The "+100%FREE" option
    # is a very handy option that tells LVM to fill up all remaining free space,
    # in this case, for the "home" logical volume.
    # Read more about LVM on your favorite online wiki.

    pvcreate /dev/mapper/luks
    vgcreate vg0 /dev/mapper/luks
    lvcreate --size 40G vg0 --name root
    lvcreate --size 2G --contiguous y vg0 --name swap
    lvcreate -l +100%FREE vg0 --name home

    # Initialize filesystems. Btrfs is still experimental, so I choose ext4
    # here. The vg0-root, vg0-home, and vg0-swap volumes should
    # appear under /dev/mapper as you create them with the lvcreate command.

    mkfs.ext4 /dev/mapper/vg0-root
    mkfs.ext4 /dev/mapper/vg0-home
    mkswap /dev/mapper/vg0-swap

    # Mount the logical volumes. Why? Because we need to install Arch Linux onto
    # it! You probably don't need to mount the "home" volume but it couldn't
    # hurt. Besides, it's nice to test that all volumes can be mounted OK anyway.

    mount /dev/mapper/vg0-root /mnt # /mnt is our system's "/" root   directory
    mkdir /mnt/home
    mount /dev/mapper/vg0-home /mnt/home
    swapon /dev/mapper/vg0-swap

    # Set up package download mirrors. The kernel mirror is generally a good one
    # if you live in the US. Google "arch linux mirror status" to find the
    # fastest mirrors.

    vi /etc/pacman.d/mirrorlist

    # Download and install the core minimum packages to get a working Linux
    # terminal (console) environment. Ah, simplicity!

    pacstrap -i /mnt base base-devel

    # Generate and inspect the filesystem tables needed when the system first
    # starts. The file should contain four "partitions" (I put them in quotes
    # because remember, we are using only 2 real partitions; 1st one is /boot and
    # the 2nd one is our LVM containing /, /home, and swap).

    genfstab -U -p /mnt >> /mnt/etc/fstab
    vi /mnt/etc/fstab

    # Make the system pretend that /mnt is /. This is called "chrooting". We need
    # to do this because some of the commands like locale-gen, systemctl, etc.
    # read configuration files from /etc, for example, and right now our /etc is
    # the /etc used by the Arch Linux live CD, not our newly-installed system's
    # root directory, /mnt.

    arch-chroot /mnt /bin/bash

    # Choose and generate locale; programs supporting internationalization (aka
    # "i18n") will use the language you pick to generate the proper text used in
    # menus, error messages, etc. Just choose "en_US.UTF-8 UTF-8" if you live in
    # the US.

    vi /etc/locale.gen
    locale-gen
    echo LANG=en_US.UTF-8 > /etc/locale.conf

    # Choose default font used by the virtual console (the terminal screen you
    # will see when you first boot up your new Arch Linux system).

    setfont Lat2-Terminus16
    echo FONT=Lat2-Terminus16 > /etc/vconsole.conf

    # Choose timezone. Here, we tell hwclock that we want the hardware clock on
    # the motherboard to be set to UTC, which will then be interpreted as UTC
    # time by the kernel and re-adjusted to the timezone you chose when you
    # actually use the system. This is nice because UTC is the "master" time the
    # world uses anyway, so it makes sense to tell your motherboard to use it.

    ln -s /usr/share/zoneinfo/America/Los_Angeles /etc/localtime
    hwclock --systohc --utc

    # Choose your computer's name. Keep it simple, without spaces. I use "k0".

    echo MYHOSTNAME > /etc/hostname

    # Enable DHCPCD... aka "acquire dynamic ip address from your router/switch".
    # Be sure to choose the right eth0 or eth1 (eth2, if you have 3 ethernet
    # ports) port. If you choose the wrong "eth" number, just go to
    # /etc/systemd/system/multi-user.target.wants/ and rename the symlink; e.g.,
    # rename /etc/systemd/system/multi-user.target.wants/dhcpcd@eth0.service to
    # /etc/systemd/system/multi-user.target.wants/dhcpcd@eth1.service.

    systemctl enable dhcpcd@eth0.service

    # Disable unused package repos. You will want to disable the [testing] repo
    # unless you want to test unstable packages and engage in the package
    # debugging/development process. For Haskellers, be sure to add the
    # [haskell-core] repo (see
    # https://wiki.archlinux.org/index.php/Haskell_Package_Guidelines).

    vi /etc/pacman.conf

    # Set up the root user ("administrator" for you Windows people) password.

    passwd

    # Install zsh, because you will like it better any other shell out there.

    pacman -S zsh

    # Add your regular user, and set up your password. This is your normal
    # username. We use the "-s /bin/zsh" option to tell useradd that we want to
    # log in with zsh.

    # useradd -m -g users -G wheel,storage,power -s /bin/zsh MYUSERNAME
    # passwd MYUSERNAME

    # VERY IMPORTANT: Add in the filesystem type that /root partition (LVM) is
    # using (for this tutorial, it is "ext4") into the MODULES variable and also
    # add 'encrypt' and 'lvm2' into HOOKS.

    vi /etc/mkinitcpio.conf

    # Re-generate the linux image to take into account the LVM and encrypt
    # flags we added into /etc/mkinitcpio.conf. I say "re-generate" because this
    # is our second time doing this (the first time was when we installed the
    # linux package with the pacstrap command above).

    mkinitcpio -p linux

    # Install boot loader. I like SYSLINUX because of the saner/easier-looking
    # boot configuration file, compared to GRUB 2. Be sure to add in
	#     "APPEND cryptdevice=/dev/disk/by-uuid/xxxxxxxxxx:luks root=/dev/mapper/vg0-root resume=/dev/mapper/vg0-swap ro"
    # under the "LABEL arch" entries, so that SYSLINUX tells the kernel to look
    # for the encrypted LVM partition. Otherwise, your system won't boot! Don't
    # forget to actually look up the UUID of your LVM partition (/dev/sda2 in
    # this tutorial). You don't need to tell the kernel about your vg0-home
    # volume because that will get mounted by /etc/fstab when it is read later in
    # the boot process.

    pacman -S syslinux
    syslinux-install_update -i -a -m
    vi /boot/syslinux/syslinux.cfg

    # Exit chroot environment.

    exit

    # Unmount volumes, and reboot. Remove your Arch Linux live CD when your
    # computer powers on again.

    umount /mnt/{boot,home}
    swapoff
    reboot

PROTIP: Use the ALT+F2, ALT+F3, etc. shortcuts to log into and switch between different TTY screens; this is very useful when you are unsure about some of the commands used above and want to look at their manpages for more detailed explanations.

The process of installing Arch Linux is long, but arguably extremely clean and clutter-free. *EVERY* command counts, and matters. And you also get system-level encryption (everything except the /boot partition is encrypted). I would go ahead and install gvim, emacs, xmonad, and all the other little programs I use daily, but that can wait until the next reboot. And because it’s Linux, you don’t have to worry about restarting the whole system when you install new programs, so you can actually install 99% of everything you need within one session.

How to Encrypt Your USB Flash Drives

Did you know that your modern Linux kernel comes with a built-in encryption framework? I am talking about dm-crypt (device-mapper crypt) and the user-friendly layer on top of it, called LUKS (Linux Unified Key Setup). I just encrypted all of my USB flash drives two weeks ago using the dm-crypt + LUKS method and I am very happy with the results.

The process itself is very simple.

# 1. Find the correct device.

lsblk

# 2. Wipe the device with random data. I prefer to target the disk by its UUID
# because using the /dev/sdX convention is not very reliable (the letters can
# change between boots/hotmounts). NOTE: You might be interested in
# http://frandom.sourceforge.net/ if your device is over 16 GiB or so, because
# using /dev/urandom can be very slow. If using Arch Linux, you can get it from
# the AUR: https://aur.archlinux.org/packages/frandom/.

dd if=/dev/urandom of=/dev/disk/by-uuid/XXX bs=4096

# 3. Create the partition on the device.

cfdisk /dev/disk/by-uuid/XXX

# 4. Encrypt the partition and make it LUKS-compatible. See the manpage for
# cryptsetup(8).
#   -c: cipher type to use
#   -y: LUKS will ask you to input the passphrase; using -y will ask you twice
#       and complain if the two do not match.
#   -s: Key size in bits; the larget the merrier, but limited by the cipher/mode used.

cryptsetup -c aes-xts-plain -y -s 512 luksFormat /dev/disk/by-uuid/XXX

# 5. Open the partition with LUKS.

cryptsetup luksOpen /dev/disk/by-uuid/XXX mycrypteddev

# The partition is now available from /dev/mapper/mycrypteddev as a "regular"
# partition, since LUKS is now handling all block device encryption between the
# user and the device.

# 6. Set up a filesystem on the partition.

mkfs.ext4 /dev/mapper/mycrypteddev

# 7. Close the partition with LUKS.

cryptsetup luksClose /dev/mapper/mycrypteddev

# Encryption setup complete! Now every time you want to access the partition,
# you must first open it with LUKS and then mount it. Then when you're done, do
# the reverse: unmount and close it with LUKS.

# To mount and open with LUKS:

cryptsetup luksOpen /dev/disk/by-uuid/XXX mycrypteddev
mount -t ext4 /dev/mapper/mycrypteddev /mnt/mount_point

# To unmount and close with LUKS:

umount /mnt/mount_point
cryptsetup luksClose mycrypteddev

The mount/open and unmount/close steps necessary for using the device is laborious. That’s why you should write a bash script to run them. I’ve written the following bash script called cmount.sh to access my 3 USB drives this way:

#!/bin/zsh
# LICENSE: PUBLIC DOMAIN
# mount/unmount encrypted flash drives

mp=$3
uuid=""

case $2 in
    "0")
        uuid="11e102cd-dea1-46a8-ae9b-b3f74b536e64" # my red USB drive
        ;;
    "1")
        uuid="cf169437-b937-4a39-86cb-7ca82bd9fe94" # my green one
        ;;
    "2")
        uuid="57a0b7d5-d2a6-47e0-a0e3-adf69501d0cd" # my blue one
        ;;
    *)
        ;;
esac

if [[ $uuid == "" ]]; then
    echo "No predefined device specified."
    exit 0
fi

case $1 in
    "m")
        echo "Authorizing encrypted partition /dev/mapper/$mp..."
        sudo cryptsetup luksOpen /dev/disk/by-uuid/$uuid $mp
        echo -n "Mounting partition on /mnt/$mp..."
        sudo mount -t ext4 /dev/mapper/$mp /mnt/$mp && echo "done."
        ;;
    "u")
        echo -n "Unmounting /mnt/$mp..."
        sudo umount /mnt/$mp && echo "done."
        echo -n "Closing encrypted partition /dev/mapper/$mp..."
        sudo cryptsetup luksClose $mp && echo "done."
        ;;
    *)
        ;;
esac

To mount the green USB to /mnt/ef0 (“ef0” is just an arbitrary folder name):

./cmount.sh m 1 ef0

Then to unmount:

./cmount.sh u 1 ef0

Simple, eh? Go forth and encrypt all of your USB drives, so that when they get lost, they can’t be read by curious strangers. You can use the above steps to create and encrypt multiple partitions in the same device, or to only encrypt one partition while leaving other partitions unencrypted (i.e., steps 4 through 7 are partition-specific). The choice is yours. I prefer partition-level (aka “block device”) encryption over file/folder encryption because I don’t have to mentally think every time “hey, do I want to encrypt this?” for every file/folder I create.

If you want to look into encrypting your hard drives and swap partitions, read through this disk encryption page, and particularly, this section. There are many “levels” of encryption, and you should consider the many options available to you.

Skype on Linux: Startup Crash Fix

I’ve been using Skype recently and it kept crashing randomly a few seconds after startup. I managed to salvage the error message, which states as follows:

skype: pcm.c:2811: snd_pcm_areas_copy: Assertion `src_areas' failed.

Apparently, I’m not the first to notice this problem. I learned from that link that Microsoft is in no hurry to fix this longstanding, horrible bug that is a simple 1-liner fix. The fix is as follows:

mkdir ~/.Skype/Logs

Shame on the current Skype developers.

Update May 31, 2012: Apparently, Skype automatically creates huge logfiles in the ~/.Skype/Logs directory, and there is no way to turn off this useless feature (the logfiles are written in a proprietary Skype-only format, and is only designed to help Skype fix bugs if anything goes wrong; no personal data (chat history, etc.) is written in those logs according to online sources). I was bitten by this problem earlier this evening when I realized that my home partition (a meager 9GB) filled up with a 2GB Skype logfile, which prevented me from downloading a file online. The solution is simple: make the logfile a symlink into /dev/null — this way, you don’t have to periodically delete the logfiles — they are automatically “deleted” as they are being written! I knew I could take advantage of /dev/null someday and this is the perfect time for it! EDIT June 1, 2012: Actually, Skype did crash once mysteriously because of this, so just make a symlink to some partition with more space available, and then run a script upon boot up/power off to delete all files in that directory. Sorry about the bad advice!

There you go — happy Skyping!

Update June 3, 2012: OK, so apparently there is a way to create a “blackhole” directory, much in the same spirit of the special /dev/null file. This stackexchange page addresses the same concern, and led me to nullfs, which is a FUSE program that allows you to mount /dev/null-like filesystems from userspace (as regular programs, without root privileges). The code for these programs are just a few hundred lines long and compile very quickly. There are three flavors (nul1fs, nullfs, and nulnfs) and I’ve tested out nullfs to great success. You can create a “blackhole” like directory where files can be created into, but which does not take up any disk space. I just ran Skype again right now and the output of “lsof | grep skype” looks very normal (the log files in ~/.Skype/Logs are indeed created in Logs). So here’s how to do it: compile nullfs with

g++ -o nullfs -lfuse nullfs

and then create a regular directory anywhere with

mkdir blackhole

and then mount the nullfs filesystem on that directory with

path-to-nullfs-binary path-to-blackhole-directory 

and you’re good to go. For our use case, the blackhole directory would just be ~/.Skype/Logs. The nullfs binary backgrounds itself so you don’t need to worry about backgrounding/disowning the process. You can put the one-liner nullfs invocation into your startup script of choice, and you’re all set! I haven’t tried out the nul1fs or nulnfs flavors, because I didn’t need to. Hooray for FUSE!

Simple Password Generation with Haskell

So, I’ve been using a custom password generator for a while. It’s great because it has some useful settings like “only generate alphanumeric passwords” or “only generate letters” to deal with stupid, legacy-code websites that refuse to modernize their password handling code. I use this generator to create 50-character passwords, because hey, if it’s randomly generated, then you might as well generate really long passwords! I recently upgraded the pseudorandom number generator (PRNG) to use a cryptographically secure PRNG (CPRNG), just for fun.

Anyway, here is the program:

-- LICENSE: PUBLIC DOMAIN
module Main where

import Control.Monad.State
import Crypto.Random.AESCtr
import Data.Binary (decode)
import qualified Data.ByteString.Lazy as B
import Data.List (nub)
import IO
import System (getArgs)
import System.IO (hSetEcho)

keysChar, keysNum, keysPunc, keysCharNum, keysAll, keysHex :: String
keysChar = ['a'..'z'] ++ ['A'..'Z']
keysHex = ['a'..'f']
keysNum = ['0'..'9']
keysPunc = "`~!@#$%^&*()-_=+[{]}\\|;:'\",<.>/? "
keysCharNum = keysChar ++ keysNum
keysAll = keysChar ++ keysNum ++ keysPunc

giveKey ::  String -> Char -> Int -> Char
giveKey keysCustom c n = extractChar $ case c of
    'i'  -> (keysNum ++ keysHex)
    'j'  -> keysNum
    'k'  -> keysChar
    'l'  -> keysCharNum
    ';'  -> keysPunc
    'h'  -> (keysCharNum ++ keysCustom)
    '\n' -> ['\n']
    _    -> keysAll
    where
    extractChar xs = xs!!mod n (length xs)

showRandomKey :: String -> StateT AESRNG IO ()
showRandomKey keysCustom = handleKey =<< liftIO getChar
    where
    handleKey key = case key of
        '\n' -> liftIO (putChar '\n') >> showRandomKey keysCustom
        'q' -> (liftIO $ putStrLn "\nBye!") >> return ()
        _ -> mapM_ f [0..(49)::Int] >> (liftIO $ putStrLn []) >> showRandomKey keysCustom
        where
        f _ = liftIO
            . putChar
            . giveKey keysCustom key
            . (\n -> mod n (length (keysAll ++ keysCustom) - 1))
            =<< aesRandomInt

aesRandomInt :: StateT AESRNG IO Int
aesRandomInt = do
    aesState <- get
    let (bs, aesState') = genRandomBytes aesState 16
    put aesState'
    return (decode $ B.fromChunks [bs])

main :: IO ()
main = do
    hSetBuffering stdin NoBuffering -- disable buffering from STDIN
    hSetBuffering stdout NoBuffering -- disable buffering from STDOUT
    hSetEcho stdin False -- disable terminal echo
    as <- getArgs
    let as' = filter (\c -> elem c keysAll) . nub $ unwords as
    mapM_ putStrLn
        [ []
        , "poke: 'q'     quit"
        , "      'j'     number"
        , "      'k'     letter"
        , "      'l'     alphanumeric"
        , "      ';'     punctuation"
        , "      'h'     alphanumeric" ++ (if null as' then [] else " + " ++ as')
        , "      'i'     hexadecimal"
        , "      'ENTER' newline"
        , "      else    any"
        , []
        ]
    aesState <- makeSystem -- gather entropy from the system to use as the initial seed
    _ <- runStateT (showRandomKey as') aesState -- enter loop
    return ()

Yes, the CPRNG used is the Advanced Encryption Standard (AES) algorithm in counter mode. It sure beats using the default System.Random module in terms of… coolness (for the purposes of this application, the use of a CPRNG over a regular PRNG really gives no benefit). Anyway, here is a sample run:

$ poke \!@#$%
poke: 'q'     quit
      'j'     number
      'k'     letter
      'l'     alphanumeric
      ';'     punctuation
      'h'     alphanumeric + !@#$%
      'i'     hexadecimal
      'ENTER' newline
      else    any

%g@c@geisBaqK81ihYkEC5NyUrWXU2ndCKr3wHkklpjvCWF9I0
zaJ,!z{db.|~vR7,MvOnPU-5v7N;cCy'],bl/d;^s[hI}RM?j>
KfdJKVygPSOufgllMAZlEaLWHSHpDrHIcgmryETcEsx5uUWlQb
49388c0958e9c51f2f4dc06a8097eb8169f715b4fcfb3ca555

I named it “poke” because I couldn’t think of any other short, UNIX-y name.

The program takes a string as input and treats it as a special class of characters to consider when generating a password with the ‘h’ key. This way, you can fit into the requirements of legacy website code that say something like, “You may use alphanumeric characters and also ‘!’ ‘@’ ‘#’ ‘$’ and ‘%’ symbols”.

The output above is from pressing ‘h’, then SPACE, then ‘l’ and finally ‘i’. You could easily extend it to take more fine-tuned options, such as a shorter password length instead of the default 50. (Hint: use this post.)

The interesting side effect of using 50-character long, randomly generated passwords is that I myself do not know these passwords! The only thing I remember is the single password used for my GnuPG private key, used to decrypt the master password file.

In case you are curious, my complete setup regarding passwords is as follows: I store all of my passwords in plaintext in a “master” file, then encrypt it with GnuPG. I use a git repo to track this encrypted file, so that I can sanely remove/delete old passwords without worrying about how to get it back if I need it. Once in a while, I use a simple shell command, gpg2 -d mypasswords.gpg | less, to view the passwords for entry into some website (hooray for copy and paste!). If I need to update/add/delete passwords, I just decrypt the master file, then edit it, and re-encrypt it and commit the change into git (btw, I decrypt it into a RAM partition to avoid leaving any traces of the plaintext file).

The GnuPG private key used to encrypt the master file is itself encrypted with CAST5 (again using GnuPG) and tracked in git. The CAST5 encryption (with the “-c” flag) is a way to encrypt a file with symmetric encryption — i.e., you supply the same password for encryption and decryption, and plus you don’t need a GnuPG key to do it! (You wouldn’t want to encrypt it using another GnuPG key, because then you’d need to encrypt the key for that GnuPG key as well — recursion!) I constantly sync the git repo holding these important, encrypted files across all of my personal machines, thumb drives, etc., for reliability and convenience.

I could learn to use KeyPass or some other software to manage my passwords, but, I’m too lazy. I’m also too paranoid to trust anyone other than myself to handle my passwords. I also take care not to store anything *too* valuable into the passwords file, or anything I type into any file anywhere, just to play it safe.

EDIT: Hey, this is my 100th post! Congratulations to me!

Parsec Example Revisited (Again): Parsing Lazy ByteStrings

So in this post, I said that I would parse ByteStrings next time just to show how easy it is. Well, I’ve done just that. Since most of the code is identical, I’m going to post a diff after the full code.

-- LICENSE: PUBLIC DOMAIN
{-# LANGUAGE DeriveDataTypeable, RecordWildCards #-}
module Main where

import qualified Data.ByteString.Lazy as BL
import System.Console.CmdArgs.Implicit
import System.IO
import System.Environment
import System.Exit
import System.Process
import Text.Parsec.Char hiding (upper)
import Text.Parsec.Combinator
import Text.Parsec.Prim
import Text.Parsec.String
import qualified Text.Parsec.ByteString.Lazy as PB
import qualified Text.Parsec.Token as PT
import Text.Parsec.Language (emptyDef)
import Control.Monad.Identity

data Opts = Opts
    { all_devices :: Bool
    , unmount :: Bool
    , unmount_all :: Bool
    , discover :: Bool
    , no_color :: Bool
    } deriving (Data, Typeable, Show, Eq)

progOpts :: Opts
progOpts = Opts
    { all_devices = def &= help "mount all USB devices"
    , unmount = def &= help "choose a USB device to unmount"
    , unmount_all = def &= name "U" &= help "unmount all USB devices"
    , discover = def &= help "list all mounted/unmounted USB devices"
    , no_color = def &= help "disable colors"
    }
    &= details
        [ "Notes:"
        , ""
        , "The default behavior without any options is to try to mount a USB device."
            ++ " Here, `device' means a device under the /dev directory, and in our context, is actually a file system partition."
            ++ " Many USB drives have only a single partition, in which case the term `device' means both the USB drive and the single partition it has."
        , ""
        , "Also, allowing the $USER to execute the mount and umount commands with sudo privileges (sudo visudo) will make things less clunky."
        ]

getOpts :: IO Opts
getOpts = cmdArgs $ progOpts
    &= summary (_PROGRAM_INFO ++ ", " ++ _COPYRIGHT)
    &= program _PROGRAM_NAME
    &= help _PROGRAM_DESC
    &= helpArg [explicit, name "help", name "h"]
    &= versionArg [explicit, name "version", name "v", summary _PROGRAM_INFO]

_PROGRAM_NAME, _PROGRAM_VERSION, _PROGRAM_INFO, _PROGRAM_DESC, _COPYRIGHT :: String
_PROGRAM_NAME = "usbmnt"
_PROGRAM_VERSION = "0.1.0"
_PROGRAM_INFO = _PROGRAM_NAME ++ " version " ++ _PROGRAM_VERSION
_PROGRAM_DESC = "mount/unmount USB device(s)"
_COPYRIGHT = "(C) Linus Arver 2011"

data BlockDevice = BlockDevice
    { shortname :: String
    , uuid :: UUID
    , fsys :: String
    , mountPoint :: MountPoint
    } deriving (Eq)

data MountPoint
    = MPath { path :: FilePath }
    | Swap
    | Unmounted
    | UnknownBlkidVal
    deriving (Eq)

instance Show BlockDevice where
    show BlockDevice{..} = unwords
        [ shortname
        , fsys
        , uuid
        , show mountPoint
        ]

instance Show MountPoint where
    show (MPath path) = path
    show Swap = "Swap"
    show Unmounted = "Unmounted"
    show UnknownBlkidVal = "UnknownBlkidVal"

blockdeviceDefault :: BlockDevice
blockdeviceDefault = BlockDevice
    { shortname = ""
    , uuid = ""
    , fsys = ""
    , mountPoint = MPath {path = ""}
    }

data Config = Config
    { fsyss :: [(String, String)]
    , uuids :: [(UUID, String)]
    } deriving (Eq, Show)

_ALPHANUM :: String
_ALPHANUM = ['a'..'z'] ++ ['A'..'Z'] ++ ['0'..'9']

data Color
    = Red
    | Green
    | Yellow
    | Blue
    | CNone
    deriving (Show, Eq)

colorize :: Color -> String -> String
colorize c s = case c of
    Blue -> "\x1b[1;34m" ++ s ++ "\x1b[0m"
    Green -> "\x1b[1;32m" ++ s ++ "\x1b[0m"
    Red -> "\x1b[1;31m" ++ s ++ "\x1b[0m"
    Yellow -> "\x1b[1;33m" ++ s ++ "\x1b[0m"
    _ -> s

main :: IO ()
main = do
    hSetBuffering stdout NoBuffering
    hSetBuffering stderr NoBuffering
    opts <- getOpts
    homeDir <- getEnv "HOME"
    errNo <- argsCheck opts homeDir
    when (errNo > 0) $ exitWith $ ExitFailure errNo
    (devs, takenPaths) <- getDevices opts
    let configLoc = homeDir ++ "/.usbmnt"
    configSrc <- BL.readFile configLoc
    (confErrNo, config) <- parseConfig configSrc configLoc
    when (confErrNo > 0) $ exitWith $ ExitFailure confErrNo
    let mountablePaths = filter (\p -> not $ elem p takenPaths) $ map (\p -> "/mnt/u" ++ show p) [(0::Int)..]
        devsKV = zip (map show [(1::Int)..]) . zip devs $ mountablePaths
    prog opts config devsKV

argsCheck :: Opts -> String -> IO Int
argsCheck Opts{..} homeDir
    | null homeDir = e "could not get environment variable $HOME" 1
    | otherwise = return 0
    where
        e :: String -> Int -> IO Int
        e str num = errMsg str >> return num

prog :: Opts -> Config -> [(String, (BlockDevice, FilePath))] -> IO ()
prog opts@Opts{..} config devsKV
    | discover = do
        putStrLn "all devices:"
        mapM_ (\(_, (d, _)) -> putStrLn $ cshow d) devsKV
    | otherwise = do
        putStrLn (if (unmount || unmount_all)
            then "USB device(s) to unmount:"
            else "USB device(s) to mount:")
        mapM_ (\(n, (d, _)) -> putStrLn $ "    " ++ n ++ ") " ++ show' d) devsKV
        putStrLn ""
        mountMenu opts config devsKV
    where
        cshow :: BlockDevice -> String
        cshow b@BlockDevice{..}
            | no_color = show b
            | otherwise = case mountPoint of
                Unmounted -> colorize Green $ show b
                MPath _ -> if not $ null $ getUSBMountPath b
                    then colorize Blue $ show b
                    else show b
                _ -> show b
        show' :: BlockDevice -> String
        show' = if not (unmount || unmount_all)
            then show
            else unwords . init . words . show

mountMenu :: Opts -> Config -> [(String, (BlockDevice, FilePath))] -> IO ()
mountMenu Opts{..} config devsKV
    | unmount = if length devsKV == 1
        then do
            putStrLn "only 1 USB device to unmount"
            tryMount False config (snd . head $ devsKV) >>= exitWith
        else chooseDev prompt devsKV (tryMount False config)
    | unmount_all = do
        putStrLn "unmounting all USB devices..."
        mapM_ (tryMount False config) (map snd devsKV)
        return ()
    | all_devices = do
        putStrLn "mounting all USB devices..."
        mapM_ (tryMount True config) (map snd devsKV)
        return ()
    | length devsKV == 1 = do
        putStrLn "only 1 USB device to mount"
        tryMount True config (snd . head $ devsKV) >>= exitWith
    | otherwise = chooseDev prompt devsKV (tryMount True config)
    where
        prompt :: String
        prompt = if (unmount || unmount_all)
            then "choose USB device to unmount (q to exit)"
            else "choose USB device to mount (q to exit)"

chooseDev :: String -> [(String, (BlockDevice, FilePath))] -> ((BlockDevice, FilePath) -> IO ExitCode) -> IO ()
chooseDev prompt devsKV func = do
    putStrLn prompt
    key <- getLine
    case lookup key devsKV of
        Just dev -> func dev >>= exitWith
        _ -> case key of
            "q" -> return ()
            _ -> chooseDev prompt devsKV func

tryMount :: Bool -> Config -> (BlockDevice, FilePath) -> IO ExitCode
tryMount mount config@Config{..} (bd@BlockDevice{..}, mp)
    | (null margs) = do
        errMsg $ "UUID " ++ squote uuid ++ " was not found in config file"
        errMsg $ "filesystem " ++ squote fsys ++ " was also not found in config file"
        errMsg $ "supported file systems: " ++ (unwords $ map fst fsyss)
        exitWith (ExitFailure 1)
    | otherwise = do
    when mount $ do
        if (null $ mountArgsUUID config uuid)
            then putStrLn $ "filesystem " ++ squote fsys ++ " found in config file"
            else putStrLn $ "UUID " ++ squote uuid ++ " found in config file"
        putStrLn $ "using these arguments: " ++ squote margs
    putStr $ (if mount then "" else "un")
        ++ "mounting "
        ++ shortname
        ++ " (" ++ fsys ++ ") "
        ++ (if mount then "to " ++ mp else "from " ++ show mountPoint)
        ++ ".."
    (_, _, _, p) <- createProcess $ cmd margs shortname
    exitStatus <- waitForProcess p
    if (exitStatus == ExitSuccess)
        then do
            putStrLn "OK"
            return ExitSuccess
        else do
            putStr "FAILED\n"
            errMsg $ (if mount
                then "mount error (perhaps " ++ squote mp ++ " does not exist)"
                else "unmount error")
            return (ExitFailure 1)
    where
        margs = mountArgs config bd
        cmd arguments devPath = CreateProcess
            { cmdspec = ShellCommand (if mount
                then "sudo mount -t " ++ arguments ++ " " ++ devPath ++ " " ++ mp ++ " &>/dev/null"
                else "sudo umount " ++ show mountPoint)
            , cwd = Nothing
            , env = Nothing
            , std_in = CreatePipe
            , std_out = CreatePipe
            , std_err = Inherit
            , close_fds = False
            }

mountArgs :: Config -> BlockDevice -> String
mountArgs Config{..} BlockDevice{..} = case lookup uuid uuids of
    Just a -> a
    _ -> case lookup fsys fsyss of
        Just a -> a
        _ -> []

mountArgsUUID :: Config -> UUID -> String
mountArgsUUID Config{..} uuid' = case lookup uuid' uuids of
    Just a -> a
    _ -> []

getDevices :: Opts -> IO ([BlockDevice], [String])
getDevices Opts{..} = do
    (_, sout, _, p) <- createProcess cmdBlkid
    devs <- case sout of
        Just h -> hGetContents h
        Nothing -> return []
    _ <- waitForProcess p
    let devs' = (map (unwords . words)) . drop 2 . lines $ devs
    devs'' <- mapM parseBlkid devs'
    let toMount = filter (\BlockDevice{..} -> mountPoint == Unmounted) devs''
        toUnmount = filter (\dev -> not $ null $ getUSBMountPath dev) devs''
        takenPaths = filter (not . null) . map getUSBMountPath $ devs''
    when (not discover && null toMount && (not (unmount || unmount_all))) $ do
        errMsg $ "cannot find USB devices to mount"
        exitWith (ExitFailure 1)
    when (not discover && null toUnmount && (unmount || unmount_all)) $ do
        errMsg $ "cannot find USB devices to unmount"
        exitWith (ExitFailure 1)
    return $ formatDevs devs'' toMount toUnmount takenPaths
    where
        formatDevs :: [BlockDevice] -> [BlockDevice] -> [BlockDevice] -> [String] -> ([BlockDevice], [String])
        formatDevs ds m um takenPaths
            | discover = (ds, takenPaths)
            | unmount || unmount_all = (um, takenPaths)
            | otherwise = (m, takenPaths)
        cmdBlkid = CreateProcess
            { cmdspec = ShellCommand ("sudo blkid -o list")
            , cwd = Nothing
            , env = Nothing
            , std_in = CreatePipe
            , std_out = CreatePipe
            , std_err = Inherit
            , close_fds = False
            }

getUSBMountPath :: BlockDevice -> String
getUSBMountPath BlockDevice{..} = case mountPoint of
    MPath str -> if take 6 str == "/mnt/u" && (all (\c -> elem c ['0'..'9']) (drop 6 str))
        then str
        else ""
    _ -> ""

errMsg :: String -> IO ()
errMsg msg = hPutStrLn stderr $ "error: " ++ msg

squote :: String -> String
squote s = "`" ++ s ++ "'"

-- Parsing

-- for parsing the computer-generated output of `sudo blkid -o list'
parserIdentifier :: Parser String
parserIdentifier = many1 $ oneOf $ _ALPHANUM ++ "/-_"

parserWhitespace :: Parser String
parserWhitespace = many1 $ oneOf " \t\n\r"

parserMP :: Parser MountPoint
parserMP =
    try ( do
        a <- oneOf "<("
        b <- manyTill anyChar (lookAhead $ (oneOf ">)"))
        _ <- oneOf ">)"
        let mp = case a of
                '<' -> Swap
                '(' -> case b of
                    "not mounted" -> Unmounted
                    _ -> UnknownBlkidVal
                _ -> UnknownBlkidVal
        return mp
        )
    <|> (parserIdentifier >>= (\s -> return MPath {path = s}))
    <?> "blkid's mount point description"

blkidParser :: Parser BlockDevice
blkidParser =
    try ( do
        sname <- parserIdentifier
        _ <- parserWhitespace
        fs <- parserIdentifier
        _ <- parserWhitespace
        _ <- parserIdentifier -- leave out the "label" column, even if it exists
        _ <- parserWhitespace
        mp <- parserMP
        _ <- parserWhitespace
        uid <- parserIdentifier
        eof
        return BlockDevice
           { shortname = sname
           , uuid = uid
           , fsys = fs
           , mountPoint = mp
           }
        )
    <|>
    do  sname <- parserIdentifier
        _ <- parserWhitespace
        fs <- parserIdentifier
        _ <- parserWhitespace
        mp <- parserMP
        _ <- parserWhitespace
        uid <- parserIdentifier
        eof
        return BlockDevice
            { shortname = sname
            , uuid = uid
            , fsys = fs
            , mountPoint = mp
            }
    <?> "5 or 4 fields to parse"

parseBlkid :: String -> IO BlockDevice
parseBlkid src =
    case parse blkidParser "output of `sudo blkid -o list'" src of
        Left parseError -> errMsg (show parseError) >> return blockdeviceDefault
        Right result -> return result

-- we use a LanguageDef so that we can get whitespace/newline parsing for FREE
-- in our .usbmnt file
configDef :: PT.GenLanguageDef BL.ByteString () Identity
configDef = emptyDef
    { PT.commentStart   = ""
    , PT.commentEnd     = ""
    , PT.commentLine    = "#"
    , PT.nestedComments = False
    -- the identStart/identLetter define what a UUID will look like (a
    -- dash-separated hex number)
    , PT.identStart     = oneOf $ ['0'..'9'] ++ ['a'..'f'] ++ ['A'..'F']
    , PT.identLetter    = oneOf $ ['0'..'9'] ++ ['a'..'f'] ++ ['A'..'F'] ++ "-"
    , PT.opStart        = char '.'
    , PT.opLetter       = char '.'
    , PT.reservedOpNames= []
    , PT.reservedNames  = []
    , PT.caseSensitive  = True
    }

-- we call makeTokenParser def and pick out just those we need
lexer :: PT.GenTokenParser BL.ByteString () Identity
lexer = PT.makeTokenParser configDef

p_identifier :: ParsecT BL.ByteString () Identity String
p_identifier = PT.identifier lexer
p_stringLiteral :: ParsecT BL.ByteString () Identity String
p_stringLiteral = PT.stringLiteral lexer
p_whiteSpace :: ParsecT BL.ByteString () Identity ()
p_whiteSpace = PT.whiteSpace lexer
p_braces :: ParsecT BL.ByteString () Identity a -> ParsecT BL.ByteString () Identity a
p_braces = PT.braces lexer
p_commaSep :: ParsecT BL.ByteString () Identity a -> ParsecT BL.ByteString () Identity [a]
p_commaSep = PT.commaSep lexer
p_symbol :: String -> ParsecT BL.ByteString () Identity String
p_symbol = PT.symbol lexer

type UUID = String

assocParser :: PB.Parser String -> PB.Parser (UUID, String)
assocParser keyParser = do
    key <- keyParser
    _ <- many $ oneOf " \t"
    _ <- string "="
    _ <- many $ oneOf " \t"
    mountOpts <- p_stringLiteral
    return (key, mountOpts)
    <?> "a key-value association"

hashParser :: String -> PB.Parser String -> PB.Parser [(String, String)]
hashParser hashName keyParser = do
    _ <- p_symbol hashName
    _ <- p_symbol "="
    a <- p_braces (p_commaSep $ assocParser keyParser)
    return a
    <?> "a " ++ hashName ++ " curly brace block"

configParser :: PB.Parser Config
configParser = do
    p_whiteSpace -- take care of leading whitespace/comments as defined by configDef
    -- parse FSYS_HASH first
    fsyss' <- hashParser "FSYS_HASH" (many1 alphaNum)
    p_whiteSpace
    -- now parse UUID_HASH
    uuids' <- hashParser "UUID_HASH" (p_identifier)
    eof
    return $ Config {fsyss = fsyss', uuids = uuids'}
    <?> "config with FSYS_HASH and UUID_HASH blocks"

parseConfig :: BL.ByteString -> String -> IO (Int, Config)
parseConfig src loc =
    case parse configParser ("config file at " ++ squote loc) src of
        Left parseError -> errMsg (show parseError) >> return (1, Config [] [])
        Right result -> return (0, result)

And here is the diff:

--- usbmnt.hsold	2011-11-09 11:43:02.871554967 -0800
+++ usbmnt.hsnew	2011-11-09 11:43:09.761453413 -0800
@@ -2,6 +2,7 @@
 {-# LANGUAGE DeriveDataTypeable, RecordWildCards #-}
 module Main where
 
+import qualified Data.ByteString.Lazy as BL
 import System.Console.CmdArgs.Implicit
 import System.IO
 import System.Environment
@@ -11,6 +12,7 @@
 import Text.Parsec.Combinator
 import Text.Parsec.Prim
 import Text.Parsec.String
+import qualified Text.Parsec.ByteString.Lazy as PB
 import qualified Text.Parsec.Token as PT
 import Text.Parsec.Language (emptyDef)
 import Control.Monad.Identity
@@ -126,7 +128,7 @@
     when (errNo > 0) $ exitWith $ ExitFailure errNo
     (devs, takenPaths) <- getDevices opts
     let configLoc = homeDir ++ "/.usbmnt"
-    configSrc <- readFile configLoc
+    configSrc <- BL.readFile configLoc
     (confErrNo, config) <- parseConfig configSrc configLoc
     when (confErrNo > 0) $ exitWith $ ExitFailure confErrNo
     let mountablePaths = filter (\p -> not $ elem p takenPaths) $ map (\p -> "/mnt/u" ++ show p) [(0::Int)..]
@@ -379,7 +381,7 @@
 
 -- we use a LanguageDef so that we can get whitespace/newline parsing for FREE
 -- in our .usbmnt file
-configDef :: PT.LanguageDef st
+configDef :: PT.GenLanguageDef BL.ByteString () Identity
 configDef = emptyDef
     { PT.commentStart   = ""
     , PT.commentEnd     = ""
@@ -397,25 +399,25 @@
     }
 
 -- we call makeTokenParser def and pick out just those we need
-lexer :: PT.TokenParser ()
+lexer :: PT.GenTokenParser BL.ByteString () Identity
 lexer = PT.makeTokenParser configDef
 
-p_identifier :: ParsecT String () Identity String
+p_identifier :: ParsecT BL.ByteString () Identity String
 p_identifier = PT.identifier lexer
-p_stringLiteral :: ParsecT String () Identity String
+p_stringLiteral :: ParsecT BL.ByteString () Identity String
 p_stringLiteral = PT.stringLiteral lexer
-p_whiteSpace :: ParsecT String () Identity ()
+p_whiteSpace :: ParsecT BL.ByteString () Identity ()
 p_whiteSpace = PT.whiteSpace lexer
-p_braces :: ParsecT String () Identity a -> ParsecT String () Identity a
+p_braces :: ParsecT BL.ByteString () Identity a -> ParsecT BL.ByteString () Identity a
 p_braces = PT.braces lexer
-p_commaSep :: ParsecT String () Identity a -> ParsecT String () Identity [a]
+p_commaSep :: ParsecT BL.ByteString () Identity a -> ParsecT BL.ByteString () Identity [a]
 p_commaSep = PT.commaSep lexer
-p_symbol :: String -> ParsecT String () Identity String
+p_symbol :: String -> ParsecT BL.ByteString () Identity String
 p_symbol = PT.symbol lexer
 
 type UUID = String
 
-assocParser :: Parser String -> Parser (UUID, String)
+assocParser :: PB.Parser String -> PB.Parser (UUID, String)
 assocParser keyParser = do
     key <- keyParser
     _ <- many $ oneOf " \t"
@@ -425,7 +427,7 @@
     return (key, mountOpts)
     <?> "a key-value association"
 
-hashParser :: String -> Parser String -> Parser [(String, String)]
+hashParser :: String -> PB.Parser String -> PB.Parser [(String, String)]
 hashParser hashName keyParser = do
     _ <- p_symbol hashName
     _ <- p_symbol "="
@@ -433,7 +435,7 @@
     return a
     <?> "a " ++ hashName ++ " curly brace block"
 
-configParser :: Parser Config
+configParser :: PB.Parser Config
 configParser = do
     p_whiteSpace -- take care of leading whitespace/comments as defined by configDef
     -- parse FSYS_HASH first
@@ -445,7 +447,7 @@
     return $ Config {fsyss = fsyss', uuids = uuids'}
     <?> "config with FSYS_HASH and UUID_HASH blocks"
 
-parseConfig :: String -> String -> IO (Int, Config)
+parseConfig :: BL.ByteString -> String -> IO (Int, Config)
 parseConfig src loc =
     case parse configParser ("config file at " ++ squote loc) src of
         Left parseError -> errMsg (show parseError) >> return (1, Config [] [])

Like I said, pretty straightforward, right? The code actually only uses lazy bytestrings for parsing the configuration file. The output of the blkid command is still parsed with the old way (native Haskell strings). The only tricky part is to import the bytestring stuff (Data.ByteString.Lazy and Text.Parsec.ByteString.Lazy) as qualified imports to avoid namespace clashes. The configDef function’s type signature had to change a bit because Parsec 3.1.1 does not have a convenience type alias for LanguageDef that uses bytestrings. To be honest, I couldn’t figure out the type signatures for p_identifier, p_symbol, etc., so I had GHC do it for me. But I mean, it all took like 5-10 minutes. Easy.

So now you can parse bytestrings — lazily!

Parsec Example Revisited: Custom Configuration File Format Meets the Token Module

As promised, here is a solution to one of the exercises I posed here — namely, to read and parse a configuration file to determine mount options to pass along to the mount command. (No one’s reading this stuff right now, but whatever. Haskell is still that much fun for me!) That is, the previous solution used a rather crude, hard-coded function, fileSystemArgs, as follows:

fileSystemArgs :: String -> [(String, String)]
fileSystemArgs user =
    [ ("ext2", "ext2 -o rw,relatime")
    , ("vfat", "vfat -o rw,uid=" ++ user ++ ",gid=" ++ user)
    ]

I’ve removed this code in favor of reading such options from a configuration file at runtime, as per the exercise. The code has increased to 452 lines (only about 100 more lines than before, with my verbose style), and we make use of the Text.Parsec.Token module to define a super-simple mini-language that we use in the configuration file (two curly brace blocks that define mount options). We only use bits and pieces of it, but it’s still handy because we get some super-convenient parser combinators for free — namely, braces, stringLiteral, and whiteSpace (which are actually aliased to p_braces, p_stringLiteral, and p_whiteSpace in the code). What’s more, these functions intelligently evade trailing whitespace and comment lines automatically. This is why I love Parsec!!!

So without further ado, here’s the code (PUBLIC DOMAIN):

-- LICENSE: PUBLIC DOMAIN
{-# LANGUAGE DeriveDataTypeable, RecordWildCards #-}
module Main where

import System.Console.CmdArgs.Implicit
import System.IO
import System.Environment
import System.Exit
import System.Process
import Text.Parsec.Char hiding (upper)
import Text.Parsec.Combinator
import Text.Parsec.Prim
import Text.Parsec.String
import qualified Text.Parsec.Token as PT
import Text.Parsec.Language (emptyDef)
import Control.Monad.Identity

data Opts = Opts
    { all_devices :: Bool
    , unmount :: Bool
    , unmount_all :: Bool
    , discover :: Bool
    , no_color :: Bool
    } deriving (Data, Typeable, Show, Eq)

progOpts :: Opts
progOpts = Opts
    { all_devices = def &= help "mount all USB devices"
    , unmount = def &= help "choose a USB device to unmount"
    , unmount_all = def &= name "U" &= help "unmount all USB devices"
    , discover = def &= help "list all mounted/unmounted USB devices"
    , no_color = def &= help "disable colors"
    }
    &= details
        [ "Notes:"
        , ""
        , "The default behavior without any options is to try to mount a USB device."
            ++ " Here, `device' means a device under the /dev directory, and in our context, is actually a file system partition."
            ++ " Many USB drives have only a single partition, in which case the term `device' means both the USB drive and the single partition it has."
        , ""
        , "Also, allowing the $USER to execute the mount and umount commands with sudo privileges (sudo visudo) will make things less clunky."
        ]

getOpts :: IO Opts
getOpts = cmdArgs $ progOpts
    &= summary (_PROGRAM_INFO ++ ", " ++ _COPYRIGHT)
    &= program _PROGRAM_NAME
    &= help _PROGRAM_DESC
    &= helpArg [explicit, name "help", name "h"]
    &= versionArg [explicit, name "version", name "v", summary _PROGRAM_INFO]

_PROGRAM_NAME, _PROGRAM_VERSION, _PROGRAM_INFO, _PROGRAM_DESC, _COPYRIGHT :: String
_PROGRAM_NAME = "usbmnt"
_PROGRAM_VERSION = "0.1.0"
_PROGRAM_INFO = _PROGRAM_NAME ++ " version " ++ _PROGRAM_VERSION
_PROGRAM_DESC = "mount/unmount USB device(s)"
_COPYRIGHT = "PUBLIC DOMAIN"

data BlockDevice = BlockDevice
    { shortname :: String
    , uuid :: UUID
    , fsys :: String
    , mountPoint :: MountPoint
    } deriving (Eq)

data MountPoint
    = MPath { path :: FilePath }
    | Swap
    | Unmounted
    | UnknownBlkidVal
    deriving (Eq)

instance Show BlockDevice where
    show BlockDevice{..} = unwords
        [ shortname
        , fsys
        , uuid
        , show mountPoint
        ]

instance Show MountPoint where
    show (MPath path) = path
    show Swap = "Swap"
    show Unmounted = "Unmounted"
    show UnknownBlkidVal = "UnknownBlkidVal"

blockdeviceDefault :: BlockDevice
blockdeviceDefault = BlockDevice
    { shortname = ""
    , uuid = ""
    , fsys = ""
    , mountPoint = MPath {path = ""}
    }

data Config = Config
    { fsyss :: [(String, String)]
    , uuids :: [(UUID, String)]
    } deriving (Eq, Show)

_ALPHANUM :: String
_ALPHANUM = ['a'..'z'] ++ ['A'..'Z'] ++ ['0'..'9']

data Color
    = Red
    | Green
    | Yellow
    | Blue
    | CNone
    deriving (Show, Eq)

colorize :: Color -> String -> String
colorize c s = case c of
    Blue -> "\x1b[1;34m" ++ s ++ "\x1b[0m"
    Green -> "\x1b[1;32m" ++ s ++ "\x1b[0m"
    Red -> "\x1b[1;31m" ++ s ++ "\x1b[0m"
    Yellow -> "\x1b[1;33m" ++ s ++ "\x1b[0m"
    _ -> s

main :: IO ()
main = do
    hSetBuffering stdout NoBuffering
    hSetBuffering stderr NoBuffering
    opts <- getOpts
    homeDir <- getEnv "HOME"
    errNo <- argsCheck opts homeDir
    when (errNo > 0) $ exitWith $ ExitFailure errNo
    (devs, takenPaths) <- getDevices opts
    let configLoc = homeDir ++ "/.usbmnt"
    configSrc <- readFile configLoc
    (confErrNo, config) <- parseConfig configSrc configLoc
    when (confErrNo > 0) $ exitWith $ ExitFailure confErrNo
    let mountablePaths = filter (\p -> not $ elem p takenPaths) $ map (\p -> "/mnt/u" ++ show p) [(0::Int)..]
        devsKV = zip (map show [(1::Int)..]) . zip devs $ mountablePaths
    prog opts config devsKV

argsCheck :: Opts -> String -> IO Int
argsCheck Opts{..} homeDir
    | null homeDir = e "could not get environment variable $HOME" 1
    | otherwise = return 0
    where
        e :: String -> Int -> IO Int
        e str num = errMsg str >> return num

prog :: Opts -> Config -> [(String, (BlockDevice, FilePath))] -> IO ()
prog opts@Opts{..} config devsKV
    | discover = do
        putStrLn "all devices:"
        mapM_ (\(_, (d, _)) -> putStrLn $ cshow d) devsKV
    | otherwise = do
        putStrLn (if (unmount || unmount_all)
            then "USB device(s) to unmount:"
            else "USB device(s) to mount:")
        mapM_ (\(n, (d, _)) -> putStrLn $ "    " ++ n ++ ") " ++ show' d) devsKV
        putStrLn ""
        mountMenu opts config devsKV
    where
        cshow :: BlockDevice -> String
        cshow b@BlockDevice{..}
            | no_color = show b
            | otherwise = case mountPoint of
                Unmounted -> colorize Green $ show b
                MPath _ -> if not $ null $ getUSBMountPath b
                    then colorize Blue $ show b
                    else show b
                _ -> show b
        show' :: BlockDevice -> String
        show' = if not (unmount || unmount_all)
            then show
            else unwords . init . words . show

mountMenu :: Opts -> Config -> [(String, (BlockDevice, FilePath))] -> IO ()
mountMenu Opts{..} config devsKV
    | unmount = if length devsKV == 1
        then do
            putStrLn "only 1 USB device to unmount"
            tryMount False config (snd . head $ devsKV) >>= exitWith
        else chooseDev prompt devsKV (tryMount False config)
    | unmount_all = do
        putStrLn "unmounting all USB devices..."
        mapM_ (tryMount False config) (map snd devsKV)
        return ()
    | all_devices = do
        putStrLn "mounting all USB devices..."
        mapM_ (tryMount True config) (map snd devsKV)
        return ()
    | length devsKV == 1 = do
        putStrLn "only 1 USB device to mount"
        tryMount True config (snd . head $ devsKV) >>= exitWith
    | otherwise = chooseDev prompt devsKV (tryMount True config)
    where
        prompt :: String
        prompt = if (unmount || unmount_all)
            then "choose USB device to unmount (q to exit)"
            else "choose USB device to mount (q to exit)"

chooseDev :: String -> [(String, (BlockDevice, FilePath))] -> ((BlockDevice, FilePath) -> IO ExitCode) -> IO ()
chooseDev prompt devsKV func = do
    putStrLn prompt
    key <- getLine
    case lookup key devsKV of
        Just dev -> func dev >>= exitWith
        _ -> case key of
            "q" -> return ()
            _ -> chooseDev prompt devsKV func

tryMount :: Bool -> Config -> (BlockDevice, FilePath) -> IO ExitCode
tryMount mount config@Config{..} (bd@BlockDevice{..}, mp)
    | (null margs) = do
        errMsg $ "UUID " ++ squote uuid ++ " was not found in config file"
        errMsg $ "filesystem " ++ squote fsys ++ " was also not found in config file"
        errMsg $ "supported file systems: " ++ (unwords $ map fst fsyss)
        exitWith (ExitFailure 1)
    | otherwise = do
    when mount $ do
        if (null $ mountArgsUUID config uuid)
            then putStrLn $ "filesystem " ++ squote fsys ++ " found in config file"
            else putStrLn $ "UUID " ++ squote uuid ++ " found in config file"
        putStrLn $ "using these arguments: " ++ squote margs
    putStr $ (if mount then "" else "un")
        ++ "mounting "
        ++ shortname
        ++ " (" ++ fsys ++ ") "
        ++ (if mount then "to " ++ mp else "from " ++ show mountPoint)
        ++ ".."
    (_, _, _, p) <- createProcess $ cmd margs shortname
    exitStatus <- waitForProcess p
    if (exitStatus == ExitSuccess)
        then do
            putStrLn "OK"
            return ExitSuccess
        else do
            putStr "FAILED\n"
            errMsg $ (if mount
                then "mount error (perhaps " ++ squote mp ++ " does not exist)"
                else "unmount error")
            return (ExitFailure 1)
    where
        margs = mountArgs config bd
        cmd arguments devPath = CreateProcess
            { cmdspec = ShellCommand (if mount
                then "sudo mount -t " ++ arguments ++ " " ++ devPath ++ " " ++ mp ++ " &>/dev/null"
                else "sudo umount " ++ show mountPoint)
            , cwd = Nothing
            , env = Nothing
            , std_in = CreatePipe
            , std_out = CreatePipe
            , std_err = Inherit
            , close_fds = False
            }

mountArgs :: Config -> BlockDevice -> String
mountArgs Config{..} BlockDevice{..} = case lookup uuid uuids of
    Just a -> a
    _ -> case lookup fsys fsyss of
        Just a -> a
        _ -> []

mountArgsUUID :: Config -> UUID -> String
mountArgsUUID Config{..} uuid' = case lookup uuid' uuids of
    Just a -> a
    _ -> []

getDevices :: Opts -> IO ([BlockDevice], [String])
getDevices Opts{..} = do
    (_, sout, _, p) <- createProcess cmdBlkid
    devs <- case sout of
        Just h -> hGetContents h
        Nothing -> return []
    _ <- waitForProcess p
    let devs' = (map (unwords . words)) . drop 2 . lines $ devs
    devs'' <- mapM parseBlkid devs'
    let toMount = filter (\BlockDevice{..} -> mountPoint == Unmounted) devs''
        toUnmount = filter (\dev -> not $ null $ getUSBMountPath dev) devs''
        takenPaths = filter (not . null) . map getUSBMountPath $ devs''
    when (not discover && null toMount && (not (unmount || unmount_all))) $ do
        errMsg $ "cannot find USB devices to mount"
        exitWith (ExitFailure 1)
    when (not discover && null toUnmount && (unmount || unmount_all)) $ do
        errMsg $ "cannot find USB devices to unmount"
        exitWith (ExitFailure 1)
    return $ formatDevs devs'' toMount toUnmount takenPaths
    where
        formatDevs :: [BlockDevice] -> [BlockDevice] -> [BlockDevice] -> [String] -> ([BlockDevice], [String])
        formatDevs ds m um takenPaths
            | discover = (ds, takenPaths)
            | unmount || unmount_all = (um, takenPaths)
            | otherwise = (m, takenPaths)
        cmdBlkid = CreateProcess
            { cmdspec = ShellCommand ("sudo blkid -o list")
            , cwd = Nothing
            , env = Nothing
            , std_in = CreatePipe
            , std_out = CreatePipe
            , std_err = Inherit
            , close_fds = False
            }

getUSBMountPath :: BlockDevice -> String
getUSBMountPath BlockDevice{..} = case mountPoint of
    MPath str -> if take 6 str == "/mnt/u" && (all (\c -> elem c ['0'..'9']) (drop 6 str))
        then str
        else ""
    _ -> ""

errMsg :: String -> IO ()
errMsg msg = hPutStrLn stderr $ "error: " ++ msg

squote :: String -> String
squote s = "`" ++ s ++ "'"

-- Parsing

-- for parsing the computer-generated output of `sudo blkid -o list'
parserIdentifier :: Parser String
parserIdentifier = many1 $ oneOf $ _ALPHANUM ++ "/-_"

parserWhitespace :: Parser String
parserWhitespace = many1 $ oneOf " \t\n\r"

parserMP :: Parser MountPoint
parserMP =
    try ( do
        a <- oneOf "<("
        b <- manyTill anyChar (lookAhead $ (oneOf ">)"))
        _ <- oneOf ">)"
        let mp = case a of
                '<' -> Swap
                '(' -> case b of
                    "not mounted" -> Unmounted
                    _ -> UnknownBlkidVal
                _ -> UnknownBlkidVal
        return mp
        )
    <|> (parserIdentifier >>= (\s -> return MPath {path = s}))
    <?> "blkid's mount point description"

blkidParser :: Parser BlockDevice
blkidParser =
    try ( do
        sname <- parserIdentifier
        _ <- parserWhitespace
        fs <- parserIdentifier
        _ <- parserWhitespace
        _ <- parserIdentifier -- leave out the "label" column, even if it exists
        _ <- parserWhitespace
        mp <- parserMP
        _ <- parserWhitespace
        uid <- parserIdentifier
        eof
        return BlockDevice
           { shortname = sname
           , uuid = uid
           , fsys = fs
           , mountPoint = mp
           }
        )
    <|>
    do  sname <- parserIdentifier
        _ <- parserWhitespace
        fs <- parserIdentifier
        _ <- parserWhitespace
        mp <- parserMP
        _ <- parserWhitespace
        uid <- parserIdentifier
        eof
        return BlockDevice
            { shortname = sname
            , uuid = uid
            , fsys = fs
            , mountPoint = mp
            }
    <?> "5 or 4 fields to parse"

parseBlkid :: String -> IO BlockDevice
parseBlkid src =
    case parse blkidParser "output of `sudo blkid -o list'" src of
        Left parseError -> errMsg (show parseError) >> return blockdeviceDefault
        Right result -> return result

-- we use a LanguageDef so that we can get whitespace/newline parsing for FREE
-- in our .usbmnt file
configDef :: PT.LanguageDef st
configDef = emptyDef
    { PT.commentStart   = ""
    , PT.commentEnd     = ""
    , PT.commentLine    = "#"
    , PT.nestedComments = False
    -- the identStart/identLetter define what a UUID will look like (a
    -- dash-separated hex number)
    , PT.identStart     = oneOf $ ['0'..'9'] ++ ['a'..'f'] ++ ['A'..'F']
    , PT.identLetter    = oneOf $ ['0'..'9'] ++ ['a'..'f'] ++ ['A'..'F'] ++ "-"
    , PT.opStart        = char '.'
    , PT.opLetter       = char '.'
    , PT.reservedOpNames= []
    , PT.reservedNames  = []
    , PT.caseSensitive  = True
    }

-- we call makeTokenParser def and pick out just those we need
lexer :: PT.TokenParser ()
lexer = PT.makeTokenParser configDef

p_identifier :: ParsecT String () Identity String
p_identifier = PT.identifier lexer
p_stringLiteral :: ParsecT String () Identity String
p_stringLiteral = PT.stringLiteral lexer
p_whiteSpace :: ParsecT String () Identity ()
p_whiteSpace = PT.whiteSpace lexer
p_braces :: ParsecT String () Identity a -> ParsecT String () Identity a
p_braces = PT.braces lexer
p_commaSep :: ParsecT String () Identity a -> ParsecT String () Identity [a]
p_commaSep = PT.commaSep lexer
p_symbol :: String -> ParsecT String () Identity String
p_symbol = PT.symbol lexer

type UUID = String

assocParser :: Parser String -> Parser (UUID, String)
assocParser keyParser = do
    key <- keyParser
    _ <- many $ oneOf " \t"
    _ <- string "="
    _ <- many $ oneOf " \t"
    mountOpts <- p_stringLiteral
    return (key, mountOpts)
    <?> "a key-value association"

hashParser :: String -> Parser String -> Parser [(String, String)]
hashParser hashName keyParser = do
    _ <- p_symbol hashName
    _ <- p_symbol "="
    a <- p_braces (p_commaSep $ assocParser keyParser)
    return a
    <?> "a " ++ hashName ++ " curly brace block"

configParser :: Parser Config
configParser = do
    p_whiteSpace -- take care of leading whitespace/comments as defined by configDef
    -- parse FSYS_HASH first
    fsyss' <- hashParser "FSYS_HASH" (many1 alphaNum)
    p_whiteSpace
    -- now parse UUID_HASH
    uuids' <- hashParser "UUID_HASH" (p_identifier)
    eof
    return $ Config {fsyss = fsyss', uuids = uuids'}
    <?> "config with FSYS_HASH and UUID_HASH blocks"

parseConfig :: String -> String -> IO (Int, Config)
parseConfig src loc =
    case parse configParser ("config file at " ++ squote loc) src of
        Left parseError -> errMsg (show parseError) >> return (1, Config [] [])
        Right result -> return (0, result)

And here is the new ~/.usbmnt configuration file that it parses:

# UUID/filesystem mountoptions

# the FSYS_HASH is used as a fallback, if the detected device does not match one of the UUIDs listed above
FSYS_HASH =
    { ext2 = "ext2 -o rw,relatime"
    , vfat = "vfat -o rw,uid=$USER,gid=$USER"
    }

# UUID is pretty straightforward; use blkid to figure it out for a particular device
# as for mount options, these are passed directly into the shell (/bin/sh; see createProcess in System.Process), so you can use things like $USER

# UUID_HASH = {} # if you have no particular device-specific settings

UUID_HASH =
    { 7CBF-B36F =                             "vfat -o rw,uid=$USER,gid=$USER"    # (256 MiB) SD card
    , 9c359666-a4e6-a894-3475-e6cd53660de8 =  "ext2 -o rw,relatime"               # (2 GiB) USB 2.0 thumbdrive
    }

See how there are a bunch of comment lines starting with ‘#‘? With our configDef function which defines a language with “#” as the commentLine, our convenience functions p_whiteSpace, p_symbol, etc. all avoid comment lines automatically! Sure, it’s a trivial automation here, but you can also define multiline comments and more (which I didn’t bother to do since the configuration file format I designed is so simple). I really like how I can robustly parse comment-laden text strings with such ease!

Can you imagine doing the same thing with regular expressions? I used to use regexes for parsing things like this back when I was still a very naive programmer, and then Parsec opened up the wonderful world of parsers for me (especially all the cool convenience functions you get for FREE!). The new rule I have is: use regular expressions to strictly do search-and-replace for unchanging/uniform input (using the Text.Regex.PCRE module), and to use Parsec for everything else. There’s no excuse to put off learning how to use Parsec!

Anyway, the code I posted is a working example and I really do use it in my machines! And all you need is a POSIX system with GHC, Parsec, and CmdArgs to use it.

Next time, I’ll do the second exercise and read in and parse ByteStrings instead of native Haskell Strings — just to show you how trivial the changes are!

Parsec and CmdArgs in Action: A Small Example

In this post, I mentioned that I wrote a ~300 line Haskell program to take care of mounting/unmounting USB drives (especially useful for window-manager-only users like myself). Well, I’ve been using my program (creatively named usbmnt) very happily so far and would like to release it for public consumption. It’s released into the PUBLIC DOMAIN, because that’s how I roll, baby!

-- LICENSE: PUBLIC DOMAIN
{-# LANGUAGE DeriveDataTypeable, RecordWildCards #-}
module Main where

import Control.Monad (when)
import System.Console.CmdArgs.Implicit
import System.IO
import System.Environment
import System.Exit
import System.Process
import Text.Parsec.Char hiding (upper)
import Text.Parsec.Combinator
import Text.Parsec.Prim
import Text.Parsec.String

data Opts = Opts
    { all_devices :: Bool
    , unmount :: Bool
    , unmount_all :: Bool
    , discover :: Bool
    , no_color :: Bool
    } deriving (Data, Typeable, Show, Eq)

progOpts :: Opts
progOpts = Opts
    { all_devices = def &= help "mount all USB devices"
    , unmount = def &= help "choose a USB device to unmount"
    , unmount_all = def &= name "U" &= help "unmount all USB devices"
    , discover = def &= help "list all mounted/unmounted USB devices"
    , no_color = def &= help "disable colors"
    }
    &= details
        [ "Notes:"
        , ""
        , "The default behavior without any options is to try to mount a USB device."
            ++ " Here, `device' means a device under the /dev directory, and in our context, is actually a file system partition."
            ++ " Many USB drives have only a single partition, in which case the term `device' means both the USB drive and the single partition it has."
        , ""
        , "Also, allowing the $USER to execute the mount and umount commands with sudo privileges (sudo visudo) will make things less clunky."
        ]

getOpts :: IO Opts
getOpts = cmdArgs $ progOpts
    &= summary (_PROGRAM_INFO ++ ", " ++ _COPYRIGHT)
    &= program _PROGRAM_NAME
    &= help _PROGRAM_DESC
    &= helpArg [explicit, name "help", name "h"]
    &= versionArg [explicit, name "version", name "v", summary _PROGRAM_INFO]

_PROGRAM_NAME, _PROGRAM_VERSION, _PROGRAM_INFO, _PROGRAM_DESC, _COPYRIGHT :: String
_PROGRAM_NAME = "usbmnt"
_PROGRAM_VERSION = "0.0.1"
_PROGRAM_INFO = _PROGRAM_NAME ++ " version " ++ _PROGRAM_VERSION
_PROGRAM_DESC = "mount/unmount USB device(s)"
_COPYRIGHT = "PUBLIC DOMAIN"

data BlockDevice = BlockDevice
    { shortname :: String
    , uuid :: String
    , fsys :: String
    , mountPoint :: MountPoint
    } deriving (Eq)

data MountPoint
    = MPath { path :: FilePath }
    | Swap
    | Unmounted
    | UnknownBlkidVal
    deriving (Eq)

instance Show BlockDevice where
    show BlockDevice{..} = unwords
        [ shortname
        , fsys
        , uuid
        , show mountPoint
        ]

instance Show MountPoint where
    show (MPath path) = path
    show Swap = "Swap"
    show Unmounted = "Unmounted"
    show UnknownBlkidVal = "UnknownBlkidVal"

blockdeviceDefault :: BlockDevice
blockdeviceDefault = BlockDevice
    { shortname = ""
    , uuid = ""
    , fsys = ""
    , mountPoint = MPath {path = ""}
    }

_ALPHANUM :: String
_ALPHANUM = ['a'..'z'] ++ ['A'..'Z'] ++ ['0'..'9']

data Color
    = Red
    | Green
    | Yellow
    | Blue
    | CNone
    deriving (Show, Eq)

colorize :: Color -> String -> String
colorize c s = case c of
    Blue -> "\x1b[1;34m" ++ s ++ "\x1b[0m"
    Green -> "\x1b[1;32m" ++ s ++ "\x1b[0m"
    Red -> "\x1b[1;31m" ++ s ++ "\x1b[0m"
    Yellow -> "\x1b[1;33m" ++ s ++ "\x1b[0m"
    _ -> s

main :: IO ()
main = do
    hSetBuffering stdout NoBuffering
    hSetBuffering stderr NoBuffering
    opts <- getOpts
    user <- getEnv "USER"
    errNo <- argsCheck opts user
    when (errNo > 0) $ exitWith $ ExitFailure errNo
    (devs, takenPaths) <- getDevices opts
    let mountablePaths = filter (\p -> not $ elem p takenPaths) $ map (\p -> "/mnt/u" ++ show p) [(0::Int)..]
        devsKV = zip (map show [(1::Int)..]) . zip devs $ mountablePaths
    prog opts user devsKV

argsCheck :: Opts -> String -> IO Int
argsCheck Opts{..} user
    | null user = e "could not get environment variable $USER" 1
    | otherwise = return 0
    where
        e :: String -> Int -> IO Int
        e str num = errMsg str >> return num

prog :: Opts -> String -> [(String, (BlockDevice, FilePath))] -> IO ()
prog opts@Opts{..} user devsKV
    | discover = do
        putStrLn "all devices:"
        mapM_ (\(_, (d, _)) -> putStrLn $ cshow d) devsKV
    | otherwise = do
        putStrLn (if (unmount || unmount_all)
            then "USB device(s) to unmount:"
            else "USB device(s) to mount:")
        mapM_ (\(n, (d, _)) -> putStrLn $ "    " ++ n ++ ") " ++ show' d) devsKV
        putStrLn ""
        mountMenu opts user devsKV
    where
        cshow :: BlockDevice -> String
        cshow b@BlockDevice{..}
            | no_color = show b
            | otherwise = case mountPoint of
                Unmounted -> colorize Green $ show b
                MPath _ -> if not $ null $ getUSBMountPath b
                    then colorize Blue $ show b
                    else show b
                _ -> show b
        show' :: BlockDevice -> String
        show' = if not (unmount || unmount_all)
            then show
            else unwords . init . words . show

mountMenu :: Opts -> String -> [(String, (BlockDevice, FilePath))] -> IO ()
mountMenu Opts{..} user devsKV
    | unmount = if length devsKV == 1
        then do
            putStrLn "only 1 USB device to unmount"
            tryMount False user (snd . head $ devsKV) >>= exitWith
        else chooseDev prompt user devsKV (tryMount False)
    | unmount_all = do
        putStrLn "unmounting all USB devices..."
        mapM_ (tryMount False user) (map snd devsKV)
        return ()
    | all_devices = do
        putStrLn "mounting all USB devices..."
        mapM_ (tryMount True user) (map snd devsKV)
        return ()
    | length devsKV == 1 = do
        putStrLn "only 1 USB device to mount"
        tryMount True user (snd . head $ devsKV) >>= exitWith
    | otherwise = chooseDev prompt  user devsKV (tryMount True)
    where
        prompt :: String
        prompt = if (unmount || unmount_all)
            then "choose USB device to unmount (q to exit)"
            else "choose USB device to mount (q to exit)"

chooseDev :: String -> String -> [(String, (BlockDevice, FilePath))] -> (String -> (BlockDevice, FilePath) -> IO ExitCode) -> IO ()
chooseDev prompt user devsKV func = do
    putStrLn prompt
    key <- getLine
    case lookup key devsKV of
        Just dev -> func user dev >>= exitWith
        _ -> case key of
            "q" -> return ()
            _ -> chooseDev prompt user devsKV func

tryMount :: Bool -> String -> (BlockDevice, FilePath) -> IO ExitCode
tryMount mount user (BlockDevice{..}, mp) = do
    when (null $ mountArgs fsys user) $ do
        errMsg $ "unsupported file system " ++ squote fsys ++ "\nsupported file systems: " ++ (unwords $ map fst (fileSystemArgs user))
        exitWith (ExitFailure 1)
    putStr $ (if mount then "" else "un")
        ++ "mounting "
        ++ shortname
        ++ " (" ++ fsys ++ ") "
        ++ (if mount then "to " ++ mp else "from " ++ show mountPoint)
        ++ ".."
    (_, _, _, p) <- createProcess $ cmd (mountArgs fsys user) shortname
    exitStatus <- waitForProcess p
    if (exitStatus == ExitSuccess)
        then do
            putStrLn "OK"
            return ExitSuccess
        else do
            putStr "FAILED\n"
            errMsg $ (if mount
                then "mount error (perhaps " ++ squote mp ++ " does not exist)"
                else "unmount error")
            return (ExitFailure 1)
    where
        cmd arguments devPath = CreateProcess
            { cmdspec = ShellCommand (if mount
                then "sudo mount -t " ++ arguments ++ " " ++ devPath ++ " " ++ mp ++ " &>/dev/null"
                else "sudo umount " ++ show mountPoint)
            , cwd = Nothing
            , env = Nothing
            , std_in = CreatePipe
            , std_out = CreatePipe
            , std_err = Inherit
            , close_fds = False
            }

fileSystemArgs :: String -> [(String, String)]
fileSystemArgs user =
    [ ("ext2", "ext2 -o rw,relatime")
    , ("vfat", "vfat -o rw,uid=" ++ user ++ ",gid=" ++ user)
    ]

mountArgs :: String -> String -> String
mountArgs fsys user = case lookup fsys (fileSystemArgs user) of
    Just a -> a
    _ -> []

getDevices :: Opts -> IO ([BlockDevice], [String])
getDevices Opts{..} = do
    (_, sout, _, p) <- createProcess cmdBlkid
    devs <- case sout of
        Just h -> hGetContents h
        Nothing -> return []
    _ <- waitForProcess p
    let devs' = (map (unwords . words)) . drop 2 . lines $ devs
    devs'' <- mapM parseBlkid devs'
    let toMount = filter (\BlockDevice{..} -> mountPoint == Unmounted) devs''
        toUnmount = filter (\dev -> not $ null $ getUSBMountPath dev) devs''
        takenPaths = filter (not . null) . map getUSBMountPath $ devs''
    when (not discover && null toMount && (not (unmount || unmount_all))) $ do
        errMsg $ "cannot find USB devices to mount"
        exitWith (ExitFailure 1)
    when (not discover && null toUnmount && (unmount || unmount_all)) $ do
        errMsg $ "cannot find USB devices to unmount"
        exitWith (ExitFailure 1)
    return $ formatDevs devs'' toMount toUnmount takenPaths
    where
        formatDevs :: [BlockDevice] -> [BlockDevice] -> [BlockDevice] -> [String] -> ([BlockDevice], [String])
        formatDevs ds m um takenPaths
            | discover = (ds, takenPaths)
            | unmount || unmount_all = (um, takenPaths)
            | otherwise = (m, takenPaths)
        cmdBlkid = CreateProcess
            { cmdspec = ShellCommand ("sudo blkid -o list")
            , cwd = Nothing
            , env = Nothing
            , std_in = CreatePipe
            , std_out = CreatePipe
            , std_err = Inherit
            , close_fds = False
            }

getUSBMountPath :: BlockDevice -> String
getUSBMountPath BlockDevice{..} = case mountPoint of
    MPath str -> if take 6 str == "/mnt/u" && (all (\c -> elem c ['0'..'9']) (drop 6 str))
        then str
        else ""
    _ -> ""

errMsg :: String -> IO ()
errMsg msg = hPutStrLn stderr $ "error: " ++ msg

squote :: String -> String
squote s = "`" ++ s ++ "'"

-- Parsing
parserIdentifier :: Parser String
parserIdentifier = many1 $ oneOf $ _ALPHANUM ++ "/-_"

parserWhitespace :: Parser String
parserWhitespace = many1 $ oneOf " \t\n\r"

parserMP :: Parser MountPoint
parserMP =
    try ( do
        a <- oneOf "<(" -- "(not mounted)" or "<swap>"
        b <- manyTill anyChar (lookAhead $ (oneOf ">)"))
        _ <- oneOf ">)"
        let mp = case a of
                '<' -> Swap
                '(' -> case b of
                    "not mounted" -> Unmounted
                    _ -> UnknownBlkidVal
                _ -> UnknownBlkidVal
        return mp
        )
    <|> (parserIdentifier >>= (\s -> return MPath {path = s})) -- e.g., "/mnt/blah"
    <?> "blkid's mount point description"

blkidParser :: Parser BlockDevice
blkidParser =
    try ( do
        sname <- parserIdentifier
        _ <- parserWhitespace
        fs <- parserIdentifier
        _ <- parserWhitespace
        _ <- parserIdentifier -- leave out the "label" column, even if it exists
        _ <- parserWhitespace
        mp <- parserMP
        _ <- parserWhitespace
        uid <- parserIdentifier
        eof
        return BlockDevice
           { shortname = sname
           , uuid = uid
           , fsys = fs
           , mountPoint = mp
           }
        )
    <|>
    do  sname <- parserIdentifier
        _ <- parserWhitespace
        fs <- parserIdentifier
        _ <- parserWhitespace
        mp <- parserMP
        _ <- parserWhitespace
        uid <- parserIdentifier
        eof
        return BlockDevice
            { shortname = sname
            , uuid = uid
            , fsys = fs
            , mountPoint = mp
            }
    <?> "5 or 4 fields to parse"

parseBlkid :: String -> IO BlockDevice
parseBlkid src =
    case parse blkidParser "output of `sudo blkid -o list'" src of
        Left parseError -> errMsg (show parseError) >> return blockdeviceDefault
        Right result -> return result

This example compiles with ghc –make -O2 -Wall -Werror. The only 2 dependencies are Parsec, and CmdArgs.

Looking back at it, I think it’s a great example of a real-world Haskell program out in the wild. It uses Parsec to correctly parse the output from blkid, CmdArgs for sane argument handling, and also does some system calls with the createProcess command. Coding-style-wise, it uses the simple where expression wherever possible, for maximum readability and tries to keep leading whitespace to a minimum. The code is written very verbosely, with explicit type signatures for every function, because, well, it helps me keep my sanity.

I think the code is pretty straightforward. It only supports 2 file systems: the ext2 type (recommended for Linux-only USB drives), and vfat (the FAT file system used for legacy Windows support), but this could be easily extended to support ext3, ext4, or any other file system, since it just wraps around the standard mount command.

The only tricky part is blkidParser, which tries to parse 5 fields or 4 fields. The reason behind this is because blkid -o’s output looks like this:

$ sudo blkid -o list
device       fs_type label    mount point      UUID
-----------------------------------------------------------------------------------
/dev/sda1    ntfs             /mnt/windows-xp  XXXXXXXXXXXXXXXX
/dev/sda2    ext4             /                XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
/dev/sda3    ext4             /home            XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
/dev/sda6    swap             <swap>           XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
/dev/sda5    ext4             /mnt/data        XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
/dev/sdb1    ext2             (not mounted)    XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
/dev/sdc1    vfat             (not mounted)    XXXX-XXXX

Sometimes, the label field is empty, as in the example above. So, we will end up with just 4 fields instead of 5.

I hope this sample, working program will help newbies out (esp. you who are frustrated by Real World Haskell’s clunky Parsec examples…). And for those looking to actually use it themselves for easy USB mounting/unmounting, here are some additional notes:

  • Create mount points (i.e., directories) /mnt/u0, /mnt/u1, /mnt/u2, etc. so that the mount command won’t choke.
  • Use shell aliases or keyboard hotkeys to avoid typing out “usbmnt -U” every single time.
  • Tweak the options passed to the mount command to suit your needs (see the fileSystemArgs function).

(1) A simple exercise would be to extend usbmnt so that it reads a very simple configuration file to detect the right user-defined options to pass to mount depending on the partition’s UUID. The only additional thing you would have to do is read a file from disk and store it as a String.

(2) An eye-opening exercises for Parsec newcomers would be to pass to the parsing functions a ByteString instead of String. (The transition to a ByteString is extremely straight-forward, and requires minimal changes.)

I will do the exercises myself later on and post them in a couple weeks at the latest.

UPDATE: December 1, 2011: Here are some convenience links for those coming in from Google for the solutions to the 2 exercises: Solution for #1 Solution for #2

Happy hacking!

EDIT: December 1, 2011: Fix typo.