Intro to QuickCheck and Hpc

And you thought my Haskell honeymoon was over.

So the other day, I finally got around to using the legendary QuickCheck testing library for a semi-serious project of mine. And I must say that I am very impressed with QuickCheck — it caught two subtle (but serious) errors in my core functions — in day one! I’m going to explain some basic concepts to help you get started with QuickCheck immediately; afterwards I’ll throw in a brief discussion about the Haskell Program Coverage (“Hpc”) tool, too, since it goes well with QuickCheck.

QuickCheck

The point of QuickCheck is to generate tons of random, junk values (the nastier-looking, the better) to feed into your functions. Well, they’re not really junk values, from a type system perspective, but the point is to generate lots of edge-case values that we forget to consider when writing the code. QuickCheck supports testing of pure functions as well as monadic ones (ST monad), but I’ll just focus on the pure functions because that’s what I have experience with.

So the next question is, how does QuickCheck generate random values? Well, if your function uses any of the basic types like Int, [Integer], Double, Word64, etc., QuickCheck knows how to generate the random values already. But if you use a custom type, or a unique combination of ordinary types (e.g., (Maybe Int, Maybe Int)), you have to instruct QuickCheck how to generate it. You do this by writing an instance for QuickCheck’s Arbitrary typeclass.

So let’s say my custom type is (Maybe Int, Maybe Int), and that it has a special requirement: if it is (Just a, Just b), then b must be greater than a. No problem.

newtype MPair = MPair (Maybe Int, Maybe Int)
    deriving (Show)

instance Arbitrary MPair where
    arbitrary = do
        -- First, generate a random Int.
        a <- arbitrary :: Gen Int
        -- Now generate another random Int, but such that it is greater than *a*.
        b <- suchThat (arbitrary :: Gen Int) (>a)
        -- Now choose between either (Just a), or Nothing.
        a' <- elements [Just a, Nothing]
        -- Repeat the same coin flip for *b*.
        b' <- elements [Just b, Nothing]
        -- Return the random result.
        return $ MPair (a', b')

There is a method to the madness! There are other useful combinators besides suchThat and elements, of course.

So now you’ve told QuickCheck how to generate the random values to stress-test your functions. The next step is to define the stress-tests. In the QuickCheck world, such testing functions are used to test very specific properties of your functions that must hold true no matter what you throw at it. By convention, these functions are named “prop_[function name]”.

So here’s an example, prop_isTouching, which, appropriately enough, tests the isTouching function..

prop_isTouching :: MPair -> MPair -> Bool
prop_isTouching x y =
    isTouching x' y' == isTouching y' x'
    where
    x' = fromMPair x
    y' = fromMPair y

isTouching :: (Ord a) => (Maybe a, Maybe a) -> (Maybe a, Maybe a) -> Bool
isTouching p@(pStart, pEnd) q@(qStart, qEnd)
    -- if no pair has both of its values defined, then just return False;
    -- technically this notion is incorrect, but the greater codebase behind this
    -- interprets it this way, so it's OK
    | not (isBoth p) || not (isBoth q) = False
    -- if one or both pairs are empty, then return False
    | isBlank p || isBlank q = False
    | isLeft q = inP qStart'
    | isRight q = inP qEnd'
    | isLeft p = inQ pStart'
    | isRight p = inQ pEnd'
    | otherwise = inP qStart'
        || inP qEnd'
        || inQ pStart'
        || inQ pEnd'
    where
    pStart' = fromJust pStart
    pEnd' = fromJust pEnd
    qStart' = fromJust qStart
    qEnd' = fromJust qEnd
    inP = isInside (pStart', pEnd')
    inQ = isInside (qStart', qEnd')

isLeft
    , isRight
    , isBoth
    , isBlank :: (Maybe a, Maybe a) -> Bool
isLeft (a, b) = isJust a && isNothing b
isRight (a, b) = isNothing a && isJust b
isBoth (a, b) = isJust a && isJust b
isBlank (a, b) = isNothing a && isNothing b

fromMPair :: MPair -> (Maybe Int, Maybe Int)
fromMPair (MPair a) = a

isInside :: (Ord a) => (a, a) -> a -> Bool
isInside (a, b) c = a <= c && c <= b

You might be wondering, “Hey, I thought you were going to test a function that takes (Maybe Int, Maybe Int), not (Maybe a, Maybe a)!” Well, believe me, it pays off a lot to write higher-order functions like this that work on multiple types. The fact that we can test it using Maybe Ints (a very simple type) is just one benefit. Notice how I’ve made sure to restrict isTouching‘s arguments to the Ord typeclass, since we expect the right hand value in the pair to be greater than the one on the left (if it exists). The fancy Arbitrary instance up above was not in vain.

Anyway, the isTouching function merely checks to see if one given pair “falls in” or “touches” the other pair of values. It’s a pretty mundane function, but such functions often form the backbone of the rest of your code, so it’s really important to get these 100% right. The various helper functions like isLeft, isRight, fromMPair, etc. may seem annoying, as if they get in the way of the example to test isTouching itself. But think about it: all of these auxiliary functions will, by virtue of their necessity, be tested by calling prop_isTouching! And if prop_isTouching keeps failing (and your custom type or function is too complex for a 10-minute debugging session), you can always add more prop_ functions to test these auxiliary functions in isolation. Haskell embraces small, cute functions, and so should you!

The prop_isTouching function itself is straightforward enough: it tests the “commutative” property of isTouching — that the order of the arguments does not matter.

So far so good. Now we just need to run prop_isTouching hundreds (or thousands) of times to see if it holds. QuickCheck defines some basic test running functions, for use in the IO monad. The simplest one is aptly named quickCheck. So, we can run the function above like so:

import Test.QuickCheck

import [Your testing module, where prop_isTouching resides]

main :: IO ()
main = quickCheck prop_isTouching

This will run prop_isTouching 100 times with the default testing settings. But default settings are usually not desirable (for me, I only caught errors when I “upped” the number of runs to 1000, among other things). Oh, and quickCheck will not tell you the name of the function it is testing. So here is a more useful version:

{-# LANGUAGE RecordWildCards #-}
import System.Exit
import Test.QuickCheck

import [Your testing module, where prop_isTouching resides]

-- Rigorous test arguments.
rigorous :: Args
rigorous = Args
    { replay = Nothing
    , maxSuccess = 1000 -- tests to run
    , maxDiscard = 1000 -- the number of tests that are thrown out and ignored b/c of "==>" conditions, before "giving up" and failing due to too many discarded tests
    , maxSize = 1000 -- if a prop_ function uses a list ([]) type, maxSize is the max length of the randomly generated list
    , chatty = True
    }

-- Quick test arguments.
quick :: Args
quick = Args
    { replay = Nothing
    , maxSuccess = 100
    , maxDiscard = 100
    , maxSize = 100
    , chatty = True
    }

runTests :: [String] -> IO ()
runTests as = case as of
    [] -> runTests' quick
    a -> case head a of
        "1" -> runTests' quick
        "2" -> runTests' rigorous
        _ -> runTests' quick
    where
    runTests' :: Args -> IO ()
    runTests' testArgs = do
        -- if all of your prop_ functions are of the same type, you can put
        -- them in a list and use mapM_ instead
        f prop_isTouching "prop_isTouching"
        f prop_someOtherFunc1 "someOtherFunc1"
        f prop_someOtherFunc2 "someOtherFunc2"
        f prop_someOtherFunc3 "someOtherFunc3"
        f prop_someOtherFunc4 "someOtherFunc4"
        where
        f prop str = do
            putStrLn str
            quitOnFail =<< quickCheckWithResult testArgs prop
        quitOnFail r = case r of
            -- pattern match with just two dots with RecordWildCards because I'm lazy
            Success{..} -> return ()
            _ -> exitFailure

main :: IO ()
main = getArgs >>= runTests

If you compile the above as “test”, then running “./test 2” will use the “rigorous” test settings. The key difference is that instead of quickCheck, we use quickCheckWithResult. With it, we can provide our choice of test settings (the Args type), and also get some feedback on what the test results were. For simplicity’s sake, we only check if the test was a complete success; if it’s anything else, we abort immediately.

Here’s a quick note about the term “shrink” that you might encounter: if QuickCheck spots a failure, it will first try to shrink the size of the random input repeatedly while maintaining the “failure” result (and QuickCheck will tell you about how many shrinks were performed). This is to help you work with reasonably small values (esp. useful if you have list arguments with hundreds of items each, like in prop_foo above).)

The careful reader would have wondered what the “==>” was about in the comments. Well, the “==>” function is called the “implication” function, and is used to throw out invalid values before running the prop_ function. It’s another way to customize the random value, sort of like how we defined a custom Arbitrary instance up above for the MPair type synonym. For example,

prop_foo :: [Int] -> [Int] -> Prop
prop_foo xs ys =
    not (null xs) ==> -- xs must not be null
    (length ys > 3) ==> -- ys must have at least 4 elements
    foo xs ys == foo ys xs

and it only gets to the “foo xs ys == bar xs ys” part if the two statements above are true. The only difference is that we have to end up with a Prop type instead of Bool, as was the case in prop_isTouching. No other part of your testing code needs to change. The advantage in using (==>) is its ease of use — you can trivially write any rare edge-case condition that you know does not conform to the spec, without bothering to write an Arbitrary instance. However, the disadvantage is that QuickCheck will waste time generating invalid inputs before it gets to the test the function in question. Because of this, you should first try to create an instance of Arbitrary before going with (==>).

Here is a version without using (==>) just for illustrative purposes:

import Data.List
import Test.QuickCheck

main = verboseCheck prop_foo

newtype Xs = Xs [Int]
    deriving (Show)
newtype Ys = Ys [Int]
    deriving (Show)

prop_foo :: Xs -> Ys -> Bool
prop_foo (Xs xs) (Ys ys) =
    foo xs ys == foo ys xs

foo :: [Int] -> [Int] -> [Int]
foo xs ys = sort $ xs ++ ys

instance Arbitrary Xs where
    arbitrary = do
    -- xs must not be null
    xs <- suchThat (arbitrary :: Gen [Int]) (not . null)
    return $ Xs xs

instance Arbitrary Ys where
    arbitrary = do
    -- ys must have at least 4 elements
    ys <- suchThat (arbitrary :: Gen [Int]) ((>3) . length)
    return $ Ys ys

That’s it! You now know enough to get started with QuickCheck today. No excuses! Make it a New Year’s resolution for 2012, if you have to!

QuickCheck has taught me to be skeptical of the “if it compiles, it is OK” Haskell attitude — it has really opened my eyes. I now want QuickCheck as a bare minimum testing setup for all of my Haskell code. No tests = poor design (Medieval II: Total War, I’m looking at you…).

QuickCheck can even be used to test C code with the Foreign Function Interface (FFI), so I’m forever tempted to go back to my other C projects and test them with QuickCheck as well (can you imagine the millions of bugs that lie hidden in C code everywhere — that could be exposed with just a little bit of Haskell knowledge and QuickCheck by the everyday developer? Suddenly, I am overcome with optimism… yet I fear that it is really pessimism underneath… hmm.)

Hpc

I said I would mention Hpc at the beginning. So, what does it do?

Simply put, Hpc tells you which codepaths were used (or not used) after running a binary executable. This is a perfect match for the binary you used for running QuickCheck tests, because during the course of its run QuickCheck will have fed tens of thousands of random values into your functions and have visited most, if not all, parts of your code. Thankfully, Hpc is included with GHC, so you already have it! Just compile your binary with the -fhpc flag. Here is the process:

  1. Compile your program with -fhpc. (ghc -fhpc –make foo.hs)
  2. Run your program. (./foo)
  3. Your program should have generated a *.tix metadata file, as well as a .hpc folder with *.mix files for hpc to analyze.
  4. Run hpc report foo for a quick plaintext report on your codepaths.
  5. Run hpc markup foo to generate prettified HTML files for a more comprehensive overview.
  6. Rinse and repeat. Remove the old *.tix files if you don’t want combined results.

One big caveat: make sure to delete any existing object files (*.o, *.hi) that you had earlier which were produced without the -fhpc flag! Otherwise, those portions of code will not generate hpc metadata! Alternatively, instead of deleting every *.o or *.hi file you have down your project’s folder tree, you could also pass the -fforce-recompile flag to ghc (this is the new name for the deprecated -no-recomp flag, which was used (sans explanation!) in Real World Haskell, Chapter 11).

Go forth, fellow Haskell newbies — embrace testing and code coverage analysis today!