Introduction to property-based testing in Haskell

Emilie Balland

29/01/15

Property-based testing

Haskell Libraries

Outline

QuickCheck Library

  1. An embedded DSL for defining properties
  2. Random data generation
  3. An embedded DSL for defining random data generators

Properties

Simple Bool functions

prop_RevUnit:: [Int] -> Bool
prop_RevUnit x =
    reverse [x] == [x]

prop_RevApp:: [Int] -> [Int] -> Bool
prop_RevApp xs ys =
    reverse (xs ++ ys) == reverse ys ++ reverse xs

prop_RevRev:: [Int] -> Bool
prop_RevRev xs =
    reverse (reverse xs) == xs

Testing prop_RevApp with QuickCheck

main> quickCheck prop_RevApp
+++ OK, passed 100 tests

QuickChecking QuickSort

qsort []     = []
qsort (x:xs) = qsort lhs ++ [x] ++ qsort rhs
    where lhs  = [y | y <- xs, y < x]
          rhs  = [z | z <- xs, z > x]

Verify that the resulting list is ordered.

isOrdered (x1:x2:xs) = x1 <= x2 && isOrdered (x2:xs)
isOrdered _          = True

prop_qsort_isOrdered :: [Int] -> Bool
prop_qsort_isOrdered = isOrdered . qsort

Model-based Testing

Instead of having multiple properties, we can give a reference implementation (observational equivalence).

prop_qsort_sort    :: [Int] -> Bool
prop_qsort_sort xs =  qsort xs == sort xs

The model may be inefficient but its correctness can be easily checked by hand.

Does this property hold?

Duplicates are erased...

Conditional Properties

alldiff (x:xs) = not (x `elem` xs) && alldiff xs
alldiff []     = True

prop_qsort_alldiff_sort :: [Int] -> Property
prop_qsort_alldiff_sort xs = 
   (alldiff xs) ==> qsort xs == sort xs

Instead of Bool, the output of the function is a Property, a special type built into the QuickCheck library. . . .

The implies combinator ==> is one of many QuickCheck combinators (e.g., .&&., .||., forall).

Quantified Properties

insert x []                 = [x]
insert x (y:ys) | x < y     = x : y : ys
                | otherwise = y : insert x ys

We can now define the following property:

prop_insert_ordered :: Int -> [Int] -> Property 
prop_insert_ordered x xs = 
   isOrdered xs ==> isOrdered (insert x xs)

What happens?

main> quickCheck prop_insert_ordered 
*** Gave up! Passed only 78 tests.

Most generated values are discarded by the ==> combinator.

Solution: write customized generators and quantified properties.

Quantified Properties

Properties may take the form:

forAll :: (Testable prop, Show a) => Gen a -> (a -> prop) -> Property

Instead of discarding test cases that are not ordered, we provide a user-defined generator of ordered lists:

prop_insert_ordered2 x = forAll orderedList $ \xs -> ordered (insert x xs)
  where types = x::Int 

Observing Test Case Distribution

Two special QuickCheck property combinators:

collect  :: Show a => a -> Property -> Property
classify :: Bool -> String -> Property -> Property

We can now redefine the prop_insert_ordered property:

prop_insert_ordered3 :: Int -> [Int] -> Property 
prop_insert_ordered3 x xs = 
   collect ((length xs `div` 10) * 10) $
   classify (isOrdered xs) "ord" $
   classify (not (isOrdered xs)) "not-ord" $
   not (isOrdered xs) || isOrdered (insert x xs)

We get details on the generated data:

main> quickCheck prop_insert_ordered
+++ OK, passed 100 tests:
 18% 0, not-ord
 17% 10, not-ord
 14% 40, not-ord
 ...

Generator Monad

A Haskell function that generates random values:

newtype Gen a = MkGen { unGen :: StdGen -> Int -> a }

unGen g takes a random number generator, a size and returns a a value.

Gen looks like a Reader monad that passes a random number generator and the current size around:

instance Monad Gen where
  return x =
    MkGen (\_ _ -> x)

  MkGen m >>= k =
    MkGen (\r n ->
      let (r1, r2)  = split r
          MkGen m' = k (m r1 n)
       in m' r2 n
    )

Data Generation

The class of types for which random values can be generated:

class Arbitrary a where
  arbitrary :: Gen a
  shrink :: a -> [a]

Main> sample (arbitrary:: Gen Bool)

The QuickCheck library provides:

choose

choose :: (System.Random.Random a) => (a, a) -> Gen a

choose takes an interval and returns a random value from that interval.

main> sample $ choose (0, 3)
2
1
0
2
1
0
2
3
0
0

elements

elements :: [a] -> Gen a

elements generates one of the given values.

Example:

data Suit = Spades | Hearts | Diamonds | Clubs
  deriving (Show,Eq)

instance Arbitrary Suit where
  arbitrary = elements [Spades , Hearts , Diamonds , Clubs]

oneof

oneof :: [Gen a] -> Gen a

oneof randomly uses one of the given generators.

Example:

data Rank = Numeric Integer | Jack | Queen | King | Ace
  deriving (Show,Eq)

rank :: Gen Rank
rank = oneof
    [ return Jack
    , return Queen
    , return King
    , return Ace
    , do r <- choose (2,10)
         return (Numeric r)
    ]
main> sample rank

What is wrong with this definition?

frequency

We can fix it with the frequency combinator:

rank2 :: Gen Rank
rank2 = frequency
    [ (1, return Jack)
    , (1, return Queen)
    , (1, return King)
    , (1, return Ace)
    , (9, do r <- choose (2,10)
             return (Numeric r))
    ]

We now get a uniform distribution for the ranks.

main> sample rank2

Recursive datatypes

Generating lists:

genList1 ::  (Arbitrary a) => Gen [a]
genList1 = liftM2 (:) arbitrary genList1
genList2 ::  (Arbitrary a) => Gen [a]
genList2 = oneof [ return []
                 , liftM2 (:) arbitrary genList2]
genList3 ::  (Arbitrary a) => Gen [a]
genList3 = frequency [ (1, return [])
                     , (4, liftM2 (:) arbitrary genList3) ]

Generating ordered lists:

genOrdList = fmap sort genList3

prop_insert :: Int -> Property 
prop_insert x = forAll genOrdList $ \xs -> isOrdered (insert x xs)

How to control the size of the generated data ?

sized

The sized combinator makes the size bound argument of the MkGen function visible to the user.

list :: Arbitrary a => Int -> Gen [a]
list 0 = return []
list n = liftM2 (:) arbitrary (list (n-1))
genList4 :: Arbitrary a => Gen [a]
genList4 = sized list

It still is not a uniform distribution of lists (see the more complex implementation provided by QuickCheck)

As shown in the Feat paper, it is really hard to get a uniform distribution for tree-like structures...

Shrinking

The shrink function reduces the size of the generated counter-example (introduced in QuickCheck2).

class Arbitrary a where
  arbitrary :: Gen a
  shrink :: a -> [a]

shrink produces a (possibly) empty list of all the possible immediate shrinks of the given value.

The default implementation returns the empty list.

Shrinking Binary Trees

data Tree a = Nil | Branch a (Tree a) (Tree a)

Most implementations of shrink should try at least three things:

shrink Nil = []
shrink (Branch x l r) =
  -- shrink Branch to Nil
  [Nil] ++
  -- shrink to subterms
  [l, r] ++
  -- recursively shrink subterms
  [Branch x' l' r' | (x', l', r') <- shrink (x, l, r)]

QuickCheck provides a genericShrink function (based on the Generic and Typeable type classes) that help users defining basic shrink functions.

Advanced Usage

Limits of QuickCheck

SmallCheck

Let us revisit the insert function example:

insert x []                 = [x]
insert x (y:ys) | x < y     = x : y : ys
                | otherwise = y : insert x ys

prop_insert_ordered :: Int -> [Int] -> Property 
prop_insert_ordered x xs = 
   isOrdered xs ==> isOrdered (insert x xs)

Remember that we had to give up with QuickCheck:

main> quickCheck prop_insert_ordered 
*** Gave up! Passed only 78 tests.

With SmallCheck, we can verify this property on all inputs of depth <= d:

main> smallCheck 7 prop_insert_ordered

This will try to falsify the property with all the lists of size<=7

Main Contributions of SmallCheck

Data Generation

Instead of Arbitrary, SmallCheck defines the Serial type class that provides a series function.

data Tree a = Nil | Branch a (Tree a) (Tree a)

instance Serial m a => Serial m (Tree a) where
  series = cons0 Null \/ cons3 Branch

series n is the finite list of all values of depth n.

type Series a = Int -> [a]

Existential Quantifier

Suppose we define a isPrefix function. How to test its correctness ?

prop_isPrefix xs xs’ = isPrefix xs (xs ++ xs’)

This property is a necessary condition but it is not sufficient. For example, it holds for the following bogus definition:

isPrefix xs ys = True

What we would like to check is:

prop_isPrefixSound xs ys =
    isPrefix xs ys ==>
    exists $ \xs’ -> xs ++ xs’ == ys

SmallCheck Function Example

Let us define a bogus property that every boolean binary operator is associative.

prop_assoc op = \x y z ->
    (x ‘op‘ y) ‘op‘ z == x ‘op‘ (y ‘op‘ z)
    where typeInfo = op :: Bool -> Bool -> Bool
smallCheck prop_assoc
Depth 0:
Failed test no. 22. Test values follow.
{True->{True->True;False->True};
False->{True->False;False->True}}
False
True
False

The depth of a function is defined by the number of its nested cases.

Lazy SmallCheck

main> smallCheck 7 prop_insert_ordered

Finished in less than 1 second (very very long in SmallCheck)

How does it work ?

New version of LazySmallCheck with functions and existentials (2012)!

SmartCheck

Generalize counter-examples with universal quantifiers:

prop_RevApp_false:: ([Int],[Int]) -> Bool
prop_RevApp_false (xs,ys) =
    reverse (xs ++ ys) == reverse xs ++ reverse ys

main = smartCheck scStdArgs prop_RevApp_false

How does it work?

  1. Shrinks the counter-example (no need to define shrink)

  2. Generalizes values in the counter-example (extrapolating values)

  3. Generalizes constructors when the datatype has a finite number of constructors (extrapolating constructors)

Feat

Feat (Functional Enumeration of Algebraic Types) provides enumerations as functions from natural numbers to values.

These enumerations can be automatically generated for any algebraic datatype using template Haskell.

-- use template haskell to generate an instance of Enumerable  
deriveEnumerable ''Tree

SmallCheck and QuickCheck generators can be derived from these enumerations.

-- create a uniform distribution from an Enumerable
instance (Enumerable a, Arbitrary a) => Arbitrary (Tree a) where
  arbitrary = sized uniform

What are the benefits?

Feat helped find bugs in template-haskell and haskell-src-exts.

Some Methodology Comments

  1. Start by checking with SmallCheck if there are some small inputs that already falsify the property.

  2. Then try to run it with QuickCheck to find middle-sized counter-examples.

  3. Be careful about the distribution of the values. Use Feat if possible (e.g., data does not need to satisfy a condition).

  4. If some specific corner-cases are suspected (e.g., large integers), define customized generators.

  5. When testing complex data-structures (e.g., ASTs), use SmartCheck to generalize the counter-examples given by QuickCheck

Conclusion

"QuickCheck forced me to think about the properties of the library more closely - I changed some design decisions after it turned out that the properties disagreed with me. The one thing QuickCheck helped with more than anything though was refactoring despite a massive number of the functions all depending on each other, QuickCheck allows me to change the behaviour of one function in some obscure case and check that no other function was relying on that."

Testing Functions in QuickCheck

We can define Tree as an instance of the Functor class:

instance Functor Tree where
  fmap f (Fork t u) = Fork (fmap f t) (fmap f u)
  fmap f (Leaf x) = Leaf (f x)

And now we check the functor laws with QuickCheck:

law1 :: Tree Int -> Bool
law1 t = fmap id t == t

law2 :: (Int -> Int) -> (Int -> Int) -> Tree Int -> Bool
law2 f g t = fmap (f . g) t == fmap f (fmap g t)

main = do
  quickCheck law1
  quickCheck law2