Simon Meier, Erudify AG
HaskellerZ meetup - March 21st, 2013
Simplify the understanding of the runtime behaviour of Haskell programs compiled with GHC. For example,
What are the memory requirements of values of the following two types?
data List a = Nil | One a (List a)
data ListInt = NilI | OneI {-# UNPACK #-} !Int ListInt
Why is the following function prone to a stack-overflow?
sumN :: Int -> Int
sumN 0 = 0
sumN n = n + sumN (n - 1)
Essentially, this talk provides an introduction to GHC's RTS commentary, which guides through GHC's RTS sources.
GHC uses the same basic layout for every object in the RTS.
Basic closure layout | Basic info table layout |
---|---|
Structure of a header.
typedef struct {
const struct _StgInfoTable* info;
#ifdef PROFILING
StgProfHeader prof;
#endif
#ifdef GRAN
StgGranHeader gran;
#endif
} StgHeader;
For more information, see the GHC commentary.
data List a = Nil | One a (List a)
ex1 :: List Int
ex1 = One 1 (One 2 (One 3 Nil))
data ListInt = NilI | OneI {-# UNPACK #-} !Int ListInt
ex1 :: ListInt
ex1 = OneI 1 (OneI 2 (OneI 3 NilI))
An expression in normal form is fully evaluated, and no sub-expression could be evaluated any further (i.e. it contains no un-evaluated thunks).
An expression in weak head normal form has been evaluated to the outermost data constructor or lambda abstraction (the head).
-- in NF
42
(2, "hello")
\x -> (x + 1)
-- not in NF
1 + 2 -- we could evaluate this to 3
(\x -> x + 1) 2 -- we could apply the function
"he" ++ "llo" -- we could apply the (++)
(1 + 1, 2 + 2) -- we could evaluate 1 + 1 and 2 + 2
-- in WHNF
(1 + 1, 2 + 2) -- the outermost part is the data constructor (,)
\x -> 2 + 2 -- the outermost part is a lambda abstraction
'h' : ("e" ++ "llo") -- the outermost part is the data constructor (:)
-- not in WHNF
1 + 2 -- the outermost part here is an application of (+)
(\x -> x + 1) 2 -- the outermost part is an application of (\x -> x + 1)
"he" ++ "llo" -- the outermost part is an application of (++)
For more information, see this StackOverflow question.
Case distinctions are the fuel of the execution of Haskell code.
A case distinction
case x of _ -> e
forces the evaluation of the value denoted by the variable x
to WHNF when the expression e
is evaluated to WHNF.
Ultimately, Haskell programs are run by forcing the value of (runIO main) theWorld#
to WHNF, where
newtype IO a = IO { runIO :: World# -> (# a, World# #) }
theWorld# :: World# -- magical reference to the world
For more information on the implementation of the IO
monad, see GHC.Types
.
sumList :: List Int -> Int
sumList Nil = 0
sumList (One x xs) = x + sumList xs
test :: Int
test = sumList (One 1 (One 2 (One 3 Nil)))
High-level outline of evaluating test
to WHNF:
test
~> sumList (One 1 (One 2 (One 3 Nil)))
~> 1 + sumList (One 2 (One 3 Nil))
~> 1 + (2 + sumList (One 3 Nil))
~> 1 + (2 + (3 + (sumList Nil)))
~> 1 + (2 + (3 + 0))
~> 1 + (2 + 3)
~> 1 + 5
~> 6
sumList :: List Int -> Int
sumList xs = case xs of
Nil -> 0
One x xs -> x + sumList xs
test :: Int
test = sumList ex1
ex1 = One 1 ex1_1
ex1_1 = One 2 ex1_2
ex1_2 = One 3 ex1_3
ex1_3 = Nil
Evaluating test
to WHNF with explicit stack management.
test <| [] -- tail call
~> sumList ex_1 <| [] -- tail call
~> case ex1 of Nil -> 0
One x xs -> x + sumList xs <| [] -- push return-closure onto stack
~> ex1 <| [case ?? of ...] -- tail call
~> One 1 ex1_1 <| [case ?? of ...] -- WHNF reached: return control to caller, i.e.,
-- jump to the top-most stack frame
~> case One 1 ex1_1 of ... <| [] -- select appropriate case
~> (+) 1 (sumList ex1_1) <| [] -- force 'sumList ex1_1' to WHNF
-- => push return-closure onto stack
~> sumList ex1_1 <| [(+) 1 ??] -- tail call
~> case ex1_1 of ... <| [(+) 1 ??] -- push return-closure onto stack
~> ex1_1 <| [case ?? of ..., (+) 1 ??] -- ...
~> One 2 ex1_2 <| [case ?? of ..., (+) 1 ??] -- ...
~> case One 2 ex1_2 of ... <| [(+) 1 ??]
~> (+) 2 (sumList ex1_2) <| [(+) 1 ??]
~> sumList ex1_2 <| [(+) 2 ??, (+) 1 ??]
~> .... ~>
~> 0 <| [(+) 3 ??, (+) 2 ??, (+) 1 ??]
~> (+) 0 3 <| [(+) 2 ??, (+) 1 ??]
~> 3 <| [(+) 2 ??, (+) 1 ??]
~> (+) 2 3 <| [(+) 1 ??]
~> 5 <| [(+) 1 ??]
~> (+) 1 5 <| []
~> 6 <| []
When executing a Haskell program, time is spent on
The Spineless Tagless G-machine (STG machine) makes these costs explicit.
-- not in ANF --
----------------
\x -> f (g x) [x] -- use of compound arguments 'g x' and '[x]'
\x -> (x:) -- unsaturated use of ':' constructor
case x of -- nested pattern in case distinction
[y] -> e1
_ -> e_def
-- in ANF --
------------
\x -> let y = g x
z = [x]
in f y z
\x y -> x:y
let zz = e_def
in case x of
[] -> zz
x1:xs1 -> case xs1 of
[] -> e1
_ -> zz
sumList
data List a = Nil | One a (List a)
sumList :: List Int -> Int
sumList Nil = 0
sumList (One x xs) = x + sumList xs
ex1 :: List Int
ex1 = One 1 (One 2 (One 3 Nil))
test = sumList ex1
Compiled with ghc -O2 -dsuppress-all -ddump-prep List.hs
yields
==================== CorePrep ====================
ex6 = I# 1
ex5 = I# 2
ex4 = I# 3
ex3 = One ex4 (Nil)
ex2 = One ex5 ex3
ex1 = One ex6 ex2
Rec {
$wsumList =
\ w_sca ->
case w_sca of _ {
Nil -> 0;
One x_sce xs_sch ->
case x_sce of _ { I# x1_scj ->
case $wsumList xs_sch of ww_sck { __DEFAULT -> +# x1_scj ww_sck }
}
}
end Rec }
sumList =
\ w_scm ->
case $wsumList w_scm of ww_sco { __DEFAULT -> I# ww_sco }
test = case $wsumList ex1 of ww_scq { __DEFAULT -> I# ww_scq }
Nil = Nil
One = \ @ a_aaf eta_B2 eta_B1 -> One eta_B2 eta_B1
mapList :: (a -> b) -> List a -> List b
mapList f Nil = Nil
mapList f (One x xs) = One (f x) (mapList f xs)
foo :: Int -> Int
foo x = x + 0xDeadBeef
test :: List Int
test = mapList foo (One 1 (One 2 (One 3 Nil)))
Compiled with ghc -O2 -dsuppress-all -ddump-prep List.hs
yields
==================== CorePrep ====================
Rec {
mapList =
\ @ a_abc @ b_abd f_sct ds_sco ->
case ds_sco of _ {
Nil -> Nil;
One x_scs xs_scv ->
let {
sat_scE = mapList f_sct xs_scv } in
let {
sat_scF = f_sct x_scs } in
One sat_scF sat_scE
}
end Rec }
foo =
\ x_scy ->
case x_scy of _ { I# x1_scB ->
case +# x1_scB 3735928559 of sat_scG { __DEFAULT -> I# sat_scG }
}
test6 = I# 1
test5 = I# 2
test4 = I# 3
test3 = One test4 (Nil)
test2 = One test5 test3
test1 = One test6 test2
test = mapList foo test1
Nil = Nil
One = \ @ a_aab eta_B2 eta_B1 -> One eta_B2 eta_B1
What implementation is faster?
sumN :: Int -> Int
sumN 0 = 0
sumN n = n + sumN (n - 1)
sumNAcc :: Int -> Int
sumNAcc =
go 0
where
go acc 0 = acc
go acc n = go (n + acc) (n -1)
It's time for some interactive experimentation; i.e., let's have some fun with
ghc -O2 -dsuppress-all -ddump-prep
and all the other debugging flags provided by GHC.
Note that some of the examples are included in the sources of this talk at Github.
..and further references for the curious:
mapList f Nil = Nil
mapList f (One x xs) = One (f x) (mapList f xs)
foo :: Int -> Int
foo x = x + 0xDeadBeef
Memory layout before reducing mapList foo (One 1 (One 2 Nil))
to WHNF.
TODO: Image
mapList f Nil = Nil
mapList f (One x xs) = One (f x) (mapList f xs)
foo :: Int -> Int
foo x = x + 0xDeadBeef
Memory layout after reducing mapList foo (One 1 (One 2 Nil))
to WHNF.
TODO: Image
CONSTR
, CONSTR_p_n
, CONSTR_STATIC
, ...FUN
, FUN_p_n
, FUN_STATIC
, ...THUNK
, THUNK_p_n
, THUNK_SELECTOR
, ...PAP
, AP
IND
, IND_OLD_GEN
, ...For more information, see the GHC commentary.
Space, Right Arrow or swipe left to move to next slide, click help below for more details