Heist, Anansi, MVars, oh my! Henry Laxen October 18, 2011

I've been using Haskell for several years now, and I think I'm starting to get the hang of it. I'm writing this to help my fellow Haskell strugglers understand in a few hours what took me several days to wade through. I've been using the Snap Framework to run this website, and in particular I wanted to get better understanding of the Heist Templating system, so I decided to run a few experiments.

One thing my thirty years of programming experience has taught me is that you generally write code between one and five times, but you read it many hundreds of times. Thus the time you invest to make your code easier to understand will pay off many times over. So even though Haskell abhors side effects, a side effect of this tutorial will be to introduce you a wonderful literate programming system called anansi, written by John Millikin. Don't worry about it now, we'll get to it later. I suggest you find a spot in your file system where you want to have a directory named heistTutorial, then download and unpack the archive there. If you want to rebuild this project from scratch, typing make should download all of the dependencies and recreate the existing src directory. This is not necessary to proceed with this tutorial. Now cd into the heistTutorial/src directory, and fire up a ghci session in one window while following along on this page to play along and explore.

tar -xzf heistTutorial.tgz
cd heistTutorial
# optional make
cd src
Let's start by looking at the test input files. That way, when we see the code we will understand what it is working on. The test files are all Heist templates, so if you don't know anything about Heist, this would be a good place to start. There are four test files:

The test data

before 1
inside 1
after 1
end 1
before 2
<test file="data.txt">
inside 2
after 2
end 2
This is the contents of data.txt

Now in you ghci session let's load the first example.

:load onLoad.hs
Now lets look at our first program. I wasn't clear on how or what the Heist hook functions did, so I write a little program to check them out.

Understanding Heist hook functions

A hook function must take a Template to an IO Template. So testHook does nothing more that print a message when it is entered, then print and return the template it was passed.
«onLoad testHook»
testHook :: Template -> IO Template
testHook t = do            
  B8.putStrLn "inside testHook"
  print t
  return t
Here we add testHook, defined above, to the list of onLoad hooks, and bind the function testImpl to the xml tag <test>
«onLoad bindTestTag»
bindTestTag :: HeistState IO -> IO (HeistState IO)
bindTestTag  = do
  return . addOnLoadHook testHook .
            bindSplice "test" (testImpl) 
This is the code that is run whenever we encounter an <test> tag in a template. It just prints out a message and returns the empty list, effectively removing the <test> tag and its descendents from the output.
«onLoad testImpl»
testImpl :: Splice IO
testImpl = do           
  liftIO $ B8.putStrLn "in testImpl"
  return []
runIt prints out the name of the template we are about to render, then renders it and displays the output. One thing to remember is that renderTemplate returns an IO (Maybe (Builder, MIMEType)), so line that reads B8.putStrLn $ toByteString . fst $ r has to do some unpacking in order to print out the result we want.
«onLoad runIt»
runIt  :: HeistState IO -> B8.ByteString -> IO ()
runIt ts name = do
  B8.putStrLn $ "Running template " `B8.append` name
  B8.putStrLn "----------------------------------------------"
  result <- renderTemplate ts name
  let r = maybe (error "render error") id result
  B8.putStrLn $ toByteString . fst $ r
Finally, main sets up the original Template State, which adds an onLoad hook function and our <test> tag, loads all the templates in the templates directory and renders the two templates, test1 and notTest.
«onLoad main»
main :: IO ()
main = do
  originalTS <- bindTestTag defaultHeistState
  ets <- loadTemplates "templates" originalTS
  let ts = either error id ets
  runIt ts "test1"
  runIt ts "notTest"
Running main in our ghci session results in the following output (line numbers added):
«testHook output»
1  inside testHook
2  [TextNode "before\n",Element {elementTag = "notTest", elementAttrs = [], elementChildren = [TextNode "\ninside\n"]},TextNode "\nafter\n"]
3  inside testHook
4  [TextNode "before 1\n",Element {elementTag = "test", elementAttrs = [], elementChildren = [TextNode "\ninside 1\n"]},TextNode "\nafter 1\n",Element {elementTag = "test", elementAttrs = [], elementChildren = []},TextNode "\nend 1\n"]
5  inside testHook
6  [TextNode "before 2\n",Element {elementTag = "test", elementAttrs = [("file","data.txt")], elementChildren = [TextNode "\ninside 2\n"]},TextNode "\nafter 2\n",Element {elementTag = "test", elementAttrs = [], elementChildren = []},TextNode "\nend 2\n"]
7  Running template test1
8  ----------------------------------------------
9  in testImpl
10 in testImpl
11 before 1
13 after 1
15 end 1
17 Running template notTest
18 ----------------------------------------------
19 before
20 <notTest>
21 inside
22 </notTest>
23 after

Let's see if we can understand what is going on here. The line in main that has loadTemplates causes the three files with a .tpl extension in the templates directory to be loaded. Lines 1-6 indicate that they are loaded in the order notTest.tpl, test1.tpl, and test2.tpl. Our function testHook, was called three times, once for each of those templates, and its input parameter is the parsed contents of the file. This gives us to opportunity to modify any template file as we see fit, after it has been loaded.

Next we render test1, which produces the output in lines 7-16. There are two <test> tags in the test1.tpl file, one which has a text node among its children, and the other which is just a plain <test/> element. Thus lines 9 and 10 show us that testImpl is called twice while rendering this template. The result of the rendering is lines 11 to 16, in which all traces of the <test> tag and its children have been removed. You will notice that in line 4 test1.tpl was parsed, and the TextNodes containing the "before", "after", and "end" words all have linefeeds in them, hence the double spacing.

Finally we render notTest, which produces the output in lines 17 to 23. Nothing too surprising here, there is no <test> tag so the output is the same as the input.

At this point I hope you have a pretty good understanding of what Heist does, and how to set up hooks, bindings, and splices. You'll notice from the type signatures, that hooks and splices can run in the IO monad, so the world is your oyster. Next we'll take a look at some of the things this lets us do.

Playing with MVars

Even though I've been using Haskell for a few years now, I've never had the occasion to use MVars. I have to admit, I was a little scared of them. While browsing through the Heist code for the splice Static.hs, I noticed MVars and IORefs all over the place. I realized it was time to figure them out. This would be a good time to run :load mvarFile1.hs in your ghci session. Here is the new code that you loaded. The runIt code is the same as before.

print out the status of the supplied MVar
«mvarFile mvStatus»
mvStatus :: MVar a -> IO ()
mvStatus mv = do
  empty <- liftIO $ isEmptyMVar mv
  print $ "isEmpty is " ++ (show empty)           
The thing to notice here is that we create an empty MVar which we then pass to testImpl as part of the splice. This means our splices can suddenly have access to their own private data.
«mvarFile bindTestTag»
bindTestTag :: HeistState IO -> IO (HeistState IO)
bindTestTag ts = do
  mv <- liftIO $ newEmptyMVar 
  return . bindSplice "test" (testImpl mv) $ ts
Here is where we take advantage of the MVar created above. node contains the entire parsed input of the <test> tag. path contains the full file path of the template currently being processed. Next we check to see if the <test> tag has a "file" attribute (assumed to be relative). If it does, we read the file and stick it in the MVar. We return [] so the <test> tag goes away in the output. The other case is that the <test> tag does not contain a "file" attribute. In that case we read the value of the MVar, and return it as a TextNode.
«mvarFile testImpl»
testImpl ::  MVar Text -> Splice IO
testImpl mv = do           
  liftIO $ B8.putStrLn "in testImpl"
  lift $ mvStatus mv
  node <- getParamNode
  path <- fmap (maybe "" id) getTemplateFilePath
  case getAttribute "file" node of
    Just f -> do
      liftIO $ print $ "Got Just " ++ (show f)
      let fileName = (FP.directory . FP.decodeString $ path) </> (FP.fromText f)
      contents <- liftIO $ readTextFile fileName
      liftIO $ putMVar mv contents 
      return []
    Nothing ->  do
      liftIO $ print "Got Nothing"
      value <- liftIO $ readMVar mv
      return [X.TextNode value]
main just renders the templates.
«mvarFile main noThreads»
main :: IO ()
main = do
  hSetBuffering stdout NoBuffering
  originalTS <- bindTestTag defaultHeistState
  ets <- loadTemplates "templates" originalTS
  let ts1 = either error id ets
  runIt ts1 "test1"
  runIt ts1 "notTest"
  runIt ts1 "test2"
  return ()
Now run main in you ghci session, and have a look at the output. It should look like this:
«mvarFile1 output»
Running template test1
in testImpl
"isEmpty is True"
"Got Nothing"
At this point, you are hung, and need to Control C out. Let's see what happened. I'll reprint the "test1.tpl" data here to make it easy.
«test1.tpl again»
before 1
inside 1
after 1
end 1

What happened is that we used the <test> tag without a file attribute before coming across a <test> tag with a file attribute. Looking at testImpl we see that it dutifully printed out that we entered it, then told us the MVar was empty, then looked for a file attribute and found Nothing. At this point the "readMVar mv" hangs, waiting for the MVar to become non-empty.

We can easily fix this by letting the three runIt calls run in parallel, with a forkIO. Go ahead and type :load mvarFile2.hs in your ghci session, which will replace main above with the following:

«mvarFile main threads»
main :: IO ()
main = do
  hSetBuffering stdout NoBuffering
  originalTS <- bindTestTag defaultHeistState
  ets <- loadTemplates "templates" originalTS
  let ts1 = either error id ets
  forkIO $ runIt ts1 "test1"
  forkIO $ runIt ts1 "notTest"
  forkIO $ runIt ts1 "test2"
  return ()
Now running main results in (a probably jumbled version of) the following (line numbers added):
«mvarFile2 output»
1  Running template test1
2  ----------------------------------------------
3  in testImpl
4  "isEmpty is True"
5  Running template notTest
6  ----------------------------------------------
7  before
8  <notTest>
9  inside
10 </notTest>
11 after
13 "Got Nothing"
14 Running template test2
15 ----------------------------------------------
16 in testImpl
17 "isEmpty is True"
18 "Got Just \"data.txt\""
19 in testImpl
20 in testImpl
21 "isEmpty is False"
22 "Got Nothing"
23 "before 1
26 This is the contents of data.txt
28 after 1
31 This is the contents of data.txt
33 end 1
35 "isEmpty is True"
36 "Got Nothing"
37 before 2
39 after 2
42 This is the contents of data.txt
44 end 2

Let's see if we can wade through this output. Template test1 is run first, and produces lines 1 to 4. Since the MVar is empty, it hangs, just as before. Next, notTest runs in parallel with test1. It doesn't contain any <test> tags, so it produces the output seen in lines 5 to 12. Now test2 runs also in parallel with test1 and notTest. Just before it starts running, line 13 is printed out by test1. test2 continues with lines 14-18. Since the <test> tag in test2 contains a file attribute, line 18 is printed out instead of "Got Nothing". At this time, your mileage (and output) may vary. Here it looks like test1 continues to run and writes out line 19, along with test2 rendering its second <test> tag and writing out line 20". Next test1 is running, and prints out lines 21 and 22, before displaying its output in lines 23 to 34. Lines 35 and 36 are now output by test2, along with the rendering in lines 37 to 44. The mystery is why does line 35 say the MVar is empty? I think it is because readMVar is not atomic, and test2 ran while test1 was taking the MVar and before it had a chance to put it back.

So, what is the moral of the story? You can use MVars with Heist to keep private data inside your splices. Also, look at how trivial it is to run things in parallel. While the debugging info is all jumbled up, the all of the output generated by runIt is in the right order. By the way, if you remove the hSetBuffering stdout NoBuffering line from main in mvarFile2, the output will be so horribly jumbled that it is almost impossible to make sense of it. Go ahead and give it a try.

Doing Something Useful

So can we do something interesting with all this machinery? One of the things I like to have at the bottom of each web page I serve is a random pithy quote. Here is the file I would like to process:
«wiseQuote Template»
<quote file="quotes.xml"/>
<quote>Author: <wiseQuoteAuthor/>
Now an indexed quote
<quote index="1">
Author: <wiseQuoteAuthor/>
What I would like to happen, is that when this file is rendered, the "quotes.xml" file is read and parsed. Then whenever we are in a <quote> tag, we replace the tag <wiseQuoteAuthor/> with the author of the quote, and the tag <wiseQuoteSaying/> with the actual saying. If the <quote> tag has an index attribute, we use that specific quote from our list of quotes, otherwise we use a random index. In case you are wondering, here is a sampling of what the quotes.xml looks like.
  <quote author="Anonymous">The reason a dog has so many friends is
  that he wags his tail instead of his tongue.</quote>
  <quote author="Ann Landers">Don't accept your dog's admiration as
  conclusive evidence that you are wonderful.</quote>
  <quote author="Will Rogers">If there are no dogs in Heaven, then
  when I die I want to go where they went.</quote>
  <quote author="Ben Williams">There is no psychiatrist in the
  world like a puppy licking your face.</quote>
  <quote author="Josh Billings">A dog is the only thing on earth
  that loves you more than he loves himself.</quote>
  <quote author="Andy Rooney">The average dog is a nicer person
  than the average person.</quote>
The author is an attribute of the <quote> tag, and the children of the <quote> tag is the saying. Okay then, let's proceed. First we set up a data type that will be our container for our quotes.
«wiseQuotes data»
data WiseQuote = WiseQuote
  { wiseQuoteAuthor :: Template,
    wiseQuoteSaying :: Template }
  deriving (Eq, Show)
Next, xmlToQuote takes a Node and converts it into a WiseQuote. It checks to see that we are processing an quote element, and then grabs the author attribute of the tag, and wraps it in a [TextNode]. Similarly, it grabs the children of the quote element, and puts them inside a WiseQuote. It checks for errors along the way.
«wiseQuotes xmlToQuote»
xmlToQuote :: Node -> WiseQuote
xmlToQuote el =
  case elementTag el of
    "quote" -> case getAttribute "author" el of
        Just t -> WiseQuote [(X.TextNode t)] (childNodes el)
        _ -> error $ "Quote " ++ show el ++ " is missing an author"
    _ -> error $ "Element " ++ show el ++ " is not a WiseQuote"
getWiseQuotes reads the xml file that contains the quotes, and filters out just the <quote> tags and their children. It then calls xmlToQuote for each <quote> tag, returning a list of WiseQuotes.
«wiseQuotes getWiseQuotes»
getWiseQuotes :: MonadIO m => FP.FilePath -> m [WiseQuote]
getWiseQuotes fileName = do
  contents <- liftIO $ Filesystem.readFile fileName
  let doc =  either error justQuotes $ parseXML "quotes" contents
      quotes = map xmlToQuote doc
  return quotes
    justQuotes s = concat 
                    [ descendantElementsTag "quote" x | x <- docContent s ]
wiseQuoteImpl is very similar to testImpl in mvarFile above. If the <quote> tag has a file attribute, it reads the file and puts the resulting list of WiseQuotes into an MVar, returning nothing which removes the <quote> tag from the output. If the file attribute is not present, it checks to see if there is an index attribute. If so, it reads the value of the "index" attribute as an Int, indexes into the WiseQuotes array which should be present in the MVar, and runs the children of this <quote> node with "wiseQuoteAuthor" bound to the author's name, and "wiseQuoteSaying" bound to the actual saying. If the "index" attribute is missing, a random number is generated between 0 and the number of available quotes, and that is used as the index.
«wiseQuotes wiseQuoteImpl»
wiseQuoteImpl :: MVar [WiseQuote] -> Splice IO
wiseQuoteImpl mv = do
  pnode <- getParamNode
  path <- fmap (maybe "" id) getTemplateFilePath
  case getAttribute "file" pnode of
    Just f -> do
      let fileName = (FP.directory . FP.decodeString $ path) </> (FP.fromText f)
      quotes <- liftIO $ getWiseQuotes fileName
      liftIO $ putMVar mv quotes
      return []
    Nothing ->  do
      quotes <- liftIO $ readMVar mv
      quote  <- case getAttribute "index" pnode of
           Just x -> return (quotes !! (read . unpack $ x :: Int))
           Nothing -> do
              i <- liftIO $ getStdRandom (randomR (0, length quotes - 1))
              return (quotes !! i)
        [("wiseQuoteAuthor", wiseQuoteAuthor quote),
         ("wiseQuoteSaying", wiseQuoteSaying quote)]
Here we create a new empty MVar so that we can pass it to wiseQuoteImpl when a <quote> tag is encountered.
«wiseQuotes bindWiseQuotes»
bindWiseQuotes :: HeistState IO -> IO (HeistState IO)
bindWiseQuotes ts = do
  mv <- liftIO $ newEmptyMVar 
  return . bindSplice "quote" (wiseQuoteImpl mv) $ ts
main just runs the template "testQuotes" shown above
«wiseQuotes main»
main :: IO ()
main = do
  originalTS <- bindWiseQuotes defaultHeistState
  ets <- loadTemplates "quotes" originalTS
  let ts1 = either error id ets
  runIt ts1 "testQuotes"
At this point you should type :load wiseQuotes.hs in your ghci session, and then run the main function. You should see something like the following as output:
«wiseQuotes output»
1  Running template testQuotes
2  ----------------------------------------------
3  <xml>
5  before
6  Author: Michelangelo
7  <br />The greatest danger for most of us
8    is not that our aim is too high and we miss it, but that it is
9    too low and we reach it.
11 after
12 Now an indexed quote
13 before
15 Author: Ann Landers
16 <br />Don't accept your dog's admiration as
17   conclusive evidence that you are wonderful.
19 after
20 </xml>
The quote in lines 6 to 10 might be different, but the one in lines 15 to 17 should always be the same.

About the tool used to write this

At this point I'd like to say a few words about how this tutorial was written. All of the source code used in the examples, as well as the test data, and this html page are the result of using a literate programming tool called anansi. It allows you to write your code as though you were telling a story, and the code generated magically appears in all the right places. Perhaps you noticed, but probably not, that nowhere above did appear any import declarations. Yet in the examples you loaded with ghci, they were there. I left them out of the story because I felt they distracted from the tale I was trying to tell. Take a look at the file heistAnansiMvars.anansi in your favorite editor, and you'll see the top level of how this tutorial was generated. Notice near the bottom of the file, enclosed in html comments are a couple of include statements. I've segregated the imports and the actual code layout to the imports.anansi and codeLayout.anansi files, which are incidental to the story. Have a look at wiseQuotes.anansi to see what the actual source to typical story like this looks like. You'll notice every once in a while a line starting with a :d followed by some text. The stuff between the :d and the line containing just a : is called a macro. You can define as many macros as you want, and in whatever order you want, paying attention only to the flow of your story. Later down in the wiseQuotes.anansi document, you'll see some actual Haskell code enclosed in these macros. When this file is processed by anansi, it passes through the text outside of the macros, and then displays the text inside of the macros in your favorite output format, say html or latex. Recently syntax highlighting has been added, making the code even easier to read. In anansi terms, your files are woven together into a coherent whole, suitable for reading and understanding.

Now take a look at the file codeLayout.anansi. This file describes how to put together the macros you defined while telling the story into actual Haskell code and data. The :f in column 1 tell anansi into which file the data that follows it is supposed to go. For example, you'll see that the file onLoad.hs is composed of the following macros:

macros for onLoad.hs
|onLoad imports|
|onLoad testHook|
|onLoad bindTestTag|
|onLoad testImpl|
|onLoad runIt|
|onLoad main|

Thus no matter in which order you tell your story, you can break out and reorder to code portions to make compilable Haskell code. Anansi call this process the tangle. Your story is untangled, broken into compilable pieces with the :f command, and re-tangled to generate your program. Furthermore, this allows you to replicate code that happens to be identical between different modules. In my case, I reused the |on Load runIt| macro in each of the other Haskell programs.

So please, consider using anansi for you next programming project, and we can start to turn Haskell from one of the worst documented platforms into one of the best documented platforms. Thank you for your attention.

Best wishes,
Henry Laxen

Quote of the day:
What to do in case of an emergency: 1. Pick up your hat. 2. Grab your coat. 3. Leave your worries on the doorstep. 4. Direct your feet to the sunny side of the street.

Go up to Haskell Go up to Home Page of Nadine Loves Henry
Go back to Understanding Function Composition or continue with How to use Data.Lens