View and edit this tutorial on Github | Discuss on Reddit | Tweet this

Efficient Packed-Memory Data Representations - vector library

The de facto standard package in the Haskell ecosystem for integer-indexed array data is the vector package. This corresponds at a high level to arrays in C, or the vector class in C++'s STL. However, the vector package offers quite a bit of functionality not familiar to those used to the options in imperative and mutable languages.

While the interface for vector is relatively straightforward, the abundance of different modules can be daunting. This article will start off with an overview of terminology to guide you, and then step through a number of concrete examples of using the package.

Example

Since we're about to jump into a few sections of descriptive text, let's kick this off with a concrete example to whet your appetite. We're going to count the frequency of different bytes that appear on standard input, and then display this content.

Note that this example is purposely written in a very generic form. We'll build up to handling this form throughout this article.

{-# LANGUAGE FlexibleContexts #-}
import           Control.Monad.Primitive     (PrimMonad, PrimState)
import qualified Data.ByteString.Lazy        as L
import qualified Data.Vector.Generic.Mutable as M
import qualified Data.Vector.Unboxed         as U
import           Data.Word                   (Word8)

main :: IO ()
main = do
    -- Get all of the contents from stdin
    lbs <- L.getContents

    -- Create a new 256-size mutable vector
    -- Fill the vector with zeros
    mutable <- M.replicate 256 0

    -- Add all of the bytes from stdin
    addBytes mutable lbs

    -- Freeze to get an immutable version
    vector <- U.unsafeFreeze mutable

    -- Print the frequency of each byte
    -- In newer vectors: we can use imapM_
    U.zipWithM_ printFreq (U.enumFromTo 0 255) vector

addBytes :: (PrimMonad m, M.MVector v Int)
         => v (PrimState m) Int
         -> L.ByteString
         -> m ()
addBytes v lbs = mapM_ (addByte v) (L.unpack lbs)

addByte :: (PrimMonad m, M.MVector v Int)
        => v (PrimState m) Int
        -> Word8
        -> m ()
addByte v w = do
    -- Read out the old count value
    oldCount <- M.read v index
    -- Write back the updated count value
    M.write v index (oldCount + 1)
  where
    -- Indices in vectors are always Ints. Our bytes come in as Word8, so we
    -- need to convert them.
    index :: Int
    index = fromIntegral w

printFreq :: Int -> Int -> IO ()
printFreq index count = putStrLn $ concat
    [ "Frequency of byte "
    , show index
    , ": "
    , show count
    ]

Terminology

There are two different varieties of vectors: immutable and mutable. Immutable vectors (such as provided by the Data.Vector module) are essentially swappable with normal lists in Haskell, though with drastically different performance characteristics (discussed below). The high-level API is similar to lists, it implements common typeclasses like Functor and Foldable, and plays quite nicely with parallel code.

By contrast, mutable vectors are much closer to C-style arrays. Operations working on these values must live in the IO or ST monads (see PrimMonad below for more details). Concurrent access from multiple threads has all of the normal concerns of shared mutable state. And perhaps most importantly for usage: mutable vectors can be much more efficient for certain use cases.

However, that's not the only dimension of choice you get in the vector package. vector itself defines three flavors: boxed (Data.Vector/Data.Vector.Mutable), storable (Data.Vector.Storable and Data.Vector.Storable.Mutable), and unboxed (Data.Vector.Unboxed and Data.Vector.Unboxed.Mutable). (There's also technically primitive vectors, but in practice you should always prefer unboxed vectors; see the module documentation for more information on the distinction here.)

And our final point: in addition to having these three flavors, the vector package provides a typeclass-based interface which allows you to write code that works in any of these three (plus other vector types that may be defined in other packages, like hybrid-vectors). These interfaces are in Data.Vector.Generic and Data.Vector.Generic.Mutable. When using these interfaces, you must still eventually choose a concrete representation, but your helper code can be agnostic to what it is.

What's nice is that - with small differences - all four mutable modules have the same interface, and all four immutable modules have the same interface. This means you can focus on learning one type of vector, and almost for free have that knowledge apply to other types as well. It then just becomes a question of choosing the representation that best fits your use case, which we'll get to shortly.

Efficiency

Standard lists in Haskell are immutable, singly-linked lists. Every time you add another value to the front of the list, it has to allocate another heap object for that cell, create a pointer to the head of the original list, and create a pointer to the value in the current cell. This takes up a lot of memory for holding pointers, and makes it inefficient to index or traverse the list (indexing to position N requires N pointer dereferences).

In contrast, vectors are stored in a contiguous set of memory locations, meaning random access is a constant time operation, and the memory overhead per additional item in the vector is much smaller (depending on the type of vector, which we'll cover in a moment). However, compared to lists, prepending an item to a vector is relatively expensive: it requires creating a new buffer in memory, copying the old values, and then adding the new value.

There are other data structures that can be considered for list-like data, such as Seq from containers, or in some cases a Set, IntMap, or Map. Figuring out the best choice for each use case can only be reliably determined via profiling and benchmarking. As a general rule though, a densely populated collection requiring integral or random access to the values will be best served by a vector.

Now let's talk about some of the other things that make vector so efficient.

Boxed, storable and unboxed

Boxed vectors hold normal Haskell values. These can be any values at all, and are stored on the heap with pointers kept in the vector. The advantage is that this works for all datatypes, but the extra memory overhead for the pointers and the indirection of needing to dereference those pointers makes them (relative to the next two types) inefficient.

Storable and unboxed vectors both store their data in a byte array, avoiding pointer indirection. This is more memory efficient and allows better usage of caches. The distinction between storable and unboxed vectors is subtle:

  • Storable vectors require data which is an instance of the Storable type class. This data is stored in malloced memory, which is pinned (the garbage collector can't move it around). This can lead to memory fragmentation, but allows the data to be shared over the C FFI.
  • Unboxed vectors require data which is an instance of the Prim type class. This data is stored in GC-managed unpinned memory, which helps avoid memory fragmentation. However, this data cannot be shared over the C FFI.

Both the Storable and Prim typeclasses provide a way to store a value as bytes, and to load bytes into a value. The distinction is what type of bytearray is used.

As usual, the only true measure of performance will be benchmarking. However, as a general guideline:

  • If you don't need to pass values to a C FFI, and you have a Prim instance, use unboxed vectors.
  • If you have a Storable instance, use a storable vector.
  • Otherwise, use a boxed vector.

There are also other issues to consider, such as the fact that boxed vectors are instances of Functor while storable and unboxed vectors are not.

Stream fusion

Take a guess how much memory the following program will take to run:

import qualified Data.Vector.Unboxed as V

main :: IO ()
main = print $ V.sum $ V.enumFromTo 1 (10^9 :: Int)

A valid guess may be 10^9 * sizeof int bytes. However, when compiled with optimizations (-O2) on my system, it allocates a total of only 52kb! How it is possible to create a one billion integer array without using up 4-8GB of memory?

The vector package has a powerful technique: stream fusion. Using GHC rewrite rules, it's able to find many cases where creating a vector is unnecessary, and instead create a tight inner loop. In our case, GHC will end up generating code that can avoid touching system memory, and instead work on just the registers, yielding not only a tiny memory footprint, but performance close to a for-loop in C. This is one of the beauties of this library: you can write high-level code, and optimizations can churn out something much more CPU-friendly.

Slicing

Above we discussed the problem of appending values to the front of a vector. However, one place where vector shines is with slicing, or taking a subset of the vector. When dealing with immutable vectors, slicing is a safe operation, with slices being sharable with multiple threads. Slicing also works with mutable vectors, but as usual you need to be a bit more careful.

Replacing lists

Enough talk! Let's start using vector. Assuming you're familiar with the list API, this should look rather boring.

import qualified Data.Vector as V

main :: IO ()
main = do
    let list = [1..10] :: [Int]
        vector = V.fromList list :: V.Vector Int
        vector2 = V.enumFromTo 1 10 :: V.Vector Int
    print $ vector == vector2 -- True
    print $ list == V.toList vector -- also True
    print $ V.filter odd vector -- 1,3,5,7,9
    print $ V.map (* 2) vector -- 2,4,6,...,20
    print $ V.zip vector vector -- (1,1),(2,2),...(10,10)
    print $ V.zipWith (*) vector vector -- (1,4,9,16,...,100)
    print $ V.reverse vector -- 10,9,...,1
    print $ V.takeWhile (< 6) vector -- 1,2,3,4,5
    print $ V.takeWhile odd vector -- 1
    print $ V.takeWhile even vector -- []
    print $ V.dropWhile (< 6) vector -- 6,7,8,9,10
    print $ V.head vector -- 1
    print $ V.tail vector -- 2,3,4,...,10
    print $ V.head $ V.takeWhile even vector -- exception!

Hopefully there's nothing too surprising about this. Most Prelude functions that apply to lists have a corresponding vector function. If you know what a function does in Prelude, you probably know what it does in Data.Vector. This is the simplest usage of the vector package: import Data.Vector qualified, convert to/from lists with V.fromList and V.toList, and then prefix your function calls with V..

  • Exercise 1: Try out some other functions available in the Data.Vector module. In particular, try some of the fold functions, which we haven't covered here.

  • Exercise 2: Try using the Functor, Foldable, and Traversable versions of functions with a vector

  • Exercise 3: Use an unboxed (or storable) vector instead of the boxed vectors we were using above. What code did you have to change from the original example? Do all of your examples from exercise 2 still work?

There are also a number of functions in the Data.Vector module with no corresponding function in Prelude. Many of these are related to mutable vectors (which we'll cover shortly). Others are present to provide more efficient means of manipulating a vector, based on their special in-memory representation.

Mutable vectors

I want to test how fair the System.Random number generator is at generating numbers between 0 and 9, inclusive. I want to generate 1,000,000 random values, count the frequency of each result, and then print how often each value appeared. Let's first implement this using immutable vectors:

import           Data.Vector.Unboxed ((!), (//))
import qualified Data.Vector.Unboxed as V
import           System.Random       (randomRIO)

main :: IO ()
main = do
    let v0 = V.replicate 10 (0 :: Int)

        loop v 0 = return v
        loop v rest = do
            i <- randomRIO (0, 9)
            let oldCount = v ! i
                v' = v // [(i, oldCount + 1)]
            loop v' (rest - 1)

    vector <- loop v0 (10^6)
    print vector

We've introduced the ! operator for indexing, and the // operator for updating. Other than that, this is fairly straightforward code. When I ran this on my system, it had 48MB maximum memory residency, and took 1.968s to complete. Surely we can do better.

This problem is inherently better as a mutable state one: instead of generating a new immutable Vector for each random number generated, we'd like to simply increment a piece of memory. Let's rewrite this to use a mutable, unboxed vector:

import           Control.Monad               (replicateM_)
import           Data.Vector.Unboxed         (freeze)
import qualified Data.Vector.Unboxed.Mutable as V
import           System.Random               (randomRIO)

main :: IO ()
main = do
    vector <- V.replicate 10 (0 :: Int)

    replicateM_ (10^6) $ do
        i <- randomRIO (0, 9)
        oldCount <- V.read vector i
        V.write vector i (oldCount + 1)

    ivector <- freeze vector
    print ivector

Once again, we use replicate to create a size-10 vector filled with 0. But now we've created a mutable vector (note the change in import). We then use replicateM_ to perform the inner action 1,000,000 times, namely: generate a random index, read the old value at that index, increment it, and write it back.

After we're finished, we freeze the vector (more on that in the next section) and print it. The resulting distribution of values is the same (or close - we are dealing with random numbers here) as the previous calculation using an immutable vector. But instead of 48MB and 1.968s, this program has a maximum residency of 44KB and runs in 0.247s! That's a significant improvement!

If we feel like being even more adventurous, we can replace our read and write calls with unsafeRead and unsafeWrite. That will disable some bounds checks before reading and writing. This can be a nice performance boost in very tight loops, but has the potential to segfault your program, so caveat emptor! For example, try replacing replicate 10 with replicate 9, change the read for an unsafeRead, and run your program. You'll see something like:

internal error: evacuate: strange closure type -1944718914
    (GHC version 7.10.2 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted (core dumped)

The same logic applies to the other unsafe functions in vector. The nomenclature means: unsafe may segfault your whole process, while not-marked-unsafe may just throw an impure exception (also not great, but certainly better than a segfault).

And if you were curious: on my system using unsafeRead and unsafeWrite speeds the program up marginally, from 0.247s to 0.233s. In our example, most of our time is spent on generating the random numbers, so taking off the safety checks does not have a significant impact.

Freezing and thawing

We used the freeze function above. The behavior of this may not be immediately obvious. When you freeze a mutable vector, what happens is:

  1. A new mutable vector of the same size is created
  2. Each value in the original mutable vector is copied to the new mutable vector
  3. A new immutable vector is created out of the memory space used by the new mutable vector

Why not just freeze it in place? Two reasons, actually:

  1. It has the potential to break referential transparency. Consider this code:

    import           Data.Vector.Unboxed         (freeze)
    import qualified Data.Vector.Unboxed.Mutable as V
    
    main :: IO ()
    main = do
        vector <- V.replicate 1 (0 :: Int)
        V.write vector 0 1
        ivector <- freeze vector
        print ivector
        V.write vector 0 2
        print ivector

    If we froze the vector in-place in the call to freeze, then the second write call would modify our ivector value, meaning that the first and second call to print ivector would have different results!

  2. When you freeze a mutable vector, its memory is marked differently for garbage collection purposes. Later attempts to write to that same memory can lead to a segfault.

However, if you really want to avoid that extra buffer copy, and are certain it's safe, you can use unsafeFreeze. And in fact, our random number example above is a case where freeze can be safely replaced by unsafeFreeze, since after the freeze, the original mutable vector is never used again.

  • Exercise 1: Go ahead and make that swap and confirm that your program works as expected.
  • Exercise 2: In the program just above (with V.replicate 1 (0 :: Int)), replace freeze with unsafeFreeze. What result do you see?

The opposite of freeze is thaw. Similar to freeze, thaw will copy to a new mutable vector instead of exposing the current memory buffer. And also, like freeze, there's an unsafeThaw that turns off the safety measures. Like everything unsafe: caveat emptor!

(We'll cover some functions like create that provide safe wrappers around unsafeFreeze and unsafeThaw later.)

PrimMonad

If you look at the mutation functions we used above like read and write, you can tell that they were looking in the IO monad. However, vector is more generic than that, and will allow your mutations to live in any primitive monad, meaning: IO, strict ST s, and transformers sitting on top of those two. The type class controlling this is PrimMonad.

You can get more information on PrimMonad in the Primitive Haskell article. Without diving into details: every primitive monad also has an associated primitive state token type, which is captured with PrimState. As a result, the type signatures for read and write (for boxed vectors) look like:

read :: PrimMonad m => MVector (PrimState m) a -> Int -> m a
write :: PrimMonad m => MVector (PrimState m) a -> Int -> a -> m ()

Every mutable vector takes two type parameters: the state token of the monad it lives in, and the type of value it holds. These gymnastics may seem overkill now, but are necessary for making mutable vectors both versatile in multiple monads, and type safe.

modify and the ST monad

Let's check out a particularly complicated type signature (for unboxed vectors):

modify :: Unbox a => (forall s. MVector s a -> ST s ()) -> Vector a -> Vector a

What this function does is:

  1. Creates a new mutable buffer the same length as the original vector
  2. Copies the values from the original vector into the new mutable vector
  3. Runs the provided ST action on the provided mutable vector
  4. Unsafely freezes the mutable vector and returns it.

What's great about this function is that it does the minimal amount of buffer copying to be safe, and that it can be used from pure code (since all side-effects are captured inside the ST action you provide).

  • Exercise 1: Steps 1 and 2 should look pretty similar to a function we discussed above. Can you figure out which one it is?
  • Exercise 2: Implement modify yourself using functions we've discussed and runST from Control.Monad.ST.

Let's use our new function to implement a Fisher-Yates shuffle. If we start with a vector of size 20, we'll generate a random number between 0 and 19. Then we'll swap position 19 with that generated random number. Then we'll loop, but this time with a random number between 0 and 18 and swapping with position 18. We continue until we get down to 0.

import           Control.Monad.Primitive     (PrimMonad, PrimState)
import qualified Data.Vector.Unboxed         as V
import qualified Data.Vector.Unboxed.Mutable as M
import           System.Random               (StdGen, getStdGen, randomR)

shuffleM :: (PrimMonad m, V.Unbox a)
         => StdGen
         -> Int -- ^ count to shuffle
         -> M.MVector (PrimState m) a
         -> m ()
shuffleM _ i _ | i <= 1 = return ()
shuffleM gen i v = do
    M.swap v i' index
    shuffleM gen' i' v
  where
    (index, gen') = randomR (0, i') gen
    i' = i - 1

shuffle :: V.Unbox a
        => StdGen
        -> V.Vector a
        -> V.Vector a
shuffle gen vector = V.modify (shuffleM gen (V.length vector)) vector

main :: IO ()
main = do
    gen <- getStdGen
    print $ shuffle gen $ V.enumFromTo 1 (20 :: Int)

Notice how shuffleM is a mutable, side-effecting function. However, shuffle itself is pure.

Generic

After everything else we've dealt with, Generic is a relatively easy addition. We introduce two new typeclasses:

class MVector v a
class MVector (Mutable v) a => Vector v a

Said in English: an instance MVector v a is a mutable vector of type v that can hold values of type a. The Vector v a is the immutable counterpart to some mutable vector. You can find the mutable version with Mutable v.

One important thing to keep in mind is kinds. The kind of the v is MVector v a is * -> * -> *, since it takes parameters for both the state token and the value it holds. With the immutable Vector v a, the v is of kind * -> *. Was that a little abstract? No problem, some type signatures should help:

length :: MVector v a => v s a -> Int
length :: Vector v a => v a -> Int

read :: (PrimMonad m, MVector v a) => v (PrimState m) a -> Int -> m a

It takes a bit of time to get used to these generic classes, but once you do it's fairly easy to use them. The best advice is to practice! And as such:

  • Exercise: modify the shuffle program above to work on a generic vector instead of specifically on an unboxed vector.

The final trick when working with generic vectors is that, ultimately, you will need to provide a concrete type. If you forget to do so, you'll end up with error messages that look like the following:

stream.hs:28:13:
    No instance for (V.Vector v0 Int) arising from a use of shuffle
    In the expression: shuffle gen
    In the second argument of ($), namely
      shuffle gen $ V.enumFromTo 1 (20 :: Int)
    In a stmt of a 'do' block:
      print $ shuffle gen $ V.enumFromTo 1 (20 :: Int)

vector-algorithms

A package of note is vector-algorithms, which provides some algorithms (mostly sort) on mutable vectors. For example, let's generate 100 random numbers and then sort them.

import           Data.Vector.Algorithms.Merge (sort)
import qualified Data.Vector.Generic.Mutable  as M
import qualified Data.Vector.Unboxed          as V
import           System.Random                (randomRIO)

main :: IO ()
main = do
    vector <- M.replicateM 100 $ randomRIO (0, 999 :: Int)
    sort vector
    V.unsafeFreeze vector >>= print
  • Exercise 1: write a helper function sortImmutable that uses modify and sort from vector-algorithms to sort an immutable vector safely
  • Exercise 2: rewrite the main function above to use sortImmutable and only the immutable vector API
  • Exercise 3: is your new version more efficient, less efficient, or the same? Explain.

mwc-random

One final library to mention now is mwc-random, a random number generation library built on top of vector and primitive. Its API can be a bit daunting initially, but given your newfound understanding of the vector package, the API might make a lot more sense now. It provides a Gen s type, where s is some state token. You can then use uniform and uniformR to get random numbers out of that generator.

As a final example, here's how we can shuffle the numbers 1-20 using mwc-random.

import           Control.Monad.ST            (ST)
import qualified Data.Vector.Unboxed         as V
import qualified Data.Vector.Unboxed.Mutable as M
import           System.Random.MWC           (Gen, uniformR, withSystemRandom)

shuffleM :: V.Unbox a
         => Gen s
         -> Int -- ^ count to shuffle
         -> M.MVector s a
         -> ST s ()
shuffleM _ i _ | i <= 1 = return ()
shuffleM gen i v = do
    index <- uniformR (0, i') gen
    M.swap v i' index
    shuffleM gen i' v
  where
    i' = i - 1

main :: IO ()
main = do
    vector <- withSystemRandom $ \gen -> do
        vector <- V.unsafeThaw $ V.enumFromTo 1 (20 :: Int)
        shuffleM gen (M.length vector) vector
        V.unsafeFreeze vector
    print vector