In general, fusion refers to transformations whose purpose is to get rid of intermediate data structures. You *fuse* function calls that result in wasteful memory allocations into something more efficient. This is actually IMO one of the biggest applications of Haskell being pure. And you pretty much don’t need to do anything to get it, it comes for free through the GHC compiler.

### Haskell is pure

Because Haskell is pure, we get this thing called referential transparency, which (from the link), means that “expression always evaluates to the same result in any context”^{1}. That means that I can do very general program level manipulations without changing what the program will actually output. For example, even without knowing what `x`

, `y`

, `z`

and `w`

are I always know that

```
((x ++ y) ++ z) ++ w
```

will evaluate to the same thing as

```
x ++ (y ++ (z ++ w))
```

yet the second one will in practice involve less memory allocations (since `x ++ y`

requires reallocating whole prefix of the output list).

### Rewrite rules

In fact, there are a whole lot of this sort of optimization we can do, and, because Haskell is pure, we can basically just move whole expressions around (replacing `x`

, `y`

, `z`

, or `w`

for actual lists or expressions that evaluate to lists in the example above changes nothing). This becomes a pretty mechanical process.

Furthermore, it turns out that you can come up with a lot of equivalences for higher order functions (Theorems for free!). For example,

```
map f (map g xs) = map (f . g) xs
```

no matter what `f`

, `g`

, and `xs`

are (the two sides are semantically equal). Yet while the two sides of this equation produce the same value output, the left hand side is always worse in efficiency: it ends up allocating space for an intermediate list `map g xs`

, that is immediately thrown away. We’d like to tell the compiler to, whenever it encounters something like `map f (map g xs)`

, replace it with `map (f . g) xs`

. And, for GHC, that is through rewrite rules:

```
{-# RULES "map/map" forall f g xs. map f (map g xs) = map (f.g) xs #-}
```

The `f`

, `g`

, and `xs`

can be matched against any expressions, not just variables (so something like `map (+1) (map (*2) ([1,2] ++ [3,4]))`

gets transformed into `map ((+1) . (*2)) ([1,2] ++ [3,4])`

. (There doesn’t appear to be a good way to search for rewrite rules, so I compiled a list). This paper explains the motivation and workings of GHC rewrite rules.

### So that’s how GHC optimizes `map`

?

Actually, not quite. The thing above is short-cut fusion. The name sort of implies the drawback: it doesn’t scale too well and is annoying to debug. You end up having to write a ton of ad-hoc rules for all arrangements of the same common functions. Then, you hope that repeated application of rewrite rules will simplify your expressions nicely.

It turns out that we can do even better in some cases by organizing our re-write rules so that we build up some intermediate normal form and then have rules targeting that intermediate form. This way, we start getting “hot” paths of rewrite rules.

Probably the most advanced of these systems is stream fusion targeting coinductive sequences (basically lazy sequences like lists). Check out this thesis and this paper (which is actually pretty much how the `vector`

package is implemented). For example, in `vector`

, your code gets first transformed into an intermediate form involving `Stream`

s and `Bundle`

s, is optimized in that form, then gets transformed back into vectors.

### And… `Data.Text`

?

`Data.Text`

uses stream fusion to minimize the number of memory allocations that occur (I think this is especially important for the strict variant). If you check out the source, you’ll see that the functions “subject to fusion” actually manipulate `Stream`

s for the most part (they are of the general form `unstream . (stuff manipulating stream) . stream`

) and there are a bunch of `RULES`

pragmas for transforming `Stream`

s. In the end, any combination of these functions is supposed to get fused so that only one allocation needs to occur.

### So, what do I need to take away for my everyday coding?

The only real way to know when your code is subject to fusion is to have a good understanding of the rewrite rules involved and understand well how GHC works. That said, there is one thing that you *should* do: try to use non-recursive higher order functions when possible, since these can be (at least for now, but in general will always be more) easily fused.

### Complications

Because fusion in Haskell occurs through repeated application of rewrite rules, it suffices to convince yourself of each rewrite rule’s correctness to know that the whole “fused” program does the same thing as your original program does. Except there are edge cases relating to programs terminating. For example, one might think that

```
reverse (reverse xs) = xs
```

yet that is clearly not true, since `head $ reverse (reverse [1..])`

will not terminate yet `head [1..]`

will. More information from the Haskell Wiki.

^{1} This is actually true only provided that in these contexts the expression maintains the same type.