I thought it might be useful to publish our internal *Introducing functional programming* manual. I am lucky that everyone who wants to work for Cake Solutions is keen to learn new things, and–to my surprise & immense pleasure–most people are keen to get started even before their first day in the office! Their questions usually start with something like:

“My question is whether this is the most logical/efficient approach as per the reading list and updated topics you’ve given. My revised approach is to approach a little of everything: Haskell, Scala, Akka, AMQP, virtualization, et cetera. I wondered if you had any recommendations…”

Indeed I have.

## What is functional programming

Functional programming languages help us reason about the outcome of a computation. As you might guess, the core element of functional language is a *function*; a function in a mathematical sense. A function that takes some input and maps it to some output; we might not be able to see inside a function, but we know that for the same *x a function will always give the same **y*. Notice that this is very different from the concept of functions in C, Pascal, Java and others.

### Functions

Let’s take what might seem like a detour through mathematics. In mathematics, when I write

*f*(*x*) = *x* * *x*,

I know that *f*(5) is 25 every time I “execute” the function. This gives us a convenient computational model: substitution. Imagine I add another function, *g*(*x*) = *x* + 1. If I now ask you to compute (*f* o *g*)(5). You will probably go about this by performing substitution:

- Compute
*g*(5) by taking the “body” of*g*, substituting every x for 5, giving 5 + 1 = 6, - Substitute
*g*(5) in*f*(*g*(5)) with 6, giving*f*(6), - Compute
*f*(6) using the same substitution process, giving 36 ∎

But there’s another substitution: when computing *f*(*g*(*x*)), swap every *x* for *g*(*x*) in the body of *f*, giving (*f* o *g*)(*x*) = *g*(*x*) * *g*(*x*). I can even name the new function and arrive at *h* = (*f* o *g*), which–following the substitution steps–is *h*(x) = (*x* + 1) * (*x* + 1). Computing *h*(5) is (5 + 1) * (5 + 1), which is indeed 36 ∎ Good!

### Non-functions

Recall that a function returns the same value for the same input, *g*(5) is 6, *g*(6) = 7; *g*(5) is never 7 or 5 (or any other value). Let’s think of a simple non-function that you are all familiar with: *random*. If I include *random* in my function *g*(*x*) = *x* + random, then *g* is no longer a function. Computing *g*(5) could return 44 now and 5534345 a moment later. This makes our model of computation much more difficult. Namely, how to compute *f*(*g*(5)) with the *random* element? The two approaches now give different results! I can either compute *g*(5), remember the value and then supply it to *f* or I can substitute *g* in *f*(*x*), giving *h*(*x*) = (*x* + random) * (*x* + random).

Let’s do the calculations: we can either compute *g*(*x*)first, giving

- Compute
*g*(5) by taking the “body” of*g*, substituting every x for 5, giving 5 +*random*= 8, - Substitute
*g*(5) in*f*(*g*(5)) with 8, giving*f*(8), - Compute
*f*(8) using the same substitution process, giving 64 ∎

Or, I can compute a new function, *h*(*x*) = (*x* + random) * (*x* + random) and then compute *h*(5): the first term is (*x* + random), giving on this occasion 8. The second term, (*x* + random), gives 14. The result of *h*(5) is now 8 * 14, giving 112 ∎ Oops!

## Back to code

After this brief excursion to mathematics, let’s turn our attention to programming, implementing the functions `f`

and `g`

in Java:

int f(int x) { return x * x; } int g(int x) { return x + 1; }

Then the compiler cannot reason about the outcome of calling `f(5)`

. All that the Java (and C, Pascal, C#, …) compiler does is to transform the `x * x`

into the appropriate machine instructions. The machine that executes these instructions is even dumber, it simply executes one instruction at a time, without remembering the instruction that came before it and without knowing the instruction that will follow.

The Java compiler does that mainly because it treats every method as a non-function. So, if I ask Java to compute `f(g(5))`

, it will first compute `g(5)`

, and use the result in as the value of `x`

in `f(x)`

. Partially because of this, Java can never allow me to write *h* = (*f* o *g*). First of all, the Java compiler cannot decide how to substitute `g(x)`

and then Java does not have *function* as a data type. In other words, I cannot have:

int f ... int g ... int main(...) { Function h = f . g; h(5); }

### Functions as first-class values

So, this is the first thing functional languages bring: a function can now be a value; therefore I can assign a function to a variable and then take the variable that represents a function and apply it to some arguments. The concrete syntax differs language by language, but the concept is the same. *A function is a value.*

### Non-pure functions

There are differences between the functional languages in the way that they enforce the function-ness. Scala, for example, does not enforce that a function is a function in the mathematical sense. Therefore, in Scala, I can write `val f = (x: Int) => x * x`

and Scala will create a variable `f`

and set it to be the function that takes an `Int`

and returns a `Int`

. The function `f`

just so happens to be a pure function–a function in the mathematical sense, whose result depends only on its input. In Scala, I can also create function `g`

that includes the randomness: `val g = (x: Int) => x + (math.random * 1000).asInstanceOf[Int]`

. The function `g`

takes an `Int`

and returns an `Int`

. Recall though from the previous paragraph that `g`

is not a function in the mathematical sense! This drives how `f(g(5))`

will be computed: Scala will first compute `g(5)`

and then use the computed value as the parameter of the function `f`

. Most of the programmers don’t find this confusing or counter-intuitive, but it is not “right”. It restricts what I can do with the function variables and it fixes the flow of the program.

### Pure functions and pure functional languages

Now, a pure function is our familiar `f`

, whose result depends only on the value of its argument. Function `g`

is not pure, because its result depends on *something other than* its argument. Now, there are functional programming languages that enforce that the functions are pure functions. Take Haskell, where the result of a function must depend only on its arguments. In Haskell, I cannot have the function `g`

. So let’s change the function `g`

back to its trivial form *g*(*x*) = *x* + 1 and write:

f :: Int -> Int f x = x * x g :: Int -> Int g x = x + 1

I can now ask Haskell to compute `f(g 5)`

. It will say 36. But it will compute 36 completely differently. It will start by computing `f x`

, where `x`

acts like a promise to give the value when needed. When evaluating `f`

, Haskell will delay computing the value of `x`

until it needs it. (As it happens, in our function `f`

, we need it immediately, but the pattern remains.) So, in the body of `f`

, when Haskell needs `x`

, it will evaluate `g 5`

. This means Haskell actually computes `(g 5) * (g 5)`

. Obviously, `g 5`

gives 6, so the result of computing `f(g(5))`

is 36. Just like the Java or Scala code, but the difference is the way in which Haskell arrived at the result.

Now, because in Haskell (and other pure functional languages) functions are functions in the mathematical sense, I can easily create compound functions. The particular syntax in Haskell is `f . g`

. This returns a new function whose body we’ve already seen. I can therefore have code:

h = f . g

Computing `h(5)`

indeed returns 36.

### Higher-order functions

Because functions (whether pure or not) are first-class values, there’s nothing stopping me from accepting functions as parameters of other functions. I can also create functions that return other functions. When a function takes another function as its parameter or when it returns a function, we call it higher-order function. Most of you will have heard of the map-and-reduce algorithm. Let’s focus on the map part for now. We take some sequence of things and we wish to do something to each thing in the sequence. We could write specialised map-like functions that perform operations such as “double every element”, “add 5 to every element”, and many more. We could have

double :: [Int] -> [Int] double [] = [] double (x:xs) = x * x : double xs addFive :: [Int] -> [Int] addFive [] = [] addFive (x:xs) = x + 5 : addFive xs

In this case, we have two functions: the `double`

and `addFive`

which take list of numbers and return a list of numbers that have been doubled or have 5 added to them, respectively. But look! Both functions are nearly the same, except that the `double`

function contains code `x * x`

and the `addFive`

function contains `x + 5`

. It would be better if we could somehow inject code to be applied to every element. Remember that functions are first-class values, which we can use as arguments of functions! Let’s get rid of the `double`

and `addFive`

functions and replace them with a higher-order function `map`

:

def map(as: List[Int])(f: Int => Int): List[Int] = as match { case Nil => Nil case h::t => f(h) :: map(t)(f) }

This code is in Scala (thanks to Tony Morris for the improvement!); I can write similar code in Haskell:

map' :: [Int] -> (Int -> Int) -> [Int] map' [] _ = [] map' (a:as) f = f a : map' as f

In both cases, I have function `map`

that takes a function that takes a sequence of values of `Int`

s and a function that takes individual `Int`

and returns a new `Int`

. Now I can compute the value of `map(List(1, 2, 3))(_ + 1)`

in Scala (or `map' [1, 2, 3] (+1)`

in Haskell) and get the same result, namely `List(2, 3, 4)`

.

Let’s complete the introduction by looking at functions that return functions. Such a function is a kind of factory of functions, say depending on the value of its argument. If the argument is 1 then the function should return a function that adds one; if the argument is 2 then the function should return a function that adds two, and so on. For example:

def mkf(x: Int) = x match { case 1 => {x: Int => x + 1} case 2 => {x: Int => x + 2} case 3 => {x: Int => x * x} }

This is the code in Scala, we can have the same thing in Haskell’s syntax:

mkf :: Int -> (Int -> Int) mkf x = case x of 1 -> (\x -> x + 1) 2 -> (\x -> x + 2) 3 -> (\x -> x * x)

Now, let’s combine everything together. Let’s use the `map`

function together with the `mkf`

function to create a truly Frankensteinian monster:

map' [1, 2, 3] (mkf 1)

This Haskell gem will poo out `[2,3,4]`

. It applies the function returned by computing `mkf 1`

to every element in the source sequence `[1,2,3]`

.

## What’s next?

In the next instalment, we will explore type systems. Type systems are not specific to functional systems, but our favourite functional languages, Scala and Haskell are both functional and type safe languages.

As a motivation, doesn’t it bother you that the `map`

function can apply arbitrary function to every element, but we require the elements to be `Int`

s. What if I needed to map `String`

s? Or `Customer`

s? In the next post, you will learn about types; you will understand what we mean when we write:

map' :: [a] -> (a -> b) -> [b]

or

def map[A, B](as: List[A])(f: A => B): List[B]

## Summary

In this introduction to functional programming, we found out what the mathematicians call a function. We then looked at how programming languages model the mathematical concept. We discovered that all languages allow us to compute results. Some languages simply spit out instructions that just so happen to arrive at the result. In those non-functional languages, a function is simply a name for a block of code that does something. Functional languages understand the notion of a function, therefore a function is no longer just a block of code, it can be a value. Because a function is a value, it can be passed around and even returned from other functions. Some functional languages go a step further and enforce the pureness of a function. Those language do a lot more reasoning of the control flow in the program.

Hi there, Just a minor correction on your Scala map function.

The implementation calls size() which is O(n), thus making the implementation inefficient. Further, since you are comparing it to Haskell, it is poor form for pedagogical purposes, since if you changed the Haskell version to reflect the Scala version, you’d have a very different function as a result of laziness. That is to say, they should not be compared since they are quite different.

It is also general poor form to call size/length on a cons list and so I advise that it should not feature in teaching material. Note that Haskell’s length function is a partial function and Scala’s size function on Stream is also partial.

Here is the Scala version you are looking for:

http://paste.pocoo.org/show/512279/

Hope that helps.

Thanks for the comment and you make a really good point about the Scala code.