Erlang for Python programmers: Part III

by Ruslan Spivak on September 24, 2007

Previous parts: Intro, Part I and Part II

Today we are going to take a look at functions.
In Erlang there is no direct correspondence to what we have in Python:

1. default values, keyword arguments and formal parameter of form **name

def foo(key, name='ruslan', type='unknown', **kw):
    print 'key = %s, name = %s, type = %s' % (key, name, type)
    for key, val in kw.items():
        print '%s = %s' % (key, val)
>>> foo(1)
key = 1, name = ruslan, type = unknown
>>> foo(1, type='known', x=1, y=10)
key = 1, name = ruslan, type = known
y = 10
x = 1

2. formal parameter of form *name

def double_sum(*args):
    return sum(x * 2 for x in args)
>>> double_sum(2, 4, 5)
22

Let’s take a look at Erlang’s function declaration syntax (a bit modified from Erlang Reference Manual):

  1. Function declaration consists of function clauses separated by semicolons (;) and terminated by period (.)
  2. Function clause consists of head and body separated by ->
  3. A clause head consists of function name (which is atom), argument list(each argument is pattern) and optional guard sequence beginning with keyword when.
  4. A clause body consists of sequence of expressions separated by comma (,)

Name(Pattern11,…,Pattern1N) [when GuardSeq1] ->
Body1;
…;
Name(PatternK1,…,PatternKN) [when GuardSeqK] ->
BodyK.

where Body is in form:
Expr1,
Expr2,

ExprN

Number of arguments in function is called arity. Function is uniquely identified by combination of module name, function name and function’s arity and this form is often written as m:f/N. Two functions in the module with the same name, but different arity
are completely different functions, remember it.

Enough dry theory, let’s define our Erlang factorial function in module mymath.erl :

-module(mymath).
-export([factorial/1]).

factorial(0) ->
    1;
factorial(N) ->
    N * factorial(N-1).

What we have here:

1) function declaration of one function called factorial
2) there are two clauses
- first clause separated by semicolon (;)

factorial(0) ->
    1;

- second and final clause terminated with period (.)

factorial(N) ->
    N * factorial(N-1).

3) Every clause (first and second) has correspondingly head and body. Head consists of function’s name, in our case it’s factorial and after follows arguments list which contains patterns. In first clause pattern is 0, in second – N.

How this works:

When function is entered, the function clauses are pattern matched against passed arguments. Pattern matching happens in order functions are defined in module. So when you enter

1> mymath:factorial(5).
120

first 5 is matched against clause factorial(0) and fails as 5 != 0, then 5 is matched against N in second clause and pattern matching succeeds. As a result unbound variable N in pattern becomes bound and gets value 5. This leads to execution of body of second clause, namely N * factorial(N-1). This happens recursively until N becomes 0 and first clause factorial(0) is executed terminating recursion.
The same factorial function will look in Python without pattern matching like:

def factorial(n):
    if n == 0:
        return 1
    return n * factorial(n-1)

In Python’s version we “hardcoded” branching in function’s body.

The benefits of pattern matching are that when conditions inside function become complex you may do your job better and write less code using pattern matching and also if you have big function this may mean that you do not have to modify body of your function but to add another clause with corresponding pattern.

Earlier I mentioned that function clauses are pattern matched in order they are defined in module. For our function this is important. If we change order of clauses this will lead to problems, as the condition to terminate recursion never appear:

-module(mymath).
-export([factorial/1]).

factorial(N) ->
    N * factorial(N-1);
factorial(0) ->
    1.

When compiling we will get warning (use C-c C-k in Emacs or c(mymath). in Eshell to compile)

2> c("/home/alienoid/dev/erlang/mymath", [{outdir, "/home/alienoid/dev/erlang/"}]).
/home/alienoid/dev/erlang/mymath.erl:6: Warning: this clause cannot match because a previous clause at line 4 always matches
{ok,mymath}

indicating that second clause will never match, but still you can call the function (it will most likely to crash after eating all your available memory).

You can easily avoid that problem introducing guard that starts with keyword when in function clause’s head:

-module(mymath).
-export([factorial/1]).

factorial(N) when N > 0 ->
    N * factorial(N-1);
factorial(0) ->
    1.

Now you can put those two clauses in any order you like. I should also add that if during pattern matching no matched function clause is found a function_clause run time error will occur:

3> mymath:factorial(-4).

=ERROR REPORT==== 24-Sep-2007::01:53:25 ===
Error in process <0.29.0> with exit value: {function_clause,[{mymath,factorial,[-4]},{erl_eval,do_apply,5},{shell,exprs,6},{shell,eval_loop,3}]}

** exited: {function_clause,[{mymath,factorial,[-4]},
                             {erl_eval,do_apply,5},
                             {shell,exprs,6},
                             {shell,eval_loop,3}]} **

What is guard anyway ? Guards make pattern matching more powerful, they can be used to perform different tests and comparison operations on variables in a pattern as you could see. You can have several guard expressions in clause separated by commans (,). The guard containing several guard expressions evaluates to true only if every guard expression evaluates to true. Example of guard containing several guard expressions:

-module(mymath).
-export([factorial/1]).

factorial(N) when is_integer(N), N > 0 ->
    N * factorial(N-1);
factorial(0) ->
    1.

For above definition calling mymath:factorial(5) reads as: choose clause factorial(N) when (N is integer AND N > 0)

There are also guard sequences which have form guard1; guard2; …; guardN and guard sequence evaluates to true if at least one guard is evaluated to true:

-module(mymath).
-export([factorial/1]).

factorial(N) when is_integer(N), N > 0; is_list(N) ->
    N * factorial(N-1);
factorial(0) ->
    1.

For above definition calling mymath:factorial(5) reads as: choose clause factorial(N) when (N is integer AND N > 0) OR N is list.

You have probably noted that the way we defined our factorial function means that it will be calling itself recursively and this will use stack and stack is not infinite resource. We need to do something about our function and this means to make it tail recursive. When last expression in function body is function call this means that call is tail recursive and this is how infinite loop may be done without consuming stack resources. So here is our new tail-recursive function factorial that uses accumulator to pass value of previous calculation:

-module(mymath).
-export([factorial/1]).

%% this one is exported, arity is 1
factorial(N) ->
    factorial(N, 1).

%% helper function, not exported, arity is 2
%% tail-recursive, uses accumulator to store
%% calculation between invocations
factorial(N, Acc) when N > 0 ->
    factorial(N-1, N*Acc);
factorial(0, Acc) ->
    Acc.

In foregoing example you can see guards, tail-recursion, usage of accumulators, functions with different arities. Experiment with it.

If you enjoyed this post why not subscribe via email or my RSS feed and get the latest updates immediately. You can also follow me on GitHub or Twitter.

Speak your mind

Previous post:

Next post: