Autovivication and Y Combinator in Python

April 14, 2008

Sometimes you may need to have nested dictionaries initialized without much hassle.

“Autovivication” comes in handy. What you want to achieve is:

>>> d = dict()
>>> d['q']['w']['r'] = 777

Of course what you’ll get will be:

>>> d['q']['w']['r'] = 777
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
KeyError: ‘q’

If you don’t need deeply nested dictionaries defaultdict with dict default factory will do the job:

>>> from collections import defaultdict
>>> d = defaultdict(dict)
>>> d
defaultdict(<type ‘dict’>, {})
>>> d['a']['b'] = 33
>>> d
defaultdict(<type ‘dict’>, {’a': {’b': 33}})
>>> d['a']['b']
33

Here are another ways to have unlimited level of nested dictionaries.
1) Define custom class inheriting from built-in dict:

class default_dict(dict):
    def __missing__(self, key):
        self[key] = value = self.__class__()
        return value
>>> d = default_dict()
>>> d
{}
>>> d['q']['w']['r'] = 777
>>> d
{’q': {’w': {’r': 777}}}
>>> d['q']['w']['r']
777

2) Recursive function + defaultdict:

def make_dict():
    return defaultdict(make_dict)
>>> from collections import defaultdict
>>> d = defaultdict(make_dict)
>>> d
defaultdict(<function make_dict at 0xb7c5295c>, {})
>>> d['q']['w']['r'] = 777
>>> d
defaultdict(<function make_dict at 0xb7c5295c>, {’q': defaultdict(<function make_dict at 0xb7c5295c>, {’w': defaultdict(<function make_dict at 0xb7c5295c>, {’r': 777})})})
>>> d['q']['w']['r']
777

3) And for fun you can use Y Combinator (in short - making anonymous lambda recursive) which is actually as (2) but you don’t have to have recursive function defined:

def Y(g):
    return ((lambda f: f(f))
            (lambda f:
                 g(lambda x: f(f)
                   (x))))

defdict = Y(lambda mk_dict:
                lambda x=None: defaultdict(lambda x=None: mk_dict(x)))
>>> d = defdict()
>>> d
defaultdict(<function <lambda> at 0xb7c52a74>, {})
>>> d['q']['w']['r'] = 777
>>> d
defaultdict(<function <lambda> at 0xb7c52a74>, {’q': defaultdict(<function <lambda> at 0xb7c56c6c>, {’w': defaultdict(<function <lambda> at 0xb7c56a74>, {’r': 777})})})
>>> d['q']['w']['r']
777

Erlang for Python programmers: Part V

November 16, 2007

Previous parts: Intro, Part I, Part II, Part III and Part IV

List comprehensions.

List comprehensions is very familiar topic for Python programmer. It’s a syntactic sugar which provides a succinct notation for producing elements in list.

Erlang:

[Expr || Qualifier1,...,QualifierN]

Expr - is an arbitrary expression
Qualifier is either a generator or a filter.
A generator is of form Pattern <- ListExpr where ListExpr evaluates to a list of terms.
A filter expression should evaluate to true or false.

1> [X || X <- [1,2,3,4,5,6], X > 3].
[4,5,6]

This is read as: list of X such that X is taken from list [1,2,3,4,5,6] and X is greater than 3.

In foregoing example: X <- [1,2,3,4,5,6] is a generator and X > 3 is a filter.

Several filters can be combined:

2> [X || X <- [1,a,2,3,b,4,5,6], X > 3].
[a,b,4,5,6]
3> [X || X <- [1,a,2,3,b,4,5,6], integer(X), X > 3].
[4,5,6]

Generator part may act as a filter in list comprehension:

4> [X || {X, a} <- [{1, a}, {2, b}, {3, a}, erlang]].
[1,3]

Generators can be combined too in list comprehensions. This is the Cartesian product of two lists:

5> [{X, Y} || X <- [1, 2, 3], Y <- [a, b]].
[{1,a},{1,b},{2,a},{2,b},{3,a},{3,b}]

This is how elegantly quicksort algorithm is solved with the help of list comprehensions in Erlang:

-module(sort).
-export([qsort/1]).

qsort([]) -> [];
qsort([Pivot|Tail]) ->
    qsort([X || X <- Tail, X < Pivot])
        ++ [Pivot] ++
        qsort([X || X <- Tail, X >= Pivot]).

Compile and run:

6> c(”/home/alienoid/dev/erlang/sort”, [{outdir, "/home/alienoid/dev/erlang/"}]).
{ok,sort}
7> sort:qsort([7, 5, 9, 3, 6]).
[3,5,6,7,9]

Usage of list comprehensions instead of some higher-order functions:

8> lists:map(fun(X) -> X*2 end, [1, 2, 3]).
[2,4,6]
9> [X*2 || X <- [1, 2, 3]].
[2,4,6]
10> lists:filter(fun(X) -> X rem 2 == 0 end, [1, 2, 3, 4]).
[2,4]
11> [X || X <- [1, 2, 3, 4], X rem 2 == 0].
[2,4]

Python:

[expression for expr1 in seq1 if cond1
            for expr2 in seq2 if cond2
            for exprN in seqN if condN]

this actually transforms to:

for expr1 in seq1:
    if not cond1:
        continue
    for expr2 in seq2:
        if not cond2:
            continue
        for exprN in seqN:
            if not condN:
                continue

So in Python Cartesian product of two lists [1, 2, 3 ] and ['a', 'b'] will look like:

>>> [(x, y) for x in [1, 2, 3] for y in ['a', 'b']]
[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')]

Variable bindings and scope rules in List Comprehensions:

Current Python 2.x “leaks” loop variable into surrounding scope. This should be solved in Python 3.0

>>> x
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
NameError: name ‘x’ is not defined
>>> vlist = [x for x in 'abc']
>>> vlist
['a', 'b', 'c']
>>> x
‘c’

To avoid leaking loop variable into surrounding scope we can use generator expression which doesn’t “leak”:

>>> vlist = list(x for x in ‘abc’)
>>> vlist
['a', 'b', 'c']
>>> x
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
NameError: name ‘x’ is not defined

Erlang has following scope rules for variables in list comprehensions(from Erlang manual):

  • all variables which occur in a generator pattern are assumed to be “fresh” variables
  • any variables which are defined before the list comprehension and which are used in filters have the values they had before the list comprehension
  • no variables may be exported from a list comprehension

So, Erlang doesn’t “leak” variables:

1> [X || X <- [1, 2, 3]].
[1,2,3]
2> X.
** 1: variable ‘X’ is unbound **

Be careful, sometimes you need to move some pattern matching operations into filter:

-module(listcomp).
-export([select/2]).

%% This select produces following warning during compilation:
%% Warning: variable ‘X’ shadowed in generate
select(X, List) ->
    [Y || {X, Y} <- List].

The function doesn’t produce desired result:

1> listcomp:select(b, [{b, 1}, {c, 2}, {b, 3}]).
[1,2,3]

So we modify generator and move pattern matching operation into filter to achieve what we need:

-module(listcomp).
-export([select/2]).

%% Moving X from pattern matching into filter
select(X, List) ->
    [Y || {X1, Y} <- List, X == X1].
2> listcomp:select(b, [{b, 1}, {c, 2}, {b, 3}]).
[1,3]

As you saw, while different in syntax, both languages provide valuable language construct which allows to write shorter and more elegant programs.


Erlang for Python programmers: Part IV

October 2, 2007

Previous parts: Intro, Part I, Part II and Part III

Funs - sequel

Continuing previous post about functions let’s remember about Fun. I already mentioned it in Part I and promised to show its advanced forms. So, here we go.

Fun with one clause:

1> Double = fun(X) -> X * 2 end.
#Fun<erl_eval.6.56006484>
2> Double(2).
4

But Fun actually has the same function declaration syntax as regular function, except that it has no name in declaration, so we can have this:

3> Big = fun (X) when X > 100 -> true;
3>           (X) -> false
3>       end.
#Fun<erl_eval.6.56006484>
4> lists:filter(Big, [20, 30, 120, 130]).
[120,130]

As you see our Fun has two clauses separated by semicolon(;) and first clause has also guard, ie syntax is the same as in regular function declaration.

There are also following fun expressions that can be used:

  1. fun FunctionName/Arity (FunctionName should point to local function defined in the same module with our fun)
  2. fun Module:FunctionName/Arity (FunctionName should be exported from module Module)

fun FunctionName/Arity is just syntactic sugar for fun (X1, ..Xn) -> FunctionName(X1, .., Xn) end.

Let’s take a look on foregoing in action and create mymath.erl

-module(mymath).
-export([test1/0,
         test2/0,
         test3/0,
         double/1]).

test1() -> lists:map(fun(X) -> X * 2 end, [1, 2, 3]).

test2() -> lists:map(fun double_local/1, [1, 2, 3]).

test3() -> lists:map(fun mymath:double/1, [1, 2, 3]).

%% helper function, not exported
double_local(X) ->
    X * 2.

%% helper function, exported
double(X) ->
    X * 2.

Compile and test it (I’m compiling as usual in Emacs with C-c C-k):

1> c(”/home/alienoid/dev/erlang/mymath”, [{outdir, "/home/alienoid/dev/erlang/"}]).
{ok,mymath}
2> mymath:test1().
[2,4,6]
3> mymath:test2().
[2,4,6]
4> mymath:test3().
[2,4,6]

Also tuple of type {Module, FunctionName} is interpreted as a fun Module:FunctionName/Arity which you already saw. This usage is deprecated, but to be complete here is an example:

5> Append = {lists, append}.
{lists,append}
6> Append([1, 2], [3, 4, 5]).
[1,2,3,4,5]

As you already noted Fun in Erlang is “richer” than lambda in Python.

Built-in functions, BIFs

Both Erlang and Python provide built-in functions that are always available and do not require explicit usage of module name to use them.

Python:

Read more about builtins in library reference.

In Python shell you can inspect in addition contents of __builtins__ module if you need:

>>> import pprint
>>> pprint.pprint(dir(__builtins__))
['ArithmeticError',
 'AssertionError',
 ...
 'type',
 'unichr',
 'unicode',
 'vars',
 'xrange',
 'zip']

Erlang:
Most of the built-in functions belong to module erlang and they are auto-imported.
To get list of all BIFs with description read man erlang.
In Eshell you can get list of functions that belong to erlang module with m(Module). command:

1> m(erlang).
Module erlang compiled: No compile time info available
Compiler options:  []
Object file: preloaded
Exports:
‘!’/2                         list_to_existing_atom/1
‘$erase’/1                    list_to_float/1
‘$erase’/0                    list_to_integer/2
‘$get’/1                      list_to_integer/1
‘$get’/0                      list_to_pid/1
‘$get_keys’/1                 list_to_tuple/1
‘$put’/2                      load_module/2
‘*’/2                         loaded/0
‘+’/1                         localtime/0
‘+’/2                         localtime_to_universaltime/1
‘++’/2                        localtime_to_universaltime/2
…
ok

Take into account though that not all functions in above list that you’ll see are auto-imported and may require usage of module: prefix. Again, for more information read man erlang (either in command line with erl -man erlang or if you use Emacs take a look at Erlang man pages in Emacs)

Macros

Python does not have them, Erlang does. In Erlang they are not as powerful as in Common Lisp, but more like in C.

They are defined in following form:

-define(Const, Replacement).
-define(Func(Var1,…,VarN), Replacement).

Whenever erlang preprocessor encounters expression of form ?MacroName it expands corresponding macro.
Define utils.erl:

-module(utils).
-export([foo/0]).
-define(TIMEOUT, 100).
-define(FUNCMACRO(X, Y), {X, x, Y, y}).

foo() ->
    io:format(“TIMEOUT = ~p~n”, [?TIMEOUT]),
    ?FUNCMACRO(3, 7).

Compile and run:

1> utils:foo().
TIMEOUT = 100
{3,x,7,y}

There are also some predefined macros:

?MODULE
The name of the current module.
?MODULE_STRING.
The name of the current module as a string.
?FILE.
The file name of the current module.
?LINE.
The current line number.
?MACHINE.
The machine name, 'BEAM'.

Having ?MODULE we can rewrite earlier example with Fun as:

-module(mymath).
-export([test1/0,
         test2/0,
         test3/0,
         double/1]).

test1() -> lists:map(fun(X) -> X * 2 end, [1, 2, 3]).

test2() -> lists:map(fun double_local/1, [1, 2, 3]).

%% test3() -> lists:map(fun mymath:double/1, [1, 2, 3]).
test3() -> lists:map(fun ?MODULE:double/1, [1, 2, 3]).

%% helper function, not exported
double_local(X) ->
    X * 2.

%% helper function, exported
double(X) ->
    X * 2.

And as usual compile and run to see that result is the same:

1> c(”/home/alienoid/dev/erlang/mymath”, [{outdir, "/home/alienoid/dev/erlang/"}]).
{ok,mymath}
2>  mymath:test1().
[2,4,6]
3>  mymath:test2().
[2,4,6]
4>  mymath:test3().
[2,4,6]

In addition there is also flow control in macros. Some of corresponding macro directives are:

  • ifdef(Macro). - Evaluate the following lines only if Macro is defined.
  • else. - It’s used after ifdef or ifndef
  • endif. - Marks the end of ifdef or ifndef directive

Example in utils.erl:

-module(utils).
-export([foo/0]).

-ifdef(debug).
-define(LOG(Msg), io:format(“{~p:~p}: ~p~n”, [?MODULE, ?LINE, Msg])).
-else.
-define(LOG(Msg), true).
-endif.

foo() ->
    ?LOG(“Debug is enabled”).

To turn LOG macro on debug should be defined. This can be achieved from command line with erlc -Ddebug utils.erl or from Eshell using c/2 function:

1> c(utils, {d, debug}).
{ok,utils}
2> utils:foo().
{utils:11}: “Debug is enabled”
ok

To turn LOG macro off debug should be omitted:

3> c(utils).
{ok,utils}
4> utils:foo().
true

To be continued.


Erlang for Python programmers: Part III

September 24, 2007

Previous parts: Intro, Part I and Part II

Today we are going to take a look at functions.
In Erlang there is no direct correspondence to what we have in Python:

1. default values, keyword arguments and formal parameter of form **name

def foo(key, name=‘ruslan’, type=‘unknown’, **kw):
    print ‘key = %s, name = %s, type = %s’ % (key, name, type)
    for key, val in kw.items():
        print ‘%s = %s’ % (key, val)
>>> foo(1)
key = 1, name = ruslan, type = unknown
>>> foo(1, type=’known’, x=1, y=10)
key = 1, name = ruslan, type = known
y = 10
x = 1

2. formal parameter of form *name

def double_sum(*args):
    return sum(x * 2 for x in args)
>>> double_sum(2, 4, 5)
22

Let’s take a look at Erlang’s function declaration syntax (a bit modified from Erlang Reference Manual):

  1. Function declaration consists of function clauses separated by semicolons (;) and terminated by period (.)
  2. Function clause consists of head and body separated by ->
  3. A clause head consists of function name (which is atom), argument list(each argument is pattern) and optional guard sequence beginning with keyword when.
  4. A clause body consists of sequence of expressions separated by comma (,)

Name(Pattern11,…,Pattern1N) [when GuardSeq1] ->
Body1;
…;
Name(PatternK1,…,PatternKN) [when GuardSeqK] ->
BodyK.

where Body is in form:
Expr1,
Expr2,

ExprN

Number of arguments in function is called arity. Function is uniquely identified by combination of module name, function name and function’s arity and this form is often written as m:f/N. Two functions in the module with the same name, but different arity
are completely different functions, remember it.

Enough dry theory, let’s define our Erlang factorial function in module mymath.erl :

-module(mymath).
-export([factorial/1]).

factorial(0) ->
    1;
factorial(N) ->
    N * factorial(N-1).

What we have here:

1) function declaration of one function called factorial
2) there are two clauses
- first clause separated by semicolon (;)

factorial(0) ->
    1;

- second and final clause terminated with period (.)

factorial(N) ->
    N * factorial(N-1).

3) Every clause (first and second) has correspondingly head and body. Head consists of function’s name, in our case it’s factorial and after follows arguments list which contains patterns. In first clause pattern is 0, in second - N.

How this works:

When function is entered, the function clauses are pattern matched against passed arguments. Pattern matching happens in order functions are defined in module. So when you enter

1> mymath:factorial(5).
120

first 5 is matched against clause factorial(0) and fails as 5 != 0, then 5 is matched against N in second clause and pattern matching succeeds. As a result unbound variable N in pattern becomes bound and gets value 5. This leads to execution of body of second clause, namely N * factorial(N-1). This happens recursively until N becomes 0 and first clause factorial(0) is executed terminating recursion.
The same factorial function will look in Python without pattern matching like:

def factorial(n):
    if n == 0:
        return 1
    return n * factorial(n-1)

In Python’s version we “hardcoded” branching in function’s body.

The benefits of pattern matching are that when conditions inside function become complex you may do your job better and write less code using pattern matching and also if you have big function this may mean that you do not have to modify body of your function but to add another clause with corresponding pattern.

Earlier I mentioned that function clauses are pattern matched in order they are defined in module. For our function this is important. If we change order of clauses this will lead to problems, as the condition to terminate recursion never appear:

-module(mymath).
-export([factorial/1]).

factorial(N) ->
    N * factorial(N-1);
factorial(0) ->
    1.

When compiling we will get warning (use C-c C-k in Emacs or c(mymath). in Eshell to compile)

2> c(”/home/alienoid/dev/erlang/mymath”, [{outdir, "/home/alienoid/dev/erlang/"}]).
/home/alienoid/dev/erlang/mymath.erl:6: Warning: this clause cannot match because a previous clause at line 4 always matches
{ok,mymath}

indicating that second clause will never match, but still you can call the function (it will most likely to crash after eating all your available memory).

You can easily avoid that problem introducing guard that starts with keyword when in function clause’s head:

-module(mymath).
-export([factorial/1]).

factorial(N) when N > 0 ->
    N * factorial(N-1);
factorial(0) ->
    1.

Now you can put those two clauses in any order you like. I should also add that if during pattern matching no matched function clause is found a function_clause run time error will occur:

3> mymath:factorial(-4).

=ERROR REPORT==== 24-Sep-2007::01:53:25 ===
Error in process <0.29.0> with exit value: {function_clause,[{mymath,factorial,[-4]},{erl_eval,do_apply,5},{shell,exprs,6},{shell,eval_loop,3}]}

** exited: {function_clause,[{mymath,factorial,[-4]},
                             {erl_eval,do_apply,5},
                             {shell,exprs,6},
                             {shell,eval_loop,3}]} **

What is guard anyway ? Guards make pattern matching more powerful, they can be used to perform different tests and comparison operations on variables in a pattern as you could see. You can have several guard expressions in clause separated by commans (,). The guard containing several guard expressions evaluates to true only if every guard expression evaluates to true. Example of guard containing several guard expressions:

-module(mymath).
-export([factorial/1]).

factorial(N) when is_integer(N), N > 0 ->
    N * factorial(N-1);
factorial(0) ->
    1.

For above definition calling mymath:factorial(5) reads as: choose clause factorial(N) when (N is integer AND N > 0)

There are also guard sequences which have form guard1; guard2; …; guardN and guard sequence evaluates to true if at least one guard is evaluated to true:

-module(mymath).
-export([factorial/1]).

factorial(N) when is_integer(N), N > 0; is_list(N) ->
    N * factorial(N-1);
factorial(0) ->
    1.

For above definition calling mymath:factorial(5) reads as: choose clause factorial(N) when (N is integer AND N > 0) OR N is list.

You have probably noted that the way we defined our factorial function means that it will be calling itself recursively and this will use stack and stack is not infinite resource. We need to do something about our function and this means to make it tail recursive. When last expression in function body is function call this means that call is tail recursive and this is how infinite loop may be done without consuming stack resources. So here is our new tail-recursive function factorial that uses accumulator to pass value of previous calculation:

-module(mymath).
-export([factorial/1]).

%% this one is exported, arity is 1
factorial(N) ->
    factorial(N, 1).

%% helper function, not exported, arity is 2
%% tail-recursive, uses accumulator to store
%% calculation between invocations
factorial(N, Acc) when N > 0 ->
    factorial(N-1, N*Acc);
factorial(0, Acc) ->
    Acc.

In foregoing example you can see guards, tail-recursion, usage of accumulators, functions with different arities. Experiment with it.


comment-dwim and comment-style

September 20, 2007

In my previous post I mentioned that if you use Emacs and you’re editing Erlang source code you can easily comment whole marked region by invoking M-; to which by default bound interactive Lisp function comment-dwim. This works in different modes and it’s very handy especially if your language of choice has no multiline comments.

So far so good, but I must admit sometimes I get the blues when I comment some nested code blocks with M-; keystrokes. Well, maybe I’m blowing things out of proportion about my sadness, but in any case recently I came across really nice small blog post called comment-style and since then I’m happy with comments produced in nested blocks. Let’s see why.

When you comment marked regions in Emacs buffer with M-; the comment-style variable defines the way regions are commented. Default value is plain.

Buffer with Python code before commenting:

def binsearch(seq, key, start, end):
    if end < start:
        return -1
    mid = (start + end) / 2
    if key == seq[mid]:
        return mid
    if key < seq[mid]:
        return binsearch(seq, key, start, mid-1)
    if key > seq[mid]:
        return binsearch(seq, key, start+1, end)

Buffer with Python code after commenting:

def binsearch(seq, key, start, end):
    if end < start:
        return -1
    mid = (start + end) / 2
    # if key == seq[mid]:
#         return mid
    if key < seq[mid]:
        return binsearch(seq, key, start, mid-1)
    if key > seq[mid]:
        return binsearch(seq, key, start+1, end)

As you see visually elements of commented block are not on the same line.

Example with Erlang code before commenting (somewhat far-fetched example though our main concern are comments, not code here):

all(Pred, [Hd|Tail]) ->
    case Pred(Hd) of
        true -> all(Pred, Tail);
        false ->
            io:format(“Hd = ~p~n”, [Hd]),
            true,
            false
    end;
all(Pred, []) when is_function(Pred, 1) -> true.

After commenting:

all(Pred, [Hd|Tail]) ->
    case Pred(Hd) of
        true -> all(Pred, Tail);
        false ->
            %% io:format(”Hd = ~p~n”, [Hd]),
%%             true,
            false
    end;
all(Pred, []) when is_function(Pred, 1) -> true.

So the visual issue with comments is the same.

After reading foregoing blog post about comment-style I’ve added this line to my .emacs:

(setq comment-style 'indent)

Easy as pie and which means (taken from Emacs help): ” …`comment-start’ markers should not be put at the left margin but at the current indentation of the region to comment.”

After that addition foregoing commented block in Python mode will look like:

def binsearch(seq, key, start, end):
    if end < start:
        return -1
    mid = (start + end) / 2
    # if key == seq[mid]:
    #     return mid
    if key < seq[mid]:
        return binsearch(seq, key, start, mid-1)
    if key > seq[mid]:
        return binsearch(seq, key, start+1, end)

in Erlang mode:

all(Pred, [Hd|Tail]) ->
    case Pred(Hd) of
        true -> all(Pred, Tail);
        false ->
            %% io:format(”Hd = ~p~n”, [Hd]),
            %% true,
            false
    end;
all(Pred, []) when is_function(Pred, 1) -> true.

Now I am happy. Thank you, Edward for the tip.


Erlang for Python programmers: Part II

September 16, 2007

In this short tutorial we will take a look at comparison operations, some arithmetic operations and modules. You may also want to skim over Inro and Part I.

Comparison operations

 Python   Erlang   Description             Erlang Example
 --------+--------+-----------------------+----------------
  <        <        strictly less than
 --------+--------+-----------------------+----------------
  <=       =<       less than or equal
 --------+--------+-----------------------+----------------
  >        >        strictly greater than
 --------+--------+-----------------------+----------------
  >=       >=       greater than or equal
 --------+--------+-----------------------+----------------
  !=       /=       not equal
 --------+--------+-----------------------+----------------
  ==       ==       equal                   1> 1 == 1.
                                            true
                                            2> 1 == 1.0.
                                            true
 --------+--------+-----------------------+----------------
           =:=      exactly equal to        1> 1 =:= 1.
                                            true
                                            2> 1 =:= 1.0.
                                            false
 --------+--------+-----------------------+----------------
           =/=      exactly not equal to
 --------+--------+-----------------------+----------------
  is                object identity
 --------+--------+-----------------------+----------------
  is not            negated obj identity

Python notes

Above comparison operations are supported by all objects. In Python you can chain comparisons:

>>> x, y, z = 1, 3, 7
>>> x < y <= z
True
>>> x < y and y <= z
True

Foregoing examples are identical, except that in second case y is evaluated twice.

Erlang notes

Following order is defined:

number < atom < reference < fun < port < pid < tuple < list < binary

1> 5 < erlang.
true
2> erlang < make_ref().
true

Both Erlang comparison operators (<, =<, >, >=, /=, ==) and Python comparison operators(<, <=, >, >=, !=, ==) make type coerce, “narrower” type is widened to that of another, ie when comparing integer and float first integer is converted to float, etc.
Erlang has special operators without type coerce though: =:= and =/=

Arithmetic operations

I’ll provide only several operations, for more take a look at corresponding language references.

 Python   Python Desc.         Erlang    Erlang Desc.   Example
 --------+--------------------+---------+--------------+--------------------
  x % y    remainder            X rem Y   integer        >>> 7 % 3
           of x / y                       remainder      1
                                          of X / Y       >>> 7.0 % 3
                                                         1.0

                                                         1> 7 rem 3.
                                                         1
                                                         1> 7.0 rem 3.
                                                         =ERROR REP..
 --------+--------------------+---------+--------------+--------------------
  x / y    quotient             X / Y     floating       >>> 7 / 3
           of x and y                     point          2
                                          division       >>> 7.0 / 3
                                                         2.3333333333333335

                                                         1> 7 / 3.
                                                         2.33333
 --------+--------------------+---------+--------------+--------------------
  x // y   (floored) quotient   X div Y   integer        >>> 7 // 3
           of x and y,                    division       2
           integer division                              >>> 7.5 // 3
           (result type is                               2.0
           not forced to be
           int)                                          1> 7 div 3.
                                                         2

Modules

Code in Erlang is organized into units called modules, which is familiar word for Python programmer.

Let’s define sample module with function declaration stored in file mymath.erl :

-module(mymath).
-export([fact/1]).

fact(0) -> 1;
fact(N) -> N * fact(N-1).

What we see here:

  1. module’s source code is stored in file with .erl extension
  2. module consists of attributes and function declarations which are terminated by period (.)
  3. we should provide module declaration defining name of the module
  • module name should be atom
  • module name is the same as file name minus .erl extension
  • module declaration is mandatory
  • module declaration attribute should be defined first.

To make functions defined in module accessible outside the module we need to export them, for this -export module attribute exists. We write exported functions inside square brackets in form of func/N where N is the number of arguments of function, called arity. I’ll repeat that functions not pointed in -export will not be accessible outside module.

Before code can be run, module must be compiled. Resulting compiled file will contain extension .beam

If you use Emacs you can compile module with C-c C-k and see results in erlang shell:

1> c(”/home/alienoid/dev/erlang/mymath”, [{outdir, "/home/alienoid/dev/erlang/"}]).
{ok,mymath}

Or you can compile it directly in shell. Make sure your shell’s current directory is where your mymath.erl lives, if it’s not the case use cd command in erlang shell, cd(”/path/to/dir/with/mymath.erl”) :

1> c(mymath).
{ok,mymath}

To invoke our function fact we use syntax mod:func :

2> mymath:fact(4).
24

Erlang has also -import attribute which allows to import functions into modules, so that you don’t need to use fully-qualified name mod:func to invoke function, again familiar behaviour and naming to Python programmer.

Erlang allows to insert code from file as-is with -include attribute at point where -import is defined, this is used to include records and macro definitions, for example.

Comments in Erlang module begin with character “%“, continue up to end-of-line and may be placed anywhere except inside string and quoted atoms.
Like in Python Erlang has no multiline comments.
If you use Emacs you can easily comment whole region with M-; command after you marked it.

-module(mymath).
-export([fact/1, print_double/1]).
-import(lists, [foreach/2]).
-include(“my_records.hrl”).

%% sum(L) ->
%%     sum(L, 0).

%% sum([H|T], Acc) ->
%%     sum(T, H+Acc);
%% sum([], Acc) -> Acc.

fact(0) -> 1;
fact(N) -> N * fact(N-1).
print_double(L) ->
    foreach(fun(X) -> io:format(“Double of ~p = ~p~n”, [X, X*2]) end, L).

Fin.

Next tutorial will be devoted to thorough exploration of functions in Erlang.


How to say big numbers in English: Common Lisp

September 12, 2007

When i was describing notations and representation of numbers in Erlang(and Python) i forgot to mention very cool feature of format function in Common Lisp, with which i got acquainted exploring highly-recommended Practical Common Lisp book.

This feature is ~R directive which knows how to say really big numbers in English words.

Here is the example (I use SBCL + SLIME):

CL-USER> (format nil “~R” 1604872898756737459538)
“one sextillion six hundred four quintillion eight hundred seventy-two
quadrillion eight hundred ninety-eight trillion seven hundred fifty-six
billion seven hundred thirty-seven million four hundred fifty-nine
thousand five hundred thirty-eight”

Erlang for Python programmers: Part I

September 9, 2007

Let’s skim over data types in Erlang today. Check previous tutorial for introduction.

Numbers

In Erlang there are two types of numeric literals: integers and floats.
In Python there are four of them: plain integers(usually called just integers), long integers, floating point numbers, and imaginary numbers.

In addition to conventional notation Erlang has its own specific notations:
1) $char
Gives ASCII value of char
2) base#value
Produces integer. Base is in range [2,36]

1> $A.
65
2> 2#111.
7
3> 16#1A.
26
4> 5.
5
5> 2.55.
2.55000

In Python we can define octinteger and hexinteger correspondingly with:

>>> 017, 0×1A
(15, 26)

To produce integer from different bases in Python we may use built-in int([x[, radix]]), where radix is in range [2,36]:

>>> int(”111″, 2), int(”1A”, 16)
(7, 26)

Atom

An atom is a literal or a constant with name. Atoms should start with small letter or it should be enclosed in single quote(’) if it contains other characters than alphanumeric characters, _, or @.
Atoms can not have values like variables, they are simply names (constant with name).

1> erlang.
erlang
2> ‘hello world’.
‘hello world’
3> ‘Erlang’.
‘Erlang’

In Python there are also atoms in form of identifiers or literals, but no correspondence with Erlang’s constant with name type of atom.

More examples with atoms to come when we get acquainted with other data types
and parts of Erlang.

Binary

From a reference manual:
“A binary is used to store an area of untyped memory.
Binaries are expressed using the bit syntax.”

This data type allows to store large raw chunks of memory in an efficient way, much more space-efficient than in lists or tuples.

Binaries are written in double less-than and greater-than brackets and printed in that form as well.
They can be constructed from set of constants or string literal:

1> <<1,2,3>>.
<<1,2,3>>
2> <<”erlang”>>.
<<”erlang”>>

Or from bound variables:

3> A = 1, B = 2, C = 3.
3
4> Bin = <<A, B, C>>.
<<1,2,3>>

Remember pattern matching ?
Binary (Bin for short) can also be used for matching:

5> <<X, Y, Z>> = Bin.
<<1,2,3>>
6> X.
1
7> Y.
2
8> Z.
3

There is more than that, namely bit syntax which allows to pack/unpack sequences of bits in Bin. Actually bit syntax is an extension to pattern matching.
Let’s pack three variables into 16 bit memory area in a variable Mem, then unpack it to another three variables using bit syntax:

1> X = 2.
2
2> Y = 40.
40
3> Z = 30.
30
4> Mem = <<X:3, Y:7, Z:6>>.
<<74,30>>
5> <<X1:3, Y1:7, Z1:6>> = Mem.
<<74,30>>
6> X1.
2
7> Y1.
40
8> Z1.
30

In foregoing output you see that after packing shell prints packed three variables as <<74,30>> which is 16 bits.

In Python 2.5 there is no built-in data type to support binary data, but in Python 3000 binary data is represented by a separate mutable “bytes” data type.

Fun

Fun is an anonymous function, ie without name.

In Python this is infamous lambda, which had many debates whether it should remain in language or not. Personally i like lambda and glad it survived and will continue its life in Python 3000.

But Ok, back to Erlang’s Fun. Here how we can define it in shell:

1> Double = fun(X) -> X * 2 end.
#Fun<erl_eval.6.56006484>
2> Double(4).
8

In contrast to Python Erlang’s Fun can contain statements and is “more advanced” in general:

1> Print_val = fun(X) -> io:format(”X = ~p~n”, [X]), X*2 end.
#Fun<erl_eval.6.56006484>
2> Print_val(3).
X = 3
6
>>> print_val = lambda x: print ‘x = %d’ % x; x*2
  File “<stdin>”, line 1
    print_val = lambda x: print ‘x = %d’ % x; x*2
                              ^
SyntaxError: invalid syntax
>>> print_val = lambda x: x*2
>>> print_val(3)
6

More advanced examples of Fun like Funs with several different clauses i’ll show when discussing in details functions in Erlang in upcoming tutorials.

Tuple

It’s a compound data type with fixed number of terms (Term is a piece of data of any data type). Number of elements in tuple is called size of tuple.
Syntax of tuples is:

1> Tuple = {ruslan, 28}.
{ruslan,28}
2> Info = {data, Tuple}.
{data,{ruslan,28}}

In Python tuples are defined with or without enclosing parentheses:

>>> tup = “ruslan”, 28
>>> tup
(’ruslan’, 28)
>>> info = (’data’, tup)
>>> info
(‘data’, (’ruslan’, 28))

To get number of elements in tuple.

Erlang:

3> size(Info).
2

Python:

>>> len(info)
2

List

List is a compound data type with variable number of terms. Number of elements in a list is called length of the list.

List is declared in square brackets like in Python.

1> List = [1, 2, 3, "erlang"].
[1,2,3,"erlang"]

Now Erlang specific part, which may be familiar to you if you know a bit Lisp and its CAR/CDR stuff:
List is either an empty list [] or may be represented as head (first element - CAR in Lisp) and tail (remainder of list - CDR in Lisp) which is also a list and often you can see it in form like [H|T].
This representation is recursive:
[]
[c | []]
[b | [c | []]]
[a | [b | [c | []]]]
are all lists actually.

Let’s define list in shell:

1> List = [a, 1, {b, 2}].
[a,1,{b,2}]

and unpack it with pattern matching to head and tail:

2> [H|T] = List.
[a,1,{b,2}]
3> H.
a
4> T.
[1,{b,2}]

As you see T(tail) above is also a list.

We can also construct list using head and tail syntax:

5> List2 = [c|T].
[c,1,{b,2}]

Working with Erlang you’ll see/use this syntax quite often.

To get number of elements in list.

Erlang:

6> length(List2).
3

Python:

>>> len([1, 2, 3, "python"])
4

String

In Python string type is immutable sequence of characters. In Python there is no separate character type, a character is represented by a string of one item. String literals are written in single or double quotes:

>>> ‘xyz’, “xyz”
(‘xyz’, ‘xyz’)

In Erlang strings are enclosed in double quotes, but unlike Python string is not separate data type actually. String “erlang” is just shorthand for list [$e, $r, $l, $a, $n, $g].

Adjacent string literals are concatenated at compile time:

1> “hello” “, world” ” in ” “erlang”.
“hello, world in erlang”

In Python behaviour with adjacent strings is the same:

>>> ‘hello’ ‘, world’ ‘ in ‘ ‘python’
‘hello, world in python’

Boolean

Again unlike Python Erlang has no built-in Boolean type, but it uses atoms true and false to denote Boolean values:

1> 1 < 2.
true
2> true or false.
true

In Python Boolean is separate data type:

>>> type(True), type(False)
(<type ‘bool’>, <type ‘bool’>)
>>> 1 < 2
True
>>> True or False
True

Record

Record is a data structure to hold fixed number of elements and it provides method to associate a name with particular element in tuple. It resembles structure in C. Record is not true data type, it is “tuple in disguise”(TM).
It’s syntax is:
-record(Name, {key1=Default1, key2, …}).

But in shell we can not use -record syntax and we do not yet know about modules where it can be defined, so we’ll use ‘rd’ shell command to define record:

1> rd(info, {name=”ruslan”, age=28, height, weight}).
info

Now let’s create instance of record:

2> X = #info{}.
#info{name = “ruslan”,age = 28,height = undefined,weight = undefined}
3> X.
#info{name = “ruslan”,age = 28,height = undefined,weight = undefined}

As you see height and weight got “undefined” value.

To extract fields of record we use pattern matching:

4> #info{name=Name, age=Age} = X.
#info{name = “ruslan”,age = 28,height = undefined,weight = undefined}
5> Name.
“ruslan”
6> Age.
28

We can also access field of record with “dot syntax”:

7> X#info.name.
“ruslan”

Erlang has more data types like Pid, Port identifier, Reference which will be described later in more advanced topics.There is no direct correspondence to Python’s built-in data types like dictionaries and sets, but Erlang has modules which provide the same semantics.

That’s it for today.


Erlang for Python programmers: Intro

August 27, 2007

I assume you have already installed Erlang on your system and are able to run erlang shell either from command line or from within Emacs (C-c C-z).

This is a view of Eshell from my Emacs:

Erlang (BEAM) emulator version 5.5.2 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.5.2  (abort with ^G)
1> 

Let’s do a simple thing and try to assign a value to a variable in the Eshell. First you should know that variables in Erlang start with uppercase letter or underscore (_), for example:

  • X
  • Name
  • Address
  • _
  • _NoWarn

Ok, let’s finally assign some value to a variable X:

1> X = 4.
4
2>

So far, so good. Make sure to type final “.” at the end of expression when you are done entering code.

Let’s assign new value to X:

2> X = 5.

=ERROR REPORT==== 27-Aug-2007::00:23:02 ===
Error in process <0.29.0> with exit value: {{badmatch,5},[{erl_eval,expr,3}]}

** exited: {{badmatch,5},[{erl_eval,expr,3}]} **
3> 

Here you need to understand and remember couple of things:

  1. Erlang uses single assignment, a variable can only be bound once, bar none.
  2. In expression X = 5 operator = is not an assignment, but match operator and variables are bound to values using pattern matching mechanism.

I think it won’t be an exaggeration if I say that Pattern Matching permeates Erlang. To understand why we got ERROR REPORT in above example with {badmatch, 5} let’s dive into this pattern matching a bit.

When you use pattern matching’s = operator left side called pattern is matched against right side which is a term(piece of data of any Erlang’s data type: integer, float, list, tuple, etc). If your matching fails you get run-time error, if it succeeds then any unbound variable in lefthand pattern becomes bound.

Returning to our example - first we had an unbound variable:

1> X.
** 1: variable ‘X’ is unbound **
2> 

Then we bound value to variable X with pattern matching:

2> X = 4.
4
3> X.
4
4> 

Now as the variable X is bound we can’t change its value any more (remember single assignment?) otherwise we’ll get badmatch error like in foregoing X = 5 (4 does not match 5), but we may use = to match our variable:

4> X = 4.
4
5> 

Let’s add a bit of tuples to pattern matching. In Erlang tuple is defined inside {…}. ie:

5> {2, 3}.
{2,3}
6> 

and now pattern matching:

6> {X, Y} = {2, 3}.

=ERROR REPORT==== 27-Aug-2007::01:10:45 ===
Error in process <0.29.0> with exit value: {{badmatch,{2,3}},[{erl_eval,expr,3}]}

** exited: {{badmatch,{2,3}},[{erl_eval,expr,3}]} **
7> {X, Y} = {4, 3}.
{4,3}
8> X.
4
9> Y.
3
10> 

In first attempt we tried to match X with 2 and the matching failed as the value of X is 4, in a second attempt the matching succeeded and unbound variable Y also got its value (became bound).

This may look to you like tuple unpacking in Python:

>>> x, y = (4, 3)
>>> x, y
(4, 3)
>>> x, y = (5, 3)
>>> x, y
(5, 3)

but it’s only visual similarity, though i must admit, useful one.

That’s it for today, later you’ll see how pattern matching is uber-useful and how it is used in different parts of Erlang’s constructs.


Htmlize your Erlang code buffer

August 18, 2007

Recently I’ve begun to make blog posts and now want to have highlighting of Erlang’s code(actually any code , be it Python, Lisp, etc) in my posts. I’m using The One True Editor and it goes without saying that i want to convert my erlang code into html from within emacs to have corresponding syntax coloring. Htmlize package to the rescue!

After you downloaded package and put

(add-to-list 'load-path “path_to_your_htmlize.el”)
(requirehtmlize)

into your dot emacs - you are ready to use it.

M-x htmlize-buffer or M-x htmlize-region will convert whole buffer or selected region with associated decorations into HTML( See C-h a htmlize RET for more functions). By default htmlize will make HTML output in css mode which is not very suitable for inserting into blog post as output includes also style sheets. Excerpt from htmlize documentation:

;; htmlize supports three types of HTML output, selected by setting
;; `htmlize-output-type‘: `css‘, `inline-css‘, and `font‘.  In `css;; mode, htmlize uses cascading style sheets to specify colors; it
;; generates classes that correspond to Emacs faces and uses <span
;; class=FACE>…</span> to color parts of text.  In this mode, the
;; produced HTML is valid under the 4.01 strict DTD, as confirmed by
;; the W3C validator.  `inline-css‘ is like `css‘, except the CSS is
;; put directly in the STYLE attribute of the SPAN element, making it
;; possible to paste the generated HTML to other documents.  In `font;; mode, htmlize uses <font color=”…”>…</font> to colorize HTML,
;; which is not standard-compliant, but works better in older
;; browsers.  `css‘ mode is the default.

inline-css is the type of output i want. This can be achieved by customizing htmlize-output-type variable with M-x customize htmlize RET or directly modifying dot emacs. Converting:

(add-to-list 'load-path “path_to_your_htmlize.el”)
(requirehtmlize)

with htmlize-region and then applying nxml-mode to produced result gives:

<!DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01//EN>
<!– Created by htmlize-1.34 in inline-css mode. –>
<html>
  <head>
    <title>.emacs</title>
  </head>
  <body style=color: #ffffff; background-color: #000000;>
    <pre>
(add-to-list ‘load-path <span style=color: #ffa07a;>“path_to_your_htmlize.el”</span>)
(<span style=color: #00ffff;>require</span><span style=color: #7fffd4;>htmlize</span>)
</pre>
  </body>
</html>

As you see I need to cut contents between <pre> </pre> including
<pre> tags itself to insert that then directly into blog post. After that i need manually copy/paste styles from <body> tag into <pre> tag and also add font-size: 8pt to meet my taste. Being lazy i’ve implemented function my-htmlize-region in elisp to make that job for me every time i press F5 key:

(defun my-htmlize-region (beg end)
  “Htmlize region and put into <pre> tag style that is left in <body> tag
plus add font-size: 8pt”
  (interactive “r”)
  (let* ((buffer-faces (htmlize-faces-in-buffer))
         (face-map (htmlize-make-face-map (adjoin ‘default buffer-faces)))
         (pre-tag (format
                   “<pre style=\”%s font-size: 8pt\”>”
                   (mapconcat #’identity (htmlize-css-specs
                                          (gethash ‘default face-map)) ” “)))
         (htmlized-reg (htmlize-region-for-paste beg end)))
    (switch-to-buffer-other-window “*htmlized output*”)
    ; clear buffer
    (kill-region (point-min) (point-max))
    ; set mode to have syntax highlighting
    (nxml-mode)
    (save-excursion
      (insert htmlized-reg))
    (while (re-search-forward “<pre>” nil t)
      (replace-match pre-tag nil nil))
    (goto-char (point-min))))

(global-set-key [(f5)] (lambda (beg end)
                         (interactive “r”) (my-htmlize-region beg end)))

So now selecting:

(add-to-list 'load-path “path_to_your_htmlize.el”)
(requirehtmlize)

and pressing F5 i get this in another buffer:

<pre style=color: #ffffff; background-color: #000000; font-size: 8pt>
(add-to-list ‘load-path <span style=color: #ffa07a;>“path_to_your_htmlize.el”</span>)
(<span style=color: #00ffff;>require</span><span style=color: #7fffd4;>htmlize</span>)
</pre>

As you see it’s just what i want:
<pre> tags with content and <pre> tag contains style from above mentioned <body> tag plus font-size is added too.

Factorial in Erlang with syntax coloring as i have it in my Emacs:

-module(factorial).
-export([factorial/1, factorial_rec/1]).

%% tail recursion
factorial(Num) ->
    factorial(Num, 1).

factorial(1, Acc) -> Acc;
factorial(Num, Acc) ->
    factorial(Num-1, Num * Acc).

%% recursion
factorial_rec(1) -> 1;
factorial_rec(Num) ->
    Num * factorial_rec(Num-1).

This is a joy of using Emacs.