## Let’s Build A Simple Interpreter. Part 4.

Date

Have you been passively learning the material in these articles or have you been actively practicing it? I hope you’ve been actively practicing it. I really do :)

Remember what Confucius said?

I hear and I forget.”

I see and I remember.”

I do and I understand.”

In the previous article you learned how to parse (recognize) and interpret arithmetic expressions with any number of plus or minus operators in them, for example “7 - 3 + 2 - 1”. You also learned about syntax diagrams and how they can be used to specify the syntax of a programming language.

Today you’re going to learn how to parse and interpret arithmetic expressions with any number of multiplication and division operators in them, for example “7 * 4 / 2 * 3”. The division in this article will be an integer division, so if the expression is “9 / 4”, then the answer will be an integer: 2.

I will also talk quite a bit today about another widely used notation for specifying the syntax of a programming language. It’s called context-free grammars (grammars, for short) or BNF (Backus-Naur Form). For the purpose of this article I will not use pure BNF notation but more like a modified EBNF notation.

Here are a couple of reasons to use grammars:

1. A grammar specifies the syntax of a programming language in a concise manner. Unlike syntax diagrams, grammars are very compact. You will see me using grammars more and more in future articles.
2. A grammar can serve as great documentation.
3. A grammar is a good starting point even if you manually write your parser from scratch. Quite often you can just convert the grammar to code by following a set of simple rules.
4. There is a set of tools, called parser generators, which accept a grammar as an input and automatically generate a parser for you based on that grammar. I will talk about those tools later on in the series.

Now, let’s talk about the mechanical aspects of grammars, shall we?

Here is a grammar that describes arithmetic expressions like “7 * 4 / 2 * 3” (it’s just one of the many expressions that can be generated by the grammar):

A grammar consists of a sequence of rules, also known as productions. There are two rules in our grammar:

A rule consists of a non-terminal, called the head or left-hand side of the production, a colon, and a sequence of terminals and/or non-terminals, called the body or right-hand side of the production:

In the grammar I showed above, tokens like MUL, DIV, and INTEGER are called terminals and variables like expr and factor are called non-terminals. Non-terminals usually consist of a sequence of terminals and/or non-terminals:

The non-terminal symbol on the left side of the first rule is called the start symbol. In the case of our grammar, the start symbol is expr:

You can read the rule expr as “An expr can be a factor optionally followed by a multiplication or division operator followed by another factor, which in turn is optionally followed by a multiplication or division operator followed by another factor and so on and so forth.”

What is a factor? For the purpose of this article a factor is just an integer.

Let’s quickly go over the symbols used in the grammar and their meaning.

• | - Alternatives. A bar means “or”. So (MUL | DIV) means either MUL or DIV.
• ( … ) - An open and closing parentheses mean grouping of terminals and/or non-terminals as in (MUL | DIV).
• ( … )* - Match contents within the group zero or more times.

If you worked with regular expressions in the past, then the symbols |, (), and (…)* should be pretty familiar to you.

A grammar defines a language by explaining what sentences it can form. This is how you can derive an arithmetic expression using the grammar: first you begin with the start symbol expr and then repeatedly replace a non-terminal by the body of a rule for that non-terminal until you have generated a sentence consisting solely of terminals. Those sentences form a language defined by the grammar.

If the grammar cannot derive a certain arithmetic expression, then it doesn’t support that expression and the parser will generate a syntax error when it tries to recognize the expression.

I think a couple of examples are in order. This is how the grammar derives the expression 3:

This is how the grammar derives the expression 3 * 7:

And this is how the grammar derives the expression 3 * 7 / 2:

Whoa, quite a bit of theory right there!

I think when I first read about grammars, the related terminology, and all that jazz, I felt something like this:

I can assure you that I definitely was not like this:

It took me some time to get comfortable with the notation, how it works, and its relationship with parsers and lexers, but I have to tell you that it pays to learn it in the long run because it’s so widely used in practice and compiler literature that you’re bound to run into it at some point. So, why not sooner rather than later? :)

Now, let’s map that grammar to code, okay?

Here are the guidelines that we will use to convert the grammar to source code. By following them, you can literally translate the grammar to a working parser:

1. Each rule, R, defined in the grammar, becomes a method with the same name, and references to that rule become a method call: R(). The body of the method follows the flow of the body of the rule using the very same guidelines.
2. Alternatives (a1 | a2 | aN) become an if-elif-else statement
3. An optional grouping (…)* becomes a while statement that can loop over zero or more times
4. Each token reference T becomes a call to the method eat: eat(T). The way the eat method works is that it consumes the token T if it matches the current lookahead token, then it gets a new token from the lexer and assigns that token to the current_token internal variable.

Visually the guidelines look like this:

Let’s get moving and convert our grammar to code following the above guidelines.

There are two rules in our grammar: one expr rule and one factor rule. Let’s start with the factor rule (production). According to the guidelines, you need to create a method called factor (guideline 1) that has a single call to the eat method to consume the INTEGER token (guideline 4):

```def factor(self):
self.eat(INTEGER)
```

That was easy, wasn’t it?

Onward!

The rule expr becomes the expr method (again according to the guideline 1). The body of the rule starts with a reference to factor that becomes a factor() method call. The optional grouping (…)* becomes a while loop and (MUL | DIV) alternatives become an if-elif-else statement. By combining those pieces together we get the following expr method:

```def expr(self):
self.factor()

while self.current_token.type in (MUL, DIV):
token = self.current_token
if token.type == MUL:
self.eat(MUL)
self.factor()
elif token.type == DIV:
self.eat(DIV)
self.factor()
```

Please spend some time and study how I mapped the grammar to the source code. Make sure you understand that part because it’ll come in handy later on.

For your convenience I put the above code into the parser.py file that contains a lexer and a parser without an interpreter. You can download the file directly from GitHub and play with it. It has an interactive prompt where you can enter expressions and see if they are valid: that is, if the parser built according to the grammar can recognize the expressions.

Here is a sample session that I ran on my computer:

```\$ python parser.py
calc> 3
calc> 3 * 7
calc> 3 * 7 / 2
calc> 3 *
Traceback (most recent call last):
File "parser.py", line 155, in <module>
main()
File "parser.py", line 151, in main
parser.parse()
File "parser.py", line 136, in parse
self.expr()
File "parser.py", line 130, in expr
self.factor()
File "parser.py", line 114, in factor
self.eat(INTEGER)
File "parser.py", line 107, in eat
self.error()
File "parser.py", line 97, in error
raise Exception('Invalid syntax')
Exception: Invalid syntax
```

Try it out!

I couldn’t help but mention syntax diagrams again. This is how a syntax diagram for the same expr rule will look:

It’s about time we dug into the source code of our new arithmetic expression interpreter. Below is the code of a calculator that can handle valid arithmetic expressions containing integers and any number of multiplication and division (integer division) operators. You can also see that I refactored the lexical analyzer into a separate class Lexer and updated the Interpreter class to take the Lexer instance as a parameter:

```# Token types
#
# EOF (end-of-file) token is used to indicate that
# there is no more input left for lexical analysis
INTEGER, MUL, DIV, EOF = 'INTEGER', 'MUL', 'DIV', 'EOF'

class Token(object):
def __init__(self, type, value):
# token type: INTEGER, MUL, DIV, or EOF
self.type = type
# token value: non-negative integer value, '*', '/', or None
self.value = value

def __str__(self):
"""String representation of the class instance.

Examples:
Token(INTEGER, 3)
Token(MUL, '*')
"""
return 'Token({type}, {value})'.format(
type=self.type,
value=repr(self.value)
)

def __repr__(self):
return self.__str__()

class Lexer(object):
def __init__(self, text):
# client string input, e.g. "3 * 5", "12 / 3 * 4", etc
self.text = text
# self.pos is an index into self.text
self.pos = 0
self.current_char = self.text[self.pos]

def error(self):
raise Exception('Invalid character')

"""Advance the `pos` pointer and set the `current_char` variable."""
self.pos += 1
if self.pos > len(self.text) - 1:
self.current_char = None  # Indicates end of input
else:
self.current_char = self.text[self.pos]

def skip_whitespace(self):
while self.current_char is not None and self.current_char.isspace():

def integer(self):
"""Return a (multidigit) integer consumed from the input."""
result = ''
while self.current_char is not None and self.current_char.isdigit():
result += self.current_char
return int(result)

def get_next_token(self):
"""Lexical analyzer (also known as scanner or tokenizer)

This method is responsible for breaking a sentence
apart into tokens. One token at a time.
"""
while self.current_char is not None:

if self.current_char.isspace():
self.skip_whitespace()
continue

if self.current_char.isdigit():

if self.current_char == '*':

if self.current_char == '/':

self.error()

class Interpreter(object):
def __init__(self, lexer):
self.lexer = lexer
# set current token to the first token taken from the input
self.current_token = self.lexer.get_next_token()

def error(self):
raise Exception('Invalid syntax')

def eat(self, token_type):
# compare the current token type with the passed token
# type and if they match then "eat" the current token
# and assign the next token to the self.current_token,
# otherwise raise an exception.
if self.current_token.type == token_type:
self.current_token = self.lexer.get_next_token()
else:
self.error()

def factor(self):
"""Return an INTEGER token value.

factor : INTEGER
"""
token = self.current_token
self.eat(INTEGER)

def expr(self):
"""Arithmetic expression parser / interpreter.

expr   : factor ((MUL | DIV) factor)*
factor : INTEGER
"""
result = self.factor()

while self.current_token.type in (MUL, DIV):
token = self.current_token
if token.type == MUL:
self.eat(MUL)
result = result * self.factor()
elif token.type == DIV:
self.eat(DIV)
result = result / self.factor()

return result

def main():
while True:
try:
# To run under Python3 replace 'raw_input' call
# with 'input'
text = raw_input('calc> ')
except EOFError:
break
if not text:
continue
lexer = Lexer(text)
interpreter = Interpreter(lexer)
result = interpreter.expr()
print(result)

if __name__ == '__main__':
main()
```

Save the above code into the calc4.py file or download it directly from GitHub. As usual, try it out and see for yourself that it works.

This is a sample session that I ran on my laptop:

```\$ python calc4.py
calc> 7 * 4 / 2
14
calc> 7 * 4 / 2 * 3
42
calc> 10 * 4  * 2 * 3 / 8
30
```

I know you couldn’t wait for this part :) Here are new exercises for today:

• Write a grammar that describes arithmetic expressions containing any number of +, -, *, or / operators. With the grammar you should be able to derive expressions like “2 + 7 * 4”, “7 - 8 / 4”, “14 + 2 * 3 - 6 / 2”, and so on.
• Using the grammar, write an interpreter that can evaluate arithmetic expressions containing any number of +, -, *, or / operators. Your interpreter should be able to handle expressions like “2 + 7 * 4”, “7 - 8 / 4”, “14 + 2 * 3 - 6 / 2”, and so on.
• If you’ve finished the above exercises, relax and enjoy :)

Check your understanding.

Keeping in mind the grammar from today’s article, answer the following questions, referring to the picture below as needed:

1. What is a context-free grammar (grammar)?
2. How many rules / productions does the grammar have?
3. What is a terminal? (Identify all terminals on the picture)
4. What is a non-terminal? (Identify all non-terminals on the picture)
5. What is a head of a rule? (Identify all heads / left-hand sides on the picture)
6. What is a body of the rule? (Identify all bodies / right-hand sides on the picture)
7. What is the start symbol of a grammar?

Hey, you read all the way to the end! This post contained quite a bit of theory, so I’m really proud of you that you finished it.

I’ll be back next time with a new article - stay tuned and don’t forget to do the exercises, they will do you good.

Here is a list of books I recommend that will help you in your study of interpreters and compilers:

By the way, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch. You can get a feel for the book here, here, and here. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series:

## Let’s Build A Simple Interpreter. Part 3.

Date

I woke up this morning and I thought to myself: “Why do we find it so difficult to learn a new skill?”

I don’t think it’s just because of the hard work. I think that one of the reasons might be that we spend a lot of time and hard work acquiring knowledge by reading and watching and not enough time translating that knowledge into a skill by practicing it. Take swimming, for example. You can spend a lot of time reading hundreds of books about swimming, talk for hours with experienced swimmers and coaches, watch all the training videos available, and you still will sink like a rock the first time you jump in the pool.

The bottom line is: it doesn’t matter how well you think you know the subject - you have to put that knowledge into practice to turn it into a skill. To help you with the practice part I put exercises into Part 1 and Part 2 of the series. And yes, you will see more exercises in today’s article and in future articles, I promise :)

Okay, let’s get started with today’s material, shall we?

So far, you’ve learned how to interpret arithmetic expressions that add or subtract two integers like “7 + 3” or “12 - 9”. Today I’m going to talk about how to parse (recognize) and interpret arithmetic expressions that have any number of plus or minus operators in it, for example “7 - 3 + 2 - 1”.

Graphically, the arithmetic expressions in this article can be represented with the following syntax diagram:

What is a syntax diagram? A syntax diagram is a graphical representation of a programming language’s syntax rules. Basically, a syntax diagram visually shows you which statements are allowed in your programming language and which are not.

Syntax diagrams are pretty easy to read: just follow the paths indicated by the arrows. Some paths indicate choices. And some paths indicate loops.

You can read the above syntax diagram as following: a term optionally followed by a plus or minus sign, followed by another term, which in turn is optionally followed by a plus or minus sign followed by another term and so on. You get the picture, literally. You might wonder what a “term” is. For the purpose of this article a “term” is just an integer.

Syntax diagrams serve two main purposes:

• They graphically represent the specification (grammar) of a programming language.
• They can be used to help you write your parser - you can map a diagram to code by following simple rules.

You’ve learned that the process of recognizing a phrase in the stream of tokens is called parsing. And the part of an interpreter or compiler that performs that job is called a parser. Parsing is also called syntax analysis, and the parser is also aptly called, you guessed it right, a syntax analyzer.

According to the syntax diagram above, all of the following arithmetic expressions are valid:

• 3
• 3 + 4
• 7 - 3 + 2 - 1

Because syntax rules for arithmetic expressions in different programming languages are very similar we can use a Python shell to “test” our syntax diagram. Launch your Python shell and see for yourself:

```>>> 3
3
>>> 3 + 4
7
>>> 7 - 3 + 2 - 1
5
```

No surprises here.

The expression “3 + ” is not a valid arithmetic expression though because according to the syntax diagram the plus sign must be followed by a term (integer), otherwise it’s a syntax error. Again, try it with a Python shell and see for yourself:

```>>> 3 +
File "<stdin>", line 1
3 +
^
SyntaxError: invalid syntax
```

It’s great to be able to use a Python shell to do some testing but let’s map the above syntax diagram to code and use our own interpreter for testing, all right?

You know from the previous articles (Part 1 and Part 2) that the expr method is where both our parser and interpreter live. Again, the parser just recognizes the structure making sure that it corresponds to some specifications and the interpreter actually evaluates the expression once the parser has successfully recognized (parsed) it.

The following code snippet shows the parser code corresponding to the diagram. The rectangular box from the syntax diagram (term) becomes a term method that parses an integer and the expr method just follows the syntax diagram flow:

```def term(self):
self.eat(INTEGER)

def expr(self):
# set current token to the first token taken from the input
self.current_token = self.get_next_token()

self.term()
while self.current_token.type in (PLUS, MINUS):
token = self.current_token
if token.type == PLUS:
self.eat(PLUS)
self.term()
elif token.type == MINUS:
self.eat(MINUS)
self.term()
```

You can see that expr first calls the term method. Then the expr method has a while loop which can execute zero or more times. And inside the loop the parser makes a choice based on the token (whether it’s a plus or minus sign). Spend some time proving to yourself that the code above does indeed follow the syntax diagram flow for arithmetic expressions.

The parser itself does not interpret anything though: if it recognizes an expression it’s silent and if it doesn’t, it throws out a syntax error. Let’s modify the expr method and add the interpreter code:

```def term(self):
"""Return an INTEGER token value"""
token = self.current_token
self.eat(INTEGER)

def expr(self):
"""Parser / Interpreter """
# set current token to the first token taken from the input
self.current_token = self.get_next_token()

result = self.term()
while self.current_token.type in (PLUS, MINUS):
token = self.current_token
if token.type == PLUS:
self.eat(PLUS)
result = result + self.term()
elif token.type == MINUS:
self.eat(MINUS)
result = result - self.term()

return result
```

Because the interpreter needs to evaluate an expression the term method was modified to return an integer value and the expr method was modified to perform addition and subtraction at the appropriate places and return the result of interpretation. Even though the code is pretty straightforward I recommend spending some time studying it.

Le’s get moving and see the complete code of the interpreter now, okay?

Here is the source code for your new version of the calculator that can handle valid arithmetic expressions containing integers and any number of addition and subtraction operators:

```# Token types
#
# EOF (end-of-file) token is used to indicate that
# there is no more input left for lexical analysis
INTEGER, PLUS, MINUS, EOF = 'INTEGER', 'PLUS', 'MINUS', 'EOF'

class Token(object):
def __init__(self, type, value):
# token type: INTEGER, PLUS, MINUS, or EOF
self.type = type
# token value: non-negative integer value, '+', '-', or None
self.value = value

def __str__(self):
"""String representation of the class instance.

Examples:
Token(INTEGER, 3)
Token(PLUS, '+')
"""
return 'Token({type}, {value})'.format(
type=self.type,
value=repr(self.value)
)

def __repr__(self):
return self.__str__()

class Interpreter(object):
def __init__(self, text):
# client string input, e.g. "3 + 5", "12 - 5 + 3", etc
self.text = text
# self.pos is an index into self.text
self.pos = 0
# current token instance
self.current_token = None
self.current_char = self.text[self.pos]

##########################################################
# Lexer code                                             #
##########################################################
def error(self):
raise Exception('Invalid syntax')

"""Advance the `pos` pointer and set the `current_char` variable."""
self.pos += 1
if self.pos > len(self.text) - 1:
self.current_char = None  # Indicates end of input
else:
self.current_char = self.text[self.pos]

def skip_whitespace(self):
while self.current_char is not None and self.current_char.isspace():

def integer(self):
"""Return a (multidigit) integer consumed from the input."""
result = ''
while self.current_char is not None and self.current_char.isdigit():
result += self.current_char
return int(result)

def get_next_token(self):
"""Lexical analyzer (also known as scanner or tokenizer)

This method is responsible for breaking a sentence
apart into tokens. One token at a time.
"""
while self.current_char is not None:

if self.current_char.isspace():
self.skip_whitespace()
continue

if self.current_char.isdigit():

if self.current_char == '+':

if self.current_char == '-':

self.error()

##########################################################
# Parser / Interpreter code                              #
##########################################################
def eat(self, token_type):
# compare the current token type with the passed token
# type and if they match then "eat" the current token
# and assign the next token to the self.current_token,
# otherwise raise an exception.
if self.current_token.type == token_type:
self.current_token = self.get_next_token()
else:
self.error()

def term(self):
"""Return an INTEGER token value."""
token = self.current_token
self.eat(INTEGER)

def expr(self):
"""Arithmetic expression parser / interpreter."""
# set current token to the first token taken from the input
self.current_token = self.get_next_token()

result = self.term()
while self.current_token.type in (PLUS, MINUS):
token = self.current_token
if token.type == PLUS:
self.eat(PLUS)
result = result + self.term()
elif token.type == MINUS:
self.eat(MINUS)
result = result - self.term()

return result

def main():
while True:
try:
# To run under Python3 replace 'raw_input' call
# with 'input'
text = raw_input('calc> ')
except EOFError:
break
if not text:
continue
interpreter = Interpreter(text)
result = interpreter.expr()
print(result)

if __name__ == '__main__':
main()
```

Save the above code into the calc3.py file or download it directly from GitHub. Try it out. See for yourself that it can handle arithmetic expressions that you can derive from the syntax diagram I showed you earlier.

Here is a sample session that I ran on my laptop:

```\$ python calc3.py
calc> 3
3
calc> 7 - 4
3
calc> 10 + 5
15
calc> 7 - 3 + 2 - 1
5
calc> 10 + 1 + 2 - 3 + 4 + 6 - 15
5
calc> 3 +
Traceback (most recent call last):
File "calc3.py", line 147, in <module>
main()
File "calc3.py", line 142, in main
result = interpreter.expr()
File "calc3.py", line 123, in expr
result = result + self.term()
File "calc3.py", line 110, in term
self.eat(INTEGER)
File "calc3.py", line 105, in eat
self.error()
File "calc3.py", line 45, in error
raise Exception('Invalid syntax')
Exception: Invalid syntax
```

Remember those exercises I mentioned at the beginning of the article: here they are, as promised :)

• Draw a syntax diagram for arithmetic expressions that contain only multiplication and division, for example “7 * 4 / 2 * 3”. Seriously, just grab a pen or a pencil and try to draw one.
• Modify the source code of the calculator to interpret arithmetic expressions that contain only multiplication and division, for example “7 * 4 / 2 * 3”.
• Write an interpreter that handles arithmetic expressions like “7 - 3 + 2 - 1” from scratch. Use any programming language you’re comfortable with and write it off the top of your head without looking at the examples. When you do that, think about components involved: a lexer that takes an input and converts it into a stream of tokens, a parser that feeds off the stream of the tokens provided by the lexer and tries to recognize a structure in that stream, and an interpreter that generates results after the parser has successfully parsed (recognized) a valid arithmetic expression. String those pieces together. Spend some time translating the knowledge you’ve acquired into a working interpreter for arithmetic expressions.

Check your understanding.

1. What is a syntax diagram?
2. What is syntax analysis?
3. What is a syntax analyzer?

Hey, look! You read all the way to the end. Thanks for hanging out here today and don’t forget to do the exercises. :) I’ll be back next time with a new article - stay tuned.

Here is a list of books I recommend that will help you in your study of interpreters and compilers:

By the way, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch. You can get a feel for the book here, here, and here. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series:

## Let’s Build A Simple Interpreter. Part 2.

Date

In their amazing book “The 5 Elements of Effective Thinking” the authors Burger and Starbird share a story about how they observed Tony Plog, an internationally acclaimed trumpet virtuoso, conduct a master class for accomplished trumpet players. The students first played complex music phrases, which they played perfectly well. But then they were asked to play very basic, simple notes. When they played the notes, the notes sounded childish compared to the previously played complex phrases. After they finished playing, the master teacher also played the same notes, but when he played them, they did not sound childish. The difference was stunning. Tony explained that mastering the performance of simple notes allows one to play complex pieces with greater control. The lesson was clear - to build true virtuosity one must focus on mastering simple, basic ideas.1

The lesson in the story clearly applies not only to music but also to software development. The story is a good reminder to all of us to not lose sight of the importance of deep work on simple, basic ideas even if it sometimes feels like a step back. While it is important to be proficient with a tool or framework you use, it is also extremely important to know the principles behind them. As Ralph Waldo Emerson said:

If you learn only methods, you’ll be tied to your methods. But if you learn principles, you can devise your own methods.”

On that note, let’s dive into interpreters and compilers again.

Today I will show you a new version of the calculator from Part 1 that will be able to:

1. Handle whitespace characters anywhere in the input string
2. Consume multi-digit integers from the input
3. Subtract two integers (currently it can only add integers)

Here is the source code for your new version of the calculator that can do all of the above:

```# Token types
# EOF (end-of-file) token is used to indicate that
# there is no more input left for lexical analysis
INTEGER, PLUS, MINUS, EOF = 'INTEGER', 'PLUS', 'MINUS', 'EOF'

class Token(object):
def __init__(self, type, value):
# token type: INTEGER, PLUS, MINUS, or EOF
self.type = type
# token value: non-negative integer value, '+', '-', or None
self.value = value

def __str__(self):
"""String representation of the class instance.

Examples:
Token(INTEGER, 3)
Token(PLUS '+')
"""
return 'Token({type}, {value})'.format(
type=self.type,
value=repr(self.value)
)

def __repr__(self):
return self.__str__()

class Interpreter(object):
def __init__(self, text):
# client string input, e.g. "3 + 5", "12 - 5", etc
self.text = text
# self.pos is an index into self.text
self.pos = 0
# current token instance
self.current_token = None
self.current_char = self.text[self.pos]

def error(self):
raise Exception('Error parsing input')

"""Advance the 'pos' pointer and set the 'current_char' variable."""
self.pos += 1
if self.pos > len(self.text) - 1:
self.current_char = None  # Indicates end of input
else:
self.current_char = self.text[self.pos]

def skip_whitespace(self):
while self.current_char is not None and self.current_char.isspace():

def integer(self):
"""Return a (multidigit) integer consumed from the input."""
result = ''
while self.current_char is not None and self.current_char.isdigit():
result += self.current_char
return int(result)

def get_next_token(self):
"""Lexical analyzer (also known as scanner or tokenizer)

This method is responsible for breaking a sentence
apart into tokens.
"""
while self.current_char is not None:

if self.current_char.isspace():
self.skip_whitespace()
continue

if self.current_char.isdigit():

if self.current_char == '+':

if self.current_char == '-':

self.error()

def eat(self, token_type):
# compare the current token type with the passed token
# type and if they match then "eat" the current token
# and assign the next token to the self.current_token,
# otherwise raise an exception.
if self.current_token.type == token_type:
self.current_token = self.get_next_token()
else:
self.error()

def expr(self):
"""Parser / Interpreter

expr -> INTEGER PLUS INTEGER
expr -> INTEGER MINUS INTEGER
"""
# set current token to the first token taken from the input
self.current_token = self.get_next_token()

# we expect the current token to be an integer
left = self.current_token
self.eat(INTEGER)

# we expect the current token to be either a '+' or '-'
op = self.current_token
if op.type == PLUS:
self.eat(PLUS)
else:
self.eat(MINUS)

# we expect the current token to be an integer
right = self.current_token
self.eat(INTEGER)
# after the above call the self.current_token is set to
# EOF token

# at this point either the INTEGER PLUS INTEGER or
# the INTEGER MINUS INTEGER sequence of tokens
# has been successfully found and the method can just
# return the result of adding or subtracting two integers,
# thus effectively interpreting client input
if op.type == PLUS:
result = left.value + right.value
else:
result = left.value - right.value
return result

def main():
while True:
try:
# To run under Python3 replace 'raw_input' call
# with 'input'
text = raw_input('calc> ')
except EOFError:
break
if not text:
continue
interpreter = Interpreter(text)
result = interpreter.expr()
print(result)

if __name__ == '__main__':
main()
```

Save the above code into the calc2.py file or download it directly from GitHub. Try it out. See for yourself that it works as expected: it can handle whitespace characters anywhere in the input; it can accept multi-digit integers, and it can also subtract two integers as well as add two integers.

Here is a sample session that I ran on my laptop:

```\$ python calc2.py
calc> 27 + 3
30
calc> 27 - 7
20
calc>
```

The major code changes compared with the version from Part 1 are:

1. The get_next_token method was refactored a bit. The logic to increment the pos pointer was factored into a separate method advance.
2. Two more methods were added: skip_whitespace to ignore whitespace characters and integer to handle multi-digit integers in the input.
3. The expr method was modified to recognize INTEGER -> MINUS -> INTEGER phrase in addition to INTEGER -> PLUS -> INTEGER phrase. The method now also interprets both addition and subtraction after having successfully recognized the corresponding phrase.

In Part 1 you learned two important concepts, namely that of a token and a lexical analyzer. Today I would like to talk a little bit about lexemes, parsing, and parsers.

You already know about tokens. But in order for me to round out the discussion of tokens I need to mention lexemes. What is a lexeme? A lexeme is a sequence of characters that form a token. In the following picture you can see some examples of tokens and sample lexemes and hopefully it will make the relationship between them clear:

Now, remember our friend, the expr method? I said before that that’s where the interpretation of an arithmetic expression actually happens. But before you can interpret an expression you first need to recognize what kind of phrase it is, whether it is addition or subtraction, for example. That’s what the expr method essentially does: it finds the structure in the stream of tokens it gets from the get_next_token method and then it interprets the phrase that is has recognized, generating the result of the arithmetic expression.

The process of finding the structure in the stream of tokens, or put differently, the process of recognizing a phrase in the stream of tokens is called parsing. The part of an interpreter or compiler that performs that job is called a parser.

So now you know that the expr method is the part of your interpreter where both parsing and interpreting happens - the expr method first tries to recognize (parse) the INTEGER -> PLUS -> INTEGER or the INTEGER -> MINUS -> INTEGER phrase in the stream of tokens and after it has successfully recognized (parsed) one of those phrases, the method interprets it and returns the result of either addition or subtraction of two integers to the caller.

And now it’s time for exercises again.

1. Extend the calculator to handle multiplication of two integers
2. Extend the calculator to handle division of two integers
3. Modify the code to interpret expressions containing an arbitrary number of additions and subtractions, for example “9 - 5 + 3 + 11”

Check your understanding.

1. What is a lexeme?
2. What is the name of the process that finds the structure in the stream of tokens, or put differently, what is the name of the process that recognizes a certain phrase in that stream of tokens?
3. What is the name of the part of the interpreter (compiler) that does parsing?

I hope you liked today’s material. In the next article of the series you will extend your calculator to handle more complex arithmetic expressions. Stay tuned.

And here is a list of books I recommend that will help you in your study of interpreters and compilers:

BTW, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch. You can get a feel for the book here, here, and here. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series:

## Let’s Build A Simple Interpreter. Part 1.

Date

If you don’t know how compilers work, then you don’t know how computers work. If you’re not 100% sure whether you know how compilers work, then you don’t know how they work.” — Steve Yegge

There you have it. Think about it. It doesn’t really matter whether you’re a newbie or a seasoned software developer: if you don’t know how compilers and interpreters work, then you don’t know how computers work. It’s that simple.

So, do you know how compilers and interpreters work? And I mean, are you 100% sure that you know how they work? If you don’t.

Or if you don’t and you’re really agitated about it.

Do not worry. If you stick around and work through the series and build an interpreter and a compiler with me you will know how they work in the end. And you will become a confident happy camper too. At least I hope so.

Why would you study interpreters and compilers? I will give you three reasons.

1. To write an interpreter or a compiler you have to have a lot of technical skills that you need to use together. Writing an interpreter or a compiler will help you improve those skills and become a better software developer. As well, the skills you will learn are useful in writing any software, not just interpreters or compilers.
2. You really want to know how computers work. Often interpreters and compilers look like magic. And you shouldn’t be comfortable with that magic. You want to demystify the process of building an interpreter and a compiler, understand how they work, and get in control of things.
3. You want to create your own programming language or domain specific language. If you create one, you will also need to create either an interpreter or a compiler for it. Recently, there has been a resurgence of interest in new programming languages. And you can see a new programming language pop up almost every day: Elixir, Go, Rust just to name a few.

Okay, but what are interpreters and compilers?

The goal of an interpreter or a compiler is to translate a source program in some high-level language into some other form. Pretty vague, isn’t it? Just bear with me, later in the series you will learn exactly what the source program is translated into.

At this point you may also wonder what the difference is between an interpreter and a compiler. For the purpose of this series, let’s agree that if a translator translates a source program into machine language, it is a compiler. If a translator processes and executes the source program without translating it into machine language first, it is an interpreter. Visually it looks something like this:

I hope that by now you’re convinced that you really want to study and build an interpreter and a compiler. What can you expect from this series on interpreters?

Here is the deal. You and I are going to create a simple interpreter for a large subset of Pascal language. At the end of this series you will have a working Pascal interpreter and a source-level debugger like Python’s pdb.

You might ask, why Pascal? For one thing, it’s not a made-up language that I came up with just for this series: it’s a real programming language that has many important language constructs. And some old, but useful, CS books use Pascal programming language in their examples (I understand that that’s not a particularly compelling reason to choose a language to build an interpreter for, but I thought it would be nice for a change to learn a non-mainstream language :)

Here is an example of a factorial function in Pascal that you will be able to interpret with your own interpreter and debug with the interactive source-level debugger that you will create along the way:

```program factorial;

function factorial(n: integer): longint;
begin
if n = 0 then
factorial := 1
else
factorial := n * factorial(n - 1);
end;

var
n: integer;

begin
for n := 0 to 16 do
writeln(n, '! = ', factorial(n));
end.
```

The implementation language of the Pascal interpreter will be Python, but you can use any language you want because the ideas presented don’t depend on any particular implementation language. Okay, let’s get down to business. Ready, set, go!

You will start your first foray into interpreters and compilers by writing a simple interpreter of arithmetic expressions, also known as a calculator. Today the goal is pretty minimalistic: to make your calculator handle the addition of two single digit integers like 3+5. Here is the source code for your calculator, sorry, interpreter:

```# Token types
#
# EOF (end-of-file) token is used to indicate that
# there is no more input left for lexical analysis
INTEGER, PLUS, EOF = 'INTEGER', 'PLUS', 'EOF'

class Token(object):
def __init__(self, type, value):
# token type: INTEGER, PLUS, or EOF
self.type = type
# token value: 0, 1, 2. 3, 4, 5, 6, 7, 8, 9, '+', or None
self.value = value

def __str__(self):
"""String representation of the class instance.

Examples:
Token(INTEGER, 3)
Token(PLUS '+')
"""
return 'Token({type}, {value})'.format(
type=self.type,
value=repr(self.value)
)

def __repr__(self):
return self.__str__()

class Interpreter(object):
def __init__(self, text):
# client string input, e.g. "3+5"
self.text = text
# self.pos is an index into self.text
self.pos = 0
# current token instance
self.current_token = None

def error(self):
raise Exception('Error parsing input')

def get_next_token(self):
"""Lexical analyzer (also known as scanner or tokenizer)

This method is responsible for breaking a sentence
apart into tokens. One token at a time.
"""
text = self.text

# is self.pos index past the end of the self.text ?
# if so, then return EOF token because there is no more
# input left to convert into tokens
if self.pos > len(text) - 1:

# get a character at the position self.pos and decide
# what token to create based on the single character
current_char = text[self.pos]

# if the character is a digit then convert it to
# integer, create an INTEGER token, increment self.pos
# index to point to the next character after the digit,
# and return the INTEGER token
if current_char.isdigit():
token = Token(INTEGER, int(current_char))
self.pos += 1

if current_char == '+':
token = Token(PLUS, current_char)
self.pos += 1

self.error()

def eat(self, token_type):
# compare the current token type with the passed token
# type and if they match then "eat" the current token
# and assign the next token to the self.current_token,
# otherwise raise an exception.
if self.current_token.type == token_type:
self.current_token = self.get_next_token()
else:
self.error()

def expr(self):
"""expr -> INTEGER PLUS INTEGER"""
# set current token to the first token taken from the input
self.current_token = self.get_next_token()

# we expect the current token to be a single-digit integer
left = self.current_token
self.eat(INTEGER)

# we expect the current token to be a '+' token
op = self.current_token
self.eat(PLUS)

# we expect the current token to be a single-digit integer
right = self.current_token
self.eat(INTEGER)
# after the above call the self.current_token is set to
# EOF token

# at this point INTEGER PLUS INTEGER sequence of tokens
# has been successfully found and the method can just
# return the result of adding two integers, thus
# effectively interpreting client input
result = left.value + right.value
return result

def main():
while True:
try:
# To run under Python3 replace 'raw_input' call
# with 'input'
text = raw_input('calc> ')
except EOFError:
break
if not text:
continue
interpreter = Interpreter(text)
result = interpreter.expr()
print(result)

if __name__ == '__main__':
main()
```

Save the above code into calc1.py file or download it directly from GitHub. Before you start digging deeper into the code, run the calculator on the command line and see it in action. Play with it! Here is a sample session on my laptop (if you want to run the calculator under Python3 you will need to replace raw_input with input):

```\$ python calc1.py
calc> 3+4
7
calc> 3+5
8
calc> 3+9
12
calc>
```

For your simple calculator to work properly without throwing an exception, your input needs to follow certain rules:

• Only single digit integers are allowed in the input
• The only arithmetic operation supported at the moment is addition
• No whitespace characters are allowed anywhere in the input

Those restrictions are necessary to make the calculator simple. Don’t worry, you’ll make it pretty complex pretty soon.

Okay, now let’s dive in and see how your interpreter works and how it evaluates arithmetic expressions.

When you enter an expression 3+5 on the command line your interpreter gets a string “3+5”. In order for the interpreter to actually understand what to do with that string it first needs to break the input “3+5” into components called tokens. A token is an object that has a type and a value. For example, for the string “3” the type of the token will be INTEGER and the corresponding value will be integer 3.

The process of breaking the input string into tokens is called lexical analysis. So, the first step your interpreter needs to do is read the input of characters and convert it into a stream of tokens. The part of the interpreter that does it is called a lexical analyzer, or lexer for short. You might also encounter other names for the same component, like scanner or tokenizer. They all mean the same: the part of your interpreter or compiler that turns the input of characters into a stream of tokens.

The method get_next_token of the Interpreter class is your lexical analyzer. Every time you call it, you get the next token created from the input of characters passed to the interpreter. Let’s take a closer look at the method itself and see how it actually does its job of converting characters into tokens. The input is stored in the variable text that holds the input string and pos is an index into that string (think of the string as an array of characters). pos is initially set to 0 and points to the character ‘3’. The method first checks whether the character is a digit and if so, it increments pos and returns a token instance with the type INTEGER and the value set to the integer value of the string ‘3’, which is an integer 3:

The pos now points to the ‘+’ character in the text. The next time you call the method, it tests if a character at the position pos is a digit and then it tests if the character is a plus sign, which it is. As a result the method increments pos and returns a newly created token with the type PLUS and value ‘+’:

The pos now points to character ‘5’. When you call the get_next_token method again the method checks if it’s a digit, which it is, so it increments pos and returns a new INTEGER token with the value of the token set to integer 5:

Because the pos index is now past the end of the string “3+5” the get_next_token method returns the EOF token every time you call it:

Try it out and see for yourself how the lexer component of your calculator works:

```>>> from calc1 import Interpreter
>>>
>>> interpreter = Interpreter('3+5')
>>> interpreter.get_next_token()
Token(INTEGER, 3)
>>>
>>> interpreter.get_next_token()
Token(PLUS, '+')
>>>
>>> interpreter.get_next_token()
Token(INTEGER, 5)
>>>
>>> interpreter.get_next_token()
Token(EOF, None)
>>>
```

So now that your interpreter has access to the stream of tokens made from the input characters, the interpreter needs to do something with it: it needs to find the structure in the flat stream of tokens it gets from the lexer get_next_token. Your interpreter expects to find the following structure in that stream: INTEGER -> PLUS -> INTEGER. That is, it tries to find a sequence of tokens: integer followed by a plus sign followed by an integer.

The method responsible for finding and interpreting that structure is expr. This method verifies that the sequence of tokens does indeed correspond to the expected sequence of tokens, i.e INTEGER -> PLUS -> INTEGER. After it’s successfully confirmed the structure, it generates the result by adding the value of the token on the left side of the PLUS and the right side of the PLUS, thus successfully interpreting the arithmetic expression you passed to the interpreter.

The expr method itself uses the helper method eat to verify that the token type passed to the eat method matches the current token type. After matching the passed token type the eat method gets the next token and assigns it to the current_token variable, thus effectively “eating” the currently matched token and advancing the imaginary pointer in the stream of tokens. If the structure in the stream of tokens doesn’t correspond to the expected INTEGER PLUS INTEGER sequence of tokens the eat method throws an exception.

Let’s recap what your interpreter does to evaluate an arithmetic expression:

• The interpreter accepts an input string, let’s say “3+5”
• The interpreter calls the expr method to find a structure in the stream of tokens returned by the lexical analyzer get_next_token. The structure it tries to find is of the form INTEGER PLUS INTEGER. After it’s confirmed the structure, it interprets the input by adding the values of two INTEGER tokens because it’s clear to the interpreter at that point that what it needs to do is add two integers, 3 and 5.

Congratulate yourself. You’ve just learned how to build your very first interpreter!

Now it’s time for exercises.

You didn’t think you would just read this article and that would be enough, did you? Okay, get your hands dirty and do the following exercises:

1. Modify the code to allow multiple-digit integers in the input, for example “12+3”
2. Add a method that skips whitespace characters so that your calculator can handle inputs with whitespace characters like ” 12 + 3”
3. Modify the code and instead of ‘+’ handle ‘-‘ to evaluate subtractions like “7-5”

Check your understanding

1. What is an interpreter?
2. What is a compiler?
3. What’s the difference between an interpreter and a compiler?
4. What is a token?
5. What is the name of the process that breaks input apart into tokens?
6. What is the part of the interpreter that does lexical analysis called?
7. What are the other common names for that part of an interpreter or a compiler?

Before I finish this article, I really want you to commit to studying interpreters and compilers. And I want you to do it right now. Don’t put it on the back burner. Don’t wait. If you’ve skimmed the article, start over. If you’ve read it carefully but haven’t done exercises - do them now. If you’ve done only some of them, finish the rest. You get the idea. And you know what? Sign the commitment pledge to start learning about interpreters and compilers today!

I, ________, of being sound mind and body, do hereby pledge to commit to studying interpreters and compilers starting today and get to a point where I know 100% how they work!

Signature:

Date:

Sign it, date it, and put it somewhere where you can see it every day to make sure that you stick to your commitment. And keep in mind the definition of commitment:

Commitment is doing the thing you said you were going to do long after the mood you said it in has left you.” — Darren Hardy

Okay, that’s it for today. In the next article of the mini series you will extend your calculator to handle more arithmetic expressions. Stay tuned.

If you can’t wait for the second article and are chomping at the bit to start digging deeper into interpreters and compilers, here is a list of books I recommend that will help you along the way:

BTW, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch. You can get a feel for the book here, here, and here. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series:

## Let’s Build A Web Server. Part 3.

Date

We learn most when we have to invent” —Piaget

In Part 2 you created a minimalistic WSGI server that could handle basic HTTP GET requests. And I asked you a question, “How can you make your server handle more than one request at a time?” In this article you will find the answer. So, buckle up and shift into high gear. You’re about to have a really fast ride. Have your Linux, Mac OS X (or any *nix system) and Python ready. All source code from the article is available on GitHub.

First let’s remember what a very basic Web server looks like and what the server needs to do to service client requests. The server you created in Part 1 and Part 2 is an iterative server that handles one client request at a time. It cannot accept a new connection until after it has finished processing a current client request. Some clients might be unhappy with it because they will have to wait in line, and for busy servers the line might be too long.

Here is the code of the iterative server webserver3a.py:

```#####################################################################
# Iterative server - webserver3a.py                                 #
#                                                                   #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X  #
#####################################################################
import socket

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 5

def handle_request(client_connection):
request = client_connection.recv(1024)
print(request.decode())
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))

while True:
handle_request(client_connection)
client_connection.close()

if __name__ == '__main__':
serve_forever()
```

To observe your server handling only one client request at a time, modify the server a little bit and add a 60 second delay after sending a response to a client. The change is only one line to tell the server process to sleep for 60 seconds.

And here is the code of the sleeping server webserver3b.py:

```#########################################################################
# Iterative server - webserver3b.py                                     #
#                                                                       #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X      #
#                                                                       #
# - Server sleeps for 60 seconds after sending a response to a client   #
#########################################################################
import socket
import time

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 5

def handle_request(client_connection):
request = client_connection.recv(1024)
print(request.decode())
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)
time.sleep(60)  # sleep and block the process for 60 seconds

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))

while True:
handle_request(client_connection)
client_connection.close()

if __name__ == '__main__':
serve_forever()
```

Start the server with:

```\$ python webserver3b.py
```

Now open up a new terminal window and run the curl command. You should instantly see the “Hello, World!” string printed on the screen:

```\$ curl http://localhost:8888/hello
Hello, World!
```

And without delay open up a second terminal window and run the same curl command:

```\$ curl http://localhost:8888/hello
```

If you’ve done that within 60 seconds then the second curl should not produce any output right away and should just hang there. The server shouldn’t print a new request body on its standard output either. Here is how it looks like on my Mac (the window at the bottom right corner highlighted in yellow shows the second curl command hanging, waiting for the connection to be accepted by the server):

After you’ve waited long enough (more than 60 seconds) you should see the first curl terminate and the second curl print “Hello, World!” on the screen, then hang for 60 seconds, and then terminate:

The way it works is that the server finishes servicing the first curl client request and then it starts handling the second request only after it sleeps for 60 seconds. It all happens sequentially, or iteratively, one step, or in our case one client request, at a time.

Let’s talk about the communication between clients and servers for a bit. In order for two programs to communicate with each other over a network, they have to use sockets. And you saw sockets both in Part 1 and Part 2. But what is a socket?

A socket is an abstraction of a communication endpoint and it allows your program to communicate with another program using file descriptors. In this article I’ll be talking specifically about TCP/IP sockets on Linux/Mac OS X. An important notion to understand is the TCP socket pair.

The socket pair for a TCP connection is a 4-tuple that identifies two endpoints of the TCP connection: the local IP address, local port, foreign IP address, and foreign port. A socket pair uniquely identifies every TCP connection on a network. The two values that identify each endpoint, an IP address and a port number, are often called a socket.1

So, the tuple {10.10.10.2:49152, 12.12.12.3:8888} is a socket pair that uniquely identifies two endpoints of the TCP connection on the client and the tuple {12.12.12.3:8888, 10.10.10.2:49152} is a socket pair that uniquely identifies the same two endpoints of the TCP connection on the server. The two values that identify the server endpoint of the TCP connection, the IP address 12.12.12.3 and the port 8888, are referred to as a socket in this case (the same applies to the client endpoint).

The standard sequence a server usually goes through to create a socket and start accepting client connections is the following:

1. The server creates a TCP/IP socket. This is done with the following statement in Python:

```listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
```
2. The server might set some socket options (this is optional, but you can see that the server code above does just that to be able to re-use the same address over and over again if you decide to kill and re-start the server right away).

```listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
```
3. Then, the server binds the address. The bind function assigns a local protocol address to the socket. With TCP, calling bind lets you specify a port number, an IP address, both, or neither.1

```listen_socket.bind(SERVER_ADDRESS)
```
4. Then, the server makes the socket a listening socket

```listen_socket.listen(REQUEST_QUEUE_SIZE)
```

The listen method is only called by servers. It tells the kernel that it should accept incoming connection requests for this socket.

After that’s done, the server starts accepting client connections one connection at a time in a loop. When there is a connection available the accept call returns the connected client socket. Then, the server reads the request data from the connected client socket, prints the data on its standard output and sends a message back to the client. Then, the server closes the client connection and it is ready again to accept a new client connection.

Here is what a client needs to do to communicate with the server over TCP/IP:

Here is the sample code for a client to connect to your server, send a request and print the response:

``` import socket

# create a socket and connect to a server
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(('localhost', 8888))

# send and receive some data
sock.sendall(b'test')
data = sock.recv(1024)
print(data.decode())
```

After creating the socket, the client needs to connect to the server. This is done with the connect call:

```sock.connect(('localhost', 8888))
```

The client only needs to provide the remote IP address or host name and the remote port number of a server to connect to.

You’ve probably noticed that the client doesn’t call bind and accept. The client doesn’t need to call bind because the client doesn’t care about the local IP address and the local port number. The TCP/IP stack within the kernel automatically assigns the local IP address and the local port when the client calls connect. The local port is called an ephemeral port, i.e. a short-lived port.

A port on a server that identifies a well-known service that a client connects to is called a well-known port (for example, 80 for HTTP and 22 for SSH). Fire up your Python shell and make a client connection to the server you run on localhost and see what ephemeral port the kernel assigns to the socket you’ve created (start the server webserver3a.py or webserver3b.py before trying the following example):

```>>> import socket
>>> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>> sock.connect(('localhost', 8888))
>>> host, port = sock.getsockname()[:2]
>>> host, port
('127.0.0.1', 60589)
```

In the case above the kernel assigned the ephemeral port 60589 to the socket.

There are some other important concepts that I need to cover quickly before I get to answer the question from Part 2. You will see shortly why this is important. The two concepts are that of a process and a file descriptor.

What is a process? A process is just an instance of an executing program. When the server code is executed, for example, it’s loaded into memory and an instance of that executing program is called a process. The kernel records a bunch of information about the process - its process ID would be one example - to keep track of it. When you run your iterative server webserver3a.py or webserver3b.py you run just one process.

Start the server webserver3b.py in a terminal window:

```\$ python webserver3b.py
```

And in a different terminal window use the ps command to get the information about that process:

```\$ ps | grep webserver3b | grep -v grep
7182 ttys003    0:00.04 python webserver3b.py
```

The ps command shows you that you have indeed run just one Python process webserver3b. When a process gets created the kernel assigns a process ID to it, PID. In UNIX, every user process also has a parent that, in turn, has its own process ID called parent process ID, or PPID for short. I assume that you run a BASH shell by default and when you start the server, a new process gets created with a PID and its parent PID is set to the PID of the BASH shell.

Try it out and see for yourself how it all works. Fire up your Python shell again, which will create a new process, and then get the PID of the Python shell process and the parent PID (the PID of your BASH shell) using os.getpid() and os.getppid() system calls. Then, in another terminal window run ps command and grep for the PPID (parent process ID, which in my case is 3148). In the screenshot below you can see an example of a parent-child relationship between my child Python shell process and the parent BASH shell process on my Mac OS X:

Another important concept to know is that of a file descriptor. So what is a file descriptor? A file descriptor is a non-negative integer that the kernel returns to a process when it opens an existing file, creates a new file or when it creates a new socket. You’ve probably heard that in UNIX everything is a file. The kernel refers to the open files of a process by a file descriptor. When you need to read or write a file you identify it with the file descriptor. Python gives you high-level objects to deal with files (and sockets) and you don’t have to use file descriptors directly to identify a file but, under the hood, that’s how files and sockets are identified in UNIX: by their integer file descriptors.

By default, UNIX shells assign file descriptor 0 to the standard input of a process, file descriptor 1 to the standard output of the process and file descriptor 2 to the standard error.

As I mentioned before, even though Python gives you a high-level file or file-like object to work with, you can always use the fileno() method on the object to get the file descriptor associated with the file. Back to your Python shell to see how you can do that:

```>>> import sys
>>> sys.stdin
<open file '<stdin>', mode 'r' at 0x102beb0c0>
>>> sys.stdin.fileno()
0
>>> sys.stdout.fileno()
1
>>> sys.stderr.fileno()
2
```

And while working with files and sockets in Python, you’ll usually be using a high-level file/socket object, but there may be times where you need to use a file descriptor directly. Here is an example of how you can write a string to the standard output using a write system call that takes a file descriptor integer as a parameter:

```>>> import sys
>>> import os
>>> res = os.write(sys.stdout.fileno(), 'hello\n')
hello
```

And here is an interesting part - which should not be surprising to you anymore because you already know that everything is a file in Unix - your socket also has a file descriptor associated with it. Again, when you create a socket in Python you get back an object and not a non-negative integer, but you can always get direct access to the integer file descriptor of the socket with the fileno() method that I mentioned earlier.

```>>> import socket
>>> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>> sock.fileno()
3
```

One more thing I wanted to mention: have you noticed that in the second example of the iterative server webserver3b.py, when the server process was sleeping for 60 seconds you could still connect to the server with the second curl command? Sure, the curl didn’t output anything right away and it was just hanging out there but how come the server was not accept ing a connection at the time and the client was not rejected right away, but instead was able to connect to the server? The answer to that is the listen method of a socket object and its BACKLOG argument, which I called REQUEST_QUEUE_SIZE in the code. The BACKLOG argument determines the size of a queue within the kernel for incoming connection requests. When the server webserver3b.py was sleeping, the second curl command that you ran was able to connect to the server because the kernel had enough space available in the incoming connection request queue for the server socket.

While increasing the BACKLOG argument does not magically turn your server into a server that can handle multiple client requests at a time, it is important to have a fairly large backlog parameter for busy servers so that the accept call would not have to wait for a new connection to be established but could grab the new connection off the queue right away and start processing a client request without delay.

Whoo-hoo! You’ve covered a lot of ground. Let’s quickly recap what you’ve learned (or refreshed if it’s all basics to you) so far.

• Iterative server
• Server socket creation sequence (socket, bind, listen, accept)
• Client connection creation sequence (socket, connect)
• Socket pair
• Socket
• Ephemeral port and well-known port
• Process
• Process ID (PID), parent process ID (PPID), and the parent-child relationship.
• File descriptors
• The meaning of the BACKLOG argument of the listen socket method

Now I am ready to answer the question from Part 2: “How can you make your server handle more than one request at a time?” Or put another way, “How do you write a concurrent server?”

The simplest way to write a concurrent server under Unix is to use a fork() system call.

Here is the code of your new shiny concurrent server webserver3c.py that can handle multiple client requests at the same time (as in our iterative server example webserver3b.py, every child process sleeps for 60 secs):

```###########################################################################
# Concurrent server - webserver3c.py                                      #
#                                                                         #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X        #
#                                                                         #
# - Child process sleeps for 60 seconds after handling a client's request #
# - Parent and child processes close duplicate descriptors                #
#                                                                         #
###########################################################################
import os
import socket
import time

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 5

def handle_request(client_connection):
request = client_connection.recv(1024)
print(
'Child PID: {pid}. Parent PID {ppid}'.format(
pid=os.getpid(),
ppid=os.getppid(),
)
)
print(request.decode())
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)
time.sleep(60)

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))
print('Parent PID (PPID): {pid}\n'.format(pid=os.getpid()))

while True:
pid = os.fork()
if pid == 0:  # child
listen_socket.close()  # close child copy
handle_request(client_connection)
client_connection.close()
os._exit(0)  # child exits here
else:  # parent
client_connection.close()  # close parent copy and loop over

if __name__ == '__main__':
serve_forever()
```

Before diving in and discussing how fork works, try it, and see for yourself that the server can indeed handle multiple client requests at the same time, unlike its iterative counterparts webserver3a.py and webserver3b.py. Start the server on the command line with:

```\$ python webserver3c.py
```

And try the same two curl commands you’ve tried before with the iterative server and see for yourself that, now, even though the server child process sleeps for 60 seconds after serving a client request, it doesn’t affect other clients because they are served by different and completely independent processes. You should see your curl commands output “Hello, World!” instantly and then hang for 60 secs. You can keep on running as many curl commands as you want (well, almost as many as you want :) and all of them will output the server’s response “Hello, World” immediately and without any noticeable delay. Try it.

The most important point to understand about fork() is that you call fork once but it returns twice: once in the parent process and once in the child process. When you fork a new process the process ID returned to the child process is 0. When the fork returns in the parent process it returns the child’s PID.

I still remember how fascinated I was by fork when I first read about it and tried it. It looked like magic to me. Here I was reading a sequential code and then “boom!”: the code cloned itself and now there were two instances of the same code running concurrently. I thought it was nothing short of magic, seriously.

When a parent forks a new child, the child process gets a copy of the parent’s file descriptors:

You’ve probably noticed that the parent process in the code above closed the client connection:

```else:  # parent
client_connection.close()  # close parent copy and loop over
```

So how come a child process is still able to read the data from a client socket if its parent closed the very same socket? The answer is in the picture above. The kernel uses descriptor reference counts to decide whether to close a socket or not. It closes the socket only when its descriptor reference count becomes 0. When your server creates a child process, the child gets the copy of the parent’s file descriptors and the kernel increments the reference counts for those descriptors. In the case of one parent and one child, the descriptor reference count would be 2 for the client socket and when the parent process in the code above closes the client connection socket, it merely decrements its reference count which becomes 1, not small enough to cause the kernel to close the socket. The child process also closes the duplicate copy of the parent’s listen_socket because the child doesn’t care about accepting new client connections, it cares only about processing requests from the established client connection:

```listen_socket.close()  # close child copy
```

I’ll talk about what happens if you do not close duplicate descriptors later in the article.

As you can see from the source code of your concurrent server, the sole role of the server parent process now is to accept a new client connection, fork a new child process to handle that client request, and loop over to accept another client connection, and nothing more. The server parent process does not process client requests - its children do.

A little aside. What does it mean when we say that two events are concurrent?

When we say that two events are concurrent we usually mean that they happen at the same time. As a shorthand that definition is fine, but you should remember the strict definition:

Two events are concurrent if you cannot tell by looking at the program which will happen first.2

Again, it’s time to recap the main ideas and concepts you’ve covered so far.

• The simplest way to write a concurrent server in Unix is to use the fork() system call
• When a process forks a new process it becomes a parent process to that newly forked child process.
• Parent and child share the same file descriptors after the call to fork.
• The kernel uses descriptor reference counts to decide whether to close the file/socket or not
• The role of a server parent process: all it does now is accept a new connection from a client, fork a child to handle the client request, and loop over to accept a new client connection.

Let’s see what is going to happen if you don’t close duplicate socket descriptors in the parent and child processes. Here is a modified version of the concurrent server where the server does not close duplicate descriptors, webserver3d.py:

```###########################################################################
# Concurrent server - webserver3d.py                                      #
#                                                                         #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X        #
###########################################################################
import os
import socket

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 5

def handle_request(client_connection):
request = client_connection.recv(1024)
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))

clients = []
while True:
# store the reference otherwise it's garbage collected
# on the next loop run
clients.append(client_connection)
pid = os.fork()
if pid == 0:  # child
listen_socket.close()  # close child copy
handle_request(client_connection)
client_connection.close()
os._exit(0)  # child exits here
else:  # parent
# client_connection.close()
print(len(clients))

if __name__ == '__main__':
serve_forever()
```

Start the server with:

```\$ python webserver3d.py
```

Use curl to connect to the server:

```\$ curl http://localhost:8888/hello
Hello, World!
```

Okay, the curl printed the response from the concurrent server but it did not terminate and kept hanging. What is happening here? The server no longer sleeps for 60 seconds: its child process actively handles a client request, closes the client connection and exits, but the client curl still does not terminate.

So why does the curl not terminate? The reason is the duplicate file descriptors. When the child process closed the client connection, the kernel decremented the reference count of that client socket and the count became 1. The server child process exited, but the client socket was not closed by the kernel because the reference count for that socket descriptor was not 0, and, as a result, the termination packet (called FIN in TCP/IP parlance) was not sent to the client and the client stayed on the line, so to speak. There is also another problem. If your long-running server doesn’t close duplicate file descriptors, it will eventually run out of available file descriptors:

Stop your server webserver3d.py with Control-C and check out the default resources available to your server process set up by your shell with the shell built-in command ulimit:

```\$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 3842
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 3842
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
```

As you can see above, the maximum number of open file descriptors (open files) available to the server process on my Ubuntu box is 1024.

Now let’s see how your server can run out of available file descriptors if it doesn’t close duplicate descriptors. In an existing or new terminal window, set the maximum number of open file descriptors for your server to be 256:

```\$ ulimit -n 256
```

Start the server webserver3d.py in the same terminal where you’ve just run the \$ ulimit -n 256 command:

```\$ python webserver3d.py
```

and use the following client client3.py to test the server.

```#####################################################################
# Test client - client3.py                                          #
#                                                                   #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X  #
#####################################################################
import argparse
import errno
import os
import socket

REQUEST = b"""\
GET /hello HTTP/1.1
Host: localhost:8888

"""

def main(max_clients, max_conns):
socks = []
for client_num in range(max_clients):
pid = os.fork()
if pid == 0:
for connection_num in range(max_conns):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.sendall(REQUEST)
socks.append(sock)
print(connection_num)
os._exit(0)

if __name__ == '__main__':
parser = argparse.ArgumentParser(
description='Test client for LSBAWS.',
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
)
'--max-conns',
type=int,
default=1024,
help='Maximum number of connections per client.'
)
'--max-clients',
type=int,
default=1,
help='Maximum number of clients.'
)
args = parser.parse_args()
main(args.max_clients, args.max_conns)
```

In a new terminal window, start the client3.py and tell it to create 300 simultaneous connections to the server:

```\$ python client3.py --max-clients=300
```

Soon enough your server will explode. Here is a screenshot of the exception on my box:

The lesson is clear - your server should close duplicate descriptors. But even if you close duplicate descriptors, you are not out of the woods yet because there is another problem with your server, and that problem is zombies!

Yes, your server code actually creates zombies. Let’s see how. Start up your server again:

```\$ python webserver3d.py
```

Run the following curl command in another terminal window:

```\$ curl http://localhost:8888/hello
```

And now run the ps command to show running Python processes. This the example of ps output on my Ubuntu box:

```\$ ps auxw | grep -i python | grep -v grep
vagrant   9099  0.0  1.2  31804  6256 pts/0    S+   16:33   0:00 python webserver3d.py
vagrant   9102  0.0  0.0      0     0 pts/0    Z+   16:33   0:00 [python] <defunct>
```

Do you see the second line above where it says the status of the process with PID 9102 is Z+ and the name of the process is <defunct>? That’s our zombie there. The problem with zombies is that you can’t kill them.

Even if you try to kill zombies with \$ kill -9 , they will survive. Try it and see for yourself.

What is a zombie anyway and why does our server create them? A zombie is a process that has terminated, but its parent has not waited for it and has not received its termination status yet. When a child process exits before its parent, the kernel turns the child process into a zombie and stores some information about the process for its parent process to retrieve later. The information stored is usually the process ID, the process termination status, and the resource usage by the process. Okay, so zombies serve a purpose, but if your server doesn’t take care of these zombies your system will get clogged up. Let’s see how that happens. First stop your running server and, in a new terminal window, use the ulimit command to set the max user processess to 400(make sure to set open files to a high number, let’s say 500 too):

```\$ ulimit -u 400
\$ ulimit -n 500
```

Start the server webserver3d.py in the same terminal where you’ve just run the \$ ulimit -u 400 command:

```\$ python webserver3d.py
```

In a new terminal window, start the client3.py and tell it to create 500 simultaneous connections to the server:

```\$ python client3.py --max-clients=500
```

And, again, soon enough your server will blow up with an OSError: Resource temporarily unavailable exception when it tries to create a new child process, but it can’t because it has reached the limit for the maximum number of child processes it’s allowed to create. Here is a screenshot of the exception on my box:

As you can see, zombies create problems for your long-running server if it doesn’t take care of them. I will discuss shortly how the server should deal with that zombie problem.

Let’s recap the main points you’ve covered so far:

• If you don’t close duplicate descriptors, the clients won’t terminate because the client connections won’t get closed.
• If you don’t close duplicate descriptors, your long-running server will eventually run out of available file descriptors (max open files).
• When you fork a child process and it exits and the parent process doesn’t wait for it and doesn’t collect its termination status, it becomes a zombie.
• Zombies need to eat something and, in our case, it’s memory. Your server will eventually run out of available processes (max user processes) if it doesn’t take care of zombies.
• You can’t kill a zombie, you need to wait for it.

So what do you need to do to take care of zombies? You need to modify your server code to wait for zombies to get their termination status. You can do that by modifying your server to call a wait system call. Unfortunately, that’s far from ideal because if you call wait and there is no terminated child process the call to wait will block your server, effectively preventing your server from handling new client connection requests. Are there any other options? Yes, there are, and one of them is the combination of a signal handler with the wait system call.

Here is how it works. When a child process exits, the kernel sends a SIGCHLD signal. The parent process can set up a signal handler to be asynchronously notified of that SIGCHLD event and then it can wait for the child to collect its termination status, thus preventing the zombie process from being left around.

By the way, an asynchronous event means that the parent process doesn’t know ahead of time that the event is going to happen.

Modify your server code to set up a SIGCHLD event handler and wait for a terminated child in the event handler. The code is available in webserver3e.py file:

```###########################################################################
# Concurrent server - webserver3e.py                                      #
#                                                                         #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X        #
###########################################################################
import os
import signal
import socket
import time

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 5

def grim_reaper(signum, frame):
pid, status = os.wait()
print(
'Child {pid} terminated with status {status}'
'\n'.format(pid=pid, status=status)
)

def handle_request(client_connection):
request = client_connection.recv(1024)
print(request.decode())
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)
# sleep to allow the parent to loop over to 'accept' and block there
time.sleep(3)

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))

signal.signal(signal.SIGCHLD, grim_reaper)

while True:
pid = os.fork()
if pid == 0:  # child
listen_socket.close()  # close child copy
handle_request(client_connection)
client_connection.close()
os._exit(0)
else:  # parent
client_connection.close()

if __name__ == '__main__':
serve_forever()
```

Start the server:

```\$ python webserver3e.py
```

Use your old friend curl to send a request to the modified concurrent server:

```\$ curl http://localhost:8888/hello
```

Look at the server:

What just happened? The call to accept failed with the error EINTR.

The parent process was blocked in accept call when the child process exited which caused SIGCHLD event, which in turn activated the signal handler and when the signal handler finished the accept system call got interrupted:

Don’t worry, it’s a pretty simple problem to solve, though. All you need to do is to re-start the accept system call. Here is the modified version of the server webserver3f.py that handles that problem:

```###########################################################################
# Concurrent server - webserver3f.py                                      #
#                                                                         #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X        #
###########################################################################
import errno
import os
import signal
import socket

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 1024

def grim_reaper(signum, frame):
pid, status = os.wait()

def handle_request(client_connection):
request = client_connection.recv(1024)
print(request.decode())
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))

signal.signal(signal.SIGCHLD, grim_reaper)

while True:
try:
except IOError as e:
code, msg = e.args
# restart 'accept' if it was interrupted
if code == errno.EINTR:
continue
else:
raise

pid = os.fork()
if pid == 0:  # child
listen_socket.close()  # close child copy
handle_request(client_connection)
client_connection.close()
os._exit(0)
else:  # parent
client_connection.close()  # close parent copy and loop over

if __name__ == '__main__':
serve_forever()
```

Start the updated server webserver3f.py:

```\$ python webserver3f.py
```

Use curl to send a request to the modified concurrent server:

```\$ curl http://localhost:8888/hello
```

See? No EINTR exceptions any more. Now, verify that there are no more zombies either and that your SIGCHLD event handler with wait call took care of terminated children. To do that, just run the ps command and see for yourself that there are no more Python processes with Z+ status (no more <defunct> processes). Great! It feels safe without zombies running around.

• If you fork a child and don’t wait for it, it becomes a zombie.
• Use the SIGCHLD event handler to asynchronously wait for a terminated child to get its termination status
• When using an event handler you need to keep in mind that system calls might get interrupted and you need to be prepared for that scenario

Okay, so far so good. No problems, right? Well, almost. Try your webserver3f.py again, but instead of making one request with curl use client3.py to create 128 simultaneous connections:

```\$ python client3.py --max-clients 128
```

Now run the ps command again

```\$ ps auxw | grep -i python | grep -v grep
```

and see that, oh boy, zombies are back again!

What went wrong this time? When you ran 128 simultaneous clients and established 128 connections, the child processes on the server handled the requests and exited almost at the same time causing a flood of SIGCHLD signals being sent to the parent process. The problem is that the signals are not queued and your server process missed several signals, which left several zombies running around unattended:

The solution to the problem is to set up a SIGCHLD event handler but instead of wait use a waitpid system call with a WNOHANG option in a loop to make sure that all terminated child processes are taken care of. Here is the modified server code, webserver3g.py:

```###########################################################################
# Concurrent server - webserver3g.py                                      #
#                                                                         #
# Tested with Python 2.7.9 & Python 3.4 on Ubuntu 14.04 & Mac OS X        #
###########################################################################
import errno
import os
import signal
import socket

SERVER_ADDRESS = (HOST, PORT) = '', 8888
REQUEST_QUEUE_SIZE = 1024

def grim_reaper(signum, frame):
while True:
try:
pid, status = os.waitpid(
-1,          # Wait for any child process
os.WNOHANG  # Do not block and return EWOULDBLOCK error
)
except OSError:
return

if pid == 0:  # no more zombies
return

def handle_request(client_connection):
request = client_connection.recv(1024)
print(request.decode())
http_response = b"""\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)

def serve_forever():
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.listen(REQUEST_QUEUE_SIZE)
print('Serving HTTP on port {port} ...'.format(port=PORT))

signal.signal(signal.SIGCHLD, grim_reaper)

while True:
try:
except IOError as e:
code, msg = e.args
# restart 'accept' if it was interrupted
if code == errno.EINTR:
continue
else:
raise

pid = os.fork()
if pid == 0:  # child
listen_socket.close()  # close child copy
handle_request(client_connection)
client_connection.close()
os._exit(0)
else:  # parent
client_connection.close()  # close parent copy and loop over

if __name__ == '__main__':
serve_forever()
```

Start the server:

```\$ python webserver3g.py
```

Use the test client client3.py:

```\$ python client3.py --max-clients 128
```

And now verify that there are no more zombies. Yay! Life is good without zombies :)

Congratulations! It’s been a pretty long journey but I hope you liked it. Now you have your own simple concurrent server and the code can serve as a foundation for your further work towards a production grade Web server.

I’ll leave it as an exercise for you to update the WSGI server from Part 2 and make it concurrent. You can find the modified version here. But look at my code only after you’ve implemented your own version. You have all the necessary information to do that. So go and just do it :)

What’s next? As Josh Billings said,

Be like a postage stamp — stick to one thing until you get there.”

Start mastering the basics. Question what you already know. And always dig deeper.

If you learn only methods, you’ll be tied to your methods. But if you learn principles, you can devise your own methods.” —Ralph Waldo Emerson

Below is a list of books that I’ve drawn on for most of the material in this article. They will help you broaden and deepen your knowledge about the topics I’ve covered. I highly recommend you to get those books somehow: borrow them from your friends, check them out from your local library, or just buy them on Amazon. They are the keepers:

BTW, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch and goes into more detail on the topics I just covered. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series:

## Let’s Build A Web Server. Part 2.

Date

Remember, in Part 1 I asked you a question: “How do you run a Django application, Flask application, and Pyramid application under your freshly minted Web server without making a single change to the server to accommodate all those different Web frameworks?” Read on to find out the answer.

In the past, your choice of a Python Web framework would limit your choice of usable Web servers, and vice versa. If the framework and the server were designed to work together, then you were okay:

But you could have been faced (and maybe you were) with the following problem when trying to combine a server and a framework that weren’t designed to work together:

Basically you had to use what worked together and not what you might have wanted to use.

So, how do you then make sure that you can run your Web server with multiple Web frameworks without making code changes either to the Web server or to the Web frameworks? And the answer to that problem became the Python Web Server Gateway Interface (or WSGI for short, pronounced “wizgy”).

WSGI allowed developers to separate choice of a Web framework from choice of a Web server. Now you can actually mix and match Web servers and Web frameworks and choose a pairing that suits your needs. You can run Django, Flask, or Pyramid, for example, with Gunicorn or Nginx/uWSGI or Waitress. Real mix and match, thanks to the WSGI support in both servers and frameworks:

So, WSGI is the answer to the question I asked you in Part 1 and repeated at the beginning of this article. Your Web server must implement the server portion of a WSGI interface and all modern Python Web Frameworks already implement the framework side of the WSGI interface, which allows you to use them with your Web server without ever modifying your server’s code to accommodate a particular Web framework.

Now you know that WSGI support by Web servers and Web frameworks allows you to choose a pairing that suits you, but it is also beneficial to server and framework developers because they can focus on their preferred area of specialization and not step on each other’s toes. Other languages have similar interfaces too: Java, for example, has Servlet API and Ruby has Rack.

It’s all good, but I bet you are saying: “Show me the code!” Okay, take a look at this pretty minimalistic WSGI server implementation:

```# Tested with Python 2.7.9, Linux & Mac OS X
import socket
import StringIO
import sys

class WSGIServer(object):

socket_type = socket.SOCK_STREAM
request_queue_size = 1

# Create a listening socket
self.listen_socket = listen_socket = socket.socket(
self.socket_type
)
# Allow to reuse the same address
# Bind
# Activate
listen_socket.listen(self.request_queue_size)
# Get server host name and port
host, port = self.listen_socket.getsockname()[:2]
self.server_name = socket.getfqdn(host)
self.server_port = port
# Return headers set by Web framework/Web application

def set_app(self, application):
self.application = application

def serve_forever(self):
listen_socket = self.listen_socket
while True:
# New client connection
# Handle one request and close the client connection. Then
# loop over to wait for another client connection
self.handle_one_request()

def handle_one_request(self):
self.request_data = request_data = self.client_connection.recv(1024)
# Print formatted request data a la 'curl -v'
print(''.join(
'< {line}\n'.format(line=line)
for line in request_data.splitlines()
))

self.parse_request(request_data)

# Construct environment dictionary using request data
env = self.get_environ()

# It's time to call our application callable and get
# back a result that will become HTTP response body
result = self.application(env, self.start_response)

# Construct a response and send it back to the client
self.finish_response(result)

def parse_request(self, text):
request_line = text.splitlines()[0]
request_line = request_line.rstrip('\r\n')
# Break down the request line into components
(self.request_method,  # GET
self.path,            # /hello
self.request_version  # HTTP/1.1
) = request_line.split()

def get_environ(self):
env = {}
# The following code snippet does not follow PEP8 conventions
# but it's formatted the way it is for demonstration purposes
# to emphasize the required variables and their values
#
# Required WSGI variables
env['wsgi.version']      = (1, 0)
env['wsgi.url_scheme']   = 'http'
env['wsgi.input']        = StringIO.StringIO(self.request_data)
env['wsgi.errors']       = sys.stderr
env['wsgi.multiprocess'] = False
env['wsgi.run_once']     = False
# Required CGI variables
env['REQUEST_METHOD']    = self.request_method    # GET
env['PATH_INFO']         = self.path              # /hello
env['SERVER_NAME']       = self.server_name       # localhost
env['SERVER_PORT']       = str(self.server_port)  # 8888
return env

('Date', 'Tue, 31 Mar 2015 12:54:48 GMT'),
('Server', 'WSGIServer 0.2'),
]
# To adhere to WSGI specification the start_response must return
# a 'write' callable. We simplicity's sake we'll ignore that detail
# for now.
# return self.finish_response

def finish_response(self, result):
try:
response = 'HTTP/1.1 {status}\r\n'.format(status=status)
response += '\r\n'
for data in result:
response += data
# Print formatted response data a la 'curl -v'
print(''.join(
'> {line}\n'.format(line=line)
for line in response.splitlines()
))
self.client_connection.sendall(response)
finally:
self.client_connection.close()

SERVER_ADDRESS = (HOST, PORT) = '', 8888

server.set_app(application)
return server

if __name__ == '__main__':
if len(sys.argv) < 2:
sys.exit('Provide a WSGI application object as module:callable')
app_path = sys.argv[1]
module, application = app_path.split(':')
module = __import__(module)
application = getattr(module, application)
print('WSGIServer: Serving HTTP on port {port} ...\n'.format(port=PORT))
httpd.serve_forever()
```

It’s definitely bigger than the server code in Part 1, but it’s also small enough (just under 150 lines) for you to understand without getting bogged down in details. The above server also does more - it can run your basic Web application written with your beloved Web framework, be it Pyramid, Flask, Django, or some other Python WSGI framework.

Don’t believe me? Try it and see for yourself. Save the above code as webserver2.py or download it directly from GitHub. If you try to run it without any parameters it’s going to complain and exit.

```\$ python webserver2.py
Provide a WSGI application object as module:callable
```

It really wants to serve your Web application and that’s where the fun begins. To run the server the only thing you need installed is Python. But to run applications written with Pyramid, Flask, and Django you need to install those frameworks first. Let’s install all three of them. My preferred method is by using virtualenv. Just follow the steps below to create and activate a virtual environment and then install all three Web frameworks.

```\$ [sudo] pip install virtualenv
\$ mkdir ~/envs
\$ virtualenv ~/envs/lsbaws/
\$ cd ~/envs/lsbaws/
\$ ls
bin  include  lib
\$ source bin/activate
(lsbaws) \$ pip install pyramid
(lsbaws) \$ pip install django
```

At this point you need to create a Web application. Let’s start with Pyramid first. Save the following code as pyramidapp.py to the same directory where you saved webserver2.py or download the file directly from GitHub:

```from pyramid.config import Configurator
from pyramid.response import Response

def hello_world(request):
return Response(
'Hello world from Pyramid!\n',
content_type='text/plain',
)

config = Configurator()
app = config.make_wsgi_app()
```

```(lsbaws) \$ python webserver2.py pyramidapp:app
WSGIServer: Serving HTTP on port 8888 ...
```

You just told your server to load the ‘app’ callable from the python module ‘pyramidapp’ Your server is now ready to take requests and forward them to your Pyramid application. The application only handles one route now: the /hello route. Type http://localhost:8888/hello address into your browser, press Enter, and observe the result:

You can also test the server on the command line using the ‘curl’ utility:

```\$ curl -v http://localhost:8888/hello
...
```

Check what the server and curl prints to standard output.

```from flask import Flask

def hello_world():
return Response(
mimetype='text/plain'
)

```

```(lsbaws) \$ python webserver2.py flaskapp:app
WSGIServer: Serving HTTP on port 8888 ...
```

Now type in the http://localhost:8888/hello into your browser and press Enter:

Again, try ‘curl’ and see for yourself that the server returns a message generated by the Flask application:

```\$ curl -v http://localhost:8888/hello
...
```

Can the server also handle a Django application? Try it out! It’s a little bit more involved, though, and I would recommend cloning the whole repo and use djangoapp.py, which is part of the GitHub repository. Here is the source code which basically adds the Django ‘helloworld’ project (pre-created using Django’s django-admin.py startproject command) to the current Python path and then imports the project’s WSGI application.

```import sys
sys.path.insert(0, './helloworld')
from helloworld import wsgi

app = wsgi.application
```

Save the above code as djangoapp.py and run the Django application with your Web server:

```(lsbaws) \$ python webserver2.py djangoapp:app
WSGIServer: Serving HTTP on port 8888 ...
```

Type in the following address and press Enter:

And as you’ve already done a couple of times before, you can test it on the command line, too, and confirm that it’s the Django application that handles your requests this time around:

```\$ curl -v http://localhost:8888/hello
...
```

Did you try it? Did you make sure the server works with those three frameworks? If not, then please do so. Reading is important, but this series is about rebuilding and that means you need to get your hands dirty. Go and try it. I will wait for you, don’t worry. No seriously, you must try it and, better yet, retype everything yourself and make sure that it works as expected.

Okay, you’ve experienced the power of WSGI: it allows you to mix and match your Web servers and Web frameworks. WSGI provides a minimal interface between Python Web servers and Python Web Frameworks. It’s very simple and it’s easy to implement on both the server and the framework side. The following code snippet shows the server and the framework side of the interface:

```def run_application(application):
"""Server code."""
# This is where an application/framework stores
# an HTTP status and HTTP response headers for the server
# to transmit to the client
# Environment dictionary with WSGI/CGI variables
environ = {}

# Server invokes the ‘application' callable and gets back the
# response body
result = application(environ, start_response)
# Server builds an HTTP response and transmits it to the client
…

def app(environ, start_response):
"""A barebones WSGI app."""
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['Hello world!']

run_application(app)
```

Here is how it works:

1. The framework provides an ‘application’ callable (The WSGI specification doesn’t prescribe how that should be implemented)
2. The server invokes the ‘application’ callable for each request it receives from an HTTP client. It passes a dictionary ‘environ’ containing WSGI/CGI variables and a ‘start_response’ callable as arguments to the ‘application’ callable.
3. The framework/application generates an HTTP status and HTTP response headers and passes them to the ‘start_response’ callable for the server to store them. The framework/application also returns a response body.
4. The server combines the status, the response headers, and the response body into an HTTP response and transmits it to the client (This step is not part of the specification but it’s the next logical step in the flow and I added it for clarity)

And here is a visual representation of the interface:

So far, you’ve seen the Pyramid, Flask, and Django Web applications and you’ve seen the server code that implements the server side of the WSGI specification. You’ve even seen the barebones WSGI application code snippet that doesn’t use any framework.

The thing is that when you write a Web application using one of those frameworks you work at a higher level and don’t work with WSGI directly, but I know you’re curious about the framework side of the WSGI interface, too because you’re reading this article. So, let’s create a minimalistic WSGI Web application/Web framework without using Pyramid, Flask, or Django and run it with your server:

```def app(environ, start_response):
"""A barebones WSGI application.

This is a starting point for your own Web framework :)
"""
status = '200 OK'
return ['Hello world from a simple WSGI application!\n']
```

Again, save the above code in wsgiapp.py file or download it from GitHub directly and run the application under your Web server as:

```(lsbaws) \$ python webserver2.py wsgiapp:app
WSGIServer: Serving HTTP on port 8888 ...
```

Type in the following address and press Enter. This is the result you should see:

You just wrote your very own minimalistic WSGI Web framework while learning about how to create a Web server! Outrageous.

Now, let’s get back to what the server transmits to the client. Here is the HTTP response the server generates when you call your Pyramid application using an HTTP client:

The response has some familiar parts that you saw in Part 1 but it also has something new. It has, for example, four HTTP headers that you haven’t seen before: Content-Type, Content-Length, Date, and Server. Those are the headers that a response from a Web server generally should have. None of them are strictly required, though. The purpose of the headers is to transmit additional information about the HTTP request/response.

Now that you know more about the WSGI interface, here is the same HTTP response with some more information about what parts produced it:

I haven’t said anything about the ‘environ’ dictionary yet, but basically it’s a Python dictionary that must contain certain WSGI and CGI variables prescribed by the WSGI specification. The server takes the values for the dictionary from the HTTP request after parsing the request. This is what the contents of the dictionary look like:

A Web framework uses the information from that dictionary to decide which view to use based on the specified route, request method etc., where to read the request body from and where to write errors, if any.

By now you’ve created your own WSGI Web server and you’ve made Web applications written with different Web frameworks. And, you’ve also created your barebones Web application/Web framework along the way. It’s been a heck of a journey. Let’s recap what your WSGI Web server has to do to serve requests aimed at a WSGI application:

• First, the server starts and loads an ‘application’ callable provided by your Web framework/application
• Then, the server reads a request
• Then, the server parses it
• Then, it builds an ‘environ’ dictionary using the request data
• Then, it calls the ‘application’ callable with the ‘environ’ dictionary and a ‘start_response’ callable as parameters and gets back a response body.
• Then, the server constructs an HTTP response using the data returned by the call to the ‘application’ object and the status and response headers set by the ‘start_response’ callable.
• And finally, the server transmits the HTTP response back to the client

That’s about all there is to it. You now have a working WSGI server that can serve basic Web applications written with WSGI compliant Web frameworks like Django, Flask, Pyramid, or your very own WSGI framework. The best part is that the server can be used with multiple Web frameworks without any changes to the server code base. Not bad at all.

Before you go, here is another question for you to think about, “How do you make your server handle more than one request at a time?”

Stay tuned and I will show you a way to do that in Part 3. Cheers!

BTW, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch and goes into more detail on topics I just covered. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series:

## Let’s Build A Web Server. Part 1.

Date

Out for a walk one day, a woman came across a construction site and saw three men working. She asked the first man, “What are you doing?” Annoyed by the question, the first man barked, “Can’t you see that I’m laying bricks?” Not satisfied with the answer, she asked the second man what he was doing. The second man answered, “I’m building a brick wall.” Then, turning his attention to the first man, he said, “Hey, you just passed the end of the wall. You need to take off that last brick.” Again not satisfied with the answer, she asked the third man what he was doing. And the man said to her while looking up in the sky, “I am building the biggest cathedral this world has ever known.” While he was standing there and looking up in the sky the other two men started arguing about the errant brick. The man turned to the first two men and said, “Hey guys, don’t worry about that brick. It’s an inside wall, it will get plastered over and no one will ever see that brick. Just move on to another layer.”1

The moral of the story is that when you know the whole system and understand how different pieces fit together (bricks, walls, cathedral), you can identify and fix problems faster (errant brick).

What does it have to do with creating your own Web server from scratch?

I believe to become a better developer you MUST get a better understanding of the underlying software systems you use on a daily basis and that includes programming languages, compilers and interpreters, databases and operating systems, web servers and web frameworks. And, to get a better and deeper understanding of those systems you MUST re-build them from scratch, brick by brick, wall by wall.

Confucius put it this way:

I hear and I forget.”

I see and I remember.”

I do and I understand.”

I hope at this point you’re convinced that it’s a good idea to start re-building different software systems to learn how they work.

In this three-part series I will show you how to build your own basic Web server. Let’s get started.

First things first, what is a Web server?

In a nutshell it’s a networking server that sits on a physical server (oops, a server on a server) and waits for a client to send a request. When it receives a request, it generates a response and sends it back to the client. The communication between a client and a server happens using HTTP protocol. A client can be your browser or any other software that speaks HTTP.

What would a very simple implementation of a Web server look like? Here is my take on it. The example is in Python but even if you don’t know Python (it’s a very easy language to pick up, try it!) you still should be able to understand concepts from the code and explanations below:

```import socket

HOST, PORT = '', 8888

listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
request = client_connection.recv(1024)
print request

http_response = """\
HTTP/1.1 200 OK

Hello, World!
"""
client_connection.sendall(http_response)
client_connection.close()
```

Save the above code as webserver1.py or download it directly from GitHub and run it on the command line like this

```\$ python webserver1.py
Serving HTTP on port 8888 …
```

Just do it, seriously. I will wait for you while you’re testing it.

Done? Great. Now let’s discuss how it all actually works.

First let’s start with the Web address you’ve entered. It’s called an URL and here is its basic structure:

This is how you tell your browser the address of the Web server it needs to find and connect to and the page (path) on the server to fetch for you. Before your browser can send a HTTP request though, it first needs to establish a TCP connection with the Web server. Then it sends an HTTP request over the TCP connection to the server and waits for the server to send an HTTP response back. And when your browser receives the response it displays it, in this case it displays “Hello, World!”

Let’s explore in more detail how the client and the server establish a TCP connection before sending HTTP requests and responses. To do that they both use so-called sockets. Instead of using a browser directly you are going to simulate your browser manually by using telnet on the command line.

On the same computer you’re running the Web server fire up a telnet session on the command line specifying a host to connect to localhost and the port to connect to 8888 and then press Enter:

```\$ telnet localhost 8888
Trying 127.0.0.1 …
Connected to localhost.
```

At this point you’ve established a TCP connection with the server running on your local host and ready to send and receive HTTP messages. In the picture below you can see a standard procedure a server has to go through to be able to accept new TCP connections.

In the same telnet session type GET /hello HTTP/1.1 and hit Enter:

```\$ telnet localhost 8888
Trying 127.0.0.1 …
Connected to localhost.
GET /hello HTTP/1.1

HTTP/1.1 200 OK
Hello, World!
```

You’ve just manually simulated your browser! You sent an HTTP request and got an HTTP response back. This is the basic structure of an HTTP request:

The HTTP request consists of the line indicating the HTTP method (GET, because we are asking our server to return us something), the path /hello that indicates a “page” on the server we want and the protocol version.

For simplicity’s sake our Web server at this point completely ignores the above request line. You could just as well type in any garbage instead of GET /hello HTTP/1.1” and you would still get back a “Hello, World!” response.

Once you’ve typed the request line and hit Enter the client sends the request to the server, the server reads the request line, prints it and returns the proper HTTP response.

Here is the HTTP response that the server sends back to your client (telnet in this case):

Let’s dissect it. The response consists of a status line HTTP/1.1 200 OK, followed by a required empty line, and then the HTTP response body.

The response status line HTTP/1.1 200 OK consists of the HTTP Version, the HTTP status code and the HTTP status code reason phrase OK. When the browser gets the response, it displays the body of the response and that’s why you see “Hello, World!” in your browser.

And that’s the basic model of how a Web server works. To sum it up: The Web server creates a listening socket and starts accepting new connections in a loop. The client initiates a TCP connection and, after successfully establishing it, the client sends an HTTP request to the server and the server responds with an HTTP response that gets displayed to the user. To establish a TCP connection both clients and servers use sockets.

Now you have a very basic working Web server that you can test with your browser or some other HTTP client. As you’ve seen and hopefully tried, you can also be a human HTTP client too, by using telnet and typing HTTP requests manually.

Here’s a question for you: “How do you run a Django application, Flask application, and Pyramid application under your freshly minted Web server without making a single change to the server to accommodate all those different Web frameworks?”

I will show you exactly how in Part 2 of the series. Stay tuned.

BTW, I’m writing a book “Let’s Build A Web Server: First Steps” that explains how to write a basic web server from scratch and goes into more detail on topics I just covered. Subscribe to the mailing list to get the latest updates about the book and the release date.

All articles in this series: