Yet More Coding#

Introduction#

This chapter covers some of the most advanced programming concepts you’re likely to run into when coding in Python. It’s not strictly necessary to master most of the content of the book, but it’s here in case you want a deeper understanding or in case you find that you eventually need to draw on more sophisticated programming tools and concepts.

This chapter has benefitted from the online book Research Software Engineering with Python, the official Python documentation, the excellent 30 days of Python, and the Hitchhiker’s Guide to Python.

Classes and objects#

Python is an object oriented programming language. Everything is an object (and every object has a type). A Class is an object constructor, a blueprint for creating objects. An object is a ‘live’ instance of a class. Objects are to classes what a yellow VW Beetle is to cars. The class defines the attributes and methods that the object can perform.

Classes and instances of them are useful in certain situations, the most common being when you need something that has ‘state’, i.e. it can remember things that have happened to it, carry information with it, and change form.

While you’re quite unlikely to need to build classes in economics (unless you’re doing something really fancy), some of the biggest Python packages are based around classes so it’s useful to understand a bit about how they work, and especially how they have state.

The syntax to create a class is

class ClassName:
  ...code...

But it’s easiest to show with an example:

# Define a class called Person


class Person:
    def __init__(self, name):
        self.name = name


# Create an instance of the class
p = Person("Adam")

When we check type, that’s when it gets really interesting

type(p)
__main__.Person

Woah! We created a whole new data type based on the Class name. The class has a constructor method, __init__, that, in this case, takes an input variable name and assigns it to an internal object variable name. The self variable that you can also see is really saying ‘generate an object of type this Class when called’. We can access any internal variables like this:

p.name
'Adam'

Okay but what’s the point of all this? Well we can now create as many objects as we like of class ‘Person’ and they will have the same structure, but not the same state, as other objects of class ‘Person’.

m = Person("Ada")
m.name
'Ada'

This is a very boring class! Let’s add a method, which will allow us to change the state of objects. Here, we add a method increment_age which is also indented under the class Person header. Note that it takes self as an input, just like the constructor, but it only acts on objects of type person that have already been created.

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def increment_age(self):
        self.age = self.age + 1


# Create an instance of the class
p = Person("Adam", 231)

print(p.age)
# Call the method increment_age
p.increment_age()
print(p.age)
231
232

This very simple method changes the internal state. Just like class constructors and regular functions, class methods can take arguments. Here’s an example:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def increment_age(self):
        self.age = self.age + 1

    def change_age(self, new_age):
        self.age = new_age


# Create an instance of the class
p = Person("Adam", 231)

print(p.age)
# Call the method increment_age
p.change_age(67)
print(p.age)
231
67

It can be tedious to have to initialise a class with a whole load of parameters every time. Just like with functions, we can define default parameters for classes:

class Person:
    def __init__(self, name="default_name", age=20):
        self.name = name
        self.age = age


p = Person()
p.name
'default_name'

That covers a lot of the basics of classes but if you’re using classes in anger then you might also want to look up inheritance and composition.

Dataclassess#

The basic classes we created above come with a lot of ‘boilerplate’; code we need but which is not very surprising. Dataclasses were inroduced in Python 3.7 as a way to remove this boilerplate when the classes being created are quite simple. Think of dataclasses as a class with sensible defaults that is for light object-oriented programming.

A simple example, with a Circle class, demonstrates why they are effective. First, the full class way of doing things:

import numpy as np


class Circle1:
    def __init__(self, colour: str, radius: float) -> None:
        self.colour = colour
        self.radius = radius

    def area(self) -> float:
        return np.pi * self.radius ** 2


circle1 = Circle1("red", 2)
circle1
<__main__.Circle1 at 0x7fe94ad17760>

We don’t get a very informative message when we call circle1, as you can see. At least we can compute its area:

circle1.area()
12.566370614359172

Now we’ll create the same object with dataclasses

from dataclasses import dataclass


@dataclass
class Circle2:
    colour: str
    radius: float

    def area(self) -> float:
        return np.pi * self.radius ** 2


circle2 = Circle2("blue", 2)
circle2
Circle2(colour='blue', radius=2)

Right away we get a much more informative message when we call the object, and the class definition is a whole lot simpler. Everything else is just the same (just try calling circle2.area()).

Type annotations and type checkers#

Type annotations were introduced in Python 3.5 (these notes are written in 3.8). If you’ve seen more low-level languages, typing will be familiar to you. Python uses ‘duck typing’ (“if it walks and quacks like a duck, it is a duck”) which means that if a variable walks like an integer, and talks like an integer, then it gets treated as if it is an integer. Ditto for other variable types. Duck typing is useful if you just want to code quickly and aren’t writing production code.

But… there are times when you do know what variable types you’re going to be dealing with ahead of time and you want to prevent the propagation of the wrong kinds of variable types. In these situations, you can clearly say what variable types are supposed to be. And, when used with some other packages, typing can make code easier to understand, debug, and maintain.

Note that it doesn’t have to be all or nothing on type checking, you can just add it in gradually or where you think it’s most important.

Now it’s important to be really clear on one point, namely that Python does not enforce type annotations. But we can use static type checking to ensure all types are as they should be in advance of running. Before we do that, let’s see how we add type annotations.

This is the simplest example of a type annotation:

answer: int = 42

This explicitly says that answer is an integer. Type annotations can be used in functions too:

def increment(number: int) -> int:
    return number + 1

A static type checker uses these type annotations to verify the type correctness of a programme without executing it. mypy is the most widely used static type checker. After installing mypy, to run type checking on a file code_script.py use

mypy code_script.py

on the command line.

What do you see when you run it? Let’s say the content of your script is:

# Contents of code_script.py
def greeting(name: str) -> str:
    return 'Hello ' + name


greeting(3)

This would return:

Argument 1 to "greeting" has incompatible type "int"; expected "str"

Here are more of the type annotations that you might need or come across, courtesy of the mypy documentation:

from typing import List, Set, Dict, Tuple, Optional

# For simple built-in types, just use the name of the type
x: int = 1
x: float = 1.0
x: bool = True
x: str = "test"
x: bytes = b"test"

# For collections, the type of the collection item is in brackets
# (Python 3.9+ only)
x: list[int] = [1]
x: set[int] = {6, 7}

# In Python 3.8 and earlier, the name of the collection type is
# capitalized, and the type is imported from 'typing'
x: List[int] = [1]
x: Set[int] = {6, 7}

# Same as above, but with type comment syntax (Python 3.5 and earlier)
x = [1]  # type: List[int]

# For mappings, we need the types of both keys and values
x: dict[str, float] = {'field': 2.0}  # Python 3.9+
x: Dict[str, float] = {'field': 2.0}

# For tuples of fixed size, we specify the types of all the elements
x: tuple[int, str, float] = (3, "yes", 7.5)  # Python 3.9+
x: Tuple[int, str, float] = (3, "yes", 7.5)

# For tuples of variable size, we use one type and ellipsis
x: tuple[int, ...] = (1, 2, 3)  # Python 3.9+
x: Tuple[int, ...] = (1, 2, 3)

# Use Optional[] for values that could be None
x: Optional[str] = some_function()
# Mypy understands a value can't be None in an if-statement
if x is not None:
    print(x.upper())
# If a value can never be None due to some invariants, use an assert
assert x is not None
print(x.upper())

I am the Walrus#

The Walrus operator, := was introduced in Python 3.8 and, well, it’s fairly complicated but it does have its uses. The main use case for the Walrus operator is when you want to both evaluate an expression and assign a variable in one fell swoop.

Take this (trivial) example which involves evaluating an expression, len(a) > 4, that returns a boolean and then assigning that same expression to a variable n:

a = [1, 2, 3, 4]
if len(a) > 3:
    n = len(a)
    print(f"List is too long ({n} elements, expected <= 3)")
List is too long (4 elements, expected <= 3)

The Walrus operator allows us to skip the clumsy use of len(a) twice and do both steps in one go. As noted, that’s trivial here, but if evaluation were very computationally expensive, then this might save us some trouble. Here’s the version with the Walrus operator:

a = [1, 2, 3, 4]
if (n := len(a)) > 3:
    print(f"List is too long ({n} elements, expected <= 3)")
List is too long (4 elements, expected <= 3)

Map, filter, and reduce#

Map, filter, and reduce are built-in higher order functions. Lambda functions, featured in the basics of coding chapter, can be passed as into each of these as an argument and some of the best use cases of lambda functions are in conjunction with map, filter, and reduce.

Map#

map takes a function and an iterable as arguments, ie the syntax is map(function, iterable). An iterable is a type that is composed of elements that can be iterated over. The map essentially applies the function to each entry in the iterable. Here’s an example where a list of strings is cast to integers via map:

numbers_str = ["1", "2", "3", "4", "5"]
mapped_result = map(int, numbers_str)
list(mapped_result)
[1, 2, 3, 4, 5]

Here’s an example with a lambda function. The benefit of using a lambda in this map operation is that otherwise we would have to write a whole function that simply returned the input with .title() at the end:

names = ["robinson", "fawcett", "ostrom"]
names_titled = map(lambda name: name.title(), names)
list(names_titled)
['Robinson', 'Fawcett', 'Ostrom']

Filter#

filter calls a specified function and returns a boolean for each item of the specified iterable. It filters the items that satisfy the given boolean criteria. It uses the filter(function, iterable) syntax. In the example below, we take all the numbers from zero to five and filter them according to whether they are divisible by 2:

numbers = list(range(6))
fil_result = filter(lambda x: x % 2 == 0, numbers)
list(fil_result)
[0, 2, 4]

Reduce#

reduce is defined in the built-in functools module. Like map and filter, reduce takes two parameters, a function and an iterable. However, it returns a single value rather than another iterable. The way reduce works is to apply operations successively so that the example below effectively first sums 2 and 3 to make 5, then 5 and 5 to make 10, then 10 and 15 to make 25, and, finally, 25 and 20 to make the final result of 45.

from functools import reduce

numbers = [2, 3, 5, 15, 20]

reduce(lambda x, y: x + y, numbers)
45

Non-local variables#

Non-local variables are used in nested functions as a means to say ‘hey, this variable is not just local to this nested function, it exists outside it too’. Here’s an example that prints “world” because we tell the inner function to use the same x as the outer function:

def outer_function():
  x = "hello"
  def nested_function():
    nonlocal x
    x = "world"
  nested_function()
  return x

print(outer_function())
world

Exercise

Re-write the above function without the nonlocal keyword. What does it print?

Multiple dispatch#

One can use object-oriented methods and inheritance to get different code objects to behave in different ways depending on the type of input. For example, a different behaviour might occur if you send a string into a function versus an integer. An alternative to the object-oriented approach is to use multiple dispatch. fastcore is a library that provides “goodies to make your coding faster, easier, and more maintainable” and has many neat features but amongst the goodies is multiple dispatch, with the typedispatch decorator. The example below doesn’t execute but shows you how the library can be used to define different behaviours for inputs of different types.

# fastcore is designed to be imported as *
from fastcore.dispatch import *


@typedispatch
def func_example(x: int, y: float):
    return x + y


@typedispatch
def func_example(x: int, y: int):
    return x * y


# Int and float
print(func_example(5, 5.0))

# Int and int
print(func_example(5, 5))

What we can see here is that we have the same function, func_example, used twice with very similar inputs. But the inputs are not the same; in the first instance it’s an integer and a float while in the second it’s two integers. The different inputs get routed into the different versions of the @typeddispatch function. This decorator-based approach is not the only way to use fastcore to do typed dispatch but it’s one of the most convenient.