# More Coding

## Contents

# More Coding#

## Introduction#

This chapter covers more programming, building on Coding Basics. Some of it will come in useful as you do more in code.

This chapter has benefitted from the online book *Research Software Engineering with Python*, the official Python documentation, the excellent 30 days of Python, and the Hitchhiker’s Guide to Python.

## Sets#

A set in coding is a collection of unordered and unindexed distinct elements (in analogy to the mathematical definition of a set). To define a set, the two commands are:

```
st = {}
# or
st = set()
```

These aren’t very interesting though! Here’s a set with some values in:

```
people_set = {"Robinson", "Fawcett", "Ostrom"}
```

What can we do with it? We can check its length using `len(people_set)`

and we can ask whether a particular entry is contained within it:

```
"Ostrom" in people_set
```

```
True
```

We can add multiple items or another set using `.update`

or `.union`

, or a single item using:

```
people_set.add("Martineau")
people_set
```

```
{'Fawcett', 'Martineau', 'Ostrom', 'Robinson'}
```

We can remove entries with `.remove(entry_name)`

or, to remove only the last entry `.pop()`

. You can easily convert between lists and sets:

```
list(people_set)
```

```
['Robinson', 'Fawcett', 'Ostrom', 'Martineau']
```

The real benefits of sets are that they support set operations, though. The most important are `intersection`

,

```
st1 = {"item1", "item2", "item3", "item4"}
st2 = {"item3", "item2"}
st1.intersection(st2)
```

```
{'item2', 'item3'}
```

`difference`

,

```
st1 = {"item1", "item2", "item3", "item4"}
st2 = {"item2", "item3"}
st1.difference(st2)
```

```
{'item1', 'item4'}
```

and symmetric difference,

```
st1 = {"item1", "item2", "item3", "item4"}
st2 = {"item2", "item3"}
st2.symmetric_difference(st1)
```

```
{'item1', 'item4'}
```

## Truthy and falsy values#

Python objects can be used in expressions that will return a boolean value, such as when a list, `listy`

, is used with `if listy`

. Built-in Python objects that are empty are usually evaluated as `False`

, and are said to be ‘Falsy’. In contrast, when these built-in objects are not empty, they evaluate as `True`

and are said to be ‘truthy’.

(If you are building your own classes, you can define this behaviour for them through the `__bool__`

dunder method.)

Let’s see some examples:

```
def bool_check_var(input_variable):
if not (input_variable):
print("Falsy")
else:
print("Truthy")
listy = []
other_listy = [1, 2, 3]
bool_check_var(listy)
```

```
Falsy
```

```
bool_check_var(other_listy)
```

```
Truthy
```

The method we defined doesn’t just operate on lists; it’ll work for many various other truthy and falsy objects:

```
bool_check_var(0)
```

```
Falsy
```

```
bool_check_var([0, 0, 0])
```

```
Truthy
```

Note that zero was falsy, its the nothing of a float, but a list of three zeros is not an empty list, so it evaluates as truthy.

```
bool_check_var({})
```

```
Falsy
```

```
bool_check_var(None)
```

```
Falsy
```

Knowing what is truthy or falsy is useful in practice; imagine you’d like to default to a specific behaviour if a list called `list_vals`

doesn’t have any values in. You now know you can do it simply with `if list_vals`

.

## Lambda functions#

Lambda functions are a very old idea in programming, and are part of the functional programming paradigm. Coding languages tend to be more object-oriented or functional, with the object-oriented approach originating with Alan Turing’s “Turing Machines” and the functional approach with Alonso Church’s “lambda calculus”. These two approaches are mathematically equivalent and, on a more practical note, high-level programming languages often mix both. As examples, Haskell is strongly a functional language, statistics language R leans toward being more functional, Python is slightly more object oriented, and powerhouse languages like Fortran and C are object-oriented. However, despite being less functional than some languages, Python does have lambda functions, for example:

```
plus_one = lambda x: x + 1
plus_one(3)
```

```
4
```

For a one-liner function that has a name it’s actually better practice here to use `def plus_one(x): return x + 1`

, so you shouldn’t see this form of lambda function too much in the wild. However, you are likely to see lambda functions being used with dataframes and other objects. For example, if you had a dataframe with a column of string called ‘strings’ that you want to change to “Title Case” and replace one phrase with another, you could use lambda functions to do that (there are better ways of doing this but this is useful as a simple example):

```
import pandas as pd
df = pd.DataFrame(
data=[["hello my blah is Ada"], ["hElLo mY blah IS Adam"]],
columns=["strings"],
dtype="string",
)
df["strings"].apply(lambda x: x.title().replace("Blah", "Name"))
```

```
0 Hello My Name Is Ada
1 Hello My Name Is Adam
Name: strings, dtype: object
```

More complex lambda functions can be constructed, eg `lambda x, y, z: x + y + z`

. One of the best use cases of lambdas is when you *don’t* want to go to the trouble of declaring a function. For example, let’s say you want to compose a series of functions and you want to specify those functions in a list, one after the other. Using functions alone, you’d have to define a new function for each operation. With lambdas, it would look like this (again, there are easier ways to do this operation, but we’ll use simple functions to demonstrate the principle):

```
number = 1
for func in [lambda x: x + 1, lambda x: x * 2, lambda x: x ** 2]:
number = func(number)
print(number)
```

```
2
4
16
```

Note that people often use `x`

by convention, but there’s nothing to stop you writing `lambda horses: horses**2`

(apart from the looks your co-authors will give you).

Exercise

Write a lambda function that takes the square root of an input number.

If you want to learn more about lambda functions, check out these short video tutorials.

## Splat and splatty-splat#

You read those right, yes. These are also known as “unpacking operators” for iterables that are fed into functions as arguments (in the form of a tuple) and keyword arguments (in the form of a dictionary) respectively. Splat is `*`

and splatty-splat is `**`

. Because they unpack, they allow us to efficiency send packages of arguments or keyword arguments into functions without labouriously writing out every single argument.

Because function arguments are always tuples, the use of `*`

must be accompanied by a tuple. Because function keywords are always dictionaries of key, value pairs, the use of `**`

must always be accompanied by a dictionary.

Let’s take a look at splat, which unpacks tuples into function arguments. If we have a function that takes two arguments we can send variables to it in different ways:

```
def add(a, b):
return a + b
print(add(5, 10))
func_args = (6, 11)
print(add(*func_args))
```

```
15
17
```

The splat operator, `*`

, unpacks the variable `func_args`

into two different function arguments.

Perhaps surprisingly, we can use the splat operator *in the definition of a function*. For example, sum_elements below

```
def sum_elements(*elements):
return sum(*elements)
nums = (1, 2, 3)
print(sum_elements(nums))
more_nums = (1, 2, 3, 4, 5)
print(sum_elements(more_nums))
```

```
6
15
```

Exercise

Write a function multiply that multiplies two input numbers, `a`

and `b`

, together and returns the answer. Send the argument `(10, 12)`

to it using the splat operator.

Splatty-splat, `**`

, unpacks dictionaries into keyword arguments (aka kwargs):

```
def function_with_kwargs(a, x=0, y=0, z=0):
return a + x + y + z
print(function_with_kwargs(5))
kwargs = {"x": 3, "y": 4, "z": 5}
print(function_with_kwargs(5, **kwargs))
```

```
5
17
```

Exercise

Using a dictionary and splatty-splat with the `function_with_kwargs`

function, find the sum of 9, 6, 13, and 2.

## Higher order functions#

Functions are like any other variable in Python, which means you can do some interesting things with them and, well, it can get a bit *meta*. For example, a function can take one or more functions as parameters, a function can be returned as a result of another function, functions can be defined within functions, a function can be assigned to a variable, and you can iterate over functions (for example, if they are in a list).

Here’s an example that shows how to use a higher order function: it accepts a function, `f`

, as an argument and then, using the splat operator `*`

, it accepts all arguments of that function.

```
def join_a_string(str_list):
return " ".join(str_list)
def higher_order_function(f, *args):
"""Lowers case of result"""
out_string = f(*args)
return out_string.lower()
result = higher_order_function(join_a_string, ["Hello", "World!"])
print(result)
```

```
hello world!
```

In the next example, we show how to return a function from a function (assigning a function, `result`

, to a variable in the process):

```
def square(x):
return x ** 2
def cube(x):
return x ** 3
def higher_order_function(type): # a higher order function returning a function
if type == "square":
return square
elif type == "cube":
return cube
result = higher_order_function("square")
print(f"Using higher_order_function('square'), result(3) yields {result(3)}")
result = higher_order_function("cube")
print(f"Using higher_order_function('cube'), result(3) yields {result(3)}")
```

```
Using higher_order_function('square'), result(3) yields 9
Using higher_order_function('cube'), result(3) yields 27
```

Functions within functions are allowed. They are known as *closures*. Here’s a simple (if contrived) example:

```
from datetime import datetime
def print_time_now():
def get_curr_time():
return datetime.now().strftime("%H:%M")
now = get_curr_time()
print(now)
print_time_now()
```

```
18:04
```

Finally, let’s see how to iterate over functions

```
def square_root(x):
return x ** (0.5)
functions_list = [square_root, square, cube]
for func in functions_list:
print(f"{func.__name__} applied to 4 is {func(4)}")
```

```
square_root applied to 4 is 2.0
square applied to 4 is 16
cube applied to 4 is 64
```

## Iterators#

An iterator is an object that contains a countable number of values that a single command, `next`

, iterates through. Before that’s possible though, we need to take a countable group of some kind and use the `iter`

keyword on it to turn it into an iterator. Let’s see an example with some text:

```
text_lst = ["Mumbai", "Delhi", "Bangalore"]
myiterator = iter(text_lst)
```

Okay, nothing has happened yet, but that’s because we didn’t call it yet. To get the next iteration, whatever it is, use `next`

:

```
next(myiterator)
```

```
'Mumbai'
```

```
next(myiterator)
```

```
'Delhi'
```

```
next(myiterator)
```

```
'Bangalore'
```

Alright, we’ve been through all of the values so… what’s going to happen `next`

!?

```
next(myiterator)
```

```
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-27-29fb3b4dbbec> in <module>
----> 1 next(myiterator)
StopIteration:
```

Iterating beyond the end raises a `StopIteration`

error because we reached the end. To keep going, use `cycle`

in place of `iter`

. Note that you can build your own iterators (here we used a built-in object type, the `list`

, to create an iterator of type `list_iterator`

).

## Generators#

Generator functions return ‘lazy’ iterators. They are lazy because they do not store their contents in memory. This has *big* advantages for some operations in specific situations: datasets larger than can fit into your computer’s memory, or a complex function that needs to maintain an internal state every time it’s called.

To give an idea of how and when they work, imagine that (exogeneously) integers are really costly, taking as much as 10 MB of space to store (the real figure is more like 128 bytes). We will write a function, “firstn”, that represents the first \(n\) non-negative integers, where \(n\) is large. The most naive possible way of doing this would be to build the full list in memory like so:

```
def first_n_naive(n):
"""Build and return a list"""
num, nums = 0, []
while num < n:
nums.append(num)
num += 1
return nums
sum_of_first_n = sum(first_n_naive(1000000))
sum_of_first_n
```

```
499999500000
```

Note that `nums`

stores *every* number before returning all of them. In our imagined case, this is completely infeasible because we don’t have enough computer space to keep all \(n\) 10MB integers in memory.

Now we’ll rewrite the list-based function as a generator-based function:

```
def first_n_generator(n):
"""A generator that yields items instead of returning a list"""
num = 0
while num < n:
yield num
num += 1
sum_of_first_n = sum(first_n_generator(1000000))
sum_of_first_n
```

```
499999500000
```

Now, instead of creating an enormous list that has to be stored in memory, we `yield`

up each number as it is ‘generated’. The cleverness that’s going on here is that the ‘state’ of the function is remembered from one call to the next. This means that when `next`

is called on a generator object (either explicitly or implicitly, as in this example), the previously yielded variable `num`

is incremented, and then yielded again.

That was a fairly contrived example but there are plenty of practical ones. Working with pipelines that process very large datasets is a classic use case. For example, imagine you have a csv file that’s far too big to fit in memory, i.e. open all at once, but you’d like to check the contents of each row and perhaps process them. The code below would `yield`

each row in turn.

```
def csv_reader(file_name):
for row in open(file_name, "r"):
yield row
```

An even more concise way of defining this is via a *generator expression*, which syntactically looks a lot like a *list comprehension* but is a generator rather than a list. The example we just saw would be written as:

```
csv_gen = (row for row in open(file_name))
```

It’s easier to see the difference in the below example which clearly shows the analogy between *list comprehensions* and *generator comprehensions*.

```
sq_nums_lc = [num ** 2 for num in range(2, 6)]
sq_nums_lc
```

```
[4, 9, 16, 25]
```

```
sq_nums_gc = (num ** 2 for num in range(2, 6))
sq_nums_gc
```

```
<generator object <genexpr> at 0x7f875f7d44a0>
```

The latter is a generator object and we can only access individual values calling `next`

on it.

```
next(sq_nums_gc)
```

```
4
```

Note that for small numbers of entries, lists may actually be faster and more efficient than generators-but for large numbers of entries, generators will almost always win out.

## Decorators#

Decorators ‘decorate’ functions, they adorn them, modifying them as they execute. Let’s say we want to run some numerical functions but we’d like to add ten on to whatever results we get. We could do it like this:

```
def multiply(num_one, num_two):
return num_one * num_two
def add_ten(in_num):
return in_num + 10
answer = add_ten(multiply(3, 4))
answer
```

```
22
```

This is fine for a one-off but a bit tedious if we’re going to be using `add_ten`

a lot, and on many functions. Decorators allow for a more general solution that can be applied, in this case, to any `inner`

function that has two arguments and returns a numeric value.

```
def add_ten(func):
def inner(a, b):
return func(a, b) + 10
return inner
@add_ten
def multiply(num_one, num_two):
return num_one * num_two
multiply(3, 4)
```

```
22
```

We can use the same decorator for a different function (albeit one of the same form) now.

```
@add_ten
def divide(num_one, num_two):
return num_one / num_two
divide(10, 5)
```

```
12.0
```

But the magic of decorators is such that we can define them for much more general cases, regardless of the number of arguments or even keyword arguments:

```
def add_ten(func):
def inner(*args, **kwargs):
print("Function has been decorated!")
print("Adding ten...")
return func(*args, **kwargs) + 10
return inner
@add_ten
def combine_three_nums(a, b, c):
return a * b - c
@add_ten
def combine_four_nums(a, b, c, d=0):
return a * b - c - d
combine_three_nums(1, 2, 2)
```

```
Function has been decorated!
Adding ten...
```

```
10
```

Let’s now see it applied to a function with a different number of (keyword) arguments:

```
combine_four_nums(3, 4, 2, d=2)
```

```
Function has been decorated!
Adding ten...
```

```
18
```

Decorators can be chained too (and order matters):

```
def dividing_line(func):
def inner(*args, **kwargs):
print("".join(["-"] * 30))
out = func(*args, **kwargs)
return out
return inner
@dividing_line
@add_ten
def multiply(num_one, num_two):
return num_one * num_two
multiply(3, 5)
```

```
------------------------------
Function has been decorated!
Adding ten...
```

```
25
```

## Time#

Let’s do a quick dive into how to deal with dates and times. This is only going to scratch the surface, but should give a sense of what’s possible. For more, see the Introduction to Time chapter.

The built-in library that deals with datetimes is called `datetime`

. Let’s import it and ask it to give us a very precise account of the datetime (when the code is executed):

```
from datetime import datetime
now = datetime.now()
print(now)
```

```
2022-10-28 18:04:21.023795
```

You can pick out bits of the datetime that you need:

```
day = now.day
month = now.month
year = now.year
hour = now.hour
minute = now.minute
print(f"{year}/{month}/{day}, {hour}:{minute}")
```

```
2022/10/28, 18:4
```

Exercise

Using an f-string, add seconds to the date and time string above.

To add or subtract time to a datetime, use `timedelta`

:

```
from datetime import timedelta
new_time = now + timedelta(days=365, hours=5)
print(new_time)
```

```
2023-10-28 23:04:21.023795
```

To take the difference of two dates:

```
from datetime import date
new_year = date(year=2022, month=1, day=1)
time_till_ny = new_year - date.today()
print(f"{time_till_ny.days} days until New Year")
```

```
-300 days until New Year
```

Note that date and datetime are two different types of objects-a datetime includes information on the date and time, whereas a date does not.

## Miscellaneous Fun#

Here are some other bits of basic coding that might be useful. They really show why Python is such a delightful language.

You can use unicode characters for variables

```
α = 15
β = 30
print(α / β)
```

```
0.5
```

You can swap variables in a single assignment:

```
a = 10
b = "This is a string"
a, b = b, a
print(a)
```

```
This is a string
```

**itertools** offers counting, repeating, cycling, chaining, and slicing. Here’s a cycling example that uses the `next`

keyword to get the next iteraction:

```
from itertools import cycle
lorrys = ["red lorry", "yellow lorry"]
lorry_iter = cycle(lorrys)
print(next(lorry_iter))
print(next(lorry_iter))
print(next(lorry_iter))
```

```
red lorry
yellow lorry
red lorry
```

**itertools** also offers products, combinations, combinations with replacement, and permutations. Here are the combinations of ‘abc’ of length 2:

```
from itertools import combinations
print(list(combinations("abc", 2)))
```

```
[('a', 'b'), ('a', 'c'), ('b', 'c')]
```

```
```

Find out what the date is! (Can pass a timezone as an argument.)

```
from datetime import date
print(date.today())
```

```
2022-10-28
```

Because functions are just objects, you can iterate over them just like any other object:

```
functions = [str.isdigit, str.islower, str.isupper]
raw_str = "asdfaa3fa"
for str_func in functions:
print(f"Function name: {str_func.__name__}, value is:")
print(str_func(raw_str))
```

```
Function name: isdigit, value is:
False
Function name: islower, value is:
True
Function name: isupper, value is:
False
```

Functions can be defined recursively. For instance, the Fibonacci sequence is defined such that \( a_n = a_{n-1} + a_{n-2} \) for \( n>1 \).

```
def fibonacci(n):
if n < 0:
print("Please enter n>0")
return 0
elif n <= 1:
return n
else:
return fibonacci(n - 1) + fibonacci(n - 2)
[fibonacci(i) for i in range(10)]
```

```
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
```