Writing Code#
In this chapter, we’ll look at the different ways you can both write and run code. This is something that can be very confusing if you’re just getting into programming.
There are different ways to write and run code that suit different needs. For example, for creating a reproducible pipeline of tasks or writing production-grade software, you might opt for a script——a file that is mostly code. And you might even bundle that up in an installable package. But for sending instructions to a colleague or exploring a narrative, you might choose to write your code in a notebook because it can present text and code together more naturally than a script can.
We already met some ways to write and run code in previous chapters. Here, we’ll be a bit more systematic so that, by the end of the chapter, you’ll be comfortable writing code in both scripts (the most popular way) and notebooks. We’ll also look ahead to writing executable code chunks in markdown documents, which has some real strengths for communication to people who don’t need to see the code.
Let’s start with some definitions.
IDE, or integrated development environment: this is the application that you write code of all different kinds in (scripts, notebooks, markdown). This book recommends Visual Studio Code as an IDE. It’s got tons of helpful features, including support for many languages. Making an analogy with writing documents, VS Code is to programming languages what a word processor is to actual languages. JupyterLab is another IDE, but one which is geared towards the use of notebooks.
the interpreter: this is the programming language (eg Python) that has to be installed separately onto your computer. It is what takes your written commands and turns them into actions. VS Code and other IDEs will use whatever interpreters they can find on your computer and use them to execute the code you’ve written.
scripts: these are files that almost exclusively contain code. They can be edited and run in an IDE or in the terminal. Python scripts always have the file extension
.py
.notebooks: aka Jupyter Notebooks, these are files that can contain code and text in different blocks called “cells”. The code appears in code cells, while the text appears in markdown cells. You can have any number, order, or type of cells you want. The code parts can be run in an IDE either all at once or however you like. Jupyter Notebooks always have the file extension
.ipynb
. Notebooks can be exported to other formats, like word documents, HTML pages, PDFs, and even slides! The content for this page is athe terminal: this is the text interface that you use to send instructions to your computer’s operating system. It’s typically what you use to install new packages, for example with commands like
pip install packagename
. Although your computer will come with a separate terminal application too, you can open a terminal in VS Code by clicking on Terminal > New Terminal at the top of your VS Code window (there’s a keyboard shortcut too, but it varies across systems).markdown: this is a lightweight language that turns simple text commands into professional looking documents. It’s widely used by people who code. It’s also what’s used for the text cells in Jupyter Notebooks. When not in a notebook, files containing markdown always have the extension
.md
. With the Visual Studio Code markdown extensions installed, you can right-click within a Markdown file in VS Code and then select Markdown Preview Enhanced to see how the rendered document will look. The difference between HTML code and the same website viewed in a browser like Chrome is a good analogy for the difference between what you see in a.md
file and what you see in the preview of that markdown file.quarto markdown: this is a special variant of markdown, with file extension
.qmd
, that can be used to combine text and code that gets executed so that code outputs are inserted into final outputs such as html or pdf documents.
Let’s now turn to all of the different ways you can write code in a fully-featured integrated development environment like Visual Studio Code. They each have pros and cons, and you’re likely to want to use them at different times. The table below sets out all of the different ways you can write, and execute, code.
If you’re looking for a typical workflow, this book recommends working with scripts (files that end in .py
) and the VS Code interactive window. Remember, if you’re working with a .py
file, you can always open the Visual Studio Code interactive window by right-clicking somewhere within the script and selecting ‘Run in interactive window’.
What |
How to use |
Prerequisites |
Pros |
Cons |
---|---|---|---|---|
Script, eg |
‘Run in interactive window’ in an integrated development environment (IDE) |
Python installation + an IDE with Python support, eg Visual Studio Code. |
Can be run all-in-one or step-by-step as needed. Very powerful tools available to aid coding in scripts. De facto standard for production-quality code. Can be imported by other scripts. Version control friendly. |
Not very good if you want to have lots of text alongside code. |
Jupyter Notebook, eg |
Open the file with Visual Studio Code. |
Use Visual Studio Code and the VS Code Jupyter extension. |
Code and text can alternate in the same document. Rich outputs of code can be integrated into document. Can export to PDF, HTML, and more, with control over whether code inputs/outputs are shown, and either exported directly or via Quarto. Can be run all-in-one or step-by-step as needed. |
Fussy to use with version control. Code and text cannot be mixed in same ‘cell’. Not easy to import in other code files. |
Markdown with executable code chunks using Quarto, eg |
To produce output, write in a mix of markdown and code blocks and then export with commands like |
Installations of Python and Quarto, plus their dependencies. |
Allows for true mixing of text and code. Can export to wide variety of other formats, such as PDF and HTML, with control over whether code inputs/outputs are shown. Version control friendly. |
Cannot be imported by other code files. |
Some of the options above make use of the command line, a way to issue text-based instructions to your computer. Remember, the command line (aka the terminal) can be accessed via the Terminal app on Mac, the Command Prompt app on Windows, or ctrl + alt + t on Linux. To open up the command line within Visual Studio Code, you can use the keyboard shortcut ⌃ + ` (on Mac) or ctrl + ` (Windows/Linux), or click “View > Terminal”.
Now let’s look at each of these ways to run code in more detail using a common example: Hello World!
Scripts#
Most code is written in scripts and they should be your go-to.
We already met scripts, but let’s have a recap. Create a new file in Visual Studio Code called hello_world.py
. In the Visual Studio Code editor, add a single line to the file:
print('Hello World!')
Save the file. Right-click and, to run the script, you can either use ‘Run current file in interactive window’, or ‘Run current file in terminal’, or ‘Run selection/line in interactive window’. These are two different methods of running the script: in the IDE (VS Code in this case) or in the command line.
A typical workflow would be selecting some lines within a script, and then hitting ‘Run selection/line in interactive window’ or using the keyboard shortcut of shift + enter.
As an alternative for the latter, you can open up the command line yourself and run
python hello_world.py
which will execute the script.
Jupyter Notebooks#
Jupyter Notebooks are another popular way to write code, in addition to scripts (.py
files). Notebooks mix code and text by having a series of “cells” that are either code or text. Jupyter Notebooks are for experimentation, tinkering, and keeping text and code together. They are the lab books of the coding world. This book is mostly written in Jupyter Notebooks, including this chapter! You can download the notebooks that make up most chapters of this book and run them on your own computer: look for the download symbol at the top of each page; “.ipynb” means ipython notebook.
The name, ‘Jupyter’, is a reference to the three original languages supported by Jupyter, which are Julia, Python, and R, and to Galileo’s notebooks recording the discovery of the moons of Jupiter. Jupyter notebooks now support a vast number of languages beyond the original three, including Ruby, Haskell, Go, Scala, Octave, Java, and more.
Writing Your First Notebook#
(Alternatively to the instructions here, you can use Google Colab to try a kind of notebook with no setup at all.)
To get started with Jupyter Notebooks, you’ll need to have a Python installation and to have run pip install jupyterlab
on the command line (to install the packages needed for Jupyter Notebooks). Then, in Visual Studio Code, creating a new notebook is as easy as File -> New File -> Jupyter Notebook. Save your new notebook file as hello_world.ipynb
. (You can just open any new file and name it with a .ipynb
extension, but you’ll need to close the file and re-open it for VS Code to recognise that it’s a notebook.)
The notebook interface should automatically load and you’ll see options to create cells with plus signs labelled ‘Code’ and ‘Markdown’. A cell is an independent chunk of either code or text. Text cells use markdown, a lightweight language for creating text outputs that you will find out more about in Markdown.
Try adding print("hello world!")
to the first (code) cell and hitting the play symbol on the left-hand side of the cell. You will be prompted to select a “kernel”, a version of Python on your system. For this, it doesn’t matter which kernel (Python interpreter) you use. In future, you may want to set the kernel using the “Select Kernel” option at the top right-hand side of the screen to tell Visual Studio Code what specific version of Python you want to use to execute any code.
Now add a markdown cell (”+ Markdown”) and enter:
# This is a title
## This is a subtitle
This notebook demonstrates printing 'hello world!' to screen.
Click the tick that appears at the top of this cell.
Now, for the next cell, choose code and write:
print('another code cell')
To run the notebook, you can choose to run all cells (usually a double play button at the top of the notebook page) or just each cell at a time (a play button beside a cell). ‘Running’ a markdown cell will render the markdown in display mode; running a code cell will execute it and insert the output below. When you play the code cell, you should see the ‘hello world!’ message appear.
Note that you can use the keyboard short-cut Shift+Enter to execute cells one-by-one instead of hitting the play button.
Jupyter Notebooks are versatile and popular for early exploration of ideas, especially in fields like data science. Jupyter Notebooks can easily be run in the cloud using a browser too (via Binder or Google Colab) without any prior installation. Although it’s not got any executable code in, the page you’re reading now can be loaded into Google Colab as a Jupyter Notebook by clicking ‘Colab’ under the rocket icon at the top of the page.
Exercise
What happens when you press “clear outputs”? What about “run all?”.
One really nice feature of Jupyter Notebooks is that you can use them as the input files for Quarto instead of using .qmd
files, and this opens up many export options and possibilities (like hiding some code inputs). You can find more information here (look for the guidance on Jupyter Notebooks aka .ipynb
files) or look ahead to the chapters on Markdown and Combining Code and Text in Quarto Markdown.
You can try a Jupyter Notebook without installing anything online at https://jupyter.org/try. Click on Try Classic Notebook for a tutorial. If you get stuck with getting started with notebooks, there’s a more in-depth VS Code and Jupyter tutorial available here.
Tips when using Jupyter Notebooks#
Version control: if you are using version control, be wary of saving the outputs of Jupyter Notebooks when you only want to save code. Most IDEs that support Jupyter Notebooks have a clear outputs option. You can also automate this as a pre-commit git hook (if you don’t know what that is, don’t worry). You could also pair your notebook to a script or markdown file (covered in the next section). Outputs or not, Jupyter Notebooks will render on GitHub, the popular remote repository for source code.
Terminal commands: these can be run from inside a Jupyter Notebook by placing a
!
in front of the command and executing the cell. For example,!ls
gives the directory the notebook is in. You can also!pip install
and!conda install
in this way.Magic commands: statements that begin with
%
are magic commands.%whos
displays information about defined variables.%run script.py
runs a script calledscript.py
.%timeit
times how long the cell takes to execute. Finally, you can see many more magic commands using%quickref
.Notebook cells can be executed in any sequence you choose. But if you’re planning to share your notebook or use it again for yourself, it’s good practice to check that its cells do what you want when run in sequence, from top to bottom.
There are tons of extensions to Jupyter Notebooks; you can find a list here. Of particular note is ipywidgets, which adds interactivity.
Get help info on a command by running it but with
?
appended to end.
Markdown with Executable Code Chunks#
This is by far the least common way of coding, though it has gained popularity in recent years and it’s great if you’re going to ultimately export to other formats like slides, documents, or even a website!
When you have much more text combined code, even using Jupyter Notebooks can feel a bit onerous and, historically, editing the text in a notebook was a bit tedious - especially if you wanted to move cells around a lot. Markdown makes for a much more pleasant writing experience. But markdown on its own cannot execute code—but imagine you want to combine reproducibility, text, and code + code outputs: however, there is a tool called Quarto that allows you to do this by adding executable code chunks to markdown.
As this is a bit more of an advanced topic, and as much about communication as it is about writing code, we’ll come back to how to do it in Combining Code and Text in Quarto Markdown.