Control Flow
Now that we have the ability to store and work with data as variables, we can start to write code that reacts to the data it is working with.
All the code that we’ve written up to now has been executed in top-down order. But What if you only want to run some lines of code under certain conditions? For example, if you had a dataset of subjects from a medical study and you wanted to treat patients with an age over 50 differently than the others? This is where we can use control flow, to create branching programs that change and adapt their behaviour based on the data that they encounter.
The if
Statement
The simplest form of control flow is the if
statement. The if statement allows us to provide some condition and a chunk of code that executes if that condition is determined to be true.
1
2
3
4
5
6
7
8
9
10
11
age = 51
print(age)
if age > 50:
print("Patient is over 50")
age = 20
print(age)
if age > 50:
print("Patient is over 50")
Will produce:
Here our condition is age is greater than 50, but there are a bunch of other comparison operators as well: <, <=, >, >=, ==, !=
. Most of these should be fairly straight forward with the exception of !=
which is used to test if two values are not equal. Comparison operators always evaluate to a boolean value: True
or False
. Any python expression that results in a boolean value can be used as a condition in an if statement.
Indentation in Python
One important thing to notice is that here the indentation is very important. In a lot of places, the so called white space in our code is ignored when the code is evaluated. The following three statements all create the exact same behaviour:
1
2
3
print ("hello")
print ("hello")
print ( "hello" )
This is not true when we talk about indentation (leading white space) in python. When we define blocks of code, python uses the indentation to determine where those blocks begin and end. In general, it’s good practice to indent your code with either tabs or spaces, but to never mix the two. This can cause all kinds of formatting issues when you move the code between different editors.
If you’ve got more complex conditions to test for, we can combine conditions together using logical operators:
1
2
3
age = 51
if age >= 50 and age < 60:
print("50 is the new 40")
An expression using an and
operator evaluates to true if both of the expressions evaluate to True
.
There’s also an or
operator, that evaluates to True
if either one or the other or both evaluate to True
:
1
2
3
age = 5
if age <= 10 or age >= 90:
print("Patient is either very young or very old.")
It’s very common to have one set of code to run if our conditions are true and another to run if our conditions are false. This is so common that there’s a shorthand syntax for it - the else
keyword:
1
2
3
4
5
6
age = 51
if age >= 50:
print("Patient is over 50")
else:
print("Patient is under 50")
It’s also very common to chain if statements together to evaluate complex conditions using another keyword - elif
. elif
is just a combination of else
and if
, which means it is only evaluated if the first condition evaluates to false.
1
2
3
4
5
6
7
age = 5
if age < 13:
print("Child")
elif age < 19:
print("Teen")
else:
print("Adult")
Building on to our temperature conversion example, we might have a dataset where the temperatures aren’t all recorded in the same units. This sort of thing can be quite common when the data is generated by a team of researchers, and or collected from a number of different sources.
Consider a list of temperatures and a list of the units that those temperatures are in:
1
2
temps = [72, 20, 68, 100]
units = ['F', 'C', 'F', 'C']
Recall that we can access any individual element in a list using the square bracket notation and its numerical index:
1
2
print(temps[0])
print(units[0])
Say that we want to standardize all of our temperatures to Celsius. Let’s start by writing the code just to check and convert the first value:
1
2
3
4
5
6
7
8
9
10
temps = [72, 20, 68, 100]
units = ['F', 'C', 'F', 'C']
if units[0] == 'F':
temp_in_c = (temps[0] - 32) * (5/9)
temps[0] = temp_in_c
units[0] = 'C'
print(temps)
Now that we’ve got it working for the first value, we can repeat the process for the remaining values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
temps = [72, 20, 68, 100]
units = ['F', 'C', 'F', 'C']
if units[0] == 'F':
temp_in_c = (temps[0] - 32) * (5/9)
temps[0] = temp_in_c
units[0] = 'C'
if units[1] == 'F':
temp_in_c = (temps[1] - 32) * (5/9)
temps[1] = temp_in_c
units[1] = 'C'
if units[2] == 'F':
temp_in_c = (temps[2] - 32) * (5/9)
temps[2] = temp_in_c
units[2] = 'C'
if units[3] == 'F':
temp_in_c = (temps[3] - 32) * (5/9)
temps[3] = temp_in_c
units[3] = 'C'
print(temps)
That works great, but does anyone see any problems here?
We’re writing very similar code over and over again. It’s repetitive, and fragile. What happens if we want to run this same code again on a separate set of input, but this time we have 3 values in our dataset? What about 5 values? Or 500?
Looping
These sorts of repetitive tasks are where another kind of control flow, looping, becomes really useful. Loops let us write code once and then repeat it over and over:
1
2
3
names = ["Mike", "Leo", "Don", "Raph"]
for name in names:
print("Hello " + name)
This is a simplistic example but it demonstrates the power of loops. We’re written one line of generic code and using the loop we’ve managed to run that code multiple times without having to write it out.
Lists in python are what we call, iterable, which means that it’s designed to be used in a loop by processing one element at a time. The python for loop can work with anything that’s iterable, such as a list or a string . The for loop will repeat the body of the loop once for each value in the iterable. As it does, it creates a new variable which we have called name
and gives that variable the value of the current element from the iterable. We can then use that variable inside the body of the loop.
Check Your Understanding
How might we use a for loop to convert a list of temperatures in Fahrenheit to Celsius?
1
2
3
4
5
6
7
8
9
10
# Start With:
temps = [72, 62, 68, 100]
# Store the converted temperatures in this list
temps_in_c = []
# Note that you can add a new element to a list by calling its append method:
# temps_in_c.append(22)
# Your code here:
1
2
3
4
5
6
7
8
9
10
# Start With:
temps = [72, 62, 68, 100]
# Store the converted temperatures in this list
temps_in_c = []
# Note that you can add a new element to a list by calling its append method:
# temps_in_c.append(22)
# Your code here:
Solution:
1
2
3
4
5
6
7
8
temps_in_f = [72, 62, 68, 100]
temps_in_c = []
for temp in temps_in_f:
temp_c = (temp - 32) * (5/9)
temps_in_c.append(temp_c)
print(temps_in_c)
Another kind of loop is the while loop, which is like a loop combined with an if statement. The while loop lets us define some condition the same way we would in an if statement, and then the loop will continue to operate as long as that condition remains true:
1
2
3
4
count = 0
while count < 4:
print(count)
count = count + 1
A more complex loop using our temperature example, say we want to count the days before the average temperature for a year exceeds some threshold value:
1
2
3
4
5
6
7
8
9
10
11
12
temps = [62, 72, 70, 74, 77, 71, 72]
average_temp = 0
running_sum = 0
index = 0
while average_temp < 70:
running_sum = running_sum + temps[index]
average_temp = running_sum / (index + 1)
index = index + 1
print(f"Threshold passed after {index} days.")
Check Your Understanding
Write a while loop that counts the number of letters in a string before the first letter "z".
1
2
3
4
# Start With:
elements = "nickel copper zinc gallium"
# Your code here:
1
2
3
4
# Start With:
elements = "nickel copper zinc gallium"
# Your code here:
Solution:
1
2
3
4
5
6
7
8
9
elements = "nickel copper zinc gallium"
count = 0
letter = elements[0]
while letter != 'z':
count = count + 1
letter = elements[count]
print(count, " characters before z")