A list in Python stores many values in a single structure.
We have seen how variables can store a single value but if we have a dataset, e.g. the pressure in our experiment sampled every minute we could use variables:
pressure_000
, pressure_001
, pressure_002
, ..
But writing these out by hand would be slow and not much fun; especially if we have tens, thousands, or even millions of values in our dataset.
A list
is the name of Python's solution to the problem (other languages have arrays
or vectors
to do the same sort of thing.
Typically lists will be used to store related values as in the example of our experimental pressures.
Structure of a list:
[
, ]
,
Let's create a list of pressures we described above:
pressures = [0.273, 0.275, 0.277, 0.275, 0.276]
print('pressures:', pressures)
As with variables containing a single value until now we assign the list to a variable, and give it a meaningful name pressures
.
When we print the variable that has been used to store the list Python displays the entire list, including all of values that have been assigned and the square brackets which we used to create the list.
We can confirm the type
of what is in our variable as we did in the previous episode:
print(type(pressures))
and confirm that it is a 'list'
.
We can also use a built-in function len()
to find out how many values are in the list:
print(len(pressures))
What makes lists useful however is being able to retrieve individual values. We do this by indexing paricular values in the list, as follows:
print('zeroth item of pressures:', pressures[0])
print('fourth item of pressures:', pressures[4])
Notice that in Python the 'first item' in the list has the index 0
.
There are reasons for this which we will not go into today, so for now we just need to remember this.
If we forget and try to reference the value in index 5
of our list of pressures we will get an "index out of range" error message:
print('fifth item of pressures:', pressures[5])
We find that code causes an error, and if we examine it, starting from the bottom of the message we find that Python has provided the following message IndexError: list index out of range
.
This is quite informative and if we look at the line above we find that it has given an indication of the line where the error happened, helping us to identify what the problem is likely to be.
Values stored in a list can be changed by assigning a new value to that index:
pressures[0] = 0.265
print('pressures is now:', pressures)
We can see that the first, or rather zeroth
item in the list has been changed, but all other values are unaffected.
Often rather than changing one value stored in a list we may want to add new items. For instance let's start by creating a new list containing the names the patients on a medical trial:
patients = ["Alan", "Ben", "Charlie", "Derick"]
print("patients:", patients)
To add, or append
items to the list we use the following syntax:
list_name.append(new_value)
In the case of our example of the list of patients we would write:
patients.append("Edward")
We can experiment with this below:
print("Initial patients:", patients)
print("Initial number of patients:", len(patients))
patients.append("Edward")
patients.append("Frank")
print("Current patients:", patients)
print("Current number of patients:", len(patients))
As well as adding patients to the list we might want to combine two lists.
For instance my colleague might have a second list of patients that they have been treating that I want to add to my list.
Python allows us to combine one list with another using the extend
function:
list_1.extend(list_2)
which adds the whole of list_2
to the end of list_1
.
In our example we first need to create the new list then add it to our existing list of patients:
patients_2 = ["Greg", "Henry", "Ian", "Jeff"]
print("First list of patients:", patients)
print("Second list of patients:", patients_2)
patients.extend(patients_2)
print("Combined list of patients:", patients)
So far we have only had the same type
of values in the lists we have created, and typically when working with datasets this will be the case.
However in Python lists do not have to contain a single type
of values, including having a list as an item in another list.
We have a third colleague with their own set of patients in the trial which we will now create.
But by mistake instead of using extend
to combine the lists we will see what happens if we append
the new list:
patients_3 = ["Keiran", "Luke", "Mike", "Nigel"]
patients.append(patients_3)
print("Combined list of patients:", patients)
If you examine this carefully you will see that instead of adding the new patients separately to the end of the list, we have instead added one extra item to the end of the list patients
, and that last item is a list!
In order to see this more clearly let's examine the last item in the list patients
.
We could see how long the list is now with len(patients)
however Python provides a convenient way of indexing from the end of the list beginning with list_name[-1]
.
Let's print this last element and the one before to see what has been stored in patients
:
print("The last item in patients:", patients[-1])
print("The penultimate item in patients:", patients[-2])
We can see now that rather than combining the lists as we'd intended we have appended patients_3
as a single item in patients
. In order to correct this we need to delete the final item in the list, and combine the new list correctly:
print("The last item in patients:", patients[-1])
del patients[-1]
print("The last item in patients:", patients[-1])
patients.extend(patients_3)
print("The last item in patients:", patients[-1])
Having finally combined our lists of patients correctly we would like to split them up for the two doctors Alice, and Beth who will be treating them next. So far we have accessed a single item in a list by specifying a single index. We can produce slices
of lists by giving a range of indexes as follows:
patients_alice = patients[0:6]
patients_beth = patients[6:14]
print("Alice will be treating:", patients_alice)
print("Beth will be treating:", patients_beth)
When we create a slice such as list_slice = list[low:high]
, low
is the index of the item in list
that will be the first index in list_slice
, while high
is the first index in list
that will not be in list_slice
.
If we are indexing from the beginning or end we can omit low
, or high
respectively from our slice and they will be inferred by python:
patients_alice = patients[:7]
patients_beth = patients[7:]
print("Alice will be treating:", patients_alice)
print("Beth will be treating:", patients_beth)
Sometimes we want to create an empty list ready to add items, or other lists, values to. In order to do this we can use square brackets with nothing in side them []
.
empty_list = []
print("An empty list:", empty_list, "contains", len(empty_list), "items.")
In the previous episodes we have seen that strings
are sets of characters. These can be indexed in a similar way to lists where each index
refers to a different character in the string.
element = 'carbon'
print('zeroth character:', element[0])
print('third character:', element[3])
There are subtle difference however between the way that Python considers strings and lists:
string
has been created it cannot be changedPractically this means that you cannot reassign an element of a string, only access it, and you cannot append
or extend
one string with another.