Introduction

In this post we’ll look at lists. A list is a collection of items in which you can add or remove items. This is one of the most important data structure and you will be using this quite a lot in your programs. Items in a list can be of any data type. e.g. numbers, strings, custom objects, dictionaries or even lists. Take a look at the examples below

1
2
3
brands = ["samsung", "sony", "lg"]
ages = [22, 35, 55, 34, 63]
empty_list = []

In Python, you can create a list by using two square brackets []. Then each item in the list is separated by a comma. In line 1 we’ve defined a list called brands which contains strings. Similarly ages is a list of integers and empty_list is an empty list.

In the brands and ages list, all items have same data type. But a same list can also store items of multiple types. In statically typed programming languages like C++, Java, C# a list can only store items of same type i.e. either it should be a list of string or a list of integers but in Python there is no such restriction. For example,

1
my_list = ["samsung", "sony", 11, 12.004, ["python", "c++"]]

my_list now contains two strings, one integer, one float and another list which also contains two strings. While this is perfectly valid Python list, generally it is not advisable to put items of different data types in a same list.

Indexing

So, we have a list but how to access those elements in the list? There are two major ways to extract item(s) from the list: to get a single element or get one or more elements.

Let’s look at some examples to get single element from the list.

1
2
3
4
brands = ["samsung", "sony", "lg"]
print(brands[0])
print(brands[1])
print(brands[2])

The program will print the following

1
2
3
samsung
sony
lg

The syntax for selecting an element from a list is list[index] where index is the position of the element in the list. Note that in Python and many other programming languages, the position starts from 0. This is also called “zero based indexing”. So the first element has index 0, second has 1 and so on.

In lines 2, 3 and 4 we extracted the first, second and third items in the list respectively.

:warning: It is quite easy to forget about 0 based indexing especially when you are a beginner. With some practice, you will get used to this idea.

Python also has a cool feature that allows you to index from the end of the list. To get the last item in the list, you can use -1 as the index, to get second to last item use -2 and so on. Take a look at the example below

1
2
3
brands = ["samsung", "sony", "lg"]
print(brands[-1])
print(brands[-2])
1
2
lg
sony

In the examples, I’ve directly printed the values after indexing. But remember that you can also assign them to another variable as well. In the code below, I select the first brand which is “samsung” and assign it to a variable called samsung_brand and then print it

1
2
samsung_brand = brands[0]
print(samsung_brand)

Slicing

So we’ve looked at how we can extract a single item from a list but we can also extract one or more items from a list. This is called slicing and the syntax is list[start_index:end_index]. If you remember this is exactly same as with strings. You can think of a slice as a range of index values. When you slice a list you will get a new list but it will only contain the items that fall under the range given in the slice.

The table below summarizes different variations of slicing. Assume that we have a list like below

1
brands = ["samsung", "sony", "lg", "nokia"]
code result remarks
brands[0:2] ["samsung", "sony"] select first two brands i.e items from index 0 until index 2 but not including 2
brands[:2] ["samsung", "sony"] same as above but since we start from \(0^{th}\) index, we can omit the 0. Python will assume you want to select from the beginning
brands[1:4] ["sony", "lg", "nokia"] select items from \(2^{nd}\) index until index 4 but not including \(4^{th}\) index.
brands[1:] ["sony", "lg", "nokia"] same as above but since we want to select until the end, we can omit the end_index. When end_index is omitted, Python will assume you want to select till the end
brands[-2:] ["lg", "nokia"] we can use negative index in slicing as well. This means select from second last item till the end
brands[-3:-1] ["sony", "lg"] select from third last until the last one but not including the last one

:warning: As you might have noticed, the end_index is not inclusive. It is a common mistake to assume that it’s inclusive.

Adding items

Adding items to a list is pretty common operation and is quite simple in Python. There are two functions that you need to use: append and insert.

append(item) function will add the item that you passed to the end of the list insert(index, item) function will insert the item at the given index (remember index starts from 0)

1
2
3
4
5
6
7
8
9
10
11
brands = []
print(brands)

brands.append("samsung")
brands.append("sony")

print(brands)

# insert "lg" in second position of list
brands.insert(1, "lg")
print(brands)

The code above prints the following:

1
2
3
[]
['samsung', 'sony']
['samsung', 'lg', 'sony']

At line 1, we have a list called brands which is currently empty. Just to show you it’s empty, let’s print that. Next we use append function two times to add “samsung” and “sony” one after another. When we print the brands list now, you should see ['samsung', 'sony']. Since we appended “sony” after we did “samsung”, it is the second item in our list.

We don’t always want to add items to the end of the list. For that we use insert function like brands.insert(1, "lg"). This line will insert “lg” in second position and move “sony”, which was in second position before, to be the last item of the list. Again, insert function will insert the item in the given index and in Python we use 0 based indexing. So if you want to insert an item in the first place you should use insert(0, "my item").

Deleting items

Deleting from list is also pretty simple. You have to use remove function of a list. You have to careful that the item that you want to delete should exist in the list otherwise Python will throw an error.

1
2
3
4
5
brands = ["samsung", "lg", "sony", "nokia"]
brands.remove("lg")

print(brands)
brands.remove("no brand")

The code above prints the following

1
2
3
4
5
['samsung', 'sony', 'nokia']
Traceback (most recent call last):
  File "main.py", line 4, in <module>
    brands.remove("no brand")
ValueError: list.remove(x): x not in list

Initially in our brands list we had 4 items, but once we delete “lg” from the list we can see in the output that “lg” was in-fact removed from the list. To show you what happens when to try to delete a non-existing item, I wrote brands.remove("no brand") but Python complained and the error says: ValueError: list.remove(x): x not in list. So basically, Python is telling you that you wanted to remove some x from the list but x does not exist in the list.

Contains

If you want to know whether an item exists in a list then Python has got you covered with a really nice syntax to do so. Let’s look at an example:

1
2
3
4
5
brands = ["sony", "lg"]
if "lg" in brands:
	print("exists")
else:
	print("does not exist")

The program will print exists. Try changing “lg” to something else that is not in the list and it should print does not exist. So to check if a list contains an item, you can use the following form: something in list. This expression will return either True or False.

Iterating

When you have a list, you most probably want to process each of those items. To do that you will use a for loop. You can also use a while loop to iterate but using for loop should be preferred whenever possible. Although I haven’t discussed about for loop and while loop yet I will try to keep things simple. There will be a dedicated post about for loops in Python but for now let’s stick to simple cases.

Let’s say that we have a list of string and wanted to convert each string into upper case and print it.

1
2
3
4
brands = ["samsung", "lg", "sony", "nokia"]
for brand in brands:
	upper_cased = brand.upper()
	print(upper_cased)

In the code above, we have a list of string brands. Using for-loop we can iterate each item in the list and do some operation on it. The code outputs the following:

1
2
3
4
SAMSUNG
LG
SONY
NOKIA

:warning: Note how the code lines are aligned. In line 1 and 2 there is no indentation but in 3 and 4 the lines have moved slightly to the right. So line 3 and 4 is considered as body of the for loop and that code will be executed repeatedly for every item in the list. It does not matter how many space character you add to indent but you have to be consistent! Generally 3 spaces is preferred in Python community. Again, details about for-loop will be covered in future posts so don’t worry if you didn’t understand.

Sorting

If you want to sort items in your list then you can use sort function like: my_list.sort(). Python will sort the items for you in ascending order by default. You can also use a parameter called reverse and set it to True to sort in descending order. Take a look at the example below.

1
2
3
4
5
6
7
8
9
10
brands = ["samsung", "lg", "sony", "nokia"]
brands.sort()
print(brands)

brands.sort(reverse=True)
print(brands)

ages = [10, 20, 5, 49, 33]
ages.sort()
print(ages)

The code prints the following

1
2
3
['lg', 'nokia', 'samsung', 'sony']
['sony', 'samsung', 'nokia', 'lg']
[5, 10, 20, 33, 49]

How does Python know how to sort? In order to sort you generally need to be able to determine if an item is greater or equal to another item. For numbers its obvious 1 is greater than 0, 2.999 is greater than 2.77 and so on. For strings, they are sorted alphabetically. For example “a” is smaller than “b”, “z” is greater than “e” and so on. Python already knows how to compare two numbers or two strings and figure out which one is “greater” and which one is “smaller” and hence, the sorting works.

But what happens if a list contains both string and numbers?

1
2
3
values = ["a", "b", 1]
values.sort()
print(values)

The code prints the following:

1
2
3
4
Traceback (most recent call last):
  File "main.py", line 2, in <module>
    values.sort()
TypeError: '<' not supported between instances of 'int' and 'str'

It gives you an error and the error basically means that Python cannot compare a number and string to determine which one is smaller. If someone asked you if letter “a” is less than 20 what would you answer? Normally this is an invalid comparison but if there is really a requirement to compare different types of data or other complex data types there are ways to let Python know how to compare. If you want to read more about sorting you can read the official documentation here.

List comprehension

List comprehension is a really nice feature in Python that allows you yo write short and neat code when dealing with lists. Let me explain you with an example. Suppose we have a list of brands and they are all in lower case but we want them to be title cased i.e. from “samsung” to “Samsung”.

1
2
3
4
5
6
7
8
9
10
11
12
brands = ["samsung", "lg", "sony", "nokia"]

new_brands = []
for brand in brands:
    new_brands.append(brand.title())
    
print(new_brands)


# using list comprehension
new_brands_2 = [brand.title() for brand in brands]
print(new_brands_2)

The code above outputs:

1
2
['Samsung', 'Lg', 'Sony', 'Nokia']
['Samsung', 'Lg', 'Sony', 'Nokia']

So, to convert each brand name to title-case, we first defined an empty list new_brands where we plan to save the converted brand names. Then using a for-loop we say Python to first convert the brand name to title-case using title() function and then append that result to new_brands list. Once that is done, we print the new_brands list. So far so good. But this can be written in a very nice way using list comprehension.

This line: new_brands_2 = [brand.title() for brand in brands] does all the things we did before, but in a single line! If you look at the output, they are exactly the same.

What’s happening here? Python creates a new list using the expression inside the square brackets. It might be a bit confusing to grasp since we haven’t even covered for-loops yet but bear with me. If you still have confusion, then read about for-loops in later post and come back here again. It might be easy to write “normal for loops” first and then try to convert that to a list comprehension. Here is how to translate a for loop into list comprehension (assuming that you already have a for-loop code).

  • First write two square brackets []
  • Then copy the entire line where for .. in .. is written and paste it inside the brackets. Now your code should look like [for brand in brands]
  • Now, we want to do something with each brand which, in this case is, converting to title-case by calling brand.title(). This part goes in front of “for” like: [brand.title() for brand in brands]
  • Now the list comprehension is complete. You can assign the result to some variable

There is lot more about lists that I haven’t covered here. For more information refer to official documentation.

Exercise

The program below asks for how many family members a user has and asks to enter name for each of them. You should complete this program to greet each member. Take a look at the output of a sample run. For 3 members it asked to enter names 3 times and then it printed “Hello …” for each of those members. You need to use for loop to print as well as string concatenation. Read the post about strings if you are not sure.

1
2
3
4
5
6
7
8
9
10
11
n_members = int(input("How many family members do you have? "))

family_members = []

for i in range(n_members):
    name = input(f"Enter name of member {i+1} ")
    # your code to add name to the list
    
    
greeting = "Hello"
# your code

Sample Run:

1
2
3
4
5
6
7
How many family members do you have? 3
Enter name of member 1 John
Enter name of member 2 Jack
Enter name of member 3 Kelly
Hello John
Hello Jack
Hello Kelly

Updated:

Comments