Python's list comprehensions (and generators) are an awesome feature that can greatly simplify your code. Most of the time however, we only use them to write a single for
loop, maybe with addition of one if
conditional and that's it. If you start poking around a bit though, you will find out that there are many more features of Python's comprehensions that you don't know about, but can learn a lot from...
Multiple Conditionals
We know that we can use if
conditional to filter results of a list comprehension and with simple comprehensions single if
is usually sufficient. What if you wanted a nested conditional, though?
values = [True, False, True, None, True]
print(['yes' if v is True else 'no' if v is False else 'unknown' for v in values])
# ['yes', 'no', 'yes', 'unknown', 'yes']
# Above is equivalent to:
result = []
for v in values:
if v is True:
result.append('yes')
else:
if v is False:
result.append('no')
else:
result.append('unknown')
print(result)
# ['yes', 'no', 'yes', 'unknown', 'yes']
It's possible to build a nested conditional using "conditional expressions", or as it's generally called a ternary operator. It's not exactly a pretty solution, so you will have to decide whether the few saved lines are worth the nasty one-liner.
Apart from using complex conditionals, it's also possible to stack multiple if
s in a comprehension:
print([i for i in range(100) if i > 10 if i < 20 if i % 2])
# [11, 13, 15, 17, 19]
# Above is equivalent to:
result = []
for i in range(100):
if i > 10:
if i < 20:
if i % 2:
result.append(i)
print(result)
# [11, 13, 15, 17, 19]
Looking at the expanded code above, it doesn't really make much sense to write it this way, but the syntax allows it.
One reason why you might want to use it, is for readability purposes:
print([i for i in range(100)
if i > 10
if i < 20
if i % 2])
Avoid Repeated Evaluation
Let's say you have comprehension which calls an expensive function both in the conditional and in the loop body:
def func(val):
# Expensive computation...
return val > 4
values = [1, 4, 3, 5, 12, 9, 0]
print([func(x) for x in values if func(x)]) # Inefficient
# [True, True, True]
This is inefficient as it doubles the computation time, but what can we do about it? Nested comprehensions to the rescue!
print([y for y in (func(x) for x in values) if y]) # Efficient
# [True, True, True]
I want to highlight that the above is not a double loop. In this example we build a generator inside list comprehension which is consumed by the outer loop. If you find this hard to read, then alternative would be to use the walrus operator:
print([y for x in values if (y := func(x))])
Here func
is called only once, creating a local variable y
which can be used in other part of the expression.
Handling Exceptions
Even though list comprehensions are usually used for simple tasks - such as calling a function on each element of the list - there are situations where exception might be thrown inside the comprehension. There's however no native way of handling an exception inside a list comprehension, so what can we do about it?
def catch(f, *args, handle=lambda e: e, **kwargs):
try:
return f(*args, **kwargs)
except Exception as e:
return handle(e)
values = [1, "text", 2, 5, 1, "also-text"]
print([catch(int, value) for value in values])
print([catch(lambda: int(value)) for value in values]) # Alternative syntax
# [
# 1,
# ValueError("invalid literal for int() with base 10: 'text'"),
# 2,
# 5,
# 1,
# ValueError("invalid literal for int() with base 10: 'also-text'")
# ]
We need a handler function to catch an exception inside a comprehension. Here we create a function catch
which takes a function and its arguments. If an exception is thrown inside catch
, then the exception is returned.
This is not an ideal solution, considering that we need a helper function, but it's the best we can do as the proposal (PEP 463), which tried to introduce a syntax for this, got rejected.
Breaking the Loop
Another limitation of list comprehensions is the inability to break
the loop. While not possible natively we can implement a little hack that solves the problem:
print([i for i in iter(iter(range(10)).__next__, 4)])
# [0, 1, 2, 3]
from itertools import takewhile
print([n for n in takewhile(lambda x: x != 4, range(10))])
# [0, 1, 2, 3]
First example above uses a little know feature/behavior of iter
function. The iter(callable, sentinel)
returns an iterator that "breaks" iteration once callable
function value is equal to a sentinel
value. When the inner iter
returns the sentinel (4
in the example), the loop automatically stops.
This is not very readable, so you can instead take advantage of the great itertools
module and the takewhile
function as shown in the second example.
As a side note - if you thought that breaking a loop in list comprehensions was possible, then you'd be correct. Until Python 3.5, you could use helper function to raise StopIteration
inside list comprehensions, that was however changed with PEP 479.
Tricks (and Hacks)
In the previous sections, we've seen some obscure features of list comprehensions that might or might not be very useful in day-to-day coding. So, let's now take a look at some tricks (and little hacks) that you can put to use right away.
While plain, vanilla list comprehensions are very powerful, they become even better when paired with libraries such itertools
(see previous section) or its extension more-itertools
.
Let's say you need to find runs of consecutive numbers, dates, letters, booleans or any other orderable objects. You can solve this elegantly by pairing consecutive_groups
from more-itertools
with a list comprehension:
import datetime
# pip install more-itertools
import more_itertools
dates = [
datetime.datetime(2020, 1, 15),
datetime.datetime(2020, 1, 16),
datetime.datetime(2020, 1, 17),
datetime.datetime(2020, 2, 1),
datetime.datetime(2020, 2, 2),
datetime.datetime(2020, 2, 4)
]
groups = [list(group) for group in more_itertools.consecutive_groups(dates, ordering=lambda d: d.toordinal())]
# [
# [datetime.datetime(2020, 1, 15, 0, 0), datetime.datetime(2020, 1, 16, 0, 0), datetime.datetime(2020, 1, 17, 0, 0)],
# [datetime.datetime(2020, 2, 1, 0, 0), datetime.datetime(2020, 2, 2, 0, 0)],
# [datetime.datetime(2020, 2, 4, 0, 0)]
# ]
Here we have list of dates, some of which are consecutive. We pass the dates to the consecutive_groups
function using ordinal values of the dates for ordering. We then collect returned groups into a list using a comprehension.
Computing accumulating sums of numbers is very easy in Python - you can just pass a list to itertools.accumulate
and you get back the sums. What if we wanted to undo the accumulation though?
from itertools import accumulate
data = [4, 5, 12, 8, 1, 10, 21]
cumulative = list(accumulate(data, initial=100))
print(cumulative)
# [100, 104, 109, 121, 129, 130, 140, 161]
print([y - x for x, y in more_itertools.pairwise(cumulative)])
# [4, 5, 12, 8, 1, 10, 21]
With help of more_itertools.pairwise
it's pretty simple!
As was mentioned earlier, the new-ish walrus operator can be used with list comprehensions to create a local variable. That can be useful in many situations. One such situation is with any()
and all()
functions:
Python's any()
and all()
functions can verify whether any or all values in some iterable satisfy certain condition. What if you however want to also capture the value that caused any()
to return True
(so-called "witness") or the value that caused all()
to fail (so-called "counterexample")?
numbers = [1, 4, 6, 2, 12, 4, 15]
# Only returns boolean, not the values
print(any(number > 10 for number in numbers)) # True
print(all(number < 10 for number in numbers)) # False
# ---------------------
any((value := number) > 10 for number in numbers) # True
print(value) # 12
all((counter_example := number) < 10 for number in numbers) # False
print(counter_example) # 12
Both any()
and all()
use short-circuiting to evaluate the given expression. This means that they stop the evaluation as soon as they find the first "witness" or "counterexample" respectively. Therefore, with this trick the variable created by walrus operator will always give us the first "witness"/"counterexample".
Closing Thoughts
Many of the features and tricks here are meant to demonstrate possibilities and limits of list comprehensions. Learning these intricacies is - in my opinion - a good way of gaining better understanding of particular language feature, even if it's not really useful in daily coding. On top of that, it's fun.
With that said, I hope you learn something here, and be aware that if you decide to use things like complex conditionals or loop breaks in your list comprehensions, then your coworkers might end-up hating you.