Python's list comprehensions (and generators) are an awesome feature that can greatly simplify your code. Most of the time however, we only use them to write a single
for loop, maybe with addition of one
if conditional and that's it. If you start poking around a bit though, you will find out that there are many more features of Python's comprehensions that you don't know about, but can learn a lot from...
We know that we can use
if conditional to filter results of a list comprehension and with simple comprehensions single
if is usually sufficient. What if you wanted a nested conditional, though?
values = [True, False, True, None, True] print(['yes' if v is True else 'no' if v is False else 'unknown' for v in values]) # ['yes', 'no', 'yes', 'unknown', 'yes'] # Above is equivalent to: result =  for v in values: if v is True: result.append('yes') else: if v is False: result.append('no') else: result.append('unknown') print(result) # ['yes', 'no', 'yes', 'unknown', 'yes']
It's possible to build a nested conditional using "conditional expressions", or as it's generally called a ternary operator. It's not exactly a pretty solution, so you will have to decide whether the few saved lines are worth the nasty one-liner.
Apart from using complex conditionals, it's also possible to stack multiple
ifs in a comprehension:
print([i for i in range(100) if i > 10 if i < 20 if i % 2]) # [11, 13, 15, 17, 19] # Above is equivalent to: result =  for i in range(100): if i > 10: if i < 20: if i % 2: result.append(i) print(result) # [11, 13, 15, 17, 19]
Looking at the expanded code above, it doesn't really make much sense to write it this way, but the syntax allows it.
One reason why you might want to use it, is for readability purposes:
print([i for i in range(100) if i > 10 if i < 20 if i % 2])
Avoid Repeated Evaluation
Let's say you have comprehension which calls an expensive function both in the conditional and in the loop body:
def func(val): # Expensive computation... return val > 4 values = [1, 4, 3, 5, 12, 9, 0] print([func(x) for x in values if func(x)]) # Inefficient # [True, True, True]
This is inefficient as it doubles the computation time, but what can we do about it? Nested comprehensions to the rescue!
print([y for y in (func(x) for x in values) if y]) # Efficient # [True, True, True]
I want to highlight that the above is not a double loop. In this example we build a generator inside list comprehension which is consumed by the outer loop. If you find this hard to read, then alternative would be to use the walrus operator:
print([y for x in values if (y := func(x))])
func is called only once, creating a local variable
y which can be used in other part of the expression.
Even though list comprehensions are usually used for simple tasks - such as calling a function on each element of the list - there are situations where exception might be thrown inside the comprehension. There's however no native way of handling an exception inside a list comprehension, so what can we do about it?
def catch(f, *args, handle=lambda e: e, **kwargs): try: return f(*args, **kwargs) except Exception as e: return handle(e) values = [1, "text", 2, 5, 1, "also-text"] print([catch(int, value) for value in values]) print([catch(lambda: int(value)) for value in values]) # Alternative syntax # [ # 1, # ValueError("invalid literal for int() with base 10: 'text'"), # 2, # 5, # 1, # ValueError("invalid literal for int() with base 10: 'also-text'") # ]
We need a handler function to catch an exception inside a comprehension. Here we create a function
catch which takes a function and its arguments. If an exception is thrown inside
catch, then the exception is returned.
This is not an ideal solution, considering that we need a helper function, but it's the best we can do as the proposal (PEP 463), which tried to introduce a syntax for this, got rejected.
Breaking the Loop
Another limitation of list comprehensions is the inability to
break the loop. While not possible natively we can implement a little hack that solves the problem:
print([i for i in iter(iter(range(10)).__next__, 4)]) # [0, 1, 2, 3] from itertools import takewhile print([n for n in takewhile(lambda x: x != 4, range(10))]) # [0, 1, 2, 3]
First example above uses a little know feature/behavior of
iter function. The
iter(callable, sentinel) returns an iterator that "breaks" iteration once
callable function value is equal to a
sentinel value. When the inner
iter returns the sentinel (
4 in the example), the loop automatically stops.
This is not very readable, so you can instead take advantage of the great
itertools module and the
takewhile function as shown in the second example.
As a side note - if you thought that breaking a loop in list comprehensions was possible, then you'd be correct. Until Python 3.5, you could use helper function to raise
StopIteration inside list comprehensions, that was however changed with PEP 479.
Tricks (and Hacks)
In the previous sections, we've seen some obscure features of list comprehensions that might or might not be very useful in day-to-day coding. So, let's now take a look at some tricks (and little hacks) that you can put to use right away.
While plain, vanilla list comprehensions are very powerful, they become even better when paired with libraries such
itertools (see previous section) or its extension
Let's say you need to find runs of consecutive numbers, dates, letters, booleans or any other orderable objects. You can solve this elegantly by pairing
more-itertools with a list comprehension:
import datetime # pip install more-itertools import more_itertools dates = [ datetime.datetime(2020, 1, 15), datetime.datetime(2020, 1, 16), datetime.datetime(2020, 1, 17), datetime.datetime(2020, 2, 1), datetime.datetime(2020, 2, 2), datetime.datetime(2020, 2, 4) ] groups = [list(group) for group in more_itertools.consecutive_groups(dates, ordering=lambda d: d.toordinal())] # [ # [datetime.datetime(2020, 1, 15, 0, 0), datetime.datetime(2020, 1, 16, 0, 0), datetime.datetime(2020, 1, 17, 0, 0)], # [datetime.datetime(2020, 2, 1, 0, 0), datetime.datetime(2020, 2, 2, 0, 0)], # [datetime.datetime(2020, 2, 4, 0, 0)] # ]
Here we have list of dates, some of which are consecutive. We pass the dates to the
consecutive_groups function using ordinal values of the dates for ordering. We then collect returned groups into a list using a comprehension.
Computing accumulating sums of numbers is very easy in Python - you can just pass a list to
itertools.accumulate and you get back the sums. What if we wanted to undo the accumulation though?
from itertools import accumulate data = [4, 5, 12, 8, 1, 10, 21] cumulative = list(accumulate(data, initial=100)) print(cumulative) # [100, 104, 109, 121, 129, 130, 140, 161] print([y - x for x, y in more_itertools.pairwise(cumulative)]) # [4, 5, 12, 8, 1, 10, 21]
With help of
more_itertools.pairwise it's pretty simple!
As was mentioned earlier, the new-ish walrus operator can be used with list comprehensions to create a local variable. That can be useful in many situations. One such situation is with
all() functions can verify whether any or all values in some iterable satisfy certain condition. What if you however want to also capture the value that caused
any() to return
True (so-called "witness") or the value that caused
all() to fail (so-called "counterexample")?
numbers = [1, 4, 6, 2, 12, 4, 15] # Only returns boolean, not the values print(any(number > 10 for number in numbers)) # True print(all(number < 10 for number in numbers)) # False # --------------------- any((value := number) > 10 for number in numbers) # True print(value) # 12 all((counter_example := number) < 10 for number in numbers) # False print(counter_example) # 12
all() use short-circuiting to evaluate the given expression. This means that they stop the evaluation as soon as they find the first "witness" or "counterexample" respectively. Therefore, with this trick the variable created by walrus operator will always give us the first "witness"/"counterexample".
Many of the features and tricks here are meant to demonstrate possibilities and limits of list comprehensions. Learning these intricacies is - in my opinion - a good way of gaining better understanding of particular language feature, even if it's not really useful in daily coding. On top of that, it's fun.
With that said, I hope you learn something here, and be aware that if you decide to use things like complex conditionals or loop breaks in your list comprehensions, then your coworkers might end-up hating you.