Python List Comprehensions Are More Powerful Than You Might Think

Python's list comprehensions (and generators) are an awesome feature that can greatly simplify your code. Most of the time however, we only use them to write a single for loop, maybe with addition of one if conditional and that's it. If you start poking around a bit though, you will find out that there are many more features of Python's comprehensions that you don't know about, but can learn a lot from...

Multiple Conditionals

We know that we can use if conditional to filter results of a list comprehension and with simple comprehensions single if is usually sufficient. What if you wanted a nested conditional, though?


values = [True, False, True, None, True]
print(['yes' if v is True else 'no' if v is False else 'unknown' for v in values])
# ['yes', 'no', 'yes', 'unknown', 'yes']

# Above is equivalent to:
result = []
for v in values:
    if v is True:
        result.append('yes')
    else:
        if v is False:
            result.append('no')
        else:
            result.append('unknown')

print(result)
# ['yes', 'no', 'yes', 'unknown', 'yes']

It's possible to build a nested conditional using "conditional expressions", or as it's generally called a ternary operator. It's not exactly a pretty solution, so you will have to decide whether the few saved lines are worth the nasty one-liner.

Apart from using complex conditionals, it's also possible to stack multiple ifs in a comprehension:


print([i for i in range(100) if i > 10 if i < 20 if i % 2])
# [11, 13, 15, 17, 19]

# Above is equivalent to:
result = []
for i in range(100):
    if i > 10:
        if i < 20:
            if i % 2:
                result.append(i)

print(result)
# [11, 13, 15, 17, 19]

Looking at the expanded code above, it doesn't really make much sense to write it this way, but the syntax allows it.

One reason why you might want to use it, is for readability purposes:


print([i for i in range(100)
       if i > 10
       if i < 20
       if i % 2])

Avoid Repeated Evaluation

Let's say you have comprehension which calls an expensive function both in the conditional and in the loop body:


def func(val):
    # Expensive computation...
    return val > 4

values = [1, 4, 3, 5, 12, 9, 0]
print([func(x) for x in values if func(x)])  # Inefficient
# [True, True, True]

This is inefficient as it doubles the computation time, but what can we do about it? Nested comprehensions to the rescue!


print([y for y in (func(x) for x in values) if y])  # Efficient
# [True, True, True]

I want to highlight that the above is not a double loop. In this example we build a generator inside list comprehension which is consumed by the outer loop. If you find this hard to read, then alternative would be to use the walrus operator:


print([y for x in values if (y := func(x))])

Here func is called only once, creating a local variable y which can be used in other part of the expression.

Handling Exceptions

Even though list comprehensions are usually used for simple tasks - such as calling a function on each element of the list - there are situations where exception might be thrown inside the comprehension. There's however no native way of handling an exception inside a list comprehension, so what can we do about it?


def catch(f, *args, handle=lambda e: e, **kwargs):
    try:
        return f(*args, **kwargs)
    except Exception as e:
        return handle(e)


values = [1, "text", 2, 5, 1, "also-text"]
print([catch(int, value) for value in values])
print([catch(lambda: int(value)) for value in values])  # Alternative syntax
# [
#   1,
#   ValueError("invalid literal for int() with base 10: 'text'"),
#   2,
#   5,
#   1,
#   ValueError("invalid literal for int() with base 10: 'also-text'")
# ]

We need a handler function to catch an exception inside a comprehension. Here we create a function catch which takes a function and its arguments. If an exception is thrown inside catch, then the exception is returned.

This is not an ideal solution, considering that we need a helper function, but it's the best we can do as the proposal (PEP 463), which tried to introduce a syntax for this, got rejected.

Breaking the Loop

Another limitation of list comprehensions is the inability to break the loop. While not possible natively we can implement a little hack that solves the problem:


print([i for i in iter(iter(range(10)).__next__, 4)])
# [0, 1, 2, 3]

from itertools import takewhile
print([n for n in takewhile(lambda x: x != 4, range(10))])
# [0, 1, 2, 3]

First example above uses a little know feature/behavior of iter function. The iter(callable, sentinel) returns an iterator that "breaks" iteration once callable function value is equal to a sentinel value. When the inner iter returns the sentinel (4 in the example), the loop automatically stops.

This is not very readable, so you can instead take advantage of the great itertools module and the takewhile function as shown in the second example.

As a side note - if you thought that breaking a loop in list comprehensions was possible, then you'd be correct. Until Python 3.5, you could use helper function to raise StopIteration inside list comprehensions, that was however changed with PEP 479.

Tricks (and Hacks)

In the previous sections, we've seen some obscure features of list comprehensions that might or might not be very useful in day-to-day coding. So, let's now take a look at some tricks (and little hacks) that you can put to use right away.

While plain, vanilla list comprehensions are very powerful, they become even better when paired with libraries such itertools (see previous section) or its extension more-itertools.

Let's say you need to find runs of consecutive numbers, dates, letters, booleans or any other orderable objects. You can solve this elegantly by pairing consecutive_groups from more-itertools with a list comprehension:


import datetime
# pip install more-itertools
import more_itertools

dates = [
    datetime.datetime(2020, 1, 15),
    datetime.datetime(2020, 1, 16),
    datetime.datetime(2020, 1, 17),
    datetime.datetime(2020, 2, 1),
    datetime.datetime(2020, 2, 2),
    datetime.datetime(2020, 2, 4)
]

groups = [list(group) for group in more_itertools.consecutive_groups(dates, ordering=lambda d: d.toordinal())]
# [
# [datetime.datetime(2020, 1, 15, 0, 0), datetime.datetime(2020, 1, 16, 0, 0), datetime.datetime(2020, 1, 17, 0, 0)],
# [datetime.datetime(2020, 2, 1, 0, 0), datetime.datetime(2020, 2, 2, 0, 0)],
# [datetime.datetime(2020, 2, 4, 0, 0)]
# ]

Here we have list of dates, some of which are consecutive. We pass the dates to the consecutive_groups function using ordinal values of the dates for ordering. We then collect returned groups into a list using a comprehension.

Computing accumulating sums of numbers is very easy in Python - you can just pass a list to itertools.accumulate and you get back the sums. What if we wanted to undo the accumulation though?


from itertools import accumulate

data = [4, 5, 12, 8, 1, 10, 21]
cumulative = list(accumulate(data, initial=100))
print(cumulative)
# [100, 104, 109, 121, 129, 130, 140, 161]

print([y - x for x, y in more_itertools.pairwise(cumulative)])
# [4, 5, 12, 8, 1, 10, 21]

With help of more_itertools.pairwise it's pretty simple!

As was mentioned earlier, the new-ish walrus operator can be used with list comprehensions to create a local variable. That can be useful in many situations. One such situation is with any() and all() functions:

Python's any() and all() functions can verify whether any or all values in some iterable satisfy certain condition. What if you however want to also capture the value that caused any() to return True (so-called "witness") or the value that caused all() to fail (so-called "counterexample")?


numbers = [1, 4, 6, 2, 12, 4, 15]

# Only returns boolean, not the values
print(any(number > 10 for number in numbers))  # True
print(all(number < 10 for number in numbers))  # False

# ---------------------
any((value := number) > 10 for number in numbers)  # True
print(value)  # 12

all((counter_example := number) < 10 for number in numbers)  # False
print(counter_example)  # 12

Both any() and all() use short-circuiting to evaluate the given expression. This means that they stop the evaluation as soon as they find the first "witness" or "counterexample" respectively. Therefore, with this trick the variable created by walrus operator will always give us the first "witness"/"counterexample".

Closing Thoughts

Many of the features and tricks here are meant to demonstrate possibilities and limits of list comprehensions. Learning these intricacies is - in my opinion - a good way of gaining better understanding of particular language feature, even if it's not really useful in daily coding. On top of that, it's fun.

With that said, I hope you learn something here, and be aware that if you decide to use things like complex conditionals or loop breaks in your list comprehensions, then your coworkers might end-up hating you.

Subscribe: