We all are familiar with Python's generators and all their benefits. But, what if I told you that we can make them even better by combining them with recursion? So, let's see how we can use them to implement "lazy recursion" and supercharge what we already do with generators in Python!
Why Even Bother?
Before we get into the code, let's first ask ourselves "Why even bother? Do we really need recursive generators?". And the answer is... it depends. Naturally, a recursive generator will share both pros and cons of both generators as well as normal recursive functions.
For the generators, the number one reason why one would use them is "lazy" evaluation - that is - computing elements one at the time rather than all at once. As for the recursion, it simply makes sense for certain algorithms or problems which it can solve elegantly and succinctly, such as tree traversal.
Therefore, a situations where recursive generators would make sense are naturally recursive algorithms that might process large amount of data or elements, and therefore consume a lot of memory if run "eagerly".
The Basic Example
Now that we know why we would use a recursive generator, let's take a look at a "simple" example to understand how we can write one:
def binary_counter():
yield "1"
for prefix in binary_counter():
yield prefix + "0"
yield prefix + "1"
This short function - as the name suggests - yields successive numbers in binary. When called, it first simply yields "1"
and after which comes the recursive call. The recursive call also yields "1"
, but that's given to the previous, non-recursive call as a prefix
. With the prefix computed, the non-recursive call yields 2 values "10"
and "11"
. After that, the recursive call continues execution by making another recursive call, going a level deeper and so the loop continues - prefixes bubble upwards so the outer frame is always yielding some result ending first with "0"
and then "1"
.
Now, if we were to run it, we would get:
for value in binary_counter():
print(f"value: {value}")
if value == "1000":
break
# value: 1
# value: 10
# value: 11
# value: 100
# value: 101
# value: 110
# value: 111
# value: 1000
When it comes to recursion, explaining the code isn't always sufficient for really understanding what's happening. So, if you're not sure how the binary_counter
actually works, then let's try working out individual steps:
def _binary_counter(depth=0):
print(f'First "yield" at depth {depth}.')
yield "1", depth
for prefix, d in _binary_counter(depth+1):
print(f'First recursive "yield" at depth {depth}.')
yield prefix + "0", d
print(f'Second recursive "yield" at depth {depth}.')
yield prefix + "1", d
for value, depth in _binary_counter():
print(f"value: {value}, depth: {depth}")
if value == "1000":
break
The above, modified version adds a depth
parameter and a couple of prints to help demonstrate what the code does. If we now call this code, we will get the following:
# First "yield" at depth 0.
# value: 1, depth: 0
# First "yield" at depth 1.
# First recursive "yield" at depth 0.
# value: 10, depth: 1
# Second recursive "yield" at depth 0.
# value: 11, depth: 1
# First "yield" at depth 2.
# First recursive "yield" at depth 1.
# First recursive "yield" at depth 0.
# value: 100, depth: 2
# Second recursive "yield" at depth 0.
# value: 101, depth: 2
# Second recursive "yield" at depth 1.
# First recursive "yield" at depth 0.
# value: 110, depth: 2
# Second recursive "yield" at depth 0.
# value: 111, depth: 2
# First "yield" at depth 3.
# First recursive "yield" at depth 2.
# First recursive "yield" at depth 1.
# First recursive "yield" at depth 0.
# value: 1000, depth: 3
I hope this makes it a bit clearer, if not, consider manually working out the steps, or maybe using debugger in your IDE of choice, so that you can see the stack frames and variables in realtime.
You might be also asking, "What's the point of computing binary numbers this way?" - and the answer is, well, there's no good reason. There are definitely better and more readable ways to do that, but I think it demonstrates the concept fairly well. With that said, let's now look at more useful examples of how we can use recursive generators...
Putting It To Good Use
When it comes to recursion, the obvious candidates for examples are various mathematical functions or - as shown here - combinatorics, more specifically power-set:
def powerset(sequence):
if len(sequence) == 1:
yield sequence
yield []
else:
for item in powerset(sequence[1:]):
yield [sequence[0]] + item
yield item
for s in powerset(list(range(4))):
print(s)
# [0, 1, 2, 3]
# [1, 2, 3]
# [0, 2, 3]
# [2, 3]
# [0, 1, 3]
# [1, 3]
# [0, 3]
# [3]
# ...
The function here uses similar flow as the binary counter earlier. To better understand it, we can translate the recursive part as:
- For every result in a smaller power-set (
sequence[1:]
) ... - ... Return the not used value (
[sequence[0]]
) + the result (item
) - ... Then return result alone (
item
)
While mathematical functions can be nicely implemented using recursion, they aren't really something we use on daily basis, so let's now take a look at something different:
def accumulate(values):
yield values[0]
if len(values) > 1:
values[1] += values[0]
yield from accumulate(values[1:])
for val in accumulate([1, 2, 3, 4]):
print(val)
The above accumulate
function computes a running total (sum) of elements of its list parameter. While the above code works, I don't recommend using it in practice, because you can and should use the following instead:
from itertools import accumulate
print(list(accumulate([1, 2, 3, 4])))
While on the topic of itertools, let's also see how we can re-implement other common function:
def flatten(nested):
try:
for sublist in nested:
yield from flatten(sublist)
except TypeError:
yield nested
print(list(flatten([[1, 2], 3])))
The flatten
function can be used to un-nest a nested list (or other iterable). I show this one here because it uses a bit of a different flow than the earlier ones - it leverages try
/except
to separate the base/non-recursive part and the recursive code.
It can however, be rewritten without try
/expect
if desired:
from collections.abc import Iterable
def flatten(nested):
if not isinstance(nested, Iterable):
yield nested
else:
for sublist in nested:
yield from flatten(sublist)
print(list(flatten([[1, 2], 3])))
When talking about recursion, we obviously have to show an examples of recursive data structures, in this case a binary tree:
from typing import Optional
from dataclasses import dataclass
@dataclass
class Node:
data: int
left: Optional[Node] = None
right: Optional[Node] = None
def __iter__(self):
if self.left:
yield from self.left
yield self.data
if self.right:
yield from self.right
def inorder(node):
if node:
yield from inorder(node.left)
yield node.data
yield from inorder(node.right)
The above code implements a binary tree, including the recursive generator in a form of the __iter__
method. The same is also implemented in the inorder
function, which makes the recursive calls a little clearer.
To demonstrate the usage of the above code, let's create a simple tree:
t3 = Node(data=5)
t5 = Node(data=32)
t4 = Node(data=10)
t1 = Node(data=8, left=t3, right=t4)
t2 = Node(data=35, right=t5)
tree = Node(left=t1, right=t2, data=20)
# tree
# / \
# t1 t2
# / \ \
# t3 t4 t5
for node in tree:
print(node)
# 5
# 8
# 10
# 20
# 35
# 32
Similar to the traversal of (binary) trees, we can use recursive generators also for examples when traversing JSON:
def traverse_json(parent, data, depth=0):
for key, value in data.items():
if not isinstance(value, dict):
print(f"{' ': <{depth*2}}{parent}: {key} -> {value}")
yield key, value
else:
yield from traverse_json(key, value, depth + 1)
Traversing JSON this way might be practical if you're working with very large data that would consume a lot of memory if loaded all at once.
By now, you might be getting the hang of how these weird generators work, but let's anyway look at what happens when we call the above code:
nested_dict = {
"A": {
"A1": "data",
"C": {"C1": "some value"}
},
"B": {
"B1": "more data",
"D": {
"D1": "another value",
"E": {"E1": "data", "E2": "even more data"}
}
}
}
print(list(traverse_json(None, nested_dict)))
# A: A1 -> data
# C: C1 -> some value
# B: B1 -> more data
# D: D1 -> another value
# E: E1 -> data
# E: E2 -> even more data
# [
# ('A1', 'data'), ('C1', 'some value'), ('B1', 'more data'),
# ('D1', 'another value'), ('E1', 'data'), ('E2', 'even more data')
# ]
And finally, another tree-like data structure which is commonly traversed recursively is a file-tree:
# Pointless, but recursive:
from pathlib import Path
def get_paths(path):
if path.is_file():
yield path
elif path.is_dir():
for subpath in path.glob("*"):
yield from get_paths(subpath)
for p in get_paths(Path("/some/path")):
... # Do something with path
# Realistic solution:
path = Path("/some/path")
for p in path.rglob("*"):
... # Do something with path
Here we implement get_paths
function that recursively yields all files in specified path. With that said, you're better off using the builtin path.rglob("*")
for this task as it also returns generator.
Also, while not useful in this instance, it's good to note that send()
function can be also used with recursive generators. So, an alternative implementation of above function:
def get_paths(path):
if path.is_file():
yield path
elif path.is_dir():
for subpath in path.glob("*"):
walker = get_paths(subpath)
while True:
try:
x = walker.send(None)
except StopIteration:
break
yield x
for p in get_paths(Path("/some/path")):
print(p)
This style of generator can be useful in case you need to control the recursion or if you need to communicate with the coroutine.
Conclusion
The examples in this article - in my opinion - show an elegant solutions to many problems that can be expressed recursively. However, elegant doesn't always mean better. Oftentimes, using less "elegant" or succinct solution will produce much more readable and generally better code.
So, let's not try to "shoehorn" recursive generators into code wherever possible and only use it where appropriate - that is - when implementing a recursive function that would benefit from lazy evaluation.