If you're a Python developer, then you're probably aware that match
/case
statement got introduced to the language in 3.10. But even though it looks like basic switch
statement which we all know from other languages - in Python - it's much more than just an alternative if
syntax.
In this article we will explore advanced features of match
/case
syntax - or as it's properly called - structural pattern matching. As well as tips and tricks for using it effectively, including recipes that will help you use it to it's full potential.
RegEx Matching
The match
/case
syntax provides a lot of matching patterns out-of-the-box, but unfortunately there's currently no native way to match against regular expressions. We can however, implement it pretty easily thanks to the fact that structural pattern matching uses ==
(__eq__
) to evaluate the match. Therefore, all we need is a class that implements custom __eq__
method:
import re
from dataclasses import dataclass
@dataclass
class RegexEqual(str):
string: str
match: re.Match = None
def __eq__(self, pattern):
self.match = re.search(pattern, self.string)
return self.match is not None
print(bool(RegexEqual("Something") == "^S.*ing$")) # True
match RegexEqual("Something to match"):
case "^...match":
print("Nope...")
case "^S.*ing$":
print("Closer...")
case "^S.*match$":
print("Yep!")
The above could also be further extended to allow us to access RegEx capture groups, so that we can capture them into variables as part of the matching:
@dataclass
class RegexEqual(str):
...
def __getitem__(self, group):
return self.match[group]
match RegexEqual("Something to match"):
case "^Some(.*ing).*$" as capture:
print(f"Captured: '{capture[1]}'") # Captured: 'thing'
JSON Processing
Common use case for match
/case
is efficient matching of JSON structures in form of Python's dictionaries. This can be done with mapping pattern which is triggered by case {...}: ...
like so:
orders = [
{"statusCode": 200, "id": 1345347, "price": 235.80, "items": ["HDD", "CPU", "Headphones", "Webcam"]},
{"statusCode": 500, "id": 0, "price": 0, "items": []},
{"statusCode": 202, "id": 3453, "price": 30.80, "items": ["Thumb Drive"]},
{"statusCode": 404, },
]
def process_json(response: dict):
match response:
case {"statusCode": 200, "id": _, "price": _, "items": [*products]}: # Capture list
print(f"Order contains following products: {products}")
case {"statusCode": code, "id": _, "price": _, "items": _} if code >= 400: # Capture and guard
print(f"Failed with status code: {code}")
case {"statusCode": _, "price": _, "items": _}:
print("Missing required field: ID")
case {"statusCode": code, **fields}: # Destructure rest of the dictionary
print(f"Code: {code}, data: {fields}")
for order in orders:
process_json(order)
# Order contains following products: ['HDD', 'CPU', 'Headphones', 'Webcam']
# Failed with status code: 500
# Missing required field: ID
# Code: 404, data: {}
I think that the above code nicely demonstrates the versatility of the feature:
- In first
case
we can see that you can use variable capture to capture a subpattern, - In the second one we can see it paired with a guard, which can be handy when matching REST API response codes
- Third one shows that you can use it to validate that all
dict
/JSON fields are present - Finally, in the 4th case we demonstrate that you can alternatively destructure part of the mapping so that you don't have to list out all individual fields
Set Membership
Similarly to RegEx matching shown earlier, we don't have an option to match against sets of values. We can however, once again take advantage of __eq__
and implement our own set-matching class:
from types import SimpleNamespace
class InSet(set):
def __eq__(self, elem):
return elem in self
Produce = SimpleNamespace(
fruit=InSet({"apple", "banana", "peach"}),
vegetable=InSet({"cucumber", "lettuce", "onion"})
)
food = "cucumber"
match food:
case Produce.fruit:
print(f"{food} is a fruit.")
case Produce.vegetable:
print(f"{food} is a vegetable.")
# cucumber is a vegetable.
The above example uses SimpleNamespace
to wrap the individual sets into single data container. You might try to "simplify" this and use case fruit
and case vegetable
directly, that however won't work as it triggers capture pattern, which means that it would assign the value of food
into fruit
and vegetable
respectively. To avoid this X.some_var
must be used because dots always trigger the value pattern.
Matching Builtin Types
It's pretty common to write a conditionals in Python that test what the variable type is. You can use structural pattern matching for this, there's however a gotcha:
some_var = "not a float"
match some_var:
case float: # Wrong! - matches any subject, because Python sees float as a variable
print(f"'{some_var}' is float")
# Prints: 'not a float' is float
some_var = 3.14
match some_var:
case float(): # Correct!
print(f"{some_var} is float")
# Prints: 3.14 is float
As with previous example about set membership, if you use something like case float
or case int
, you will trigger capture pattern, therefore effectively overriding builtin float
function and making it a variable. Instead, you have to use case float()
, which triggers a class pattern and tests whether the variable in match ...
is of the specified type.
The above however only works for 9 builtin types, namely: bool
, bytearray
, bytes
, dict
, float
, frozenset
, int
, list
, set
, str
and tuple
. If you want to match against other builtin types you will need to use Abstract Base Class, such as the ones listed here, e.g. case collections.abc.Iterable(): ...
.
We now know how to test type of variable, but what if we want to test just a raw type?
import builtins
type_ = int
# Matches raw types, not their instances
match type_:
case builtins.str:
print(f"{type_} is a String.")
case builtins.int:
print(f"{type_} is an Integer.")
case _:
print("Invalid type.")
# Prints: <class 'int'> is an Integer.
In that case we have to use Python's builtins
module that gives us direct access identifiers of Python, which includes types such as builtins.str
, builtins.dict
or builtins.complex
.
Matching Positional Arguments
By default, when using the class pattern to match a class such as case MyClass(key="value"): ...
, you're required to use keyword arguments. That can however, be little verbose in case your class has many arguments. To solve this we can use __match_args__
:
class Location:
__match_args__ = ('country', 'city')
def __init__(self, country, city):
self.country = country
self.city = city
def test_positional_args(location):
match location:
case Location("Germany", "Berlin"):
print("Hallo Berlin!")
case Location(_, "London"):
print("There's London in multiple countries...")
case Location("Canada", _):
print("Hello Canada!")
test_positional_args(Location("Canada", "Toronto"))
# Prints: Hello Canada!
# Without __match_args__: TypeError: Location() accepts 0 positional sub-patterns (2 given)
__match_args__
class attribute allows us to specify tuple of instance attributes in order in which they will be used as positional arguments. Also, not all instance attributes have to be listed in __match_args__
, consider putting only required ones in __match_args__
while leaving out the optional ones.
Additionally, if you use a dataclass
, you get this feature out of the box, where order of definition is used for order of positional arguments:
from dataclasses import dataclass
@dataclass
class Location:
country: str
city: str
# __match_args__ present without explicitly defining it:
print(Location.__match_args__)
# ('country', 'city')
Soft Keywords
If you have old code base that happens to use case
or match
for variable names, then you might think that you'd have to refactor you code in order to upgrade to Python 3.10 which includes case
and match
keywords. That's however not the case, because both of these are "soft keywords", which means that they're considered a reserved words only in context where it makes sense.
Thanks to that, even the following (questionable/wild) code will work:
import re
support_ticket = "Support case no.: 152 is closed."
match = re.match(r"Support case no\.: (\d+) is (open|closed)\.", support_ticket)
match = match.groups() if match else None
match match:
case case, "closed":
print(f"Case {case} is done.")
case case, "opened":
print(f"Case {case} is still in progress.")
case _:
print(f"Case has unknown status")
Branch Reachability
While very powerful, structural pattern matching has its limitations and quirks, that you should be aware of.
One such limitation is branch reachability:
rows = [
{"success": True, "value": 100},
{"success": False, "value": 200},
{"success": True, "value": 200}, # Should be matched by 3rd case
{"success": False, "value": 200},
]
for row in rows:
match row:
case {"success": True, "value": _}:
print("First")
case {"success": _, "value": 200}:
print("Second")
# Unreachable, If we move it to the top, it will work correctly
case {"success": True, "value": 200}:
print("Third")
case {"success": _, "value": _}:
print("None matches")
# Prints:
# First
# Second
# First
# Second
In the example above, the third record in rows
variable should clearly matched by the third case
, but it isn't. Instead, it falls into the first one. To fix this we need to move the third case to the top and it will work as expected.
This just shows that order of cases matters and that you should be careful when writing these kinds of match
/case
statements because it can create hard to debug issues.
Hopefully, future versions of Python will include some level of code analysis that might catch at least some of these issues.
Exhaustiveness
Another hard to debug issue you might encounter stems from missing a valid branch/case
- that is - when match
doesn't cover all the possible cases. This can be mitigated by adding safety asserts like so:
from enum import Enum
from typing import NoReturn
class Color(Enum):
RED = "Red"
GREEN = "Green"
BLUE = "Blue"
def exhaustiveness_check(value: NoReturn) -> NoReturn:
assert False, 'This code should never be reached, got: {0}'.format(value)
def some_func(color: Color) -> str:
match color:
case Color.RED:
return "Color is red."
case Color.GREEN:
return "Color is green."
exhaustiveness_check(color)
some_func(Color.RED)
This will make the code throw an error at runtime, but that might be too little too late. Better solution is to use static type checker like mypy
:
# python -m pip install -U mypy
def some_func(color: Color) -> str:
match color:
case Color.RED:
return "Color is red."
case Color.GREEN:
return "Color is green."
exhaustiveness_check(color)
some_func(Color.RED)
# `mypy examples.py` leads to:
# ...error: Argument 1 to "exhaustiveness_check" has incompatible type "Literal[Color.BLUE]"; expected "NoReturn"
After running the above code with mypy example.py
we get an error telling us exactly which case
is missing. See also mypy
docs for details on exhaustiveness checking.
Similar feature is also available in Pyright, which is Microsoft's static type checker for Python.
Conclusion
Structural pattern matching in Python is very close in syntax to switch
/case
known from other languages, but it's much more than that - it's a control flow and destructuring tool.
To take full advantage of all its features, not just the advanced ones presented here, make sure to take a look at PEP 636 which provides an in-depth tutorial for the majority of its use cases.