There are certain bugs and issues that are very hard to troubleshoot. Just ask yourself, "How would I debug deadlock, segmentation fault, crashing application, or a hanging process?"
Now there's a tool to help you will all of that - PyStack is a powerful new debugger that - according to docs - "uses forbidden magic to let you inspect the stack frames of a running Python process or a Python core dump."
So, let's put that statement to a test and explore how to use PyStack to track down and troubleshoot all the above-mentioned issues and bugs...
Why?
There are many debugging tools out there, including interactive debuggers in your IDE, so you might be wondering, "Why is this even needed?"
While traditional debuggers or even simple print statements are often enough for debugging, there are bugs and issues that are much harder to track down and troubleshoot.
One such category are deadlocks or any application that gets stuck. When you have hanging process, you might ask yourself, "Is it doing something? Or is it stuck/dead/deadlocked?" and PyStack can help answer that.
Another category is hybrid applications - that is - applications that combine C/C++ and Python. That includes Python extension modules, but also libraries such as NumPy or TensorFlow which many of us use every day. Well, how would you debug NumPy crashing with segfault?
And then there are weird issues that happen in specific circumstances, e.g. under heavy load; after apps has been running for X hours, etc.
But aren't there tools for this already? Well, yes, but actually no.
Only real alternative is GDB which isn't exactly user-friendly, and it can't debug both Python and C/C++ at the same time.
There are also 2 Python tools austin and py-spy, those are however profilers and not debuggers.
Now, what is so special about PyStack and why it's superior to the above-mentioned tools? For a start, unlike GDB, PyStack only reads memory (optionally stops the process), and doesn't modify it. GDB can modify your code, which can be a big no-no.
It can also inspect all core dump files - which is a big statement, because core dumps can be wild. It's also able to prints local variables as part of stacktrace. And it automatically downloads debugging information (if not available) for the particular distro. That's just a sample of reasons why PyStack is very cool.
Setup
Before we start debugging, we need to do some setup:
sudo apt update
sudo apt install systemd-coredump python3-pip python3.10-venv
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
python3 -m venv ./venv
source ./venv/bin/activate
pip install pystack pytest-pystack
One of the features of PyStack is ability to inspect core dumps, so we need to enable core dump generation in our system - the simplest way to achieve that is to install systemd-coredump
. All the core dumps will then be automatically stored in /var/lib/systemd/coredump/
. Also, to be able to trace a process, PyStack needs to send ptrace
syscalls to the traced process, we can temporarily enable that with echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
.
Finally, we need to install pystack
itself, as well as the Pytest plugin for it.
Debugging
PyStack has 2 ways of debugging a program - either by attaching to a running process or by analyzing core dump of a crashed process.
Let's start with the former. We will use the following snippet that simply sleeps forever, simulating a hanging process:
# wait.py
from time import sleep
def wait():
some_var = "data"
other_var = [1, 2, 3]
while True:
sleep(5)
print("Sleeping...")
wait()
We can run it in the background and then run pystack remote
against its PID to see what's happening in the code:
nohup python wait.py &
# [1] 44000
pystack remote 44000 --locals --no-block
# Traceback for thread 44000 (python) [] (most recent call last):
# (Python) File "/home/.../wait.py", line 18, in <module>
# wait()
# (Python) File "/home/.../wait.py", line 15, in wait
# sleep(5)
# Locals:
# other_var: [1, 2, 3]
# some_var: "data"
We already know that the program hangs on the sleep(5)
call and the above output confirms it. It also shows us the local variables which can provide important context.
Here's also a screenshot, so that you can see nice color highlighting:
Unfortunately, the above example might not work on your Mac or Windows, because PyStack only supports Linux. It however works just fine in Docker!
Let's demonstrate that with another code snippet, this time actual deadlock of 2 threads:
import threading
import time
l1 = threading.Lock()
l2 = threading.Lock()
def f1(name):
print('thread', name, 'about to lock l1')
with l1:
print('thread', name, 'has lock l1')
time.sleep(0.3)
print('thread', name, 'about to lock l2')
with l2:
print('thread', name, 'run into deadlock')
def f2(name):
print('thread', name, 'about to lock l2')
with l2:
print('thread', name, 'has lock l2')
time.sleep(0.3)
print('thread', name, 'about to lock l1')
with l1:
print('thread', name, 'run into deadlock')
if __name__ == '__main__':
t1 = threading.Thread(target=f1, args=['t1', ])
t2 = threading.Thread(target=f2, args=['t2', ])
t1.start()
t2.start()
t1.join()
t2.join()
To make this into a container image, you can use the following Dockerfile
:
FROM python:3.10
COPY deadlock.py .
RUN pip install pystack pytest-pystack
CMD python3 -u deadlock.py
To then build it, run it and debug it:
docker build -t python-pystack -f Dockerfile .
docker run --cap-add=SYS_PTRACE --name python-pystack --rm python-pystack
docker exec -it python-pystack /bin/bash
root@438137c1086c:/# ps aux
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# root 1 0.0 0.0 2484 576 ? Ss 15:20 0:00 /bin/sh -c python3 -u deadlock.py
# root 7 0.0 0.0 159348 7784 ? Sl 15:20 0:00 python3 -u deadlock.py
# root 10 0.0 0.0 6056 3592 pts/0 Ss 15:20 0:00 /bin/bash
# root 16 0.0 0.0 8652 3204 pts/0 R+ 15:20 0:00 ps aux
root@438137c1086c:/# pystack remote 7 --locals --no-block
When we run the container, we pass --cap-add=SYS_PTRACE
to docker run
- this gives the container privileges to trace a process. Next, we exec into it and find the process we want to trace. After that, we run pystack remote
and get the following output:
Core Files
Sometimes we aren't so lucky to have the process running to be able to debug it live. Sometimes the application just crashes with nothing useful in logs. In those cases we might need to inspect the core dump of the process.
If you're unfamiliar with core files (core dumps), then here's definition:
The core dump refers to the recording of the state of the program, i.e. its resources in memory and processor.
Essentially, a snapshot of the process when it crashed - produced when the familiar message Segmentation fault (core dumped)
occurs.
To analyze a core dump, we need to force Python to crash, for example by seg-faulting or aborting:
# crash.py
import os
import ctypes
def crash():
some_var = "data"
other_var = [1, 2, 3]
print(some_var)
ctypes.string_at(0) # Segmentation Fault (core dumped)
# os.abort() # Aborted (core dumped)
crash()
If we then run this script, we will get the familiar message:
python crash.py
# Segmentation fault (core dumped)
cp /var/lib/systemd/coredump/core.python.1000.49...e5d.44204.1688657327000000.zst core.zst
unzstd core.zst
# core.zst : 4292608 bytes
pystack core ./core --locals
# ...
pystack core ./core --native
# ...
And we can find the core file in the /var/lib/systemd/coredump/
. It's in .zst
format, which is a compressed archive. To uncompress it, we use unzstd
. We can then analyze the core file with pystack core ...
, which gives the following output:
Notice that here we again see all the local variables, which is very cool because we're working with a crashed program, so you can't just call __repr__
on variables.
While debugging a live process is often more convenient, you can also force the process to crash with kill -SIGABRT <PID>
which will generate a core file.
As was mentioned in the beginning, you're most likely to find use for PyStack when using libraries like NumPy or PyTorch that have modules written in C/C++. So, let's look at a real NumPy bug that causes a segfault:
# pip install numpy
from multiprocessing import shared_memory
import numpy as np
def np_array_from_shm_name(name, shape=(2,)):
shm = shared_memory.SharedMemory(name=name)
np_array = np.ndarray(shape, buffer=shm.buf, dtype=np.int32)
np_array[:] = (0, 0)
return np_array
if __name__ == "__main__":
shm = shared_memory.SharedMemory(create=True, size=8)
np_array = np_array_from_shm_name(shm.name)
np_array[:] = (0, 0) # segfaults
shm.unlink()
This is a code snippet adapted from this GitHub issue. If you run it, you will get a segfault and analyzing the generated core file with PyStack gives much more information for the issue, which would be very hard to gather without it:
Pytest
During initial setup we installed pytest-pystack
along with pystack
itself, so let's put it to some use. This Pytest plugin allows us to automatically run pystack remote
on the process running our tests in case the particular test exceeds given timeout.
For demonstration, will use following two test cases:
# examples_test.py
import time
import threading
import deadlock
def test_sleeping_test():
print("start")
time.sleep(6)
print("finished")
def test_deadlock():
t1 = threading.Thread(target=deadlock.f1, args=['t1', ])
t2 = threading.Thread(target=deadlock.f2, args=['t2', ])
t1.start()
t2.start()
t1.join()
t2.join()
First one simply sleeps for a couple of seconds, while the other reuses the deadlock example from earlier. To run the test suite with PyStack plugin enabled, use:
pytest -s --pystack-threshold=2 --pystack-args='--locals' --pystack-output-file='./pystack.log'
This will execute the tests as usual, but if any of them exceeds 2 seconds, then PyStack will inspect the process and output the trace into specified file. I won't paste the generated file here, because it's quite long, but you can view it in this Gist.
Closing Thoughts
While PyStack is a tool with quite a narrow use case, and you won't need it when debugging normal issues, it can be invaluable when issue like segfault or deadlock actually arises.
Being able to figure out what a running program is up to or what a program was doing when it crashed is incredibly useful and there really aren't any other tools that can do it so well. Also, if you're already using a profiler (e.g. py-spy
or austin
) or other tools that inspect stack information, then you should give PyStack a shot because these tools complement each other nicely.