Debugging tasks with complex or asynchronous control flow can be difficult since it renders useless one of the most important debugging tools we have: the debugger stack trace. To overcome this, we can have our tasks capture their own stack traces in debug builds to aid debugging by reconstructing the path that long-running task objects take through the program.
This article will also share my own custom code for creating and symbolicating stack traces that’s a little faster and more flexible than the existing code for doing this in Swift.
One of the best ways to analyze a program for the source of an error is a debugger stack trace.
Imagine we create a task object in function (1), then we pass the task object into another function (2), which then passes the task object into a third function (3) which represents the end of the task. In this example, path the object takes through the program and the call stack are the same thing:
If, by the end of function (3), we don’t have the expected result, we can set a breakpoint at the end of function (3) and the stack trace in the debugger will show functions (2) and (1) ahead of function (3) on the stack. We’ll know the path our task took through functions between creation and completion. If the task is processed purely as a series of nested calls, then the debugger stack trace at the deepest point will capture the entire history of the task and reconstructing the control flow is as simple as reading this stack trace.
Unfortunately, many tasks are not so simple. Objects are not just passed into functions but also returned from functions, eliminating their source frames from the call stack in the process. Objects are also passed between threads, eliminating all prior steps in their lifetime from the call stack.
Imagine our simple three step path for the object instead looked like this:
In this diagram, function (2) calls function (1) where the task object is created. Function (1) then returns the task object to function (2) which then asynchronously passes the object to function (3).
Function (1) is lost from the stack trace as soon as it returns to function (2) then function (2) is also lost from the stack trace when function (3) is invoked asynchronously. When we find a problem at the end of function (3), we’ve lost the history of control flow for the object through our program.
How can we maintain enough information about our computation to perform useful reactive analysis at step (3) under these circumstances?
We could use a log file
Traditionally, we address this type of control flow tracking problem with a log file. We log the construction of the task object, we log when we hand the object from one thread to another and we log its conclusion. By reading through the log file, we can see what events occurred and when.
The biggest problem with a log file though is controlling the amount of information it contains. A log file is usually program-wide and records events from all tasks in a single location. If multiple tasks are occurring simultaneously, they will be interleaved in the log file. If you have many tasks in your program, the log can become unreasonably difficult to read.
Another problem with log files is that they’re only as helpful as the information you choose to write – they don’t implicitly gather information. Good logging requires a non-zero amount of time to write a good log message.
Finally, logging only captures information from the current scope. This makes logging information far less helpful in reusable code where you’re more likely to be interested in the call site than the reusable callee.
A journal of stack traces instead
Instead of using log files to track asynchronous tasks, I prefer to use a structure that I call a “task journal”. It’s not a complicated data structure. Here’s an example:
Its an array of stack traces, where a stack trace is simply an array of return addresses (
UInt) that we can use to reconstruct functions and offsets within those functions at a later time.
Relative to a log file for tracking control flow, this task journal has the following advantages:
- it stays out of the way until you need it (it’s not blasted into a common location)
- it’s unique for the task (remains readable, even when multiple tasks are occurring at once)
- it captures information about callers, making it useful in reusable classes
The initial stack trace (as captured by
callStackReturnAddresses) is added on initialization of the
taskJournal and when your task reaches an interesting sequence point (conditional branch, interface boundary or other handover point), you append a stack trace.
Coming from a log files, it’s natural to think you might need to include a message with the stack trace but the message is usually “I am here”, something that is already implicit in the stack trace.
This doesn’t mean you shouldn’t store additional metadata with tasks – information like “this task was created with parameters X, Y and Z” or “task received error message E” is very helpful – but it should usually be stored in a custom format for your task context. The key purpose of the task journal is to have metadata that doesn’t require any custom formatting, structures or messages – you just add sequence points when you want.
Wait… why are we doing this at all?
Since Xcode 6, Xcode can automatically record backtraces when queueing blocks asynchronously using dispatch queues so you can follow a call stack across multiple asynchronous calls. You can watch this in action in Debugging in Xcode 6 from WWDC2014.
Do we still need to do stack trace capturing for ourselves?
Xcode will only record backtraces when queuing on a dispatch queue. You’re not in control of these backtraces so you can’t request them in other situations. There are therefore lots of examples of important control flow behaviors that this doesn’t capture:
- Ongoing tasks that don’t use dispatch, including common tasks like network connections and user input
- Operations on a single thread like basic function calls or taking one branch instead of another
Even when the control flow you want to track is asynchronous invocation of blocks on dispatch queues, recorded backtraces in Xcode relies on implicit behavior. Sometimes it works and sometimes it doesn’t and you get no feedback when the backtrace simply doesn’t appear. For example: in Xcode 7.2.1, Xcode’s recorded backtraces don’t appear to work at all for me in xctest targets. I don’t know if it’s a “known limitation” or a bug but it leaves me needing an alternative.
Capturing the stack trace
Now, I’ve written
callStackReturnAddresses in the examples above to capture the current stack trace. There isn’t a free function with that name in Swift or Cocoa.
There is a class function with that name on
NSThread, so we could write:
NSThread, nor any other class in Foundation, exposes a way to easily convert addresses to symbols at a later time when needed.
There is an alternative function on
which immediately converts the return addresses to symbols but that’s really too time consuming to attach to our main task machinery (we only want to convert to symbols after an error occurs, not on our critical path where we may need to capture hundreds or thousands of stack traces per second).
For performance reasons then, we need to stick to basic return addresses and we’ll need to convert them to symbols ourselves with the C function
dladdr. At a minimum:
NSThread.callStackReturnAddresses isn’t going to be a complete solution.
But I have another issue with the
NSThread implementation: you can’t skip over the top couple of stack frames and you can’t limit the number of captured frames. With
NSThread.callStackReturnAddresses, you must always capture the whole stack and then crop the result down to the information you actually need – which forces reallocates and wastes time.
We’d have a bit more flexibility if we could use the C function
backtrace but in Swift, we can’t. The
backtrace function comes from the “execinfo.h” header and is implemented as part of libSystem on OS X and iOS (which all Swift programs link against) but for whatever reason, the “execinfo.h” contents are not exposed to Swift programs.
Rolling my own
While the functionality of
NSThread.callStackReturnAddresses is probably enough for most people, I was dissatisfied enough that I bothered to re-implement it in Swift with the features I wanted:
To support the function, I also wrote:
which can take the
[UInt] and produce call stack symbols in an identical format to
Thread.callStackSymbols() but in an on-demand fashion.
A simple example using these functions
I have a class that I sometimes use called
DeferredWork. This object wraps a closure that is deferred to run at a later date. The class has a strict usage requirement: the work must be run at some point before the
DeferredWork object is deleted but it is not automatically run; it must be done manually.
A simplified version of the class looks like this (NOTE: this code assumes
DEBUG is defined in debug builds):
To explain how this works: when the
DeferredWork class is created, it takes ownership of the
work closure and agrees to run this closure at a later time. The location of the construction is saved as the first element in the
DeferredWork is handed over to another owner, the
recordHandover function can be manually invoked, recording the change in ownership.
Ultimately, if there is a problem – for this class a “problem” is a failure to invoke
deinit – we know where the
work closure came from we know the chain of custody that brought it to the current location and we can rapidly track down the precise location where the failure to run occurred.
Let’s look at how this solves the problematic three-step control flow I showed above:
DeferredWork.init is function (1). The
recordHandover function should be called immediately before the asynchronous call at the end of function (2). The
deinit function is invoked at the end of function (3) when the
DeferredWork object falls out of scope. If the
run function is not called at any time during function (2) or function (3) then we can immediately see the path the object took through our program and decide where we’ve made our mistake.
Implementation of callStackReturnAddresses
As with all code, I’d prefer to implement everything in Swift but Swift lacks an equivalent of to the gcc/clang builtin function
__builtin_frame_address which is critical for locating the current stack frame. I’ve written a C function named
frame_address that returns
__builtin_frame_address(1) and exposed this function to Swift in the bridging header.
Using this function, the internals of
callStackReturnAddresses are very simple. I store the result from
frame_address in a struct named
StackFrame (which is nothing more than few functions wrapped around a
uintptr_t that stores the result from
Traversing through a stack is as simple as dereferencing the frame pointer (since the value at the frame pointer is the address of the previous frame):
Getting return addresses from each frame is also easy since it is stored immediately after the frame pointer in each frame:
So we just do this repeatedly through the stack until we exceed the stack bounds.
callStackReturnAddresses function is about twice as fast as
NSThread.callStackReturnAddresses on the basic tests I’ve tried (roughly 2 million invocations per second per core on my Mac versus 1 million invocations per second for
Thread.callStackReturnAddresses). It’s easily fast enough to gather lots of data in Debug builds – even for fairly intensive computational paths.
The project containing this code is available on github: mattgallagher/CwlUtils.
- In a subdirectory of your project’s directory, run
git clone https://github.com/mattgallagher/CwlUtils.git
- Drag the “CwlUtils.xcodeproj” file into your own project’s file tree in Xcode
- Click on your project in the file tree to access project settings and click on the target to which you want to add CwlUtils.
- Click on the “Build Phases” tab and under “Target Dependencies” click “+” and add the CwlUtils_OSX or CwlUtils_iOS target as appropriate for your target’s platform.
- Still on “Build Phases”, if you don’t already have a “Copy Files” build phase with a “Destination: Frameworks”, add one. In this build phase, add “CwlUtils.framework”. NOTE: there will be two frameworks in the list with the same name (one is OS X and the other is iOS). The “CwlUtils.framework” will appear above the corresponding CwlUtils OS X or iOS testing target.
Note about step (1): it is not required to create the checkout inside your project’s directory but if you check the code out in a shared location and then open it in multiple parent projects simultaneously, Xcode will complain – it’s usually easier to create a new copy inside each of your projects.
This and the next few code posts I write are going to be part of a single framework, CwlUtils. To be blunt, I’m not really happy with this monolithic framework approach. Ideally, much of the code I’m hoping to produce would be separate, isolated classes that would be pulled together, on-demand, by a dependency manager. However, the existing third-party solutions for dependency management in Swift don’t work elegantly in this scenario (not entirely their fault since static linking and transparent Xcode integration are not possible).
I’ll definitely revisit this when the official Swift Package Manager is shipped as part of Xcode but at the moment, a dependency-free, monolithic framework is it. If you’re only interested in a small piece of code that I’ve written, you’ll need to extract it for yourself.
I’ve talked about ease and benefits of recording stack traces for ongoing tasks to aid debugging. Tracking down why a result is incorrect after asynchronous invocations or other complex control flows can be completely impractical without this information so it’s a good technique to keep in mind.
Use it or lose it: why safe C is sometimes unsafe Swift
Gathering system information in Swift with sysctl