Testing actions over time

Testing that actions are scheduled correctly over time is a pain. It’s common to have tests that “mostly” work but don’t confirm accuracy, run slowly, fail periodically and exceed wait times whenever a breakpoint is hit during testing.

One of the key reasons is that thread scheduling is imprecise. Tenths of a second inaccuracy are common and occasional spikes of much higher inaccuracy can occur. If we wan to measure the timing of thread scheduled events, we end up needing events spaced over a long time to allow for significant windows of inaccuracy – this is not practical for large numbers of tests.

It is possible to mock the thread scheduler – so that we record the attempts to schedule a task – but on its own, an empty mock might not be enough. If we need to test how scheduled events interact then they still need to be performed, ideally as though they precisely met their timing target.

I’ll show a way to test events scheduled over time, threads and other execution contexts with a DebugContextCoordinator that functions as a basic task scheduler but operates over a simulated version of time so that testing of invocation times and task interaction can be precise and free from changes due to host activity.

A test case involving timing

Let’s imagine we want to test the following function:

class Service {
   func connect(timeout seconds: Double, handler: (Result<String>) -> ())
}

In general, we would want to test for three possible scenarios:

  1. The connection succeeds and the handler is invoked with a .success(value)
  2. The timeout elapses before the connection succeeds and the handler is invoked with a .failure(ServiceError.timeout).
  3. The Service instance is released before success or timeout and the handler is never invoked.

However, for this article, I want to focus exclusively on the second scenario: the timeout.

Basic testing with the XCTest framework might involve:

func testConnectionTimeout() {
   let expectation = expectation(description: "Waiting for timeout callback")
   let service = Service()
   service.connect(timeout: 0.001) { r in
      XCTAssert(r.value == nil)
      expectation.fulfill()
   }

   waitForExpectations(timeout: 10, handler: nil)
   withExtendedLifetime(service) {}
}

While this tests that a timeout is possible, the test lacks any rigor.

In order to avoid the common pattern where a timeout of 0 means “never timeout”, a small 0.001 second value is provided for the timeout. That means that there’s a very real window in which the service might legitimately complete faster than this timeout, resulting in testing failures. Even reducing this value to zero wouldn’t solve the problem since a zero second timer takes non-zero time and it’s technically possible for a success result to occur faster than a zero second timeout timer.

In addition, even if this test succeeds, we’re not actually testing that the timeout occurred at the correct time. The implemenation could confuse seconds for milliseconds and this test might still succeed.

Let’s fill in the Service implementation

Before we more forward and improve our testing, let’s step backward and fill in the definition of the Service so we can better define what we’re doing. Our Service is fairly simple. It takes an underlying ConnectionFunction an applies a timeout limit when waiting for a connection response.

The underlying ConnectionFunction can be swapped from the default NetworkService.init to an alternate implementation at construction time, allowing us to avoid network connections and get repeatable behavior when testing. Specifying alternate interfaces for any interally used external connections is also called dependency injection. I hope this isn’t news to anyone: it’s a fundamental aspect of isolating our code for testing.

NOTE: This code example is quite large given that it’s just applying a time limit to an underlying action. I didn’t want to skimp on necessary steps so there’s a lot of work associated with managing lifetimes and ensuring everything cancels correctly when lifetimes elapse. I’ve tried to comment the code to explain some of these decisions. There are more syntactically efficient ways to manage lifetimes for asynchronous tasks (same steps but hidden behind simpler interfaces) but that’s a topic for another article.

public protocol Cancellable: class { func cancel() }
enum ServiceError: Error { case timeout }

class Service {
   typealias ConnectionFunction =
      (DispatchQueue, @escaping (Result<String>) -> ()) -> Cancellable

   let underlyingConnect: ConnectionFunction
   var currentAction: Cancellable? = nil
   let queue = DispatchQueue(label: "\(Service.self)")
   
   // Construction of the Service lets us specify the underlying "connect" service
   init(connect: @escaping ConnectionFunction = NetworkService.init) {
      self.underlyingConnect = connect
   }

   // The connect function that we want to test
   func connect(timeout seconds: Double, handler: @escaping (Result<String>) -> ()) {
      var previousAction: Cancellable? = nil
      queue.sync {
         previousAction = self.currentAction
         
         // Tie the timer and underlying action together with a single lifetime object for
         // this `connect` action
         let timerAndAction = CancellableTimerAndAction()
         
         // Run the underlying connection
         let underlyingAction = self.underlyingConnect(queue) {
            [weak timerAndAction] result in
            // Cancel the action so no futher callbacks are invoked
            timerAndAction?.cancel()
            
            // Send the succes to the handler
            handler(result)
         }
         
         // Run the timeout timer
         let timer = DispatchSource.singleTimer(interval:
            DispatchTimeInterval.fromSeconds(seconds), queue: queue) {
            [weak timerAndAction] in
            // Cancel the action so no futher callbacks are invoked
            timerAndAction?.cancel()
            
            // Send the timeout to the handler
            handler(.failure(ServiceError.timeout))
         } as? DispatchSource
         
         // Store everything in the lifetime object for this action and then store that
         // in the parent
         timerAndAction.timer = timer
         timerAndAction.action = underlyingAction
         self.currentAction = timerAndAction
      }
      
      // A good rule of thumb: never release lifetime objects inside a mutex – you might
      // trigger a re-entrancy deadlock
      withExtendedLifetime(previousAction) {}
   }
}

where CancellableTimerAndAction is just a wrapper around two separate Cancellables so that they are tied together with the same lifetime:

class CancellableTimerAndAction: Cancellable {
   var timer: Cancellable? = nil
   var action: Cancellable? = nil

   func cancel() {
      timer?.cancel()
      action?.cancel()
   }
}

Assuming a valid implementation of NetworkService the previous tests would all likely succeed.

However, as previously stated: we are not correctly testing that the timeout occurs when it should and we’re still connecting to the network which (in a real-world example) would connect to an external service like a website making testing prone to failures for range of difficult to control reasons.

Let’s try testing using host time

Let’s start to test things better by replacing the NetworkService with a debug StringService where we can control the exact time the service takes and isn’t dependent on anything outside our program.

Additionally, we need to start testing the actual time taken. Lets look at what that involves:

let expectation = expectation(description: "Waiting for timeout callback")

// Use our debug `StringService`
let service = Service(connect: StringService.withDelay(2.0))

// Set up the time data we need
let startTime = mach_absolute_time()
let timeoutTime = 1.0
let targetTime = UInt64(timeoutTime * Double(NSEC_PER_SEC))
let leeway = UInt64(0.01 * Double(NSEC_PER_SEC))

service.connect(timeout: timeoutTime) { r in
   // Measure and test the time elapsed
   let endTime = mach_absolute_time()
   XCTAssert(endTime - startTime > targetTime - leeway)
   XCTAssert(endTime - startTime < targetTime + leeway)

   XCTAssert(r.value == nil)
   expectation.fulfill()
}

// Wait for all scheduled actions to occur
waitForExpectations(timeout: 10, handler: nil)

This mostly works but as you can see:

  • There’s a lot of boilerplate associated with getting and measuring host time and measuring both upper and lower bounds
  • Even in this trivial case, we’ve had to use 10 milliseconds of leeway (the actual timing is usually between 1 and 5 milliseconds slower than the targetTime on my computer)
  • In rare cases, this may still fail since heavy load or “stop the world” events on the host system (compressing memory, virtual memory thrashing, WindowServer crashes, sleep events) can cause many seconds delay.

Injecting a scheduler like any other dependency

When we wanted to make the NetworkService more testable, we provided an alternate StringService which fulfilled the same interface (and basic functionality) with a controller and deterministic behavior.

The approach to making time more testable is the same; we need to be able to switch between normal and testing implementations by supplying a different scheduling interface on construction. Unfortunately, libdispatch isn’t a single interface that we can swap out. In our code here, we’re using two separate interfaces, DispatchQueue and DispatchSource, so we’ll need to define a single new interface that unifies the functionality of both these interfaces.

We now get to why the CwlExec.swift file I presented in the previous article for specifying execution contexts also contains a range of timer and timestamp features – it is intended to be a unified interface for both these facets of libdispatch.

Lets rewrite the Service class to use Exec instead of libdispatch. The comments in this new version serve purely to document the differences compared to the previous libdispatch version.

class Service {
   // Added an `Exec` parameter to the `ConnectionFunction`
   typealias ConnectionFunction = (Exec, @escaping (Result<String>) -> ()) -> Cancellable
   let underlyingConnect: ConnectionFunction
   let handler: (Result<String>) -> ()

   // Changed from `DispatchSourceTimer` to the more general `Cancellable`
   var currentAction: Cancellable? = nil
   
   // Self-constructed serial `DispatchQueue` changed to user-provided `Exec`
   let context: Exec
   
   // The `context` is now provided on construction
   init(context: Exec = .default, connect: ConnectionFunction = NetworkService.init,
      handler: @escaping (Result<String>) -> ()) {
      self.underlyingConnect = connect
      
      // Since the provided context could be concurrent, we use `serialized` to obtain a
      // serialized version in those cases
      self.context = context.serialized()
   }

   func connect(timeout seconds: Double) {
      var previousAction: Cancellable? = nil
      context.invokeAndWait {
         previousAction = self.currentAction
         let timerAndAction = CancellableTimerAndAction()
         
         // `context` is passed instead of `queue`
         let underlyingAction = self.underlyingConnect(self.context) {
            [weak timerAndAction] result in
            // Cancel the timer if the success occurs first
            timerAndAction?.cancel()
            handler(result)
         }
         
         // Timer is constructed from the `context` instead of passing a `DispatchQueue`
         let timer = self.context.singleTimer(interval:
            DispatchTimeInterval.fromSeconds(seconds)) { [weak timerAndAction] in
            // Cancel the connection if the timer fires first
            timerAndAction?.cancel()
            handler(.failure(ServiceError.timeout))
         }
         
         timerAndAction.timer = timer
         timerAndAction.action = underlyingAction
         self.currentAction = timerAndAction
      }
      
      withExtendedLifetime(previousAction) {}
   }
}

You can see that there’s a lot of little minor changes but no structural changes. The Exec type is intended to act as a low-friction abstration of DispatchQueue-like behavior.

As a further advantage, the handler callback for the connect function is now invoked in the Exec of the user’s choice, so this change in interface offers all the same advantages discussed in the previous article.

Testing with DebugContextCoordinator

Now that libdispatch is replaced by the more flexible Exec, we can use the flexibility to schedule events with something other than libdispatch.

The new scheduler is named DebugContextCoordinator. This class runs tasks in an appropriate order but they will run according to a simulated, rather than real version of time, making everything testable and deterministic. With everything run in appropriate order, you should be able to simply substitute instances of Exec.custom created from a DebugContextCoordinator in place of the standard Exec cases and most common scheduling should continue to work, unchanged.

Let’s look at how our test for the timeout case would work:

// Construct the coordinator
let coordinator = DebugContextCoordinator()

// Construct an instance of `Exec` that behaves like `Exec.syncQueue` (a serial
// `DispatchQueue`) but actually schedules on the coordinator
let context = coordinator.syncQueue

// Construct the `Service` using our debug context
let service = Service(context: context, connect: StringService.withDelay(2.0))

// Run the `connect` function
let timeoutTime = 1.0
var result: Result<String>? = nil
service.connect(timeout: timeoutTime) { r in
   XCTAssert(result == nil)
   result = r
   XCTAssert(coordinator.currentTime == UInt64(timeoutTime * Double(NSEC_PER_SEC)))
   XCTAssert(coordinator.currentThread.matches(context))
}

// Perform all scheduled tasks immediately
coordinator.runScheduledTasks()

// Ensure we got the correct result
XCTAssert(result?.error as? ServiceError == ServiceError.timeout)

withExtendedLifetime(service) {}

Some clear points to note here:

  • We no longer need expectation and waitForExpectations since all actions (even those nominally scheduled over time) are performed immediately when runScheduledTasks is called
  • We’re testing the exact time (no more range required)
  • We can also test that the callback thread matches our specified context

Less visible:

  • it now takes less than a millisecond to run (versus 1.1 seconds, previously) since it doesn’t need to wait for any real-world time to elapse to simulate a second in the debug context.
  • pausing the debugger or taking other actions that delay execution will never cause a test to fail its timing test – host time is now unrelated to the virtual time that the DebugContextCoordinator uses.

Downsides:

  • if your code relies on probing its execution context directly (for example, with Thread.currentThread, DispatchSpecificKey or even just getting the current time), you won’t get the same resullts and in some cases, this will prevent your tests working. If possible, remove queries of the execution context or ask the Exec instance for information about the execution context.
  • since DebugContextCoordinator runs single threaded, it won’t help you find threading bugs – thread testing will need to be done a different way.

Usage

The project containing the DebugContextCoordinator, along with all dependencies like Exec is available on github: mattgallagher/CwlUtils.

The CwlDebugContext.swift file has a few dependences on other files in the same project, including CwlExec.swift, CwlDispatch.swift and the CwlMutex.swift file. It probably wouldn’t be difficult to grab these files separately and add them to your project (if you prefer static linkage) but I recommend following the instructions in the ReadMe.md file for the project and simply cloning the whole repository and adding the framework it produces to your own projects.

As for the tests shown in this article, the final version of the Service class and the tests involved are part of the CwlDebugContextTests.swift file so you can play around with how the DebugContextCoordinator behaves at runtime.

Conclusion

The Exec type presented in the previous article as a means of controlling callback execution context can also be used as a means of abstracting timing and scheduling too. The two ideas benefit from each other: when we added Exec to the Service class to abstract the timing, we automatically gained the ability to invoke the callback handler in an execution context of the user’s choosing.

The advantage with using a DebugContext and DebugContextCoordinator to handle timing is that you’re no longer dependent on the actual timing information from the host machine, leading to greater precision and greater reliablility in testing.

Looking forward…

While not directly related to timing, I included a “NOTE” before a large code sample in this article: properly handling lifetimes for asynchronous tasks can make a simple composition of actions (like combining an action with a timeout) quite complicated. In the next article, I’ll begin looking at abstractions used for composing asychronous tasks with the aim of reducing this complexity.