Advanced programming tips, tricks and hacks for Mac development in C/Objective-C and Cocoa.

Assign, retain, copy: pitfalls in Obj-C property accessors

In this post, I'll look at some very subtle problems that can occur if a getter or setter method chooses the wrong memory management pattern and in the process, explain why NSDictionary copies its keys rather than simply retains them.

Scope of this article: in this post I'll be discussing basic memory and mutability considerations in Objective-C accessor methods. If you're interested in atomicity and thread safety issues in accessor methods, please read my earlier post on Memory and thread-safe custom property methods. This post will only look at non-atomic accessors.

Why implement your own accessors?

If you know about Objective-C's properties, you may already be asking, "Why implement your own accessors at all?". For simple accessors, you can (and probably should) use the synthesized @property accessors and not worry about the implementation. The reason for this post is that you will often need to implement accessor methods yourself to attach additional behaviors to the get or set action, so you always need to know how the accessor should work.

The one you already know: the assign pattern

Since you're reading this blog, it's probably safe to assume that you already know how a basic assign accessor looks:

- (SomeVariable)someValue
{
    return someValue;
}

and a setter method looks like this:

- (void)setSomeValue:(SomeVariable)aSomeVariableValue
{
    someValue = aSomeVariableValue;
}

For most non-object data, that's all you need. This is the implementation you'd get with a non-atomic assign synthesized @property.

Retain: first opportunities for failure

If you're working with retain/released objects in Objective-C, your object will become invalid if your setter does not retain it, so the setter must handle this.

The most completely thorough, all purpose way of handling retain and release issues in a setter looks like this:

- (void)setSomeInstance:(SomeClass *)aSomeInstanceValue
{
    if (someInstance == aSomeInstanceValue)
    {
        return;
    }
    SomeClass *oldValue = someInstance;
    someInstance = [aSomeInstanceValue retain];
    [oldValue release];
}

This is obviously much more complicated than a simple assignment.

The biggest point that stands out i the weird retain/release dance occupying the last three lines of this code block. We do this to avoid problems where the object pointed to by aSomeInstanceValue and someInstance is the same. Imagine the following:

- (void)setSomeInstance:(SomeClass *)aSomeInstanceValue
{
    [someInstance release]; // <-- original value is released
    someInstance = [aSomeInstanceValue retain];
}

If aSomeInstanceValue and someInstance are the same object, then the first line could release the underlying object before the second line retains it — meaning that the object is destroyed and invalid by the time it is retained again.

You'll notice that we use the C equality operator to compare the two objects instead of the isEqual: method. Normally, this is a bad idea (Objective-C objects may be "equal" even if they do not point to the same location in memory) but in this case, we are specifically interested in cases where the memory location (and hence the retain counts) are identical. The comparison is not strictly necessary to prevent premature release (since the retain precedes the release) but it's an optimization and safety measure.

In reality though, all this is a bit painful to write: the comparison, the annoying temporary value and the careful ordering of everything. In practice, we normally use a shorter version of the setter method that is basically just as good and is certainly far easier to write:

- (void)setSomeInstance:(SomeClass *)aSomeInstanceValue
{
    [someInstance autorelease];
    someInstance = [aSomeInstanceValue retain];
}

In this case, we don't care about the comparison optimisation and we use the autorelease pool to temporarily hold the original value instead of our own stack value.

This method has the same safety as the retain/release dance but requires an autorelease pool and is marginally slower if someInstance and aSomeInstanceValue are actually the same. In reality, there's almost always an autorelease pool in a Cocoa application and the performance difference is more theoretical than real (you'd have difficulty constructing a test program to ever show a difference) until you start creating copy setters (see below).

Not all Objective-C setters should retain: There are some situations where a setter method that takes an Objective-C object should not retain its parameter. Specifically: objects in a hierarchy should not normally retain their parents. Look at Rules to avoid retain cycles for more on this topic.

The other half of the retain access pattern

In reality, even if you forget to perform the if (newInstance == oldInstance) return; check in your accessor method, you are highly unlikely to see a problem even if you actually do pass the current value to a setter method.

The reason why you are normally safe, even if you make a mistake, is because the parameter you pass into a method normally has another retain count held elsewhere.

Specifically: most objects you access on the stack have an autoreleased retain count held by an autorelease pool on the stack or a longer lived object somewhere in the heap.

Theoretically, you could help this idea along by implementing your getter methods like this for retained objects:

- (SomeClass *)someInstance
{
    return [[someInstance retain] autorelease];
}

This will ensure that if your object is used like this:

SomeClass *stackInstance = [anObject someInstance];
[anObject release]; // anObject releases its someInstance ivar in its dealloc method
[stackInstance doSomething];

Then the stackInstance variable will still be valid when doSomething is invoked on it.

Generally though, I don't bother doing this in non-atomic getter methods.

Why not? Speed, redundancy and common sense.

While retaining and autoreleasing something is not a gigantic overhead, it is normally just not needed for a get accessor method. The programmer using the get accessor should understand that the lifetime is bound to the source object.

However, you may want to use this pattern in a situation where it is not immediately clear that the value returned is a getter method. For example:

@implementation ThisClassIsReallyJustAString

- (NSString *)description
{
    return [[internalValue retain] autorelease];
}

@end

The description method (which returns an NSString representation of an object) does not normally return an instance variable directly so it would not ordinarily be considered an accessor method. In this situation, it is implemented as an accessor but due to the expectation that the description is a generated value, we need to retain and autorelease the returned value.

Any instance variable returned from a method where the method is not obviously an accessor method should always go through a retain/autorelease since it is not immediately obvious that its lifetime would be bound to the lifetime of the source object.

Copy accessors

The reason why retain accessors exist is obvious — you don't want the value to be deallocated while it is set on the object.

The reason for the final accessor pattern in Objective-C — copy accessors — is less broadly understood but nonetheless relevant.

You should use a copy accessor when the setter parameter may be mutable but you can't have the internal state of a property changing without warning. Consider the following example:

NSMutableString *mutableString = [NSMutableString stringWithString:@"initial value"];
[someObject setStringValue:mutableString];
[mutableString setString:@"different value"];

In this situation, if the setStringValue: method followed the retain pattern, then the setString: method invoked on the next line would change the internal value of this property without warning someObject.

Sometimes, you don't care if the internal state of a property change without warning. However, if you do care about the internal state of a property changing, then you'll want to use a copy setter.

The implementation of a non-atomic copy setter is as simple as you'd imagine:

- (void)setStringValue:(NSString *)aString
{
    if (stringValue == aString)
    {
        return;
    }
    NSString *oldValue = stringValue;
    stringValue = [aString copy];
    [oldValue release];
}
Correction: I had previously stated that you don't need the equality comparison for a copy (since I wrongly claimed the copy would ensure you always have a different block of memory). I was wrong, of course. As was immediately pointed out in the comments: copy does not always return a copy; for immutable objects, copy can return the same object. I forgot about this, even though I've certainly written copy setters that work this way (see the SynthesizeSingleton code). The result is that we still need the comparison to ensure that setting the property to the same value doesn't cause the same potential release problems that the previous retain pattern had. Did I mention it is possible to screw up property accessors?

This finally explains why NSDictionary copies its keys.

NSDictionary stores all its value objects in locations according to the result of each corresponding key object's hash method.

If a key object were to change in such a way that the hash method result changed, then the NSDictionary would not be able to find the corresponding value object any more.

Key objects are not set by a property accessor on an NSDictionary but the effect is the same: the copy pattern is used to ensure that outside objects cannot change the internal state of the dictionary without warning.

Of course, the downside to properties that follow the copy pattern is speed. It's not normally an issue for objects up to a few kilobytes but beyond that point, you'll need to consider whether copy is the right pattern for the resources involved.

Conclusion

Getter and setter methods are frequently given as the simplest kind of method to implement. But you can see that even within this very simple kind of method, there are subtle ways that they can go wrong.

Overall, this post is a much simpler level than my typical posts but that doesn't mean its only for novice Objective-C developers. Experienced programmers are far from immune to mistakes in this area. Some of the potential quirks with getter and setter methods are so rare you may never see a problem (even if your code is doing things in an unsafe manner) so learning from experience can take time in this area.

Of course, if you don't need any customization in your accessor methods, you should simply use a synthesized @property implementation. This will avoid any possibility of introducing mistakes.

The design of every Mac application

I was recently asked by a reader if I used any modelling program to model the classes and relationships in my Mac applications. The answer is no, I don't model the application side of my programs. The reason for this is not because applications are always small and simple. The reason is that all applications have approximately the same design — eventually everything in a well-designed application becomes intuitive.

Ontology

It would be a mis-statement to claim that all applications have the same design but it is accurate (although more confusing) to state that all well-designed applications share a very high degree of ontological similarity.

Ontology is a pretty obscure word so it might be a good idea to explain it here. I'm going to use an analogy...

When you have a lot of items, it's helpful to classify them into a structure to understand how they relate to each other. For example: all living species on Earth are classifed into Domain, Kingdom, Phylum, Class, Order, Family, Genus and Species. We often just talk about animal or plant "species" but the classification system has many more levels (in fact, there are many more levels in-between these simple names that I've listed). The classification of living things is an example of a "taxonomy" (classifying things into a single hierarchy).

Similarly, every class in a well-designed application can be classified. But the classification of an application isn't a simple tree structure — there are many different connections. And the connections aren't simply of heredity — there are subclasses, view hierarchies, event hierarchies, control hierarchies and more.

When each classified element may have more than one position in the tree, and when different connections may have different semantic meanings and the overall organization isn't a simple tree but it's more of a big ball of wibbly-wobbly, interconnected stuff — that's an ontology.

The ontology of an application

Due to the interconnected, non-simple structure of ontologies, it's difficult to give a simple diagram that fully models one. Instead, I'm just going to list the broad categories of classes that you find in an application and simply discuss how they actually interconnect.

ApplicationOntology.png

The basic categories of class in an application ontology.

This diagram describes pretty much every class type you're likely to see in an application. In a well designed project, these are the names of the Groups in the class tree.

Where did I get this list? I've really just transcribed the Groups from some of my larger projects. If you're interested, I consider a large project to be approximately a couple hundred classes. The largest I've ever let a project get is just over 400 classes. Normally, as a project gets near that size, I break it apart into multiple sub-projects. Generally, in a break apart like this, the "application" stays where it is and the "document" or "job handler" components get spun out into their own projects.

Breaking a huge project into pieces offers lots of advantages: it makes building faster, forces better abstraction and interfaces and makes testing easier (because the interface between components is a great place for system tests). The drawback is that the build process and version management can get a little trickier but once the project becomes that big, you normally have a good process in place to handle that complexity.

Applications on other platforms: While applications on different operating systems share much in common, the description here applies most accurately to Cocoa Mac applications. The needs of interacting with different application frameworks and delivering different features to users on different platforms ultimately effect the ontological structure of the application's classes.
Application initiation, top level control, configuration

In a Cocoa program, this includes the main.c file and your App Delegate. In a Cocoa Mac document-based application, this also includes the NSDocumentController.

These classes perform startup and top-level handling and shouldn't do anything else.

A good indication of a hastily or poorly written application is when other functionality gets implemented in one of these classes. Since these classes are almost always present and are accessible at the top level, it's easy to be lazy and shove functionality into them because the functionality doesn't seem to fit elsewhere.

View and event hierarchies for primary workflow

This includes your primary NSDocument subclass, your NSWindowContorllers, NSViewControllers, UIViewControllers and the primary (used for displaying the main document) NSViews or UIViews.

In a simple application that does more browsing than processing, this will be the bulk of the project. Fortunately, the view hierarchy is typically highly structured. Controllers at the top and nested views all the way down.

New communication between view elements should go through the top controller of any hierarchy or should through connections established by a higher level of the hierarchy. Views talk to corresponding document objects but the correspondence should be entirely established by the controllers.

The view and event hierarchies are generally similar; the few differences there are likely to be will rarely cause a problem.

View and event hierarchies for secondary workflows

An example of a secondary workflow would be the downloading interface in a web browser — a fully contained user-interface that runs in a different view or window. It also includes complicated popup/modal or other interfaces like the "Get Info" interface in the Mac OS X Finder or the "Add to Playlist" functionality in the iPod app on the iPhone.

These tend to run like separate little programs within the greater application and you should write them as such. While they may share some views and data with the rest of the program, they generally use their own controllers to manage the workflow and as such should not simply be lumped in with the classes from the main workflow.

Data processing and handling

This is your "document" (or "model" in MVC parlance). In a basic application, the document may be vanishingly small (a single array of objects or even a single object). For the largest applications though, the document is generally the majority of the program.

Generally, the document includes an interface at the top level that connects it to the structure of the rest of the program — since the elements within the document should be as ignorant of the structure of the rest of the program as possible. This connective interface is sometimes rolled into the NSDocument class on the Mac (which is also the controller of the document's view hierarchy). That's not too bad (they're both controllers) but in larger programs, the connecting controller for the document's view and the document's model may end up being different classes.

Below the interface there is normally some form of management or coordination to handle caching, saving, memory and other issues related to the runtime state but not necessarily the persistent state of the document. This is the role that the NSManagedObjectContext performs for a document implemented using Core Data.

Since the document is the part of a program that is not inherently designed following "application" rules, it does not necessarily follow an application ontology. For this reason, you may need to keep diagrams to understand the structure of highly complex document model. Fortunately, if you're using Core Data or another persistence framework for storing your model, you probably already have a diagram of your model.

Activity/job handling and execution

This is a category that unfortunately gets left out of many MVC discussions.

A functional pipeline fulfils a similar role to the document but instead of exposing data storage and manipulation to the user, exposes actions and behaviors.

An example would be a web server. If you push the "Start Server" button in your user interface, that button is not connected to a document but to a top level controller that starts the server. That top-level controller is the interface to a functional pipeline.

In many respects, a functional pipeline should be treated like a document — connect it to views in the same way (through controllers that link values to display) and manage it in the same way (through a top level interface that exposes functionality to the rest of the program).

Preferences, styles and filesystem integration

This layer is largely filled with singletons; independent components that manage a small set of data that can be accessed at will from anywhere in the program.

These types of classes tend to drive Dependency Injection enthusiast Unit Testers crazy because they follow a state-machine rather than a command-pattern (state machines are very hard to unit test).

Settings management is an essential component of larger programs. Basic storage provided by NSUserDefaults becomes insufficient once different parts of the program need to coordinate access to the same settings.

A styles controller is really just a way of accessing the same fonts, colors and other elements that make your program look consistent.

Custom views and control elements

This level is the storage bin for controls and views that are not tied to a specific workflow but (like the Styles Controller) are reused through the program to give a consistent aesthetic and set of behaviors to the use.

Single purpose views generally live along with the workflow that uses them. For this reason, as a program goes, you may continuously generalize your single use views and move them out of a workflow into this more general section.

This level of the application ontology also tends to be the first so far that contains concrete classes that may be brought in from outside the program. While your view controllers or document may include reusable abstract classes, the concrete implementations are normally specific to your program. Custom controls however are easily used in concrete form across a range of different applications.

Programming and testing facilities and integration

This is the cluster of classes that the user should never see (with the possible exception of Error dialogs). Many of the debugging and testing classes simply don't appear in release builds.

Why would kludges, hacks and bug work-arounds appear here instead of closer to the classes they affect? Because they are parts of your code that the developer needs to keep a close eye on and should be able to disable easily in the event that they no longer apply. For these reasons, you should keep them from directly polluting your real code.

Drawing, processing, extensions and patches

These are the type of class that I've shared many times in this blog: handy tricks for accessing or achieving something with little or no dependency on the rest of your application.

While this occupies just a single level of the diagram, in a medium sized program, these classes could be more than half the total classes in the program — little tools, tricks, reusable code and behaviors you like to carry from program to program.

Fortunately, the large number of these classes rarely makes the program more complex — they have no significant connections to the rest of the program. The only difficulty tends to be cleaning them out when they're no longer used.

Scalability

In many situations, the size of a program increases complexity and makes it harder to understand.

For the "application" side of a program (not including the "document"), greater size actually tends to force the program to behave more regularly and fit the categories I've discussed better.

In smaller applications, classes tend to be more multi-functional, spanning many of these categories. As an application grows, classes tend to become more single-functional so they fit the standard categories better.

The "document" — and to a lesser extent, functional workflows — tend to be the exception to this scalability rule, since they don't adhere to a common ontology. You will probably want to model and document your document.

Conclusion

Applications are a relatively narrow domain within programming overall — they need to perform specific functions and tend to perform those functions in similar ways.

This similarity leads them to be structure similarly, to interconnect in similar ways; to share a common ontology.

While the pattern may not be obvious when looking at a large program for the first time, or if you're continuously looking at hastily-constructed, small-to-medium sized projects, applications are structurally simple.

I've drawn a distinction in this post between applications and other types of programs — even between the application part of a program and the non-application parts of a program. Obviously, non-application parts of an application will likely not follow the ontology of an application.

While a badly designed, badly organized project can escape many of the traits you'd normally expect from an application, the mechanics of hooking into a framework like Cocoa and required steps to implement application-like features still dictate a structure that is hard to escape.

Sorting an NSMutableArray with a random comparison method

If you sorted an NSMutableArray with a comparison method that randomly returns either higher or lower, would the result be an even, random distribution? Spoiler: no it won't but the actual distribution is interesting nonetheless. I'll show you what would happen if you did sort this way (and also show you how to correctly randomize an array if you did want an even distribution).

Sorting an array

You can sort an NSMutableArray by any logic you want using a comparison method (implemented on the contained objects) and invoking sortUsingSelector: on the array.

What would happen if you sorted an array of numbers using the following category method which returned a random result?

@implementation NSNumber (RandomComparison)

- (NSComparisonResult)randomCompare:(id)other
{
    // randomly return either ascending or descending
    return (random() % 2) ? NSOrderedAscending : NSOrderedDescending;
}

@end

Naively, you might think that the result would be an even, random distribution.

Distribution from a random sort

For an array 20 elements long, sorted using a random comparison, the following is a chart of percentages where the 1st element in the array ended after 1 million runs:

firstelement.png

The percentage of times that the original 1st element ended in each position of the final array.

In a proper random distribution, you'd expect the first element to be equally likely to end up in any position. In fact a correct result here should be 5.0% +/- 0.1% for every column.

The actual result reflects traits of NSMutableArray's internal sorting (which I'm presuming uses a sort similar to the CFQSortArray from CoreFoundation.

The biggest feature of the distribution is a fuction of how quicksort works. Between 7 and 40 elements (which this array is), NSMutableArray's sort chooses a single pivot which is most likely to be in the middle of the array. Elements on either half are sorted with respect to each other before sorting with the whole.

This pivot selection causes the huge difference between the distribution in the top and bottom half since elements compared with other elements on their own side of the pivot more often, they are more likely to remain on that half of the pivot.

Quicksort subdivides the array but for subdivided blocks smaller than 7 elements, the NSMutableArray's sort algorithm uses a bubble sort instead of quicksort. Elements 1 to 5 at the far left of the distribution show what happens in this bubble sort case: a bulge in the center of the 1 to 5 range, skewed in the direction of the element's original position (i.e. the bulge is lopsided towards 1).

Let's look at what happens to element 11:

tenthelement.png

The percentage of times that the original 11th element ended in each position of the final array.

This is not a simple mirror of the previous results but it again follows the quicksort pattern: results from a random sort are most likely to stay in the partition where they began, with minor perturbations on the fringes due to the bubble sort.

The correct way to randomize in-place

If you're interested, the correct way to randomize in-place is:

@implementation NSMutableArray (Randomization)

- (void)randomize
{
    NSInteger count = [self count];
    for (NSInteger i = 0; i < count - 1; i++)
    {
        NSInteger swap = random() % (count - i) + i;
        [self exchangeObjectAtIndex:swap withObjectAtIndex:i];
    }
}

@end

A quick summary of what this does:

  1. conceptually split the array into "used" elements (in their final positions) and "unused" elements (initially, everything is "unused")
  2. randomly choose an element from the unused elements
  3. swap the chosen element with the first element in the unused range
  4. this first position in the unused range is now considered used (i.e. the element is in its final position)
  5. repeat steps 2 to 4 until all elements are in their final positions

This algorithm is the "Fisher–Yates shuffle" and dates to 1938 (which is an indication of how obvious it is).

If you're truly interested in a perfectly even, random distribution, you may want to use an approach that avoids biases due to modulo (modulo will round down too often for non-binary divisible numbers). You can see one approach for unbiased rounding in this post by Brent Royal-Gordon on StackOverflow. Although if you're really interested in an even distribution, you'll probably want to use a Mersenne-Twister instead of the C random() function.

The reality is that that code I showed is faster than the other options and easily good enough unless you have a strong mathematical need for accuracy (in my tests on 1 million samples, it was evenly distributed to 2 significant figures in all cases).

Conclusion

A quick post this week. I was really just goofing around with comparison methods and thought the results were interesting. In retrospect, its obvious that the results from a sort with a random comparison would be a function of how the sort is performed but I was surprised that the two different kinds of sort (bubble and quick) both had a visible influence on the results.

Of course, there are proper ways to randomize arrays and they're not very difficult — unless you're really fussy about the quality of your random numbers.

Avoiding deadlocks and latency in libdispatch

The system-wide thread pool of libdispatch's global queue is an easy way to efficiently manage concurrent operations but it is not the solution to all threading problems and it is not without its own class of problems. In this post I look at deadlocks and latency problems that are inherent in thread-pool based solutions like libdispatch's global concurrent queue so that you will know when you should use this option and when you need something else.

Introduction: libdispatch's global queue is a constrained resource

In Mac OS X 10.6 (Snow Leopard), Apple introduced Grand Central Dispatch (GCD) for managing a system-wide thread pool. The normal API for GCD is libdispatch (despite the name, this API is part of libSystem).

The simplest way to interact with libdispatch is to execute blocks on the global queue:

dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_async(queue, ^{ /* perform some work */ });

For discrete packets of work that are not order dependent, this is a great solution. In this post, however, I'm going to look at situations where this is a poor choice.

The key limitation of this approach is that the global concurrent thread pool is a constrained resource — there are only as many active threads in this pool as you have CPUs on your computer.

Using this constrained resource indiscriminately can lead to the following problems once the limits of the resource are reached:

  • higher latency than other solutions
  • deadlocks for interdependent jobs in the queue

Latency problems

The following code simulates a tiny web sever or similar program that needs to serve a number of clients. In this case, it has 20 simultaneous requests from different clients.

In reality, this program just writes a small string to /dev/null 100,000 times for every client but that's sufficient to simulate any similar network or other non-CPU based operation.

int main(int argc, const char * argv[])
{
    dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    dispatch_group_t group = dispatch_group_create();
    
    FILE *devNull = fopen("/dev/null", "a");

    const int NumConcurrentBlocks = 20;
    const int NumLoopsPerBlock = 100000;

    for (int i = 0; i < NumConcurrentBlocks; i++)
    {
        dispatch_group_async(group, queue, ^{
            NSLog(@"Started block %d", i);
            for (int j = 0; j < NumLoopsPerBlock; j++)
            {
                fprintf(devNull, "Write to /dev/null\n");
            }
            NSLog(@"Finished block %d", i);
        });
    }

    dispatch_group_wait(group, DISPATCH_TIME_FOREVER);
    dispatch_release(group);
    fclose(devNull);
    
    return 0;
}

This scenario illustrates the latency problems inherent in using the libdispatch global queue: on my computer, the first 4 blocks start immediately and the first of these finish 1.5 seconds later when the next blocks are started. The final block does not begin until 7.5 seconds after it is queued, finishing at the 9 second mark.

While the global queue in libdispatch is called "concurrent", it is only concurrent up to a threshold. Once the concurrent slots are full, the global queue becomes serial — in this case, the limit is reached and the serial nature increases the latency.

If this were a heavily loaded web server, instead of simply slowing down evenly for all users, the first few users would get a response and the last few would simply time out.

The solution to this type of problem is to create a specific queue for every block. Instead of pushing everything into the global queue (which is capped at 4 blocks on my test computer) we will have a separate queue for each block that will run simultaneously.

int main(int argc, const char * argv[])
{
    dispatch_group_t group = dispatch_group_create();
    FILE *devNull = fopen("/dev/null", "a");

    const int NumConcurrentBlocks = 20;
    dispatch_queue_t *queues = malloc(sizeof(dispatch_queue_t) * NumConcurrentBlocks);
    for (int q = 0; q < NumConcurrentBlocks; q++)
    {
        char label[20];
        sprintf(label, "Queue%d", q);
        queues[q] = dispatch_queue_create(label, NULL);
    }

    const int NumLoopsPerBlock = 100000;
    for (int i = 0; i < NumConcurrentBlocks; i++)
    {
        dispatch_group_async(group, queues[i], ^{
            NSLog(@"Started block %d", i);
            for (int j = 0; j < NumLoopsPerBlock; j++)
            {
                fprintf(devNull, "abcdefghijklmnopqrstuvwxyz\n");
            }
            NSLog(@"Finished block %d", i);
        });
    }

    dispatch_group_wait(group, DISPATCH_TIME_FOREVER);
    dispatch_release(group);
    fclose(devNull);
    
    return 0;
}

The result is that all 20 blocks are started simultaneously. All run at approximately the same speed and all finish at approximately the same time.

Alternative solution: as suggested by Keith in the comments, since these operations are I/O bound, not CPU bound, a better solution would be to use a file write source in the queue instead of standard operation queue blocks. File write sources are removed from the queue when they are blocked on I/O and this would allow all 20 sources to operate equitably in the global concurrent queue (or any other single queue).

Deadlocking

Deadlocking occurs when a first block stops and waits for a second to complete but the second can't proceed because it needs a resource that the first is holding.

In the following program, 20 parent blocks are queued (which will more than fill the constrained resource of the global concurrent queue). Each of these blocks spawns a child block which is queued in the same global concurrent queue. The parent runs a busy wait loop until the child adds its own integer to the completedSubblocks set.

NSMutableSet *completedSubblocks;
NSLock *subblocksLock;

int main (int argc, const char * argv[])
{
    completedSubblocks = [[NSMutableSet alloc] init];
    subblocksLock = [[NSLock alloc] init];

    dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    dispatch_group_t group = dispatch_group_create();
        
    const int NumConcurrentBlocks = 20;
    for (int i = 0; i < NumConcurrentBlocks; i++)
    {
        dispatch_group_async(group, queue, ^{
            NSLog(@"Starting parent block %d", i);
            
            NSDate *endDate = [NSDate dateWithTimeIntervalSinceNow:1.0];
            while ([(NSDate *)[NSDate date] compare:endDate] == NSOrderedAscending)
            {
                // Busy wait for 1 second to let the queue fill
            }
            
            dispatch_async(queue, ^{
                NSLog(@"Starting child block %d", i);

                [subblocksLock lock];
                [completedSubblocks addObject:[NSNumber numberWithInt:i]];
                [subblocksLock unlock];
                
                NSLog(@"Finished child block %d", i);
            });

            BOOL complete = NO;
            while (!complete)
            {
                [subblocksLock lock];
                if ([completedSubblocks containsObject:[NSNumber numberWithInt:i]])
                {
                    complete = YES;
                }
                [subblocksLock unlock];
            }

            NSLog(@"Finished parent block %d", i);
        });
    }
    
    dispatch_group_wait(group, DISPATCH_TIME_FOREVER);
    dispatch_release(group);
    
    [completedSubblocks release];
    [subblocksLock release];

    return 0;
}

On my computer the first 8 blocks are started (filling the global queue's concurrent blocks) and these blocks block the global queue so that the child blocks can never run. Since the parent is waiting for the child and the child can never run, the result is a deadlock: the program never completes.

You'll notice that I've used a busy wait loop for both the 1 second delay and to detect when the child has completed. Normally, you would dispatch the child using dispatch_sync (which is a different way to wait for the child to complete) but the reality is that libdispatch is smart enough to remove a block from its queue while it is using one of the libdispatch wait mechanisms (or one of the many Cocoa or OS functions that similarly yields CPU time like sleep).

While using the correct functions to wait will fix this trivial example, it will not fix the situation where the processes might be genuinely busy.

The best solution is to avoid dependencies between blocks in the same queue.

In a situation like this where one of the blocks (the child block) consumes a trivial amount of time and the other is time consuming (the parent always takes at least 1 second), you can simply enqueue the trivial time block in a separate serial queue. That will prevent deadlocking in all cases.

In a situation where both parent and child are time consuming, you could try:

  • create a separate queue for every invocation of either the parent or child and queue the other in the global queue
  • roll the functionality of the child into the parent so that they are always done as a single block
  • carefully write your code to ensure that a libdispatch wait mechanism is always ahead of any dependency

Conclusion

While libdispatch does an excellent job at making concurrent, multithreaded programming easier, it does not magically eliminate some of the pitfalls.

The problems I've looked at in this post are mostly a result of the fact that queues are shared resources and there are situations where you must be careful about how resources are shared.

In that sense, these problems are not directly related to libdispatch (although I've used libdispatch to implement the scenarios) they are problems that can be caused by misusing any system where a fixed number of servers is pulling data out of a queue.

Grand Central Dispatch's global queue is intended for CPU intensive operations. That it limits concurrency to the number of CPUs in your computer is good for CPU intensive operations; it prevents wasting CPU time due to task swapping. The examples here demonstrate the downside: it may increase latency for queue blocks and you need to be careful about dependencies between blocks on the same queue.