Advanced programming tips, tricks and hacks for Mac development in C/Objective-C and Cocoa.

Handling unhandled exceptions and signals

When an application crashes on the iPhone, it disappears without telling the user what happened. However, it is possible to add exception and signal handling to your applications so that an error message can be displayed to the user or you can save changes. It is even possible to try to recover from this situation without crashing at all.

Debug-only hack warning

Warning: The code in this post is for quick and dirty information gathering after a crash bug. The purpose is not to make your shipping code crash-proof, it is to allow better debugging of your application when a crash or exception occurs during user testing.

The code in this post performs signal handling in non re-entrant way — this is not a reliable thing to do and is only done because proper re-entrant coding is brutally difficult and the assumption is that your program has already fatally crashed so we're not too worried. If multiple signals are caught, this code probably won't help at all.

Introduction

This post will present a sample application that deliberately raises Objective-C exceptions, EXC_BAD_ACCESS exceptions and related BSD signals. All exceptions and signals are caught, presenting debug information and allowing the application to continue after these events.

uncaughtexception.png
You can download the sample project: UncaughtExceptions.zip (25kB)

This application will deliberately trigger an unhandled message exception after 4 seconds and then will deliberately trigger an EXC_BAD_ACCESS/SIGBUS signal at the 10 second mark.

Why do applications crash on the iPhone?

A crash (or more accurately: an unexpected termination) is the result of an unhandled signal sent to your application.

An unhandled signal can come from three places: the kernel, other processes or the application itself. The two most common signals that cause crashes are:

  • EXC_BAD_ACCESS is a Mach exception sent by the kernel to your application when you try to access memory that is not mapped for your application. If not handled at the Mach level, it will be translated into a SIGBUS or SIGSEGV BSD signal.
  • SIGABRT is a BSD signal sent by an application to itself when an NSException or obj_exception_throw is not caught.

In the case of Objective-C exceptions, the most common reason why unexpected exceptions are thrown in Objective-C is sending an unimplemented selector to an object (due to typo, object mixup or sending to an already released object that's been replaced by something else).

Mac application note: NSApplication on the Mac always catches all Objective-C exceptions in the main run loop — so an exception on the main thread of a Mac application will not immediately crash the program, it will simply log the error. However, an unexpected exception can still leave the application in such a bad state that a crash will subsequently occur.

Catching uncaught exceptions

The correct way to handle an uncaught exception is to fix the cause in your code. If your program is working perfectly, then the approaches shown here should not be necessary.

Of course, programs do sometimes get released with bugs that may lead to a crash. In addition, you may simply want more information back from your testers when you know that there are bugs in your program.

In these cases, there are two ways to catch otherwise uncaught conditions that will lead to a crash:

  • Use the function NSUncaughtExceptionHandler to install a handler for uncaught Objective-C exceptions.
  • Use the signal function to install handlers for BSD signals.

For example, installing an Objective-C exception handler and handlers for common signals might look like this:

void InstallUncaughtExceptionHandler()
{
    NSSetUncaughtExceptionHandler(&HandleException);
    signal(SIGABRT, SignalHandler);
    signal(SIGILL, SignalHandler);
    signal(SIGSEGV, SignalHandler);
    signal(SIGFPE, SignalHandler);
    signal(SIGBUS, SignalHandler);
    signal(SIGPIPE, SignalHandler);
}

Responding to the exceptions and signals can then happen in the implementation of the HandleException and SignalHandler. In the sample application, these both call through to the same internal implementation so that the same work can be done in either case.

Save your data: The very first task to perform in your uncaught exception handler should be to save data that might need saving or otherwise clean up your application. However: if the exception may have left the data in an invalid state, you may need to save to a separate location (like a "Recovered Documents" folder) so you don't overwrite good data with potentially corrupt data.

While these cover the most common signals, there are many more signals that may be sent that you can add if required.

There are two signals which cannot be caught: SIGKILL and SIGSTOP. These are sent to your application to end it or suspend it without notice (a SIGKILL is what is sent by the command-line function kill -9 if you're familiar with that and a SIGSTOP is sent by typing Control-Z in a terminal).

Requirements of the exception handler

An unhandled exception handler may never return

The types of situations which would cause an unhandled exception or signal handler to be invoked are the types of situations that are generally considered unrecoverable in an application.

However, sometimes it is simply the stack frame or current function which is unrecoverable. If you can prevent the current stack frame from continuing, then sometimes the rest of the program can continue.

If you wish to attempt this, then your unhandled exception handler must never return control to the calling function — the code which raised the exception or triggered the signal should not be used again.

In order to continue the program without ever returning control to the calling function, we must return to the main thread (if we are not already there) and permanently block the old thread. On the main thread, we must start our own run loop and never return to the original run loop.

This will mean that the stack memory used by the thread that caused the exception will be permanently leaked. This is the price of this approach.

Attempt to recover

Since a run loop will be used to display the dialog, we can keep that run loop running indefinitely and it can serve as a possible replacement for the application's main run loop.

For this to work, the run loop must handle all the modes of the main run loop. Since the main run loop includes a few private modes (for GSEvent handling and scroll tracking), the default NSDefaultRunLoopMode is insufficent.

Fortunately, if the UIApplication has already created all the modes for the main loop, then we can get all of these modes by reading from the loop. Assuming it is run on the main thread after the main loop is created, the following code will run the loop in all UIApplication modes:

CFRunLoopRef runLoop = CFRunLoopGetCurrent();
CFArrayRef allModes = CFRunLoopCopyAllModes(runLoop);

while (!dismissed)
{
    for (NSString *mode in (NSArray *)allModes)
    {
        CFRunLoopRunInMode((CFStringRef)mode, 0.001, false);
    }
}

CFRelease(allModes);
As part of the debug information, we want the stack addresses

You can get the backtrace using the function backtrace and attempt to convert this to symbols using backtrace_symbols.

+ (NSArray *)backtrace
{
    void* callstack[128];
    int frames = backtrace(callstack, 128);
    char **strs = backtrace_symbols(callstack, frames);
    
    int i;
    NSMutableArray *backtrace = [NSMutableArray arrayWithCapacity:frames];
    for (
        i = UncaughtExceptionHandlerSkipAddressCount;
        i < UncaughtExceptionHandlerSkipAddressCount +
            UncaughtExceptionHandlerReportAddressCount;
        i++)
    {
        [backtrace addObject:[NSString stringWithUTF8String:strs[i]]];
    }
    free(strs);
    
    return backtrace;
}

Notice that we skip the first few addresses: this is because they will be the addresses of the signal or exception handling functions (not very interesting). Since we want to keep the data minimal (for display in a UIAlert dialog) I choose not to display the exception handling functions.

If the user selects "Quit" we want the crash to be logged

If the user selects "Quit" to abort the application instead of attempting to continue, it's a good idea to generate the crash log so that normal crash log handling can track the problem.

In this case, we need to remove all the exception handlers and re-raise the exception or resend the signal. This will cause the application to crash as normal (although the uncaught exception handler will appear at the top of the stack, lower frames will be the same).

Limitations

The signal handler is not re-entrant

Remember from the paragraph at the beginning:

The code in this post performs signal handling in non re-entrant way — this is not a reliable thing to do and is only done because proper re-entrant coding is brutally difficult and the assumption is that your program has already fatally crashed so we're not too worried. If multiple signals are caught, this code probably won't help at all.

If you want to learn how to write signal handlers for non-crash related signals or learn how to write proper re-entrant signal handling, I'm afraid you'll need to look elsewhere — there's not enough space here for me to show you and it's really hard. Ignoring this constraint here is okay for debug code only where we assume we're only going to get 1 signal.

This approach won't work if the application hasn't configured the main run loop

The exact way that UIApplication constructs windows and the main run loop is private. This means that if the main run loop and initial windows are not already constructed, the exception code I've given won't work — the code will run but the UIAlert dialog will never appear. For this reason, I install the exception handlers with a performSelector:withObject:afterDelay:0 from the applicationDidFinishLaunching: method on the App Delegate to ensure that this exception handler is only installed after the main run loop is fully configured. Any exception that occurs prior to this point on startup will crash the application as normal.

Your application may be left in an unstable or invalid state

You cannot simply continue from all situations that trigger exceptions. If you're in the middle of a situation that must be completed in its entirety (a transaction on your document) then your application's document may now be invalid.

Alternately, the conditions which led to the exception or signal may have left your stack or heap in a state so corrupted that nothing is possible. In this type of situation, you're going to crash and there's little you can do.

The exception or signal could just happen again

The initial causes of the exception or signal will not be fixed by ignoring it. The application might simply raise the same exception immediately. In fact, you could become overwhelmed by exceptions in some cases — for this reason, I've limited the number of uncaught exceptions that may be handled to 10 in the sample application.

Resources used up to the time of the exception are leaked

Since the stack is blocked from returning, everything allocated on the stack or the autorelease pool between the main run loop and the exception will be leaked.

It might be bad behavior for the user

Depending on the style of your application, it might be better to simply let the crash happen — not all users care about debug information and your application might not have data that needs saving, so a very rare crash might not be too offensive.

gdb will interfere with signal handling

When you're debugging, the SIGBUS and SIGSEGV signals may not get called. This is because gdb inserts Mach exception handlers which picks them up at the EXC_BAD_ACCESS stage (and refuses to continue). Other signals type may also be handled by gdb, preventing the signals from reaching your handlers.

If you want to test signal handling properly, you'll need to run without gdb (Run with Breakpoints off).

Conclusion

You can download the sample project: UncaughtExceptions.zip (25kB)

It is possible to make your application continue running for a short period of time after a "crash" signal occurs by handling common exceptional signals and attempting to recover.

There are real risks though in terms of signal re-entrancy problems, leaked memory and potentially corrupted application data, so this type of approach should be viewed as either a debugging tool or a measure of last resort.

However, it is comforting to have a level of fallback in the situation where a hard to reproduce crash occurs during testing and you'd like more information on the application state when the crash happened.

Read more...

5 ways to draw a 2D shape with a hole in CoreGraphics

In this post, I look at 5 different ways that you can draw a very simple shape: a square with a triangular hole cut out of the center. In a drawing environment like CoreGraphics which offers double buffering, winding count path filling, even-odd path filling and clipping regions, there's no single answer. An iPhone sample project is provided containing the code but all drawing functions are identical on the Mac.

Introduction

This post will look at different ways to draw the following shape:

shape.png

This is a very simple shape but it requires a non-simple topology: you must cut the center out of the shape to draw it. This post will look at 5 different ways that this can be done and the advantages or disadvantages with each.

To make the explanations easier, I've named each of the coordinates used to draw the shape:

// Coordinates are:
//
// A-------------B     A(0,0), B(100,0), C(100,100), D(0,100)
// |      E      |     E(50,10), F(10,90), G(90,90)
// |     / \     |     H(50,90), I(50,100)
// |    /   \    |
// |   /     \   |
// |  F---H---G  |
// D------I------C

Technique 1: overpainting

The most naive approach to drawing the shape is to draw the square (ABCD) in the shape's color and then draw the triangle (EFG) over the top in the color of the background.

To be clear though: this is an example of a technique that you should not use.

// Technique 1: overpaint
CGContextMoveToPoint(context, Ax, Ay);
CGContextAddLineToPoint(context, Bx, By);
CGContextAddLineToPoint(context, Cx, Cy);
CGContextAddLineToPoint(context, Dx, Dy);
CGContextAddLineToPoint(context, Ax, Ay);
CGContextSetRGBFillColor(context, 0.5, 0, 0, 1);
CGContextFillPath(context);
CGContextMoveToPoint(context, Ex, Ey);
CGContextAddLineToPoint(context, Fx, Fy);
CGContextAddLineToPoint(context, Gx, Gy);
CGContextAddLineToPoint(context, Ex, Ey);
CGContextSetRGBFillColor(context, 1, 1, 1, 1);
CGContextFillPath(context);

Advantages: if you're only familiar with the "painter's algorithm" (everything is just painted over the top of everything else) then this might be the easiest concept to understand.

Disadvantages: if your background changes, the effect won't work.

shape1problem.png

This approach also has the problem that the pixels within the triangle of the image are drawn twice in the offscreen buffer (once in the square's color and once in the background color). If you are drawing a very large number of shapes this way, this overdrawing will be slower than not drawing the center pixels in the first place.

Technique 2: false hole

Another attempt to cheat when drawing this shape would be to draw the shape as a single, simple polygon — i.e. cut the shape along the line segment between H and I and draw it like a horseshoe (ABCI then HGEFH then finish with IDA).

Again, this is an example of a technique that you should not use.

// Technique 2: false hole
CGContextMoveToPoint(context, Ax, Ay);
CGContextAddLineToPoint(context, Bx, By);
CGContextAddLineToPoint(context, Cx, Cy);
CGContextAddLineToPoint(context, Ix, Iy);
CGContextAddLineToPoint(context, Hx, Hy);
CGContextAddLineToPoint(context, Gx, Gy);
CGContextAddLineToPoint(context, Ex, Ey);
CGContextAddLineToPoint(context, Fx, Fy);
CGContextAddLineToPoint(context, Hx, Hy);
CGContextAddLineToPoint(context, Ix, Iy);
CGContextAddLineToPoint(context, Dx, Dy);
CGContextAddLineToPoint(context, Ax, Ay);
CGContextSetRGBFillColor(context, 0, 0.5, 0, 1);
CGContextFillPath(context);

Advantages: avoids the previous problem of the background being overdrawn in the wrong color.

Disadvantages : contains extra edges which won't draw correctly if you attempt to stroke the shape.

shape2problem.png

This approach can also suffer from precision problems: if the cut at the bottom does not actually overlap, it may be visible as a gap in the object when drawn at very high resolutions (for example on the Mac once resolution independence becomes user settable and objects can be drawn at unexpected sizes).

Technique 3: Winding count

This is the first of the correct approaches to drawing a hole in a path and uses the "winding count" polygon interior algorithm to label the inside of the triangle as "outside" the bounds of our path.

Winding count is the default way that CoreGraphics determines if a pixel is inside or outside a path. It works like this:

  1. CoreGraphics draws every horizontal row within the path's bounding rectangle from left-to-right
  2. At the start of each row, CoreGraphics sets the winding count for the shape to zero.
  3. If CoreGraphics crosses a line in the shape at any point during the row, it notes if the line was going upwards or downwards at the point where CoreGraphics crossed it.
  4. An upward line increases the winding count of the shape by 1.
  5. A downward line decreases the winding count of the shape by 1.
  6. If the winding count for the shape is ever non-zero (positive or negative) then pixels are filled according to the color of the shape.

If that's a little hard to follow then the simple description of winding count is:

Simple winding count: If a boundary is drawn clockwise, then a counter-clockwise boundary inside it will switch the shape off. If a boundary is drawn counter-clockwise, then a clockwise boundary inside it will switch it off.

This is how we use winding count to draw the shape:

// Technique 3: winding count fill rule
CGContextMoveToPoint(context, Ax, Ay);
CGContextAddLineToPoint(context, Bx, By);
CGContextAddLineToPoint(context, Cx, Cy);
CGContextAddLineToPoint(context, Dx, Dy);
CGContextAddLineToPoint(context, Ax, Ay);
CGContextClosePath(context);
CGContextMoveToPoint(context, Ex, Ey);
CGContextAddLineToPoint(context, Fx, Fy);
CGContextAddLineToPoint(context, Gx, Gy);
CGContextClosePath(context);
CGContextSetRGBFillColor(context, 0.5, 0.0, 0.75, 1);
CGContextFillPath(context);

The ABCD boundary is clockwise, so the counter-clockwise EFG creates a hole. To start the inner boundary, we just close the first boundary and move to the next (all boundaries become part of the current path).

Advantages: genuinely draws a shape with a hole cut out of it.

Disadvantages: accidentally draw the inner shape in the order EGF and it won't work (clockwise plus clockwise leads to a winding count of 2 — which is still non-zero and the shape will still be filled.

Winding counts require a little extra effort to ensure that directions are maintained at all times.

Technique 4: Even-odd paths

Even-odd is the other rule used for paths in CoreGraphics. The rule is a little simpler to explain than winding count: in even-odd, the outmost boundary begins the object, the next outermost turns it off again, and so on for other nested paths.

The code is very similar to the winding count version except that we fill using CGContextEOFillPath and the order of EFG with respect to ABCD does not matter.

// Technique 4: even-odd fill rule
CGContextMoveToPoint(context, Ax, Ay);
CGContextAddLineToPoint(context, Bx, By);
CGContextAddLineToPoint(context, Cx, Cy);
CGContextAddLineToPoint(context, Dx, Dy);
CGContextAddLineToPoint(context, Ax, Ay);
CGContextClosePath(context);
CGContextMoveToPoint(context, Ex, Ey);
CGContextAddLineToPoint(context, Fx, Fy);
CGContextAddLineToPoint(context, Gx, Gy);
CGContextClosePath(context);
CGContextSetRGBFillColor(context, 0.75, 0.5, 0, 1);
CGContextEOFillPath(context);

Advantages: less prone to ordering issues than winding count.

Disadvantages: there are some situations where winding count may give a better result. Consider the following pentagram drawn in a single continuous path 12345.

eoversuswinding.png

In this case, if you actually wanted to fill the center of the shape, you'd need to use winding count (the shape is drawn in a continuous clockwise direction so the winding count is always positive).

Technique 5: Clipping region

The final approach that I'll show is using a clipping region to remove the triangle at the center of the shape from the enabled drawing region.

// Technique 5: remove the inner hole using a clipping region
CGContextSaveGState(context);
CGContextAddRect(context, CGContextGetClipBoundingBox(context));
CGContextMoveToPoint(context, Ex, Ey);
CGContextAddLineToPoint(context, Fx, Fy);
CGContextAddLineToPoint(context, Gx, Gy);
CGContextClosePath(context);
CGContextEOClip(context);
CGContextMoveToPoint(context, Ax, Ay);
CGContextAddLineToPoint(context, Bx, By);
CGContextAddLineToPoint(context, Cx, Cy);
CGContextAddLineToPoint(context, Dx, Dy);
CGContextAddLineToPoint(context, Ax, Ay);
CGContextSetRGBFillColor(context, 0, 0, 0.5, 1);
CGContextFillPath(context);
CGContextRestoreGState(context);

You can see that this approach is actually quite complicated since a clipping region with a hole requires the same effort as a shape with a hole: I subtracted the triangle from the clipping region's bounding rectangle using an even-odd clipping rule.

Advantages: a clipping region can subtract or cut very complicated shapes — even clusters of shapes — very simply.

Disadvantages: for cutting a single shape like this, the extra effort required to save the old graphics state and restore it after we're done, plus the fact that the clipping region is just as complicated as the shape itself, makes this approach more effort than the previous two.

Conclusion

You can download the code for drawing these shapes here: GraphicalSubtraction.zip (25kB)

A very simple shape but you can draw it in some very different ways. As you can see, the "wrong" ways of solving the problem don't actually save any code — the proper solutions are roughly the same length.

The best ways to solve this problem are to use the even-odd or winding count approaches. While the shape might look like the center is clipped out, the clipping region turns out to be more work and non-rectangular clipping regions are actually more computationally difficult as well.

Read more...

A look at how malloc works on the Mac

In this post, I'll take a high-level look at how malloc is implemented on the Mac. I'll look at how memory is allocated for "tiny", "small" and "large" allocation scales, the multi-core performance improvements introduced in Snow Leopard and some inbuilt debugging features you can trigger for finding memory problems including buffer overruns.

Introduction

All memory in Mac OS X is allocated to applications by the kernel. The kernel allocates memory to applications by mapping virtual memory pages (which are 4 kilobytes in size) into the application's memory space.

You can allocate memory in your application this way (using the mmap function) to get your memory 4 kilobytes at a time. Alternately, you can simply use memory on the stack (which is 64 kilobytes big by default) and is mapped automatically for every thread.

But most of the time when people talk about allocating memory, they are referring to allocations using malloc.

Compared to requesting pages of virtual memory directly, malloc is finer grained (you can request much smaller sizes than 4 kilobytes) and considerably faster (since it doesn't require an mmap every time).

Of course, internally, malloc does receive its memory from the kernel by mapping virtual memory pages. Malloc gains its advantage by dividing those pages (or clusters of pages) into smaller regions and returning pointers to addresses within those regions when smaller allocations are requested.

I'm using the term "malloc" but this article applies to Objective-C's alloc and allocWithZone:, all CoreFoundation allocations and all related C functions like calloc, realloc, valloc, malloc_zone_malloc, malloc_zone_calloc, malloc_zone_valloc, malloc_zone_realloc and malloc_zone_batch_malloc since all these functions go through the same internal implementation on the Mac.

A generic malloc implementation

Malloc implementations work by requesting a handful of virtual memory pages from the kernel and returning pointers to free areas within those pages when memory is requested. To know which areas within the pages are free at any given time, a malloc implementation must maintain metadata about the size and location of each allocated block in use and any free space between blocks.

The pages of memory managed by malloc are collectively called "the heap" but it should be noted that malloc generally does not use a heap data structure (which is a form of sorted tree) to track blocks; the two uses of the term "heap" are unrelated.

As the program requires more memory, the malloc implementation requests more virtual memory pages, increasing the application's memory footprint. Every reasonable effort should to be made to allocate new blocks in the spaces left by previously free blocks to keep the memory footprint low.

The main difficulty for a memory allocator is to keep the amount of metadata and the amount of processing time low. This can be very difficult to do when memory becomes a mottled pattern of allocated blocks and freed space.

If the memory allocator kept an array of every single allocation, its location and size, the metadata could easily take as much memory as the allocations themselves. To increase efficiency, most allocators do not track every single byte in memory. Most use a resolution of 16 bytes or more. This allows reduced metadata to track allocated and freed runs.

Further, most allocators use free lists; instead of traversing memory on each allocation looking for an empty space of the appropriate size, the allocator keeps lists of freed areas categorized by their approximate sizes. As a further optimization, these free lists do not always track all free areas in a block (the free lists normally track a finite number of free areas).

The Mac's implementation of malloc

Despite the fact that all C standard library implementations offer a function named malloc, they all have very different internal implementations.

The Mac's implementation of malloc is open source and is composed of two key implementation files:

The malloc.c file is mostly a wrapper around the internal implementation in magazine_malloc.c. This external wrapper routes regular malloc invocations through the malloc_zone_malloc function using the default malloc zone — so all malloc allocations on the Mac are actually zoned allocations sharing the same implementation that's used by Objective-C's +[NSObject allocWithZone:].

Up until Snow Leopard, the internal implementation was named scalable_malloc.c. It is scalable because it contains different code paths for allocations based on their size. The newer version magazine_malloc.c retains much of the code from scalable_malloc.c but adds multithreaded improvements for smaller allocations and removes the old "huge" scale (instead using the same allocation approach for "large" and "huge" scale allocations).

The Mac's malloc regions

The malloc allocation on the Mac has different code paths for the following allocation sizes:

Allocation SizeCode path nameQuantum size (Allocation resolution)Region size
32-bit: 1 byte to 496 bytes
64-bit: 1 byte to 992 bytes
Tiny16 bytes32-bit: 1MB
64-bit: 2MB
32-bit: 497 bytes to "Large" threshold
64-bit: 993 bytes to "Large" threshold
Small512 bytes32-bit: 8MB
64-bit: 16MB
< 1GB RAM: 15kB or greater
>= 1GB RAM: 127kB or greater
Large4kBN/A

The key point to note about the Mac's scalable malloc implementation is that it doesn't just divide up virtual memory pages to return smaller blocks but it allocates "tiny" and "small" allocations from their own separate regions ("large" allocations are simply allocated as virtual memory pages).

Allocating a "tiny" amount results in memory being returned from within a 2MB block (1MB in 32-bit programs). Allocations are always rounded up to the nearest 16 byte boundary since the tiny regions only track allocations at a resolution of 16 bytes. This has the additional advantage that it ensures all memory allocations are 16-byte aligned (helpful for SSE/Altivec instructions).

Of course, the 1 or 2 MB regions used for tiny allocations are only big enough to hold around 64,000 allocations at the smallest size — less for larger allocations. More regions are created as needed.

Beyond 15kB (127kB on systems with more than 1GB RAM) the Mac's malloc allocates memory purely as virtual memory pages — no additional tracking or metadata is maintained (although the kernel does maintain tracking for these pages).

To summarize the levels of hierarchy then:

  • Malloc zones allocate "tiny" and "small" regions or "large" blocks directly
  • Regions return blocks from their contents as results for "tiny" and "small" malloc operations

The importance of regions

The reason for separate regions for tiny and small allocations is that it allows the region's metadata to be more efficiently tuned to tracking the size of object it contains. For "tiny" allocations, tracking memory down to 16 byte units is worthwhile (it allows freed space to be reclaimed better and doesn't waste large amounts of memory around allocated objects). For "small" sized objects though, tracking at this resolution would be a poor tradeoff between CPU time (traversing through lists of freed or allocated blocks) and memory efficiency — the coarser 512 byte resolution is more efficient.

The tiny regions also have the advantage of reducing the impact of memory fragmentation. By keeping small and large allocations separate, the pattern of small allocations in the midst of otherwise large, free areas that is a cause of memory fragmentation is avoided. Of course, it doesn't eliminate fragmentation but helps.

The existence of the "tiny" region is of great important to Objective-C. Since Objective-C allocates all objects within malloc zones and almost all Objective-C objects fall within the "tiny" size bounds, it is of great benefit to have an optimized code path for such allocations.

Threading improvements in Mac OS X 10.6

In Snow Leopard, Apple replaced the old scalable_malloc.c with the newer magazine_malloc.c. This new implementation introduces a new approach of creating special "tiny" memory allocation regions for each thread.

This approach is "inspired" by the Hoard memory allocator with the thread-specific clusters and "superblocks" of that approach implemented as the magazine and regions in the Mac malloc implementation. Most of the allocation metadata that was previously kept in the "zone" structures moves into the new "magazine" level of the hierarchy so that it can be kept thread-specific. This includes all data that needs to be updated per allocation, like the number of allocated regions, the free lists and available memory counts.

Each "tiny" region is itself allocated by the top level allocator but is then assigned to a specific thread. Since the region is then thread-specific, the top-level, shared, allocator does not need to be locked and the chance of any thread contention is very low.

To summarize the levels of hierarchy for "tiny" allocations with these additions:

  • Malloc zones allocate magazines for "tiny" regions (1 per thread) and allocated the regions themselves when requested by the magazines.
  • Magazines manage the regions for a thread
  • Regions return blocks from their contents

Of course, locks are still used internally when allocating or freeing from a region (since a malloc from one thread can still be freed in a different thread) but the chance of two threads contesting a lock is significantly reduced. The likelihood of cache lines shared unnecessarily between CPUs is also reduced since threads don't allocate memory in the same regions.

Then end result is that the majority of memory allocations in Objective-C (which are typically allocated in a thread and released by the autorelease pool on that thread) will have near perfect thread independence.

A pair of minor notes

When you free memory, your application footprint will not immediately go down. Freeing region allocated memory adds the space to the free list for the region but will not cause the region to be released unless it was the last block in the region. Your application's footprint will only go down if an entire region is freed and the zone can unmap the virtual memory pages.

calloc takes two parameters: a size and a number of elements. malloc only takes a size but is often calculated by multiplying a sizeof(SomeType) by a number of elements. The end result is identical: internally, calloc just multiplies its two parameters together — other than some overflow checking on the multiplication, the size of the returned block from calloc(a, b) is identical to the block returned by malloc(a * b).

Debugging information

One important point to note from reading the malloc.c file is that the Mac memory allocator can be configured at runtime to generate logging information. The following environment variables can be set to have the memory allocator perform debug behaviors:

  • MallocLogFile <f> to create/append messages to file <f> instead of stderr
  • MallocGuardEdges to add 2 guard pages for each large block
  • MallocDoNotProtectPrelude to disable protection (when previous flag set)
  • MallocDoNotProtectPostlude to disable protection (when previous flag set)
  • MallocStackLogging to record all stacks. Tools like leaks can then be applied
  • MallocStackLoggingNoCompact to record all stacks. Needed for malloc_history
  • MallocStackLoggingDirectory to set location of stack logs, which can grow large; default is /tmp
  • MallocScribble to detect writing on free blocks and missing initializers: 0x55 is written upon free and 0xaa is written on allocation
  • MallocCheckHeapStart <n> to start checking the heap after <n> operations
  • MallocCheckHeapEach <s> to repeat the checking of the heap after <s> operations
  • MallocCheckHeapSleep <t> to sleep <t> seconds on heap corruption
  • MallocCheckHeapAbort <b> to abort on heap corruption if <b> is non-zero
  • MallocCorruptionAbort to abort on malloc errors, but not on out of memory for 32-bit processes MallocCorruptionAbort is always set on 64-bit processes
  • MallocErrorAbort to abort on any malloc error, including out of memory
  • MallocHelp - this help!

Unfortunately, the normal build of magazine_malloc.c in Mac OS X has the limitation that it won't apply guard pages to "small" or "tiny" allocations. To apply guard pages to all data, you'll need to use the libgmalloc library. Do this by setting the following environment variable:

export DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib

For more information, see the libgmalloc Manual Page.

You can also set environment variables in Xcode by right-clicking on the executable in the Tree view, selecting "Get Info" and going to the "Arguments" tab.

Conclusion

I'm not sure there are a huge number of lessons to be directly taken from this article and immediately used in a program. The debug options are useful to know should the need arise but this article is mostly an exercise in investigating one of more heavily used library functions on the Mac.

The memory allocator on the Mac has lots of different OS-specific behaviors related to alignment, granularity and thread performance. Depending on the size of the allocation, and even the amount of installed memory, some of these behaviors will change from allocation to allocation.

One reassuring point to take from this is to know that the many thousands of tiny Objective-C object allocations in a typical Cocoa program do have a heavily optimized path on the Mac.

Read more...

Finding or creating the application support directory

A simple post this week but one which optimizes a common task: locating the application support directory for the current application, creating it if it doesn't exist. The result makes accessing the current application's support directory a single line and provides a structure for locating and creating folders at other standard locations with similar ease.

The Application Support Directory

On the Mac, the correct location to store persistent user-related files for your application is in a directory with the same name as your application in the Application Support directory for the current user.

As an example, if the current user "person" runs an application named "ExampleApp" it would store such files in the following location:

/Users/person/Library/Application Support/ExampleApp/

On iPhone OS devices, the running application gets its own copy of the Library directory, so you could write files wherever you choose within this. However, I use the same code on both platforms for consistency and there is no real penalty in doing this. For an iPhone OS device then, the path to the application support directory would look like this:

/User/Applications/12345678-AAAA-BBBB-CCCC-0123456789AB/Library/Application Support/ExampleApp/

where 12345678-AAAA-BBBB-CCCC-0123456789AB is whatever UUID has been assigned to your application.

The application support directory is only for user-related persistent files. If you want to store user-related preferences, it is generally better to can store them in the NSUserDefaults.

Getting paths correctly

The correct way to get the path to the Application Support directory is to use the NSSearchPathForDirectoriesInDomains function passing NSApplicationSupportDirectory for the search path and NSUserDomainMask for the domain.

This can be done in one line but on its own, it is only ever part of the solution.

While NSSearchPathForDirectoriesInDomains can return a path to the Application Support directory, it does not guarantee that the application support directory exists. For an iPhone OS device, it almost certainly won't exist the first time you run the application.

Further, we need to append the name of the current application to this path and create this application-specific subdirectory if needed.

Finally, we need to handle all of this in a way that is tolerant of errors including failure to create the directory or existence of files where we need a directory to go.

A simple idea — get the application support directory — turns out to be a multi-step operation.

Design of the solution

The solution that I use in many of my applications is based around a method with the following declaration:

- (NSString *)findOrCreateDirectory:(NSSearchPathDirectory)searchPathDirectory
    inDomain:(NSSearchPathDomainMask)domainMask
    appendPathComponent:(NSString *)appendComponent
    error:(NSError **)errorOut

This is a flexible method that can be used for resolving/creating a directory/subdirectory at any standard location searchable by NSSearchPathForDirectoriesInDomains.

The first two parameters are the parameters passed to NSSearchPathForDirectoriesInDomains, the third parameter is a subpath to append to the result from NSSearchPathForDirectoriesInDomains (which we can use to append the current application's name to get our application specific subdirectory). The final parameter is used to return information about any of the three errors that can occur (no path found, file exists at directory location or unable to create directories).

I further supplement this with a convenience method to invoke this with all the appropriate parameters for creating the application support directory:

- (NSString *)applicationSupportDirectory

On error, this method simply logs any error result using NSLog.

In my solution, both of these methods are part of a category on NSFileManager. There is no technical requirement that it be a category on NSFileManager but these methods do use NSFileManager internally and do share the same goals of providing access to directories within the filesystem. Further refinements could also add a URL version based on the NSFileManager method -URLsForDirectory:inDomains: which would make the association less arbitrary.

Implementation

I've omitted the creation of the error objects for space but otherwise the implementation is as follows:

- (NSString *)findOrCreateDirectory:(NSSearchPathDirectory)searchPathDirectory
    inDomain:(NSSearchPathDomainMask)domainMask
    appendPathComponent:(NSString *)appendComponent
    error:(NSError **)errorOut
{
    // Search for the path
    NSArray* paths = NSSearchPathForDirectoriesInDomains(
        searchPathDirectory,
        domainMask,
        YES);
    if ([paths count] == 0)
    {
        // *** creation and return of error object omitted for space
        return nil;
    }

    // Normally only need the first path
    NSString *resolvedPath = [paths objectAtIndex:0];
    
    if (appendComponent)
    {
        resolvedPath = [resolvedPath
            stringByAppendingPathComponent:appendComponent];
    }
    
    // Create the path if it doesn't exist
    NSError *error;
    BOOL success = [self
        createDirectoryAtPath:resolvedPath
        withIntermediateDirectories:YES
        attributes:nil
        error:&error];
    if (!success) 
    {
        if (errorOut)
        {
            *errorOut = error;
        }
        return nil;
    }
    
    // If we've made it this far, we have a success
    if (errorOut)
    {
        *errorOut = nil;
    }
    return resolvedPath;
}

I've noted that we "Normally only need the first path". If you pass multiple values ORed together for the domain mask (e.g. NSUserDomainMask | NSLocalDomainMask) then this assumption will not be true. If your program needs to handle multiple directories in this way, you'll need to add appropriate handling of multiple results.

Finally, we have the implementation of the applicationSupportDirectory method:

- (NSString *)applicationSupportDirectory
{
    NSString *executableName =
        [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleExecutable"];
    NSError *error;
    NSString *result =
        [self
            findOrCreateDirectory:NSApplicationSupportDirectory
            inDomain:NSUserDomainMask
            appendPathComponent:executableName
            error:&error];
    if (error)
    {
        NSLog(@"Unable to find or create application support directory:\n%@", error);
    }
    return result;
}

As you can see, it pulls the application name out of the main bundle's infoDictionary and uses this to create the Application Support directory.

The end result is that you can get the path to the application support directory with the following line:

NSString *path = [[NSFileManager defaultManager] applicationSupportDirectory];

Conclusion

You can download the complete category here: NSFileManager_DirectoryLocations.zip (6kb)

I like it when a task that can be unambiguously described in a simple sentence ("Get the path to the application support directory.") is correspondingly achieved in a single line.

The code presented in this post implements a simple set of steps but since I need to do this in most applications, it is one of my most commonly used categories.

Read more...