Advanced programming tips, tricks and hacks for Mac development in C/Objective-C and Cocoa.

Propagate deletes immediately in Core Data

Learn some limitations associated with cascading deletes in Core Data and find out how to immediately propagate deletes in Core Data, overcoming these potential problems.

Correction: this post previously indicated that NSManagedObject deletes would fail inside processPendingChanges. This issue was fixed in Mac OS X 10.5. The post has been edited to reflect that in Mac OS X 10.5 it is only processPendingChanges that cannot be invoked inside itself.

Limitations to the default "deleteObject" for change propagation

Core Data has the ability to perform full object-graph deletes through the "cascade" delete rule available to each entity in the object model.

If you have overridden the Key Value Coding methods for relationships on your model entities you can encounter the following issues.

Always deferred

Deletes in Core Data are always deferred, either to the end of the event (when -[NSManagedObjectContext processPendingChanges] is next called) or until the context is saved.

This means that if you want to read a value changed by a delete, you must:

  • not be inside a -[NSManagedObjectContext processPendingChanges] invocation (i.e. during a delete propagation)
  • call -[NSManagedObjectContext processPendingChanges] after the delete to flush the work through

If you are inside processPendingChanges (or you otherwise don't want to call it) but you need to see changes immediately, then you will need to implement your own change propagation instead of relying on the default deleteObject: to apply changes.

Relationships deleted in no particular order

The "cascade" Delete Rule does not cascade through relationships in any particular order. If you need relationships to be deleted in a specific order (i.e. if you may read from the object during deletion), you will need to implement your own change propagation.

No recursive deletes in 10.4

In Mac OS X 10.4, Core Data's delete propagation is incompatible with recursively invoked deletes. If a delete propagation triggers another delete propagation while it is doing its work, the second propagation won't work (it will have no effect).

An example of how these problems can occur

Consider the following .xcdatamodel:

In this example, "A" uses a "cascade" Delete Rule for its properties but "B" and "C" use "nullify".

Our program must obey the following rule:

    Every B object attached to an A object must have a matching C object attached to the same A object at all times.

Our program maintains this rule by overriding the Key Value Coding methods for the "A" class.

The addBObject: method is overridden to create a C object and add it to the "A" object's "c" property before the "b" property is changed. Similarly, the removeBObject: method is overridden to delete the matching C object in the "A" object's "c" property after the "b" property is changed.

By adding the "C" object before the "b" property is changed and removing the "C" object after the "b" property is changed, we ensure that while a "B" object is attached to an "A" object, there is always a matching "C" object attached.

Once we've deleted the "C" object, we use NSAssert to check that the number of "B" objects and "C" objects attached to "A" are the same. To ensure that the "C" object has been deleted before this read, we must call processPendingChanges to flush through the delete.

Problems with nested processPendingChanges

The above described example will work fine, unless it is invoked from inside a delete propagation.

If we delete a "B" object, instead of simply removing it from "A", then the removal happens inside the processPendingChanges for the "B" object's delete.

In Mac OS X 10.4, the deleteObject: invocation in removeBObject: will not work correctly from inside processPendingChanges. This is a documented limitation. Any use of deleteObject: in an overridden accessor in Mac OS X 10.4 should propagate deletes itself.

In Mac OS X 10.5, the call to processPendingChanges in removeBObject: will have no effect. This means that the NSAssert will fail.

In this trivial example, the NSAssert isn't important but it shows that if you need to flush deletes to read back immediately and may need this functionality during a delete propagation, then you will need to propagate this delete yourself.

Problems with order of delete

If you delete an "A" object, then you have no control over whether the "b" property or "c" property is deleted first.

The "C" property could be deleted first. If this occurs, then the rule "must have a matching C object attached to the same A object at all times" is violated.

If you need control over the order that relationships are deleted, then you must propagate the delete yourself.

Solution: actively perform the delete propagation yourself

A solution to the above listed problems is to actively perform the delete propagation yourself. This means iterating over all relationships on the object and using the deleteRule information from the relationship to decide how to handle the object on the other end.

This is going to be slower than the default delete propagation but we are deliberately replicating its behavior so that we can have greater control over what is done during the propagation.

Objects will still be deleted using -[NSMangedObjectContext deleteObject:] but before this occurs, they will be correctly disconnected from the object graph. To ensure that no infinite loops occur, all relationships in affected objects are set to nil as propagation passes through.

A brief warning: the following method doesn't handle the "Deny" delete rule. You would need to add support for this yourself.

So here's the big block of code. This is intended to be a Category method on NSManagedObject (otherwise it won't work as written). The priorityDeletionRelationships method should be overridden by classes which need some relationships deleted first — the array returned should be the ordered list of relationship keys to delete first.

- (NSArray *)priorityDeletionRelationships
{
   return nil;
}

- (void)propagateDelete
{
   NSEntityDescription *entityDescription = [self entity];
  
   // Get the set of relationships
   NSDictionary *relationships = [entityDescription relationshipsByName];
   NSArray *unsortedKeys = [relationships allKeys];
   NSArray *priorityKeys = [self priorityDeletionRelationships];
   NSArray *keys;
   if ([priorityKeys count] > 0)
   {
       keys = [[unsortedKeys mutableCopy] autorelease];
       [(NSMutableArray *)keys
           removeObjectsInArray:priorityKeys];
       [(NSMutableArray *)keys
           replaceObjectsInRange:NSMakeRange(0, 0)
           withObjectsFromArray:priorityKeys];
   }
   else
   {
       keys = unsortedKeys;
   }
  
   // Iterate over the set of relationships
   NSEnumerator *relationshipEnumerator = [keys keyEnumerator];
   NSString *relationshipName;
   while ((relationshipName = [relationshipEnumerator nextObject]) != nil)
   {
       NSRelationshipDescription *relationshipDescription =
           [relationships objectForKey:relationshipName];
      
       // If the relationship is not "cascade", then just nullify it.
       if ([relationshipDescription deleteRule] != NSCascadeDeleteRule)
       {
           if (![relationshipDescription isToMany])
           {
               [self setValue:nil forKey:relationshipName];
           }
           else
           {
               NSMutableSet *relationshipSet =
                   [self mutableSetValueForKey:relationshipName];
               [relationshipSet removeAllObjects];
           }
           continue;
       }
      
       // Propagate the delete to the object at the other end of the
       // relationship
       if (![relationshipDescription isToMany])
       {
           NSManagedObject *destination = [self valueForKey:relationshipName];
           [self setValue:nil forKey:relationshipName];
           [destination propagateDelete];
           continue;
       }
      
       // Propagate the delete to every object in the to-many relationship.
       // We copy the set because we plan to change it during iteration.
       NSMutableSet *mutableRelationship =
           [self mutableSetValueForKey:relationshipName];
       NSSet *iterateSet = [[mutableRelationship copy] autorelease];
       NSEnumerator *enumerator = [iterateSet objectEnumerator];
       NSManagedObject *setObject;
       while ((setObject = [enumerator nextObject]) != nil)
       {
           [mutableRelationship removeObject:setObject];
           [setObject propagateDelete];
       }
   }
  
   // Delete this object
   [[self managedObjectContext] deleteObject:self];

}

viewWillDraw - a welcome addition to NSView in 10.5

A method named viewWillDraw appeared in NSView in Mac OS X 10.5. If you have cause to use it, this method replaces 6 other methods from earlier versions of Mac OS X. Read more to find out if you should use it and how it helps.

Drawing in NSView

Drawing in Cocoa happens in the NSView method drawRect:. Almost all programs perform some form of custom drawing, so this is one of the most commonly overridden methods in Cocoa.

The general procedure to follow in this method is:

  • Determine what needs to be drawn
  • For each component that needs to be drawn, draw the component

But what happens if, in the course of determining what needs to be drawn, you need to alter the area being updated? If, after determining what needs to be drawn, we realize that a different region needs to be updated, it's too late — the drawing rectangle can't be changed from inside drawRect:.

Layout code

The above described problem is exactly what is faced by deferred or pending layout code.

Layout code deferred until drawInRect: is unable to change the bounds for drawing and must a separate drawRect: invocation. This can create pauses in display, tearing or other drawing problems.

Let me be clear here: when I say "layout", I don't mean the arrangement of NSViews in a window (these will handle themselves). "Layout" in this sense means graphical objects within a single view which may need to be arranged in some way (i.e. column alignment, animation or other procedural generation or positioning).

Deferred layout isn't very common (since drawing is normally triggered after layout is performed) but there may be cases where drawing begins while layout updates are still pending and you want to block the current drawRect: invocation until the layout update is complete. If those layout updates alter the bounds of objects being drawn, we'll want the current invocation of -[NSView drawRect:] to update those bounds, to avoid drawing glitches.

The old solution

The solution to this problem is to perform the layout required for drawing slightly earlier than drawRect: — before the NSView is locked into drawing a fixed rectangle. This gets around the limitation that drawRect: can only update the area it is given.

Previously, this meant performing pending layout in these NSView methods:

  • displayIfNeeded
  • displayIfNeededIgnoringOpacity
  • _recursiveDisplayAllDirtyWithLockFocus:visRect:
  • _recursiveDisplayRectIfNeededIgnoringOpacity:isVisibleRect:rectIsVisibleRectForView:topView:
  • _recursiveDisplayRectIfNeededIgnoringOpacity:inContext:topView:
  • _lightWeightRecursiveDisplayInRect

These are all the immediate predecessor methods to drawRect:. When these methods are invoked, it is early enough to extend the drawing region by invoking setNeedsDisplayInRect: and the new drawing region will get included when drawRect: is invoked.

Unfortunately, only the first two other these methods are documented. The remainder are undocumented and Apple is free to change them without notice — potentially messing up your program. So "the old solution" was not a good solution.

Conclusion: the Leopard solution

Fortunately, Mac OS X 10.5 gives a much cleaner function to override before drawRect: is invoked &mdash viewWillDraw. This method is the perfect spot to perform deferred or pending layout. You can alter the bounds that need to be updated and this new altered region will be included on the very next update. This means no tearing and no holes in your drawing.

Type punning isn't funny: Using pointers to recast in C is bad.

A very common C technique for reinterpreting data types has the potential to cause nasty bugs. Apple knows this, which is why the implementation of NSRectToCGRect (correctly) doesn't do what the documention claims. I show you a technique to perform reinterpret casts safely in your own code.

Apple's documentation for the function NSRectToCGRect claim that it is implemented as follows:

CGRect NSRectToCGRect(NSRect nsrect) {
   return (*(CGRect *)&(nsrect));
}

If you have seen a lot of C code, chances are that you've seen this approach before. You can't cast one struct to another to reinterpret — even if they have the same fields — so it is common to see reinterpreting by making a pointer and casting the pointer.

The implication is that NSRectToCGRect reinterprets an NSRect as a CGRect without altering the contained data.

While the implied functionality is accurate, the displayed implementation is not. In actuality, the function looks like this:

NS_INLINE CGRect NSRectToCGRect(NSRect nsrect) {
    union _ {NSRect ns; CGRect cg;};
    return ((union _ *)&nsrect)->cg;
}

Why the difference? Why bother creating a union? Why shouldn't you simply cast through a pointer?

Type punning

As common as casting through a pointer is, it is actually bad practice and potentially risky code. Casting through a pointer has the potential to create bugs because of type punning.

Type punning
A form of pointer aliasing where two pointers and refer to the same location in memory but represent that location as different types. The compiler will treat both "puns" as unrelated pointers. Type punning has the potential to cause dependency problems for any data accessed through both pointers.

Most of the time, type punning won't cause any problems. It is considered undefined behavior by the C standard but will usually do the work you expect.

That is unless you're trying to squeeze more performance out of your code through optimizations. Specifically, if you ever turn on "Enforce Strict Aliasing" in XCode (a.k.a -fstrict_aliasing in GCC) you run the risk of unpredictable and errant behavior. With strict aliasing, the compiler may start doing things in the wrong order or leaving instructions out entirely.

To be clear, these bugs can only occur if you dereference both pointers (or otherwise access their shared data) within a single scope or function. Just creating a pointer should be safe.

An example of a punning bug

Before the NSRectToCGRect function existed, I had some code which did the following:

NSRect ellipseBounds;
ellipseBounds.origin.x = 0;
ellipseBounds.origin.y = 0;
ellipseBounds.size.width = WIDGET_SIZE - 1.0;
ellipseBounds.size.height = WIDGET_SIZE - 1.0;
ellipseBounds = NSInsetRect(ellipseBounds, 4, 4);

CGContextAddEllipseInRect(context, *(CGRect *)&ellipseBounds);
CGContextFillPath(context);

This code creates and sets up an NSRect and then reinterprets it as a CGRect before using it.

In this case, with -fstrict_aliasing enabled, GCC chose to order the NSInsetRect after the call to CGContextAddEllipseInRect because the dependency between the two was broken by type punning when the pointer to ellipseBounds was dereferenced as a different type.

Union solves the problem

The traditional solution to this problem, to allow the code to be correct with -fstrict_aliasing enabled, is to use a union. As shown in the NSRectToCGRect code, the union should contain the source and destination types and you simply set or cast to the source type before reading from the destination type.

According to the C standard, anything involving type punning is implementation specific. So in a "standard" sense, using a union doesn't necessarily solve the problem. According to the standard, if you set data in a union on one field, you are required to read back from the same field.

Fortunately, GCC explicitly gives permission to do different. From the GCC documentation:

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type.

Excellent.

A macro to reinterpret your own data safely

Really simple:

#define UNION_CAST(x, destType) \
	(((union {__typeof__(x) a; destType b;})x).b)
This example now incorporates the "__typeof__" suggestion made by Daniel NĂ©ri in the comments.

So you could cast a float variable named myFloat to an int as follows:

int myInt = UNION_CAST(myFloat, int);

You might notice that I don't bother with an inline function, I don't give the union a name, and I don't make a pointer to the value before casting. The Apple NSRectToCGRect function did these things but they are unnecessary. Although, since the compiler should optimize away the extra work, the function, the extra pointer and the dereference in Apple's code shouldn't matter.

Conclusion

Creating a pointer to a value and recasting the pointer to a new type is the most common way to reinterpret data in C that I've seen. Despite its prevalence, you shouldn't do it. Always do your reinterpret casts through a union. It could save you a lot of trouble if you're ever trying to squeeze performance through compiler options.

The value of immutable values

This is an explanation of why Cocoa contains immutable value classes and why the value classes you create in your own program should be immutable too. This post is brought to you by yet another bug release of Magic Number Machine.

Mea culpa

In 2002, I wrote the BigFloat class that would end up being the floating-point number class in my calculator, Magic Number Machine.

When I wrote this class, I was new to Cocoa programming and didn't really understand some of its design philosophies. Specifically, I didn't understand why Cocoa chooses to store its values in immutable classes.

Without this understanding as I wrote the BigFloat class, I designed the operator methods (add:, subtract:, multiplyBy:, divideBy:) to store their result in the receiving object, thereby changing or "mutating" the receiving object.

For example, in Magic Number Machine's BigFloat class, [a add:b] adds "b" to "a" and stores the result in "a".

Storing the result in the same location as one of the operands is common in assembly language and lots of classes have methods that change the underlying object. But this is the wrong approach for an Objective-C value class.

So let's look at why value objects in Cocoa should be immutable where possible.

Immutable objects in Cocoa

It may be useful to clarify what is meant by "immutable":

Immutable object
In object-oriented and functional programming, an immutable object is an object whose state cannot be modified after it is created. This is in contrast to a mutable object, which can be modified after it is created.
From http://en.wikipedia.org/wiki/Immutable_object

It may sound like immutable classes are worse than mutable classes since they contain less functionality. It may then be surprising how many Cocoa classes are immutable. Immutable classes in Cocoa include:

  • NSValue
  • NSNumber
  • NSURL
  • NSDate
  • NSString
  • NSArray
  • NSColor
  • NSEvent
  • NSFont

The reality is that these classes don't necessarily do less than their mutable equivalent but they do always return changes in a newly allocated object instead of changing themselves.

A further trait in common is that all of these classes can be considered "value" classes. They store an indivisible piece of data which is intended to be read and used in its entirety.

The problem with mutable value objects

Objective-C shared references

In Objective-C, it is common to retain objects returned from other classes. "Get" accessor methods often return objects owned by another object that you are free to retain as though it is your own.

The result is that multiple parent objects often maintain references to the same child object. Even if you're using Garbage Collection and you don't explicitly type "retain", it doesn't matter, the effect is the same: multiple objects have a pointer to the same object.

This leads to the biggest problem with mutable values: if one of the sharing objects changes the value object, then both sharing objects see a changed value. This is fine if the value object is supposed to be a shared state object between the sharing objects but is very bad in all other cases.

To prevent external objects changing your internal state without permission, all internal state should be returned immutable. If no immutable class exists, you will need to copy the internal object first and return the copy.

Threading concerns

Multiple objects sharing references to the same data can also create race conditions in threaded code.

Immutable objects largely avoid race conditions because one value can be swapped for another atomically.

Consecutive Use

Another problem with mutable values, is that the original value is lost if you don't make a copy before applying an operator.

This causes problems depending on usage pattern. If value objects are ever used in a situation where more than one operator needs to be applied to the original object you must remember to copy the original object before applying an operator.

For example, if you have value X, it is common to want to calculate X + Y and X + Z. In a mutable object world, if calculating X + Y destroys X, then you must pre-copy X so that it still exists when you want to calculate X + Z.

Aesthetic reasons

A final reason to make your value classes immutable is aesthetic.

When writing a description of an operation using mathematical syntax, we write:

result = function(operand1, operand2)

or

result = operand1 operator operand2 

In both cases, the mathematical notation is modelling a case where "operand1" and "operand2" are immutable and a new value "result" is created. To closely reflect this model, "function" and "operator" must act on the operands in an immutable fashion.

When to avoid immutable objects

Any of the following traits can rule out the use of immutable design:

  • Very large data size - since any change to an immutable object requires copying
  • Lazily or progressively constructed data - since immutable objects require all data at construction
  • Classes containing structured data, especially where child elements of the structure are exchanged or manipulated
  • Where a class is intended as a container for shared state

This normally rules out most classes which can't be described as values. For example, you'll never see an immutable application control, view or document state class, like Windows or Views or Documents.

Compound objects like collections (NSArray, NSSet, NSDictionary, etc) regularly exist in both mutable and immutable forms. They are normally mutable when they are progressively constructed, contain large data sets or are used for maintaining shared structural state. But they are often immutable to take advantage of immutable traits (read-only internal access and thread-safe).

Some value objects with large data or structural values (large blocks of text for example) can create gray area where the choice can be hard to make. In the case of text, Cocoa includes the immutable NSString and NSAttributedString as well as the mutable NSMutableString and NSTextStorage to cover different sizes and usage patterns for text.

Conclusion

As a word, "immutable" may sound like it is removing potential functionality from a class. In reality, using it where appropriate can make your interfaces between classes cleaner and easier to control. It also allows you to create code which closely reflects the aspects of values.

I have spent most of this post talking about immutable "value" classes. This is because classes which model entities that can be described as "values" are the canonical case for immutable classes.

Values are a good match for "immutable" design because an operator which acts on a value usually creates a completely new value. Values are also typically small objects (a few bytes to a few kilobytes).

Magic Number Machine's BigFloat class is still mutable since I never took the time to rewrite it. It didn't prevent the program working, it just required lots of careful copying and careful passing between classes. It also made the code for invoking operators less aesthetically pleasing.