Cocoa with Love

Advanced programming tips, tricks and hacks for Mac development in C/Objective-C and Cocoa.

A big weakness in Objective-C's weak typing

We generally assume that we can send any message we want to a variable in our code typed as "id" and Objective-C's dynamic message handling will make the invocation work correctly at runtime. In some rare cases, this assumption is wrong. I'll look at situations where you need to be careful about sending messages to "id" typed variables and a situation where a limitation in the Objective-C language requires a hideous workaround to avoid serious bugs.

Introduction

We generally assume we can send any message we like to an "id" variable (or a Class variable). In fact, that's the real purpose of the "id" type: it is the "any" type in Objective-C to which any Objective-C message may be sent.

We use this in lots of different situations but one of the most common is sending messages to objects store in an NSArray:

NSString *description = [[someArray objectAtIndex:0] substringFromIndex:5];

In this code sample, we don't need to cast the result of the objectAtIndex: invocation to an NSString before sending it the substringFromIndex: message — we know that as long as the object at index 0 actually is an object that responds to the substringFromIndex: selector, it will work.

This post is about invoking methods on either of Objective-C's two "weak" types: id or Class. This post does not apply if the variable you invoke a method on is typed to anything else (even id<SomeProtocol> will count as "anything else" and avoid the issues in this post).

False assumptions

The false assumption often applied here is that the compiler doesn't need to know any type information. In reality, even though the method lookup happens at runtime, that's only enough to ensure the correct method is invoked, it is not enough to ensure the parameters will work.

The compiler definitely does need to infer some information about the method signature involved. Even though the compiler does not necessarily need to know the type of the id, it does need to know the byte lengths of all parameters and the explicit type of any return values. This is because marshalling of the parameters (pushing and popping them from the stack) is configured at compile time.

Normally, we don't need to take any steps for this to happen though. The parameter information is obtained by looking at the name of the method you're trying to invoke, searching through the included headers for methods matching the invoked method name and then getting the parameter lengths from the first matching method it finds.

99.99% of the time, there's no problem with this: even if there's ambiguity about the exact method you're really targeting, the parameters are likely to be the same between matching method because method names in Objective-C generally imply the types of the data, so this type of conflict is likely to cause no difference in signature.

And then there's that other 0.01% of the time...

Catastrophic failure

Imagine you have class MyClass with an instance method named currentPoint that returns an int. You want to get the currentPoint from an object stored in an array, so you use the code:

int result = [[someArray objectAtIndex:0] currentPoint];

When you run the code, you know the exact value returned from invoking currentPoint on the first object in the array should be zero (because you set it to zero and you can see in the debugger that it is still zero) but the value that ends up in the result is 2,147,483,647 (or some other partial garbage value).

What has gone wrong?

The correct method is invoked at runtime. The problem is that the compiler marshalled the parameters for this invocation incorrectly leading to data corruption of the return type.

The compiler needs to push parameters onto the stack correctly before the message send and perform the message send using the correct variant of objc_msgSend to get the return value back afterwards. This is what has failed.

The compiler prepares parameters using the method signature (which it gets by looking at the type of the receiver and all method names that are valid for the receiver) and trying to work out which method you're likely to be invoking. Since the type of the receiver (i.e. the result from objectAtIndex:) is just id then we have no explicit type information so the compiler will look through the list of all known methods.

Unfortunately, instead of our MyClass method, the compiler decided to match against the NSBezierPath method named currentPoint and has prepared the parameters to match that method's signature. The NSBezierPath method returns an NSPoint which is a struct and handles the return parameter very differently compared to the int parameter our real method actually uses. This has lead to our return type getting corrupted.

I use the example of a struct value return type here because it is the most likely to generate a bug, since a struct return type causes the compiler to generate an objc_msgSend_stret for the message invocation instead of a regular objc_msgSend. However, it is also possible to create problems with non-return parameters of different lengths, particularly if the conflict is between floating point and non-floating point parameters or struct and non-struct parameters.

Fixing the problem (most of the time)

The best way to fix the problem is to add additional type information about the receiver, so that there will only be 1 possible match for the method name.

We do this by casting the receiver:

int result = [(MyClass *)[someArray objectAtIndex:0] currentPoint];

A MyClass receiver has only one match for currentPoint so there is no possible conflict with the NSBezierPath method and the problem here is solved.

Why is this allowed to happen? Why isn't there a compiler error?

Technically, there is a compiler warning that will alert you to this category of problem. The compiler warning "Strict Selector Matching" (a.k.a. -Wstrict-selector-match) will tell you when there is a conflict between two different method names when you're invoking a method on either of Objective-C's two weak types (id or Class).

It would be great if Strict Selector Matching always worked and we could turn it on at all times. That Apple don't turn it on by default is either because they consider the problem rare enough to ignore or it is an admission of the significant limitations of the warning as it currently behaves:

  1. It will over-warn you. Basically, there's no reason to care if there is a conflict between two methods different but totally compatible method signatures but this compiler warning will still occur.
  2. Plenty of Apple's own methods will cause spurious warnings due to point (1). I'm looking at you, different implementations of objectForKey:, count and most methods in NSNotificationCenter versus NSDistributedNotificationCenter. These spurious warnings may force you to carefully typecast large numbers of method calls that won't actually cause any problem.
  3. It will not warn you about conflicts between a class and instance methods. This one is a bit absurd since a Class object is regularly handled as a generic id.
  4. It won't help if you haven't imported the correct definition at all. If you failed to import the declaration of the correct method but you did import the declaration of a different method with a matching name but different signature, then I'm not sure the compiler could warn you about this problem.

A scenario where casting won't fix the problem

Imagine in the example above, the problem was between two class methods, instead of two instance methods.

i.e. instead of a conflict between -[NSBezierPath currentPoint] and -[MyClass currentPoint], the conflict was between +[NSBezierPath currentPoint] and +[MyClass currentPoint].

For the instance methods, we fixed the problem by casting to the specific object type required but when your objects are both Class, it is not possible to cast to the specific Class involved. Seriously: you cannot cast classes in Objective-C.

I consider this a serious failing of Objective-C. It makes avoiding this scenario with conflicting class method names hideous. If you're not able to change the name of the method, then the only work around looks like this:

int result = objc_msgSend([someArray objectAtIndex:0], @selector(currentPoint));

That's right: we would need to bypass the compiler entirely and insert the objc_msgSend call ourselves.

It gets worse though, since objc_msgSend uses a variable argument list by default, if the parameters you need to pass/receive from objc_msgSend are not signature compatible with a variable argument list, then you'll need to fully cast the objc_msgSend yourself to make certain that the parameters are passed correctly:

SomeReturnType result =
    ((SomeReturnType(*)(id, SEL, float, short, WeirdStruct))objc_msgSend)
    (
        [someArray objectAtIndex:0],
        @selector(methodWithMultipleVariables:thatAreNot:varArgCompatible:),
        someFloat,
        someShort,
        someWeirdStruct
    );

And if SomeReturnType is a struct, you'll need to use objc_msgSend_stret instead and for floating point types, you'll need to use objc_msgSend_fpret.

Conclusion

I would like to give a blanket suggestion that you switch on "Strict Selector Matching" in your Workspace/Project/Target settings for every project but unfortunately, the limitations of this warning make that suggestion difficult.

Spurious warnings from Apple's code, every time you try to use objectForKey: (or a large group of other methods) on an id typed variable can be infuriating and pointlessly increase your workload.

The warning doesn't catch all problems because it doesn't check conflicts between methods in the Class and id spaces. Along with the spurious warnings nuisance, the argument that you could skip the warning itself (and simply keep this potential problem in mind as a potential issue) is probably valid.

The warning itself should be fixed in GCC and Clang to make it useable enough to leave on in all situations. It shouldn't give a warning when the parameters in conflicting signatures are compatible. It should also gain cross checking between Class and id when the type in the code is id.

And Objective-C really needs a way to declare a type that is a specific Class. Even syntax as ugly as @classtype(SomeClass) would do it but I'm sure something more graceful could be found.

I've seen people argue that if you ever require a specific Class then you're designing things badly. I think the bugs caused by method name conflicts are a situation where you may not (if you can't rename either method) be able to cleanly design your way around this serious problem without the ability to cast a Class to something more specific.

In addition to allowing a conflict between two Class methods to be resolved more gracefully it would also help in the unrelated situation where a method wants a Class object passed as a parameter but that Class must be a subclass of a specific class.

An RSS-feed and location-based iOS application

The purpose of this post is so that I will have a link to give people when they ask: how do I write an iOS application that pulls data from an RSS feed, displays it pretty and can put things on a map. I'll show you all of that and more as I rewrite my oldest iOS application from scratch: FuelView.

The app presented in this post, FuelView, is freely available on the Australian App Store.

Introduction

I am commonly asked to write a post where I show a basic "pull data from the network and display" application.

But I think "basic" is boring and I try to avoid it in my blog posts. Instead, I decided to rewrite an application that looks simple but actually has a deceptively large amount of work to perform. It's a far better example of a real-world iOS application since there should always be enough work done transparently by your application that it leaves the user with a "Wow!"

The actual application is only 3 screens. While a truly simple three screen application might take about 2-3 hours per screen to implement (even if you're just stumbling your way through), this program actually took me about twice that time to implement (around 15 hours or two full days of programming) and I definitely knew what I was doing, since I've written the entire program before.

If you're curious though, in its first incarnation, this program took me 2 weeks to fully implement — way back in July 2008. It was the first program I ever tried to write for iPhone OS 2.0 and I did a lot of things the wrong way. Having a vague idea what you're trying to do really does make a 5 to 10 times difference to implementation time and unfortunately, the only way to learn is to stumble headlong into things and run the risk of getting everything wrong the first time.

There's a reason why this version is a rewrite from scratch.

Useful code in this post

While the application in this post is only directly useful to people living in Western Australia, I think this is a really interesting project as it contains a lot of very useful snippets of code (some of which I've written posts on before) including:

Plus a whole lot more. It really is a densely packed little program.

So where is all the "useful code"? If you skip forward to the second last section, I reveal where in the program you can find all of these code samples.

About FuelView

FuelView is an application for getting fuel (petrol/gasoline, diesel, etc) prices in Western Australia. It pulls its information from an RSS feed provided by the Western Australian government's "Fuel Watch" scheme that provides fuel prices for all stations in that state.

(No, I don't live in Western Australia — I'm on the opposite coast of Australia — but it seemed like a good idea for an iPhone app at the time.)

The application looks like this:

Fuelviewscreenshots
Download the complete project associated with this post: FuelView.zip (330kb).

Note: the code is all freely available under a zlib-style license but this license does not extend to the other assets. You may not use the icons or application name in your own programs.

The previous version of FuelView (1.1.10 is available for free from the iTunes App Store in Australia. I'll be resubmitting this version (probably 1.2) in a week or two.

Deceptive complexity

It all looks very simple; you get the location from the GPS, you pull the correct RSS feed for the location, stick pins in a map for the result.

If that were the actual number of steps involved, it would be great. However, it's not so simple. Let's take a quick look at the issues that will cause the most trouble for this program.

User location issues

The GPS gives latitude/longitude but the getting the RSS requires a Western Australian suburb name. In order to make this work, you need to be able to look up all the Western Australian suburbs and find the nearest one for your longitude/latitude. This requires a database of suburb names and their longitude/latitude and some code to search this.

Additionally, I want to be able to let the user specify a postcode for their location instead of using the GPS. Again, I need to be able to look up the suburb name for the postcode. Additionally, any code that requires the user's location must be able to tolerate the location being a postcode, not a raw longitude latitude.

A further complication arises because the FuelWatch RSS feed only exists for "larger" suburbs. Names of smaller suburbs can't be used, so the list of suburbs must be filtered to match the list that the FuelWatch website recognizes.

Station location issues

The RSS feed gives the fuel station locations as street addresses but I need longitude and latitudes for them so I can stick pins in the map or calculate the distance between the user and the station.

While this is a similar problem to resolving the user's location, it is actually trickier since new stations appear all the time, so this database must be dynamic (unlike the static postcode database).

I need to actually perform geocoding of fuel stations but there's a problem: Google's APIs are highly restrictive about how often you can make requests. The only way to avoid problems is to pre-populate the database and then have users only perform forward geocoding requests when an unknown fuel station appears and immediately add the new station to the database.

Custom drawing

Then I have the more straightforward complexity of custom drawing. I want to have most of the interface follow a custom aesthetic (because plain is boring) and that takes time and effort. I also need to ensure that drawing and layout function properly on an iPad and an iPhone.

The real design of the program

FuelViewDesign

In the diagram here, the green objects are the 3 main view controllers, the orange objects are the main "model" of the program (being the data that is actually displayed and passed between the ResultsViewController and the MapViewController) and the blue objects are the data controllers or functional pipelines of the program.

You can see some of the "deceptive complexity" here: the ResultsViewController is managing three different kinds of model data, and is also managing input from the CLLocationManager and UITextField. All this is apart from the normal responsibility of showing and displaying the UITableView and its rows.

Initial design of the program

While I'd love to say: I had this entire design ready when I started, it's simply not true. What I actually had was a quick UI design and a quick sketch of the data pipeline.

My inital sketch for the UI looked like this:

FuelViewInterfaceDesign

I didn't really need to do this sketch for this program (since the UI is pretty simple) but it's a good way to start all user programs.

FuelViewInitialDesign

As you can see, in the three years since I wrote FuelView, I had entirely forgotten about the need to resolve postcodes to suburb names and resolve station locations — instead I've simply shown the network pipeline.

This forgetfulness is pretty disappointing. Not only does it make me think the program is taking longer than it should but it ultimately leads to some of the design problems I discuss later.

But we're not there yet.

First implementation iteration

I'm going to refer to the design stages as "iterations". These steps are often called milestones (or mini-milestones when they're this small) but milestone implies you're only going forward, instead of the reality where you normally need to update the existing interfaces on your classes as part of integrating new features — so I prefer to think about it as iterating the program.

The first iteration involved implementing the program as described in the "FuelView Initial Dataflow design" shown above.

  1. Created a new project from my PageViewController template
  2. Pulled in an XMLFetcher
  3. Hardwired the code to pull the RSS feed for a specific suburb
  4. Displayed the list of addresses from the results as a basic text list
Screenshot4

Second design iteration

Now I need to be able to resolve locations from the CLLocationManager to suburb names. I also need to be able to resolve user-entered postcodes to actual suburb names. Finally, there needs to be logic so that a manually entered suburb name supercedes the GPS.

Locationsources

Again, I wouldn't ordinarily draw little design diagrams for this but — its a blog post and big walls of text are boring.

Second implementation iteration

To make the second design iteration work, I go through the following steps:

  1. Modified the CSVImporter from my Writing a parser using NSScanner post to generate a list of Western Australian postcodes with longitude and latitude. At the same time, I need to filter out suburbs that won't be recognized by the FuelWatch website
  2. Brought in a standard class for managing a Core Data NSManagedObjectContext and used the class to perform lookups on the list of Postcodes
  3. Added code to the ResultsViewController for CLLocationManager GPS locations
  4. Added code to the ResultsViewController to switch between the CLLocationManager and user-entered Postcode data sources

Design Mistake #1

In a general sense, the program is "well-designed" but it still contains two design mistakes. I could fix these mistakes but (a) I'm lazy and (b) I wanted to talk about both of them since I think they're under-discussed design problems.

By this point in the implementation, the first mistake has emerged: the ambiguous overuse of the word "location".

Yes, ambiguous naming is a design mistake. It's not exactly an aspect of design that is thoroughly discussed — probably because it's considered self-evident — but as your program evolves and new functions, roles and elements are added to existing classes, it is sometimes necessary to change the names of the classes.

In this program, the word location has a few different meanings:

  • GPS location from the CLLocationManager (gpsLocation in the program)
  • User-specified postcode (usingManualLocation/postcodeValue in the program)
  • Resolved suburb name used for searching (location property on ResultsViewController — a Postcode object
  • The location of fuel stations (instances of the Location class)

Yuck, what a nightmare.

The single biggest mistake here is that the location property on the ResultsViewController is a Postcode object, despite the name implying the Location class. This is an ambiguity you should work really hard to avoid — and fix when it occurs.

A far better approach would actually be: rename the PostcodesController to SuburbsController, rename Postcode to Suburb, rename the suburb property on Suburb to name and rename the location property to suburb. In addition, it would be better to rename Location to Station.

Third implementation iteration

Now that the main data path is working, it's time to start on the custom views.

The GradientBackgroundTable will be the main UITableView class in the program. Its name is a bit of a misnomer: not so much of a misnomer that it's a design mistake but the table can draw as a gradient but can also draw as a flat color — it would be better named "Custom colored background table" or something like that.

Each result row will be represented using the following classes:

  • ResultCell — controller that constructs the rest of the row
  • ResultCellBackground — set as the backgroundView of the cell. Draws the gray gradient background and not much else
  • ResultView — draws all of the text (does not use UILabels) and uses the Gloss Gradient code to construct the backing for the actual price display

This now brings us a main screen that looks like this:

Customdrawing

Note that each row simply shows an orange "cents per litre" line under the price. Ultimately, this should show the distance from the user's current GPS location to the station but since I don't yet have the locations for the stations, I can only show "cents per litre". Note that this "cents per litre" will continue to display in the final program if the user is using a manually-entered postcode (and I don't necessarily know their GPS location).

Fourth implementation iteration

Now I need to resolve station locations. I have the street addresses from the FuelWatch RSS feed results but I need to turn this into longitude and latitude to calculate the distance or stick a pin in a map.

I'll use the Google APIs for this. As I've said though, the Google Maps API won't let you perform a large number of requests per second so I need to aggressively pre-cache station locations and only perform a geocoding request when a new fuel station appears for the first time.

The LocationsController and the Location lookup generally work like the Postcode lookup on the PostcodesController when the station is cached, otherwise, I need a callback when the actual response comes back from Google.

The LocationsController normally uses two Core Data data stores: one is read-only and is the "pre-cached" set of fuel station results (shipped with the application). The second is the read-write data store, saved in the Application Support directory (which is writeable, unlike the application's bundle). This second location will get all new locations for which we need to query Google.

To prepare the pre-cached file of stations that we can ship with the application, the best approach is simply to run in the iPhone Simulator with the LocationController's primary data store set to Read/Write (and the read-only store removed) and when the primary data store fills with results, let it save to file, copy the cached results file from the iPhone Simulator's directory back into the Project.

Once the location is resolved to a longitude and latitude, I can calculate the approximate distance to the user's GPS location, using the approximate kilometres per longitude and latitude at 30 degrees latitude (this is not highly accurate but is sufficient given that most of Western Australia is relatively close to 30 degrees latitude.

With the locations available, I can display the distance in the ResultsView and color code the distance bar based on how far away the station is.

Locationsavailable
Maps Key Note: I have removed my Google Maps API key from the code. If you want to use this code, you'll need to apply for your own Google Maps API key and set it the MapsKey at the top of the LocationsController.m file.

Fifth implementation iteration

Implementing the SettingsViewController to switch fuel types and the MapsViewController to show the current array of results on a map turns out to be very simple. There's not a significant amount of complexity in either of these views.

Design Mistake #2

One point to notice in the implementation of the MapViewController is that I needed to implement an "Adapter category" on NSDictionary to allow the NSDictionary to respond to the MKAnnotion protocol so I could use the dictionaries to display the pins in the map.

How is this a design mistake?

Needing to put categories on generic classes like this is an indication that you probably should have used a dedicated class to contain your data. The results in the program should not be generic NSDictionary objects.

Until this point, the "generic"-ness of the main data type in the program has been ignored. The reality though is that the construction of NSDictionary results from XPathResultNodes and the resolving of station location for each result has been handled by the ResultsViewController — this is all work that a Result class should be performing for itself instead of using a generic NSDictionary class and making the ResultViewController handle all the work.

But ad hoc trickery like adding categories to generic container classes is a big flag that you've forgotten to use a custom class for objects that genuinely need their own behaviors. If you find yourself needing something like this: replace the generic container with a proper custom class.

I'm not saying that adapter categories are a bad idea. Sometimes you can't or shouldn't change the underlying class — in this case and adapter is a good thing. But here in FuelView I can change the underlying class and should, in order to reduce the code burden on our controllers.

So where is all the "useful code"?

An iOS version of my Gloss Gradient drawing code

The GlossGradients.m code is in the project. It's very similar to the original code except that there aren't HSV conversion methods on UIColor like there is on NSColor, so I've had to write these methods myself. It is used in the ResultsView drawing code.

Two persistent stores with one NSPersistentStoreCoordinator

The LocationsController uses two different persistent stores: a read-only store inside the application bundle that is shipped with the application contains the pre-supplied results for station lookups. But the application bundle can't be changed, so I create a read-write store in the Application Support directory. The NSPersistentStoreCoordinator is smart enough to save the store to the correct location automatically.

A full set of "single line Core Data fetch" methods

The NSManagedObjectContext+FetchAdditions.m file contains a range of different fetch request creation methods and single line fetching implementations (for set, array and single object results). It is used in the LocationsController and the PostcodesController to perform the actual queries on the Core Data context.

Getting the GPS location

The ResultsViewController operates as a CLLocationManagerDelegate. The location receiving code is pretty simple but I think the error handling code in locationFailedWithCode: is more interesting.

Pulling data from an RSS feed

Of course, an RSS feed is just XML. We're after the <item> nodes in the result so I use an XMLFetcher to with an XPath query of "//item". You can see this in the setLocation: method and the response is handled in the responseReceived: method (the XPathQueryNodes are turned into an NSDictionary).

Caching data in the Application Support directory

The application support directory is accessed/created in the persistentStoreCoordinator method of the LocationsController. As I said above, this is for writing extra Locations to the Locations Core Data context.

Function to create a two point CGGradientRef from two UIColors

The function TwoPointGradient is pretty simple; it just creates a CGGradientRef taking two UIColors to use as the endpoints of the gradient. However, it's 23 lines of code that don't need to be retyped in the ResultCellBackground, ResultView and the PageCellBackground.

An example of using a category as an Adapter interface

Putting an adapter on a generic container class is a bad idea if you can easily change the class to something of your own implementation. But this is still an example of adapting a class' interface to suit your own needs — something that is very useful when you don't have control over the underlying class.

Scrolling a text field that isn't in a table

The manually entered postcode is in a text field on a UIToolbar and when the keyboard appears, the entire UIToolbar scrolls up with the keyboard. This behavior is handled by the PageViewController (everything below the "Handle the sliding/scrolling of the view when the keyboard appears" pragma except the dealloc method). The PageViewController needs to be set as the delegate of the UITextField for this to work.

A Core Data Postcode database

The PostcodesController shows how to implement a static data store using Core Data. I think I could probably write a base-class for this type of singleton that would dramatically reduce the common code between the PostcodesController and the LocationsController.

A flexible, reusable controller/table/cell structure

The PageViewController, PageCell, all the view controllers and all the table cells are directly based on the code I presented in UITableView construction, drawing and management.

A CheckmarkCell that self-manages radio button style selection

It's strongly reliant on the PageViewController and PageCell classes but the CheckmarkCell shows an easy way (easy for the rest of the program) to manage a section in a table where only one row can be selected.

Forward geocoding using Google's Maps API

The locationForAddress:receiver: method in LocationsController performs an XML request on Googles Maps API to forward geocode addresses into longitude and latitude (the response is handled in mapsResponseReceived:). Again: you'll need your own Maps API Key to make this work.

Conclusion

Download the complete project associated with this post: FuelView.zip (330kb).

Note: the code is all freely available under a zlib-style license but this license does not extend to the other assets. You may not use the icons or application name in your own programs.

The app presented in this post, FuelView, is freely available on the Australian App Store.

I did this rewrite of FuelView for three reasons:

  • I hadn't written any code at work for a week or two — I needed to scratch my codemonkey itch with a few hours of actual programming instead of meetings and documentation.
  • Preparing code like this for my blog motivates me to review my reusable classes and make them more presentable.
  • I'm often asked to show an iOS program that pulls data from an RSS feed.

The first two points are entirely for my own purposes and went fine, thanks.

On the third point — this is definitely code that pulls from an RSS feed. I hope the scale of the program doesn't make it hard to see how the basics work. In addition to RSS feed handling though, there's lot of other code here so I hope there's something here for programmers at a range of different skill levels.

Classes for fetching and parsing XML or JSON via HTTP

In this post I show two reusable classes for fetching data via HTTP: one that parses the result as XML and another that parses as JSON. These are relatively simple tasks but due to the number of required steps, they can become tiresome if you don't have robust, reusable code for the task. These classes will work on iOS or on the Mac but the optional error alerts and password dialogs are only implemented for iOS.

Introduction

In my experience, "fetching data via HTTP" is probably the second most common task that iOS applications perform after "displaying a list of things in a table". Since I wrote a recent post showing how I handle display in tables, showing my reusable classes for fetching via HTTP seemed like a reasonable follow up.

As with the post on UITableView management, this post is all about trying to make the HTTP fetching, handling and processing as simple and reusable as possible.

What I hope to demonstrate is that even though the Cocoa API makes it look like you need to bolt NSURLConnection delegate methods onto your own classes every time you need a network connection, it doesn't mean that you actually need to do all this work every time you need a network connection. For the most common tasks like this, you should develop your own, reusable approaches that you like, that serve your needs and that make new code easier.

There are lots of alternative approaches around that demonstrate similar ideas. My implementation is a simple implementation compared to full frameworks (for a more thorough implementation along similar lines, you may want to look at RestKit). I hope you'll still be able to see the contrast compared to ad hoc solutions though, especially if you've ever jammed HTTP communication into your projects without thinking about keeping the interface clean and simple.

You can download the four classes discussed in this project: HTTPXMLJSONFetchers.zip (16kB)

HTTP connections in Cocoa

BSD sockets and CFHTTPStream are generally too low level to use regularly. Unless your program requires meticulous control of the network layer, you probably want to use NSURLConnection for handling HTTP fetching.

Technically, NSURLConnection can perform network connections in a single instruction: +[NSURLConnection sendSynchronousRequest:returningResponse:error:]. Synchronous connection should be avoided in all but a few rare worker-thread situations because it stops your program's user-interface and it doesn't allow careful error handling.

This means that when fetching via HTTP, you should be using NSURLConnection's delegate methods. The delegate methods are:

- (void)connection:(NSURLConnection *)connection didReceiveResponse:(NSURLResponse *)response
- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data
- (void)connection:(NSURLConnection *)connection didFailWithError:(NSError *)error
- (void)connection:(NSURLConnection *)aConnection didReceiveAuthenticationChallenge:(NSURLAuthenticationChallenge *)aChallenge
- (void)connectionDidFinishLoading:(NSURLConnection *)connection

and a thorough implementation means implementing all 5 of these methods.

A commonly seen case for this is to add the NSURLConnection delegate methods to your UITableViewController and make that view controller manage the connection.

While this might seem like a good idea (the view controller can track the status of the connection and provide visual updates and also present its own errors) the reality is that fully handling the connection takes a lot of code. How much code? The code I use is 530 lines long (including comments and spacing).

But there's also a more serious problem: bolting NSURLConnection to your UITableViewController limits code reuse. If your network code is tied closely to the view controller, there's more work involved in adding network behaviors to other view controllers or parts of your program.

Why do NSURLConnection delegates take so much code to implement? In the simplest case, they don't (you could probably manage a connection in 20 lines or so) but you'd be overlooking a lot of subtler behaviors. Errors, password authentication, cancelling the connection cleanly and offering simple construction versus meticulous construction are the type of behaviors that get left out if you're rewriting the code every time or operating under serious time constraints.

HTTPFetcher

The idea behind my HTTPFetcher class is really simple: it's reusable NSURLConnection delegate. It handles all the NSURLConnection delegate work and calls back when it has the results. It provides default error handling, password authentication and while it has a very simple default constructor, it still provides enough hooks that you can customize its behavior.

The interface to the class is really just construction methods, a start, a cancel and some properties. The assign properties are for configuring the connection before you start it. The readonly properties are for gathering information once the connection is complete.

@interface HTTPFetcher : NSObject <UITextFieldDelegate>

@property (nonatomic, readonly) NSData *data;
@property (nonatomic, readonly) NSURLRequest *urlRequest;
@property (nonatomic, readonly) NSDictionary *responseHeaderFields;
@property (nonatomic, readonly) NSInteger failureCode;
@property (nonatomic, assign) BOOL showAlerts;
@property (nonatomic, assign) BOOL showAuthentication;
@property (nonatomic, assign) void *context;

- (id)initWithURLRequest:(NSURLRequest *)aURLRequest
    receiver:(id)aReceiver
    action:(SEL)receiverAction;
- (id)initWithURLString:(NSString *)aURLString
    receiver:(id)aReceiver
    action:(SEL)receiverAction;
- (id)initWithURLString:(NSString *)aURLString
    timeout:(NSTimeInterval)aTimeoutInterval
    cachePolicy:(NSURLCacheStoragePolicy)aCachePolicy
    receiver:(id)aReceiver
    action:(SEL)receiverAction;
- (void)start;
- (void)cancel;

@end

You initialize the class in whatever way you choose (the middle init method shown here is the simplest), optionally configure the class (the most common configuration is to set the context pointer so that when the connection completes, you can remember where to set the data), start the connection and then it will invoke the receiverAction on your receiver object (the receiver action takes one parameter: the HTTPFetcher itself).

// Example fetcher creation
fetcher = [[HTTPFetcher alloc]
    initWithURLString:@"http://some-domain.com/some/path"
    receiver:self
    action:@selector(receiveResponse:)];
[fetcher start];

// Example fetcher response handling
- (void)receiveResponse:(HTTPFetcher *)aFetcher
{
    NSAssert(aFetcher == fetcher,
        @"In this example, aFetcher is always the same as the fetcher ivar we set above");
    if ([fetcher.data length] > 0)
    {
        [self doSomethingWithTheData:fetcher.data];
    }
    [fetcher release];
    fetcher = nil;
}

Ordinarily, your program will want to customize the code that presents the errors and make the presentation consistent to your application. You can do this with the HTTPFetcher class by either subclassing or editing the class itself or you can disable the alerts and authentication functionality and perform the work outside the class. However, if you don't have time to do this customization, there is default behavior in the class that will suffice.

HTTPFetcher memory management: the HTTPFetcher does not retain itself while running and does not retain the receiver. This is because the expected behavior is that the receiver retains the HTTPFetcher and we don't want a retain cycle. If you create the HTTPFetcher and don't have a retain count on it, it will immediately auto-cancel itself and dealloc.

XMLFetcher

The HTTPFetcher is fine if you simply want the data from an HTTP connection. For my own purposes though, I've never used the HTTPFetcher on its own — I've always used it as the base-class for classes which post-process the HTTP data before invoking the receiver's callback method.

The XMLFetcher class is for turning an XML response into something more useful. Instead of needing to look at the data property of the HTTPFetcher, you can use the results property which is the array of nodes matching a given XPath query on the XML result.

@interface XMLFetcher : HTTPFetcher

@property (nonatomic, copy, readonly) NSString *xPathQuery;
@property (nonatomic, retain, readonly) NSArray *results;

- (id)initWithURLString:(NSString *)aURLString
    xPathQuery:(NSString *)query
    receiver:(id)aReceiver
    action:(SEL)receiverAction;

@end

I've previously spoken about how I'm not a fan of the event-driven model (sometimes called a SAX parser) promoted by Apple in the iOS API. It is certainly memory efficient and faster for large files but it requires you perform your own structured handling which is tiresome, prone to mistakes and not really reusable. I personally prefer a document-based model like the NSXML API that exists in Mac OS X but not in iOS.

The XMLFetcher class blends the libXML-based XPath based parsing and querying with the HTTPFetcher.

However, I've addressed a number of the shortcomings of my previous libXML-based parsing. The biggest problem with that earlier code was that it simply packaged the XML into NSDictionarys (which is inelegant at best) — so instead, the results are now a dedicated XPathResultNode class which can cleanly represent attributes, childNodes and contentStrings. There's also better handling of content strings either side of subnodes and concatenating of text data spread over subnodes.

@interface XPathResultNode : NSObject

@property (nonatomic, retain, readonly) NSString *name;
@property (nonatomic, retain, readonly) NSMutableDictionary *attributes;
@property (nonatomic, retain, readonly) NSMutableArray *content;

+ (NSArray *)nodesForXPathQuery:(NSString *)query onHTML:(NSData *)htmlData;
+ (NSArray *)nodesForXPathQuery:(NSString *)query onXML:(NSData *)xmlData;

- (NSArray *)childNodes;
- (NSString *)contentString;
- (NSString *)contentStringByUnifyingSubnodes;

@end

XPath query note: XPath queries can be a little difficult to get used to — if you're not accustomed to XPath, it can be hard to extract the exact nodes you want. Like regular expressions though, they're a highly specialized language for extracting data and once you understand the different functions available, they are the quickest way of getting specific nodes out of XML.

Compiler note: the XPathResultNode.m file contains a comment at the time which explains the Xcode compiler settings required to make it work. Basically, you need to include libxml in the include path and link your project with libxml2.dylib.

JSONFetcher

The JSONFetcher is really just the same idea as the XMLFetcher — parse the result from HTTPFetcher once complete, this time as JSON data.

The class I've written relies on SBJSON, Stig Brautaset's BSD-licensed JSON parsing library. You will need to download these files separately and include them in your project (it's 3 .m files and 4 .h files).

SBJSON isn't your only option for JSON handling in iOS or Mac OS X. There are a few other JSON libraries for iOS and Mac discussed here on Stackoverflow if you'd prefer options. Obviously though, you'd need to make minor adjustments to integrate a different parser.

With a JSON response, there's not the same expectation of needing to find a subnode within a larger result (as is the common case for XML), so the JSON parser simply parses the whole JSON structure and returns it all.

@interface JSONFetcher : HTTPFetcher

@property (nonatomic, readonly) id result;

@end

Conclusion

You can download the four classes discussed in this project: HTTPXMLJSONFetchers.zip (16kB)

I've presented my classes for handling these tasks. I don't expect that everyone has the same data and network requirements as I do, so there's every chance that you would need very different classes to suit your own exact needs.

The point is really to consider reuse in your own code — how can you evolve your classes so that when you start a new project you need to rewrite as little as possible — you can simply bring in your own class for handling network data, pass different parameters into its constructor and your network connection is done.

Until I had composed these classes for my own purposes, new projects involved hundreds of lines of code that went through a copy, paste, refactor process from existing projects I'd written. While copy, paste, refactor will work, it is slower, more prone to errors and harder to keep up-to-date than properly reusable classes. In most cases, you should view copy and paste as a failure of your own processes. That's a hard rule to adhere to, since copy, paste, refactor is faster than designing a reusable class — or at least it is initially (compared to an up-front design effort). You need to have the discipline to recognize the common behaviors between classes or projects and refactor into shared classes if required.

A final thought: I realize I I haven't really shown these classes at work in an example program. If you can't work out how to use them in a real program, please wait a week or two: I plan to share a real-world project that uses them to handle all its network communication.

Presenting a Mac dialog sheet with visual cue effects

In this post, I'll show you how to use visual effects over a window to make a dialog sheet stand out when it is presented over the top. It's a pretty simple use of Core Image but is a useful technique to capture attention when needed.

Introduction

Getting the user's attention when required and drawing their focus to important areas is an important point when trying to iterate and improve your user interface.

Of particular importance: how to you force the user's attention from one part of the screen to another when an important event occurs.

That's a task that the code in this post aims to address. By visually disrupting the normal window area, we inform the user that their attention is briefly needed elsewhere.

Of course, this is not a good thing to do frequently. Forcing the user to shift their attention from one side of the screen to another is generally considered poor form. However, some screen features (especially sheets and alerts) have a fixed location so keeping the activity local is not always possible.

The sample application

Download the Xcode project for this post: PresentSheetWithEffect.zip (190kB)

The sample application shows the following window:

normalwindow.png

When either of the two buttons are pressed, a sheet (either a window loaded from a NIB file or an NSAlert, depending on the button) is presented over the window.

As the sheet is presented, a trio of Core Image effects are applied over the window:

  • Reducing saturation
  • Reducing exposure
  • Applying gloom (darkened glow from dark areas)

The result is shown in this second screenshot:

sheetpresented.png

The aim is to make the alert — which is pretty bland — visually pop out and draw the user's attention to it (since it is modal and they're not allowed to do anything else until they dismiss it).

Visual overlays

Visual disruption of the background is common in iOS. Users of Safari in iOS would be familiar with the dark, semi-transparent overlays over the main webpage area when the keyboard focus is in the Safari address bar. Even Safari on the Mac darkens the main page except for search terms when searching for text within a page.

However, this overlay normally needs to be very dark since the color of the main page behind the overlay is not known and the overlay needs to provide contrast in as many cases as possible.

With Core Image on the Mac, we have a range of different options that don't require turning the screen completely dark. For this post, I've chosen to reduce saturation, take a little out of the brightness and apply a gloom. Of course you might prefer a different selection of filters.

An important consideration with the filters to apply: they shouldn't look too flashy and they shouldn't hurt the eyes of the user who will likely be focussed on a region within the filtered area. For example: I had initially tried a gaussian blur filter but this actually hurts your eyes a little if it is applied while you're trying to focus — you subconsciously try to focus as the blur gradually makes this impossible.

A great big block of code

The following method applies the filters and presents the sheet. The filters will only work if the window's contentView has a Core Animation layer (in this program, the Core Animation is enabled in the XIB file).

- (void)presentSheetWithWindow:(id)aSheetWindow
    delegate:(id)modalWindowDelegate
    didEndSelector:(SEL)didEndSelector
{
    // 'sheetWindow' is an instance variable tracking the currently presented
    // window. If a window is already being presented, dismiss it first before
    // presenting this new one
    if (sheetWindow)
    {
        [self dismissSheetForWindow:sheetWindow];
    }
    sheetWindow = [aSheetWindow retain];
    
    // We're going to fade the effect in
    CATransition *animation = [CATransition animation];
    [animation setType:kCATransitionFade];
    [[[[self window] contentView] layer] addAnimation:animation forKey:@"layerAnimation"];
    
    // The effect will be applied to this new view that we'll lay over the top
    // of everything else
    blankingView =
        [[[NSView alloc] initWithFrame:[[[self window] contentView] bounds]] autorelease];
    [[[self window] contentView] addSubview:blankingView];

    // Construct the three effects
    CIFilter *exposureFilter = [CIFilter filterWithName:@"CIExposureAdjust"];
    [exposureFilter setDefaults];
    [exposureFilter setValue:[NSNumber numberWithDouble:-1.25] forKey:@"inputEV"];
    CIFilter *saturationFilter = [CIFilter filterWithName:@"CIColorControls"];
    [saturationFilter setDefaults];
    [saturationFilter setValue:[NSNumber numberWithDouble:0.35] forKey:@"inputSaturation"];
    CIFilter *gloomFilter = [CIFilter filterWithName:@"CIGloom"];
    [gloomFilter setDefaults];
    [gloomFilter setValue:[NSNumber numberWithDouble:0.75] forKey:@"inputIntensity"];
    
    // Apply the effects to the blankingView layer
    [[blankingView layer] setBackgroundFilters:
        [NSArray arrayWithObjects:exposureFilter, saturationFilter, gloomFilter, nil]];

    // Present the sheet -- different code depending on whether we're presenting
    // a dialog or regular window
    if ([sheetWindow isKindOfClass:[NSAlert class]])
    {
        if (modalWindowDelegate == nil)
        {
            modalWindowDelegate = self;
            didEndSelector = @selector(didEndPresentedAlert:returnCode:contextInfo:);
        }
        [(NSAlert *)sheetWindow
            beginSheetModalForWindow:[self window]
            modalDelegate:modalWindowDelegate
            didEndSelector:didEndSelector
            contextInfo:NULL];
    }
    else
    {
        [[NSApplication sharedApplication]
            beginSheet:sheetWindow
            modalForWindow:[self window]
            modalDelegate:modalWindowDelegate
            didEndSelector:didEndSelector
            contextInfo:NULL];
    }
}

The rest of the sample application

Download the Xcode project for this post: PresentSheetWithEffect.zip (190kB)

The sample application also shows the expected usage: a separate subview controller controls the subview but invokes the window controller to actually present the sheet.

The presentation code is written to account for the fact that different subview controllers may attempt to present errors at different times without any real coordination but the current incarnation does not allow for sheet "stacking" (i.e. if a new sheet is presented, any existing sheet is immediately dismissed). If you need stacked sheets, you'd need to make changes to permit that.

Conclusion

Core Image allows a lot of flexibility with this type of visual effect. Even if you're not trying to be as "cute" as the trio of effects used in this sample post, the ability to do something as basic as turning down the saturation is quite powerful.

Remember: the purpose with visual effects should be to speed up the user experience by drawing focus to where it is needed. If you're slowing the user down, you're doing it wrong.

Background audio through an iOS movie player

Background audio in iOS is supposed to be as simple as entering a setting in your Info.plist and making sure your kAudioSessionProperty_AudioCategory is appropriate. This is true unless your audio is part of a movie file or is played in a movie player that has just played video — suddenly it becomes fiddly, hard to test, unreliable and changeable from version to version of iOS.

Introduction

I was not sure I wanted to write this post. It runs the risk of pointing out that I'm not perfect. But all programs have bugs and my programs are no different.

And anyway, as both Han Solo and Lando Calrissian validly said of the Millenium Falcon's failure to reach light speed, "it's not my fault". Of course, as it was in Star Wars, so it is in real life: your users don't care whose fault it is, they just want it fixed.

Obviously, I develop and sell a product named StreamToMe, available through the iOS App Store, that plays video and music and lists "Background audio" as one of its features. In this post, I'm going to talk about why background audio has worked and then not worked, been fixed and then not worked again only to be mostly fixed with some issues outstanding.

How can a feature that is simple, according to Apple's documentation, cause such a quality headache in a program?

In this post I'll be looking at playing background audio through the iOS movie playing APIs (either MPMoviePlayerController or AVPlayer/AVQueuePlayer). I've recently written a post on the history of iOS media APIs but as you'll see in this post, background audio is functionality that relates to the implications of the APIs, not the APIs themselves. You need to discover the "de facto" behavior yourself and hope you're correct.

Specific points will include:

  • why an application that also plays video has so much more difficulty with background audio than other kinds of applications
  • why background audio has broken multiple times in StreamToMe since iOS 4 was released, despite using no undocumented functionality and despite the documented API remaining nominally unchanged
  • why background audio is affected by seemingly unrelated choices like Apple's HTTP live streaming and 3G network

I'll also briefly look at quality management on a complicated program and how the largely undocumented behaviors of Apple's video APIs make perfect testing impossible.

Apple's documentation for background audio in iOS

Apple's documentation for background audio makes it sound very simple. It is 4 paragraphs long under the heading "Playing Background Audio" on the Executing Code in the Background page.

Additionally, Technical Q&A QA1668 discusses "How to play audio in the background with MPMoviePlayerController" by ensuring the Audio Session Category is correct.

Background audio is mentioned in a few other pages but it mostly repeats the information found in these two locations.

It all sounds pretty simple: it seems like background audio should "just work".

What happens to a file that contains video?

The movie players in iOS are explicitly capable of working in the background

But in the above linked Technical Q&A QA1668, the question explicitly mentions "audio-only movies and other audio files". There is no mention of what happens to files that have a video track.

In fact, there is no mention anywhere in the iOS documentation that I could find about what happens to a video file when you switch into the background.

All we can do is examine the behaviors experimentally. The following are the behaviors I've noticed in iOS 4.3 when switching video into the background.

Any file that contains a video track of any sort will be paused if the application switches into the background.

This pause is sent from the CALayer displaying the video frames. This is a private class for an MPMoviePlayerController and is your own AVPlayerLayer for an AVPlayer.

You can't really control this — even in the situation where it's your own AVPlayerLayer — the pause is sent from private methods (so you can't legally override them), during a private "UIApplicationDidSuspendNotification" (so you can't legally block or intercept this). This notification occurs between the UIApplicationWillResignActiveNotification and the UIApplicationDidEnterBackgroundNotification.

Nor can you simply disconnect the AVPlayerLayer of an AVPlayer to avoid the pause being sent — this actually leads to a crash if the file is still playing for reasons that are not explained and could be either a bug in iOS or expected behavior (it's not at all clear).

If you attempt to start a file playing video in the background it will fail with an error

While a video file started in the foreground will simply pause, a video file started in the background will actually give an error abort playback entirely.

This can even occur for a file that was pausing on entering the background but which you attempt to resume.

If you attempt to play a file without video but the previous file contained video, the new file will also fail in many cases

The video system in iOS has a degree of latency between commands you request and actual changes in playback.

My guess (again, none of this is explained in the documentation) is that this latency occurs because your video commands need to be sent to the separate mediaserverd process in iOS that handles all media playback. This process then makes the required changes and sends back response notifications.

This seems to create a situation where if you cancel the playback of a file and immediately start a new file, some of the properties of the old file will remain for a time.

In the case of playing an audio-only file immediately after a video file, this latency appears to be long enough for the audio-only file to be rejected with an error as though it was a file with video.

Even a file with the video tracks disabled will still fail

If you're using an AVPlayer or AVQueuePlayer, you can disable all the video tracks any time after the AVPlayerItemStatusReadyToPlay notification is sent using the following code:

for (AVPlayerItemTrack *track in player.currentItem.tracks)
{
    if ([track.assetTrack.mediaType isEqual:AVMediaTypeVideo])
    {
        track.enabled = NO;
    }
}

This will stop the tracks playing but despite the tracks being disabled, the effect on background play remains the same: presence of video in the file still causes the player to pause.

How StreamToMe has handled video in the background

As you can tell by the summary of experimentally determined functionality above, iOS really strongly doesn't want you to play video in the background.

Frankly, iOS's restrictions in this area are contrary to what people want.

Where iOS makes a huge distinction between audio-only media and media with both video and audio, many users do not. We are accustomed to Quicktime and iTunes and VLC and MPlayer and most other media applications being able to perform all the same tasks with either video or audio.

Even for users who only use StreamToMe to play music, it's hard to avoid video in StreamToMe because StreamToMe puts a still image for the album artwork into a video track to display artwork for music files — in the eyes of iOS, basically every files StreamToMe plays counts as video.

It was necessary to find a way around these restrictions. And so begins the story of half a dozen application updates over 3 major iOS updates.

iOS 4.0

In iOS 4.0, StreamToMe used MPMoviePlayerController and was able, through a bizarre sequence of layer manipulation operations in the MPMovieMediaTypesAvailableNotification method (basically removing the video render layer and reinserting at the right time), to convince the MPMoviePlayerController to proceed, even when it was playing video in the background.

Technically, you didn't need to remove the layer to get it to play in the background (all you needed was to resume after the "UIApplicationDidSuspendNotification" pause) but if you didn't remove the video layer, video frames would still be rendered and queued for display, leading to out of memory problems or weird speedy video quirks when the video came back to the foreground.

I'm not going to share the code that did this: it was messy, not advisable and doesn't work anymore. I was fully aware that this was a bizarre thing to do and that I would need to keep a really close eye on iOS updates to ensure that it kept working.

iOS 4.2

From the betas of iOS 4.2, it became apparent that the layer manipulation would no longer work to allow background video to work smoothly and no combination of actions I could find would make it work again. Playing the audio from a file also containing video looking like it would be impossible.

Fortunately, with StreamToMe, I control both ends of the client-server communication and there was another solution: upon entering the background, StreamToMe could reload the stream from the server with the video track stripped off by the server.

This server reconnection results in a second pause or so (more over 3G) while the new stream was started and sometimes a jump back to the start of the previous HTTP live stream segment but otherwise the experience is tolerable.

However, there was a catch: MPMoviePlayerController didn't like being torn down and recreated in a short space of time. In iOS 4.2, doing this would actually result in an error.

But the new AVQueuePlayer API introduced in iOS 4.1 did support queuing a new stream and then switching to it. In fact, it did it pretty well (after all, that's what the whole "queue" is about). Unfortunately, switching to AVQueuePlayer from MPMoviePlayerController is not a small task: AVQueuePlayer offers no user interface (you have to implement one entirely for yourself) and the entire property observation model is completely different.

The following code sample shows how a switch to a background track was managed in the UIApplicationDidEnterBackgroundNotification. A new "background" variant of the URL for the current item is generated by the STMQueuePlayerController (the StreamToMe class that relates the StreamToMe representation of files to the AVQueuePlayer represenation) is generated, seeked to the same point as the current file, inserted into the queue and played.

if (resyncTask)
{
    [[UIApplication sharedApplication] endBackgroundTask:resyncTask];
}
resyncTask = [[UIApplication sharedApplication]
    beginBackgroundTaskWithExpirationHandler:^{resyncTask = 0;}];

AVPlayerItem *backgroundItem =
    [[AVPlayerItem alloc]
        initWithURL:[[STMQueuePlayerController sharedPlayerController]
            urlForFile:[[STMQueuePlayerController sharedPlayerController] currentFile]
            inBackground:YES
            offset:CMTimeGetSeconds(player.currentTime)]];
[backgroundItem                              // seek the item, not the player
    seekToTime:player.currentTime
    toleranceBefore:kCMTimeZero
    toleranceAfter:kCMTimeZero];
[player insertItem:backgroundItem afterItem:currentItem];

[self stopObservingPlayerItem:currentItem];  // stop observing the old AVPlayerItem
[currentItem release];
currentItem = [backgroundItem retain];
[self startObservingPlayerItem:currentItem]; // begin observing the new AVPlayerItem

[player advanceToNextItem];
[player play];

The resyncTask is ended when this new file sends an AVPlayerItemStatusReadyToPlay and is used to ensure that we don't get suspended while restarting the playback.

Needing to rewrite the code for AVQueuePlayer left a brief gap at the start of iOS 4.2 until StreamToMe 3.3 was released, where background audio was broken in StreamToMe.

iOS 4.3

But iOS 4.3 turned out to be a bit of a one-two punch. On paper, the big change was AirPlay video — the new feature in iOS 4.3 that didn't work with AVQueuePlayer (seriously) — but it turns out that iOS 4.3 also changed how movie players were paused when going into the background. This change to pausing behavior was not clear to me until after iOS 4.3 was released, so StreamToMe's background behavior broke again.

What happened is that StreamToMe used to read whether the stream was currently playing (i.e. not paused) and only transition to the background version of the stream if it was actively playing. Unfortunately, the -[AVPlayer rate] which previously returned 1.0 for a previously playing video stream during the UIApplicationDidEnterBackgroundNotification would now return 0.0 (i.e. reporting that the stream was paused).

The fix is pretty simple: when we receive UIApplicationWillResignActiveNotification we needed to record whether the current file was playing or paused and use that information later in the UIApplicationDidEnterBackgroundNotification (the private "UIApplicationDidSuspendNotification" that pauses the file occurs between these two notifications).

Unfortunately, I didn't realize until the last moment on an update that the AVPlayerLayer had also started pausing audio-only files, not just files with video. To me, this seems like a significant change in behavior; why should an audio-only file suddenly start getting paused when the application enters the background? It's not my fault but I need to fix it anyway — unfortuntely due to the slowness in realizing this problem, this separate fix for audio-only files in StreamToMe (files with neither video nor album artwork in a video track) had to be held over until the 3.5.4 update.

More than StreamToMe was affected: iOS 4.3 actually broke background video for Apple's apps too. While Apple's apps (iPod, Movies, Safari, YouTube) have always paused the current video when switching into the background, you used to be able to resume the video from the multitasking bar, lock screen or headphones. From iOS 4.3, this behavior has been blocked; the video may play for a fraction of a second but then will immediately stop again.
3G and slow WiFi affecting background audio?

Even after fixing these problems it turns out iOS 4.3 had one more suprise. It now appears that the code I showed above that handles the track change:

[player insertItem:backgroundItem afterItem:currentItem];
[player advanceToNextItem];
[player play];

will work on a local WiFi network but on a high latency WiFi or 3G connection can cause the proper, background-safe version of the file (which is the "next item" loaded here) to be rejected.

Why on earth would the speed of the network affect this?

I'm not entirely certain but it appears that when the network is fast enough, the command:

[player insertItem:backgroundItem afterItem:currentItem];

actually fetches the first segment of the stream and updates all the track information, so it correctly realizes that there is no video track.

But on a slower network, this first segment of the stream is not loaded so the call to [player play]; immediately results in an error and the file being rejected from the stream.

The fix for this is that you need to defer the call to [player play]; until after a AVPlayerItemStatusReadyToPlay notification is sent for the new file.

Yuck.

Why was this not caught in testing?

As I write this, the current version of StreamToMe is 3.5.4 and it still contains this 3G/slow WiFi problem.

Yes, I know what the cause of the bug is. Yes, I already have a fix for it. Unfortunately, the agony of release cycles and the nuisance of the App Store approval process means that I'm going to sit on this fix until I've finished the other features I wanted to include in the next update — the background audio over 3G/slow WiFi is simply too narrow a niche to justify an update right now.

However, there's one thing I've noticed about media application users: people seem to use their media within specific niches and if their specific niche isn't working, they're prepared to eviscerate you.

How StreamToMe and ServeToMe are tested

As an independent developer, it is very difficult to handle quality assurance. I don't have a dedicated tester; I have a few people who help me test but they're all volunteers and tend to use the application however they feel. They're not really robust testers. While I use the application all day, I don't really exercise the whole application: on any given day, I focus on pretty specific issues.

Despite these resource limitations, I do have a pretty extensive set of tests. Unfortunately, the scope of the application means that the tests are arguably too extensive for my ability to run them all.

For file compatibility, I have 280 different test files in a regression suite that I run (literally media files in a folder that I run through the program and process the log file to ensure no unexpected errors). This takes 8 hours.

For server functionality, I have a test harness that tests every server command (fortunately, this takes just 30 seconds).

For client functionality, I have a 166 step, user-operated test script. This takes about an hour to perform, sitting in front of the application, pressing buttons in order.

Just 10 hours for these steps but it only tests 1 version of the program.

If you include all the different platforms for which there is platform-specific code, there are 4 versions of the server that need testing (Windows XP, Windows 7, Mac OS X 10.5, Mac OS X 10.6) and 6 versions of the client (iOS 3 on any iOS device, iOS 3.2 iPad, iOS 4 on iPhone 3G, iOS 4 on iPhone 3Gs, iOS 4 on iPhone 4, iOS 4 on iPad).

You should realize that just running this suite in these testing environments would take me about a week. And that's if I worked non-stop on StreamToMe, which I don't.

But the bug slips through: how do you fix it?

It is unreasonable for me to fully test minor releases and sometimes minor issues slip through. Needing to limit testing so that it is manageable has resulted in some minor bugs but it does not describe why this latest 3G/slow-WiFi problem escaped testing.

Even if I had run my full test suite on version 3.5.4 of StreamToMe, it would not have detected the problem between 3G and background audio. This is because the test script tested background audio on local WiFi — you don't generally insert repeats of tests into your script unless you suspect something about the repeat in a new context will actually affect the test. In this case, I had no reason to suspect that the two ideas would be connected.

An interesting thought to consider here: the code coverage through my program is identical on local WiFi and 3G. The difference is either somewhere in Apple's code or purely a timing problem.

All you can do in these situations is add the scenario to your test cases, fix the bug and makes sure it keeps getting tested in new releases.

Conclusion

I hope you can see that even when APIs are documented, the usage and implications of the API can be unknown and subject to misunderstandings and change over time. The fact that video cannot play in the background is barely mentioned by the documentation but the details of video being paused, stopped or rejected with an error is completely absent from documentation (you will only discover this by experimentation).

Lack of information is always hard to deal with in testing. You can only exercise documented or otherwise suspected behavior, and even so, you need to be practical. You can't simply say: test everything about background audio. You need to formulate your tests based on what you think is likely to have different effects.

The decision by Apple to forbid video in the background is frustrating and puzzling from my perspective. Why can't iOS simply ignore video packets in the background — particularly for disabled tracks? I can only presume that there's a technical reason for this behavior but since we haven't been informed of the boundaries, it remains frustrating.

Additionally, the entire iOS environment makes this type of problem exceptionally difficult to characterize and test. UI automation is insanely difficult in iOS and even if it improved (I'm keeping my eye on Cucumber+Frank) it probably wouldn't be able to exercise background switches and realtime and network issues easily.

User interface strings in Cocoa

In this post, I'll look at best practice for using and managing text strings in your user interface. This is a fairly simple topic but Cocoa has established "best practices" for handling user interface strings that new Cocoa developers should be aware of. Since it is inevitably related, I'll also look at the steps involved in localizing the strings in your applications but remember: you should follow good practice for string handling, even if you have no intention of ever translating your application.

Introduction (the wrong way)

Putting a text string in your user interface is not a difficult thing to do on a technical level. In code, filling in text can be as simple as setting the text property of a UILabel to a literal string:

someUserInterfaceLabel.text = @"Text to display";

(This code is for an iOS UILabel. On Mac OS X, you would set the stringValue property of an NSTextField but otherwise the step is the same.)

While this will work, you should never set a user interface string this way.

Setting labels with literal strings (the right way)

The most thorough way to put a literal string into your Cocoa application's user interface is:

someUserInterfaceLabel.text =
    NSLocalizedStringFromTable(
        @"Text to display",       // the native language string
        @"SomePageLabels",        // the category
        @"Label display string"); // a comment describing context

This is pretty verbose though. It is often okay to just use:

someUserInterfaceLabel.text = NSLocalizedString(@"Text to display", nil);

If you take no other steps, this will produce exact the same output as the "wrong way" example.

You should always use the NSLocalizedString[...] macros for every user interface string in your code.

But wait... this NSLocalizedString[...] stuff requires more typing and unless you take yet more additional steps, it won't have any functional difference? If I'm not planning to translate my program right now, aren't they a complete waste of time?

Why NSLocalizedString is important, even if you don't intend to translate

Obviously, the NSLocalizedString[...] functions (and the less common CFCopyLocalizedString[...] variants) are functions that exist to enable localization (i.e. letting you translate your application into different languages).

Technically, they're not even functions — they're just macros that invoke the -[NSBundle localizedStringForKey:value:table:] method — but you should always use the macro and not the underlying method for reasons I'll discuss in the "Mechanics of Translation" section below.

However, even if you're not intending to ever localize your application, you should always use NSLocalizedString.

There are a few reasons for this:

  1. Futureproofing: The future is hard to predict: you never know if you'll want to translate in the future. Needing to go through your code and find rogue literal strings is time consuming and prone to mistakes. Instead, everything should always have NSLocalizedString from the beginning.
  2. MVC practices: It keeps the exact details of your model/presentation layer at least one level of indirection removed from your controller code. In some cases, you can simply change the .strings files for your program to update the user interface and not need to change code due to this separation.
  3. Separation of concerns: It clearly identifies text strings intended for user presentation as opposed to text strings used as keys for programming use only.
  4. Discourages other bad practices: with your user interface strings detached from your controller, you'll be less likely to try to read static strings back from the user interface (a very bad idea) or place programmer-targetted strings in the user-interface.

Get into the habit of using NSLocalizedString. It's really simple to do — even when you're hacking code together quickly, you should be able to use it.

The first two points in the previous list are self-explanatory but the second two merit further explanation.

Separation of concerns

It is always helpful in programming to be able to glance at code and understand the intent. Consider the following piece of code in isolation:

[someDictionary setObject:@"value" forKey:SomeKeyNameString];
[someDictionary setObject:NSLocalizedString(@"value", nil) forKey:SomeOtherKeyNameString];

Without knowing what someDictionary is for or what the purpose of the SomeKeyNameString and SomeOtherKeyNameString values are, we know that the second string is intended for display in the user interface at some point whereas the first string is purely for tracking a value internally.

This clear labelling of intent is helpful as strings for user display have a very different role in a program compared to strings for internal use.

Discourages other bad practices

If you treat NSLocalizedString in your mind as though its output is a black box, this can help you avoid poor controller design when managing user interface elements. It can act as a conceptual tool to encourage you to design things the right way, instead of a lazy way.

Your controller code should treat user interface strings as something that can be written but not read. Reading static strings back from the user interface is always bad (it ends up being a form of "common or data coupling" — a bad design practice).

In the "Separation of concerns" example above, you might consider that since the keys SomeKeyNameString and SomeOtherKeyNameString are defined in global variables in this example, that perhaps you'd want to define your localized strings in global variables. In most cases a global variable for a user string is actually a bad idea.

We define dictionary keys in global variables because more than one location in the program may need to use exact the same value or the exchange of information between the two points will fail. But with user interface strings, you should never have a second piece of code that requires the exact same value: you should never read back from the user interface or require user interface collusion. Generally, the only situation where the same string should appear multiple times is if the same user interface code is displaying it (i.e. you're drawing the same object) but in this case, the code is common and the string should only need to appear once in the code.

If you need to uniquely identify a label or the state of a text displaying item, testing the text it contains is the wrong way to do that. A far better way is to use the tag value of any UIView/NSActionCell and then map the tag value onto the object's role or function (tag is a pointer sized value so you can store a non-retained object reference here if needed, not just an integer). The tag property is not reserved for any other purpose; it is intended for the controller to track user interface items and their state.

Mechanics of translation (when you're ready)

Eventually, you may actually have to translate your program. Let's look at the steps involved.

Create your ".strings" files

The files you need to translate are the ".strings" files in your application. By default though, your project probably won't have any ".strings" files (except possibly an InfoPlist.strings file which is for translating your Info.plist file's strings).

The first step is to make sure you have a localized directory somewhere (probably in the Resources subdirectory of your project's folder). The localized directory should be named "en.lproj" if you're starting with English strings, otherwise you'll want to replace "en" with the appropriate ISO 639-1 or ISO 639-2 designators. If needed you can use the script and region identifiers too as described in Apple's Language and Locale Designations.

A note on folder names: it is common to see "English.lproj" used as the name for English localization instead of "en.lproj" — in fact, Xcode 3 still generates folders with this name if you Get Info on a file and select "Make file localizable". Apple have stated that these old, "full" names are deprecated from Mac OS X 10.4 onwards in favor of ISO 639-1 or ISO 639-2 designators. Don't use the old "English.proj" style names anymore and replace with "en.lproj" if it is autocreated (yes, you might need to update your Xcode paths if you change the folder name).

Now we can create ".strings" files automatically from all the NSLocalizedString references in your program. To do this, open a Terminal in your Project's root directory and run the following command:

find -E . -iregex '.*\.(m|h|mm)$' -print0 | xargs -0 genstrings -a -o Resources/en.lproj

This will process all .m, .h and .mm files in your directory hierarchy and create ".strings" files for them in the en.lproj directory (note, the en.lproj directory must already exist). This assumes that the localized resources directory you created is located at "Resources/en.proj", relative to your Project's root directory; obviously, you'll need to change this if your localized resources are elsewhere.

The ".strings" file will be filled with entries that look like this:

/* This comment provided comes from the last parameter of the NSLocalizedString macro. */
"Some UI string %@ to translate %@" = "Some UI string %1$@ to translate %2$@";

Your translator just needs to translate the right-hand side of the equality statement. Notice that placeholders in your strings are given ordinal positions (1 and 2 in this case) so that the translation can change the order of placeholders if necessary (obviously, if you use placeholders, you should include a comment that explains what they're going to be).

Localization versus Internationalization: generally, the whole process of creating new language variants is referred to as localization. In reality though, it comprises two steps:
  1. Internationalization: where you decouple the program from the original locale
  2. Localization: where you add translations and behaviors for each new locale
By that terminology, the inclusion of NSLocalizedString wrappers and the creation of ".strings" files is the "Internationalizing" phase.
genstrings will only handle static NSLocalizedString and CFCopyLocalizedString strings

The only strings that will be automatically extracted are those wrapped in NSLocalizedString[...] and CFCopyLocalizedString[...] macros. Obviously, all your user interface text needs to be wrapped in these but also remember that the underlying -[NSBundle localizedStringForKey:value:table:] method will not be automatically extracted.

Why would you ever use -[NSBundle localizedStringForKey:value:table:] directly then? The answer is for dynamically generated strings.

The genstrings command will raise an error if it detects anything other than a static string in the localization macros. This is appropriate because you don't want your translators translating variable names and function calls (they only need to translate the results of those calls).

The answer to why you would use -[NSBundle localizedStringForKey:value:table:] is then: the actual strings to be translated are located elsewhere in the code (or are in a ".strings" file that was not generated from code) and you are simply looking them up dynamically.

Encoding problems

From Mac OS X 10.5 onwards, you can put any UTF-8 characters in your NSLocalizedString constants. Prior to this, they were required to be pure 7-bit ASCII with all Unicode escaped with \\Uxxxx style escaping or you could use MacRoman with the -macRoman command-line option to use MacRoman high-ASCII characters.

A quick swipe at almost everybody: UTF-8 has been around since 1993 and Unicode 2.0 since 1996; if you have created any 8-bit character content since 1996 in anything other than UTF-8, then I hate you.

I weep to think of the years of programmer time that are still wasted attempting to support non-Unicode formats without characters getting garbled because people are still creating content using ancient encodings without useful identifiers to indicate what nonsense encoding they're using (or worse, people creating content that explicitly uses the wrong encoding for an encoding-specific text field).

MacRoman? Atrocious. Big-5? I hope you want to see garbage output. Windows Latin? You suck. If you're creating new content using anything other than UTF-8, UTF-16 or UTF-32 then you should be forced to serve prison time with whatever idiot monkey decided that UTF-16 should be allowed little-endian and big-endian variants instead of a single authoritative encoding.

The actual text files generated by genstrings are UTF-16 in whatever byte order your system happens to use. i.e. UTF-16BE on PowerPC and UTF-16LE on Intel Macs.

Grumble.

Translating XIB files

Not all your strings will come from your code. The other common text location is in XIB files. XIB files can be a little bit trickier than strings in code due to two factors:

  1. While you can extract the strings from a XIB file easily, you also have to merge them back in once the translation is complete — basically another step that can go wrong
  2. The ".strings" file format extracted from XIB files is uglier and doesn't have easy room for comments to send to the translator

For these two reasons, I generally avoid putting text in XIB files if reasonably possible — it is normally easier to have text inserted at NIB load time by the code. Of course, menus, button labels and automator labels can't reasonably be moved into code so you're still likely to need a number of XIB files translated.

You extract the .strings from XIB files in a similar way to extracting the strings from code. However, first we must make all of our XIB files localizable (if they aren't already).

To localize your XIB files, select them all in the Xcode project Group Tree, Get Info on them and then from the first tab, select "Make file localizable".

Then, go to the localized directory where your files all ended up (if there's multiple, you'll need to do this for each one) and run the following in Terminal:

for file in *.xib; do
    ibtool --export-strings-file "$file".strings "$file"
done

This will generate all the ".strings" files for your XIB files.

Once the ".strings" files are localized, create a new ".lproj" directory with the appropriate language name for the new translations and put all the ".strings" files in it. Then open a Terminal in this new folder and run:

for file in *.xib.strings; do
    basename=`basename "$file" .strings`
    ibtool --strings-file "$file" --write "$basename" "../en.lproj/$basename"
done

This will merge all the ".xib.strings" files in the current directory with the XIB files from the en.lproj directory, creating the translated XIB files.

Translating other resources

The same "Make file localizable" step that we used for the XIB files in the previous section can be applied to any resource file in your Xcode group tree so you can localize other resources in whatever way is apppropriate.

Here's a tip though: avoid localizing anything other than strings and XIB files by whatever means possible. Having non-strings files for translation will cause you nothing but pain and suffering.

In particular: avoid localizing images. Work as hard as you can to keep all text out of images (except in logos that don't require translation). You can perform quite sophisticated drawing and text handling in Cocoa code if needed and this will almost always be easier than localizing images.

I haven't really touched on non-string code localization topics in this post. There's date, time, numbers, error descriptions and other stuff — most of the time, the classes and APIs for these make it clear what you need to do. Just read Apple's Internationalization documentation.

Conclusion

Most programmers should already know the information in this post. Numerous other Mac programming blogs have discussed the topic:

See how anciently old those second two links are? I'm not telling you new information. The advice remains the same: always, ALWAYS use NSLocalizedString for your user interface strings.

Actual translation can happen later or not at all — my point is that your need for translation (or lack of need) should not determine whether you use NSLocalizedString as it is best practice in any case. Of course you can rest assured that translation will all work out if your code is already NSLocalizedString-compliant.