Advanced programming tips, tricks and hacks for Mac development in C/Objective-C and Cocoa.

Verifying that a string contains an email address using NSPredicate

To celebrate the official release of iPhone OS 3.0 this week, I will show you how to verify that an NSString contains a syntactically valid email address using NSPredicate — a class that joins the iPhone SDK 3.0 as part of the Core Data additions. This code will work on Mac OS X too since, as with the rest of Core Data, NSPredicate has been part of Mac OS X since 10.4 (Tiger).

Before I begin...

I gave an interview to Anthony Agius of MacTalk.com.au this week. You can download the MP3 from their website. I'm hesitant to listen to my own voice but I think I talked about what it's like to be an independent Mac/iPhone developer in Melbourne, Australia.

Back to predicates

In programming, a predicate is a condition that returns true or false if the object it processes has the properties that the predicate describes. The key difference between a predicate and a regular boolean expression is that a predicate only considers the properties of one object, where a boolean expression may consider multiple, unrelated objects.

Many programmers are familiar with predicates as used in SQL database queries. For example a query to extract the complete row from the "people" database table for every person named "John Smith" might look like this in SQL:

SELECT * FROM people WHERE firstname = 'John' AND lastname = 'Smith'

Everything after the "WHERE" is the predicate — it looks at properties of the row only and is either true (the row will be extracted) or false (the row will be ignored).

Using NSPredicate to evaluate predicates

In Cocoa, NSPredicate works in much the same way as the "WHERE" clause of SQL. The main reason that NSPredicate is being brought to the iPhone is that NSPredicate fulfils the same role in Core Data that "WHERE" clauses fulfil in SQL — to allow the persistent store to fetch objects that satisfy specific criteria.

Imagine we had an NSDictionary created using the following method:

- (NSDictionary *)personRowWithFirstname:(NSString *)aFirstname
    lastname:(NSString *)aLastname
{
    return
        [NSDictionary dictionaryWithObjectsAndKeys:
            aFirstname, @"firstname",
            aLastname, @"lastname",
        nil];
}

we could test if a given row created by this method matched the predicate "firstname = 'John' AND lastname = 'Smith'" with the following:

// given an NSDictionary created used the above method named "row"...
NSPredicate *johnSmithPredicate =
    [NSPredicate predicateWithFormat:@"firstname = 'John' AND lastname = 'Smith'"];
BOOL rowMatchesPredicate = [johnSmithPredicate evaluateWithObject:row];

The string format used to construct an NSPredicate in Cocoa is very similar to the syntax of the "WHERE" clause in SQL. You can also construct this NSPredicate in code by building it from two NSComparisonPredicates and an NSCompoundPredicate.

A more common use of NSPredicate is filtering — extracting rows that match an NSPredicate from a larger collection:

// given an NSArray of rows named "rows" and the above "johnSmithPredicate"...
NSArray *rowsMatchingPredicate = [rows filteredArrayUsingPredicate:johnSmithPredicate];

This is then more like an SQL query where we have selected matching rows from the larger table of data.

Note: NSPredicate handles filtering only. If you'd like to replicate SQL's "ORDER BY" clause, you can apply NSSortDescriptor as a separate step.

Verifying an email address

The "LIKE" comparison operator in NSPredicate (NSLikePredicateOperatorType) is commonly used as a convenient means of testing if an NSString matches a Regular Expression. It's advantage over full libraries with greater options and replacement capability is that it is already in Cocoa — no libraries, no linkage, no hassle.

To test if an NSString matches a regular expression, we can use the following code:

NSPredicate *regExPredicate =
    [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regularExpressionString];
BOOL myStringMatchesRegEx = [regExPredicate evaluateWithObject:myString];

The only question that remains is: what is a regular expression that can be used to verify that an NSString contains a syntactically valid email address?

NSString *emailRegEx =
    @"(?:[a-z0-9!#$%\\&'*+/=?\\^_`{|}~-]+(?:\\.[a-z0-9!#$%\\&'*+/=?\\^_`{|}"
    @"~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\"
    @"x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-"
    @"z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5"
    @"]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-"
    @"9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21"
    @"-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])";

This regular expression is adapted from a version at regular-expressions.info and is a complete verification of RFC 2822.

This adaptation involved escaping all backslashes with another backslash (otherwise the NSString will try to interpret them before they are used in the Regular Expression) and escaping the ampersands (&) so it isn't interpreted as a Unicode escape sequence and caret (^) characters because — actually I have no idea why except that the expression wouldn't parse without it.

The linked regular-expressions.info page recommends using slightly different regular expressions that force the top-level domain to be a country code or a known top-level domain. With the number of top-level domains due to increase in the near future, I'm not sure this is a good constraint to impose — since this check isn't intended to verify that the provided domain name is valid.

Since this regular expression NSString is so long, I've split it over 7 lines. This is an underused feature in Standard C languages — if you split a string into pieces but put nothing except whitespace between the pieces, the compiler will treat it as one continuous string. You don't need to write an extremely long string on a single long line.

Conclusion

Just a few lines of code this week but I thought it would be good to draw attention to one of the minor additions making its way to the iPhone in SDK 3.0. I use NSPredicate all the time on Mac OS to perform searches, extract subarrays and perform quick Regular Expression tests. Without it, the only alternatives on the iPhone were methodical array iterations and manual comparisons in code.

Read more...

Revisiting an old post: Streaming and playing an MP3 stream

Given the attention it received and the number of bugs I know it contained, I wanted to revisit an old post of mine: Streaming and playing an MP3 stream. In this post, I'll talk about the problems the original contained, how I fixed those problems and I'll present the updated result.

Introduction

Last September, I wrote a post titled "Streaming and playing an MP3 stream". The post was largely an experiment — I just wanted to see if I could play a streaming MP3 by quickly adapting Apple's AudioFileStreamExample to accept an HTTP data stream.

Unexpectedly, the post became one of my most popular. The attention quickly revealed the limitations in my approach:

  • The blend of Objective-C and C was muddled and led to a situation where neither were being used cleanly.
  • The boolean flags I copied from the original example were a bad way to describe the playback state and lots of situations were not covered by these flags.
  • Sending notifications to the user-interface on a thread that isn't the main thread causes problems.
  • The extra thread I added (the download thread) was never thread-safe.

I've finally decided to take the time to present a solution to these issues and present an approach which is a little more robust and a little easier to extend if needed.

You can download the complete AudioStreamer project as a zip file (around 110kB) which contains Xcode projects for both iPhone and Mac OS. You can also browse the source code repository.

Limited scope

One point should be clarified before I continue: this class is intended for streaming audio. By streaming, I don't simply mean "an audio file transferred over HTTP". Instead, I mean a continuous HTTP source without an end that continues indefinitely (like a radio station, not a single song).

Yes, this class will handle fixed-length files transferred over HTTP but it is not ideal for the task.

This class does not handle:

  1. Buffering of data to a file
  2. Seeking within downloaded data
  3. Feedback about the total length of the file
  4. Parsing of ID3 metadata

These things often can't be done on streaming data, so this class doesn't try. See the "Adding other functionality" section for hints about how the class could be reorganised to handle some of these features.

Taking code out of C functions

Since I had borrowed the AudioFileStream and AudioQueue callback functions from Apple's example, they were Standard C.

My first change was to make these 6 callback functions (7 including the CFReadStream callback) little more than wrappers around Objective-C methods:

void MyPacketsProc(
    void *inClientData,
    UInt32 inNumberBytes,
    UInt32 inNumberPackets,
    const void *inInputData,
    AudioStreamPacketDescription *inPacketDescriptions)
{
    // this is called by audio file stream when it finds packets of audio
    AudioStreamer* streamer = (AudioStreamer *)inClientData;
    [streamer
        handleAudioPackets:inInputData
        numberBytes:inNumberBytes
        numberPackets:inNumberPackets
        packetDescriptions:inPacketDescriptions];
}

At a compiled code level, this is a step backwards: all I've done is slowed the program down by an extra Objective-C message send.

Technically, a C function that takes a "context" pointer (like the inClientData pointer here) is not significantly different to a method. What a method does is makes data hiding and data abstracted actions easier. Within a method, you can easily access the instance variables of an object and you don't need to explicitly pass context into each function.

This is the cliché argument in favor of object-orientation — but it isn't why I reorganized these functions and methods.

The honest reason why I did it is aesthetics: it is easier to read a class that is implemented using Objective-C methods alone — it's more consistent. I chose to move towards an Objective-C aesthetic and away from the Standard C aesthetic of the CoreAudio sample code to promote consistent formatting, consistent means of accessing state variables, consistent ways of invoking methods and consistent ways of synchronizing access to the class.

Describing state

With the majority of code now inside the class, I was in a better position to start handling changes through methods rather than direct member access.

My original approach to state came from Apple's original example. This example had just one piece of state: a bool named finished (which indicated that the run loop should exit).

The problem with this flag is how simple it is. It is unable to distinguish between the following:

  1. End of file, normal automatic stop.
  2. The user has asked the AudioStreamer to stop but the AudioQueue thread has not yet responded.
  3. An error has occurred before the AudioQueue thread is created and we must exit.
  4. We are stopping the AudioQueue for temporary reasons (clearing it, changing device, seeking to a new point) but we don't want the loop to stop.

For Apple's example, there was no problem: the first case was the only one that ever occurred.

As a hasty solution, I had added started and failed flags but these really only covered the first and third case adequately.

In the end, I realized that the AudioStreamer needed much more descriptive state where every combination of progress within each thread had a different position:

typedef enum
{
    AS_INITIALIZED = 0,
    AS_STARTING_FILE_THREAD,
    AS_WAITING_FOR_DATA,
    AS_WAITING_FOR_QUEUE_TO_START,
    AS_PLAYING,
    AS_BUFFERING,
    AS_STOPPING,
    AS_STOPPED,
    AS_PAUSED
} AudioStreamerState;

and when stopping, one of the following values would also be needed:

typedef enum
{
    AS_NO_STOP = 0,
    AS_STOPPING_EOF,
    AS_STOPPING_USER_ACTION,
    AS_STOPPING_ERROR,
    AS_STOPPING_TEMPORARILY
} AudioStreamerStopReason;

In this way, the state always describes where every thread is and the stop reason explains why a transition is occurring.

Combining this with an error code that replaces the old failed flag, I now have a complete desription of the state.

By cleaning up the state of the object, I was able to make the object capable of state transitions that weren't previously possible including pausing/unpausing and returning to the AS_INITIALIZED state after a stop (instead of requiring that the class be released after stopping).

Notifications

In the old version of the project the only way for the user-interface to follow the playback state was to observe the isPlaying property on the object which reflected the kAudioQueueProperty_IsRunning property of the AudioQueue.

This observing was handled through KeyValueObserving. I'm a big fan of KeyValueObserving for its simplicity and ubiquity but this was not the correct place to use it.

KeyValueObserving always invokes the observer methods in the same thread as the change. Since all changes in AudioStreamer happen in secondary threads, this means that the observer methods were getting invoked in secondary threads.

Why is this bad? A minor drawback is simply the unexpectedness for the observer but the biggest reason was that the sole purpose of observing this property was to update the user-interface and the user-interface on the iPhone cannot be updated from any thread except the main thread. Even on the Mac, performing updates off the main thread can have unexpected and glitchy results.

The solution is to retain the NSNotificationCenter of the thread that first calls start on the object and use this center to send messages as follows:

NSNotification *notification =
    [NSNotification
        notificationWithName:ASStatusChangedNotification
        object:self];
[notificationCenter
    performSelector:@selector(postNotification:)
    onThread:[NSThread mainThread]
    withObject:notification
    waitUntilDone:NO];

Don't invoke postNotification: directly from the secondary thread as, like most methods, it is not thread safe and it could be in use from the main thread.

Thread safety

Despite adding an extra thread on top of Apple's AudioFileStreamExample, I never really spent any time thinking about thread safety — a reckless approach to stability. In my defence Apple's example wasn't exactly cautious with its threads and would quit while the AudioQueue's thread was still playing the last buffer.

The most efficient approach to threading is to carefully enter @synchronized (or NSLock or pthread_mutex_lock) in a tight region around any use of a shared variable.

Unfortunately for the AudioStreamer class, almost everything in the class is shared. Instead, I decided to go for the decidedly less efficient approach of running almost everything in the class within a @synchronized section, emerging only at points when control must be yielded to other threads.

The drawback is that the code rarely runs simultaneously on multiple threads (although threading here is for blocking and I/O, not for multi-threaded performance reasons so that's not a probem). The advantage with this heavy-handed locking approach is that the only threading condition that may cause problems are deadlocks.

When do deadlocks occurs? Only when you're waiting for another thread to do something while you're inside the synchronized section needed by that other thread. The simple solution: never wait for another thread inside a synchronized section.

AudioStreamer has three situations where 1 thread waits for another:

  1. The run loop (the AudioFileStream thread waits for any kind of control communication from the main thread or playback finished notification from the AudioQueue thread).
  2. The enqueueBuffer method (AudioFileStream thread waits for the AudioQueue thread to free up a buffer).
  3. Synchronous AudioQueueStop invocations (waits for the AudioQueue to release all buffers).

The first two points are easy: perform these actions (any any method invocation which invokes them) outside of the @synchronized section.

The final point is harder: the synchronous stop must be performed inside the @synchronized section to prevent multiple AudioQueueStop actions occurring at once. To address this, the release of buffers by the AudioQueue (in handleBufferCompleteForQueue:buffer:) must perform its work without entering the @synchronized section (although it's allowed to use the queueBuffersMutex as normal since that isn't used by anything else during a synchronous stop).

Of course, every time the @sychronized section is re-entered, a check must be performed to see if "control communication" has occurred (the class checks this by invoking the isFinishing method and exiting if it returns YES).

Adding other functionality

Get metadata

The easiest source of metadata comes from the HTTP headers. Inside the handleReadFromStream:eventType: method, use CFReadStreamCopyProperty to copy the kCFStreamPropertyHTTPResponseHeader property from the CFReadStreamRef, then you can use CFHTTPMessageCopyAllHeaderFields to copy the header fields out of the response. For many streaming audio servers, the stream name is one of these fields.

The considerably harder source of metadata are the ID3 tags. ID3v1 is always at the end of the file (so is useless when streaming). ID3v2 is located at the start so may be more accessible.

I've never read the ID3 tags but I suspect that if you cache the first few hundred kilobytes of the file somewhere as it loads, open that cache with AudioFileOpenWithCallbacks and then read the kAudioFilePropertyID3Tag with AudioFileGetProperty you may be able to read the ID3 data (if it exists). Like I said though: I've never actually done this so I don't know for certain that it would work.

Stream fixed-length files

The biggest variation you may want to make to the class is to download fixed-length files, rather than streaming audio.

To handle this, the best approach is to remove the download from the class entirely. Download elsewhere and when "enough" (an amount you should determine on your own) of the file is downloaded, start a variation of the class that plays by streaming from a file on disk.

To adapt the class for streaming from a file on disk, remove the CFHTTPMessageRef and CFReadStreamRef code from openFileStream and replace it with NSFileHandle code that uses waitForDataInBackgroundAndNotify to asynchronously stream the file in the same way that CFReadStreamRef streamed the network data.

Once you're streaming from a file, you'll probably want to permit seeking within the file. I've already put hooks within the file to seek (set the seekNeeded flag to true and set the seekTime to the time in seconds to which you want to seek) — however, the mechanics of seeking within the file would be dependent on how you access the file.

Incidentally, the AudioFileStreamSeek function seems completely broken. If you can't get it to work (as I couldn't) just seek to a new point in the file, set discontinuous to true and let AudioFileStream deal with it.

Handling data interruptions

At the moment, if the AudioQueue has no more buffers to play, the state will transition to AS_BUFFERING. At this point, no specific action is taken to resolve this situation — it assumes that the network will eventually resume and requeue enough buffers.

I actually expect there will be cases where this action is insufficient — you may need to ensure that the AudioQueue is paused until enough buffers are filled before resuming or even restart the download entirely. I haven't experimented much since it is easiest with streaming audio just to stop and start new.

Incidentally, if you're curious to know how many audio buffers are in use at any given time, uncomment the NSLog line in the handleBufferCompleteForQueue:buffer: method. This will log how many 1 kilobyte audio buffers are queued waiting for playback (when the queue reaches zero, the AudioStreamer enters the AS_BUFFERING state).

Conclusion

You can download the complete AudioStreamer project as a zip file (around 110kB) which contains Xcode projects for both iPhone and Mac OS. You can also browse the source code repository.

The functionality of this new version has not changed greatly — my purposed was to present a version that is more stable and tolerant of unexpected situations, rather than add new features.

As before, the AudioStreamer class should work on Mac OS X 10.5 and on the iPhone (SDK 2.0 and greater).

The source repository is hosted on github so you can browse, fork or track updates as you choose. I will likely update again in future (I can't imagine I've written this much code without causing more problems) and this way, you can see the changes I've made.

I hope this post has shown you a number of problems that can happen when code is written hastily. This doesn't mean you should always avoid hastily written code (timeliness and proof of concepts are important) but it does mean you should be practised at refactoring code and not simply slap poor fixes onto code that doesn't cleanly solve a problem in the first place.

Read more...

Method names in Objective-C

Compared to other languages, method names in Objective-C are weird. They're long, they're wordy, they include names for the parameters and they seem to repeat information you can get elsewhere. Despite these apparent negatives, Objective-C method naming can save you time and effort. I'll show you how methods are named so that you can predict them without documentation and understand how methods work and how they use their parameters from their names alone.

Introduction: Objective-C is an ugly duckling

Along with square-brackets and its largely Apple-exclusive nature, method names are one of the most commonly decried parts of Objective-C. I've seen them criticized for being:

  • Long and wordy
  • Repetitious or redundant
  • Filled with names for each parameter
  • Camel-case

While every one of these points is true, none of them is really a negative — although camel-case is a subjective, aesthetic point and I'll ignore it since I'll only consider the structural aspects of method naming conventions in this post.

Objective-C methods names are some of the most predictable, regular, self-descriptive methods in any C-derivative language — and it is these points that allow it to be so. Of course, these benefits are lost unless you know the conventions well enough to understand and predict them.

Background: naming conventions in other C-like languages

To help you understand the reasoning behind Objective-C's method names, I'll start by describing how methods and functions are named in C. I'm starting with C because Objective-C's method naming is, in some respects, an extension of C's naming style, adapted to overcome the limitations.

Standard C's naming style

Yes, Standard C does have some implicit naming conventions. Many standard C functions are composed of three parts:

  1. An indication of the data type acted upon
  2. The action
  3. A description of the secondary object

We can see examples of these in these functions:

  • sscanf - "s" (acts upon a char *), "scan" (extract character data from the string), "f" (format string is the secondary object)
  • fprintf - "f" (acts upon a FILE), "print" (outputs character data to the file), "f" (format string is the secondary object)
  • strlen - "str" (acts upon a char *), "len" (compute the length). No secondary object.
  • fesetround - "fe" (acts upon the "floating point environment"), "set" (changes a state value), "round" (a new rounding value is the secondary object).
Standard C's naming style fails

For these methods, this style works well. The problem is that components are so short (often single letters) that it is difficult to know for certain to what they refer. Many of the methods in math.h show these limitations:

  • lround - "l" (no longer an "acts upon", this first part now indicates "returns a long"), "round" (round to the nearest integer). No description of the primary parameter — you are expected to assume a double.
  • acosl - The "l" in this case is the primary parameter and means long double whereas it meant long in the lround method above. The "a" here is not a prefix, it is part of the action component.

This is where naming styles in C break down — while math.h does have naming conventions, they are all its own. You can learn the tricks of math.h but they are unique to that library, not a standard that applies to all functions.

Beyond these issues C, suffers from a broader lack of adherence to any convention, let alone a consistent one. Most functions are really just a 4 to 6 character description of the action performed. While this may be all that is required at a technical level, it makes it almost impossible to know how to use functions like system, raise, atexit or even malloc for the first time without reading the documentation.

Other languages

Few other languages have a distinct approach to method structure. The overriding convention in the main languages I use (C++, C#, Java) is simply a short verb plus possible modifier.

Some languages do have their own aesthetics (for example, the C++ Standard Library uses terse underscore delimited words) but this doesn't represent a structural convention.

The reason why these languages have abandoned the three part style of Standard C functions is that:

  1. The data type acted upon is now the object upon which the method is invoked, removing the need to explain this part.
  2. Parameter overloading means that a description of any parameter needs to be vague (since the parameter itself may be used inconsistently).

So methods end up being little more than a verb plus an optional modifier term.

The result is that methods with subtle actions or that use their parameters in specific ways can only be used by thoroughly examining the documentation.

Method naming conventions in Objective-C

Objective-C aims to be substantially more self-documenting than its peers. The intent is that all methods in all classes should be able to follow the same set of rules so that subtleties of behavior are easy to see and understand — even when you are new to the class.

This aim combines with the named parameters in Objective-C to produce methods which are quite distinct compared to other languages.

Given the lack of convention in other languages and the unusual nature of Objective-C's named parameters, it's no so surprising that method names in Objective-C are confronting to new Objective-C programmers.

Structure of an Objective-C method name

Objective-C methods are composed of a few different components. I'll list the components here, examples follow:

  1. If there is a return value, the method should begin with the property name of the return value (for accessor methods), or the class of the return value (for factory methods).
  2. A verb describing either an action that changes the state of the receiver or a secondary action the receiver may take. Methods that don't result in a change of state to the receiver often omit this part.
  3. A noun if the first verb acts on a direct object (for example a property of the receiver) and that property is not explicit in the name of the first parameter.
  4. If the primary parameter is an indirect object to the main verb of the method, then a preposition (i.e. "by", "with", "from") is used. This preposition partly serves to indicate the manner in which the parameter is used but mostly serves to make the sentence more legible.
  5. If a preposition was used and the first verb doesn't apply to the primary parameter or isn't present then another verb describing direct action involving the primary parameter may be used.
  6. A noun description (often a class or class-like noun) of the primary parameter, if this is not explicit in one of the verbs.
  7. Subsequent parameter names are noun descriptions of those parameters. Prepositions, conjunctions or secondary verbs may precede the name of a subsequent parameter but only where the subsequent parameter is of critical importance to the method. These extra terms are a way to highlight importance of secondary parameters. In some rarer cases secondary parameter names may be a preposition without a noun to indicate a source or destination.

In addition, an Objective-C method should be maximally readable — its reading should flow like a sentence and abbreviations should be avoided except where they are universally known (and even then, abbreviations to syllables rather than single letters are preferred).

That's a lot to consider. Fortunately, most components are only relevant to specific kinds of method.

Applying the naming convention to property accessors
length

This method is shows the typical form of a getter method for a property or calculated value. Only the first component of the method name applies.

The biggest point to note with respect to other languages is that getter methods don't use the verb "get" because no explicit action or state change is required to extract this property (implicit actions or actions taken for internal reasons should not be communicated through the method name).

encodedLength

This method shows a variant length method where a modifier is used to describe the length. We put the modifier before the property name because a structure like lengthWhenEncoded gets too close to lengthForEncoding: which would be used if the encoding was passed in as a parameter.

sharedApplication

Singleton accessors (like this accessor for the global UIApplication object) and other global data accessors take exactly the form as instance accessors. A description of the return value ("Application" in this case) plus an adjective to disambiguate it (in this case, disambiguation is required because a class name on its own is expected to be a factory method — see below). The term "shared" is used by convention to identify singletons.

doubleValue

This method has the same structure as the previous method but shows how the returned class is sometimes the modifier when used in conjunction with an abstract property like "value".

isEditable

This method is actually an exception to the conventional rules. The verb "is" shouldn't be there — verbs are normally reserved for describing state change or secondary action.

I guess that this style for accessors that return a BOOL developed to make the method read more like a sentence. You could also argue that "is" is a passive verb so it isn't indicating an action.

setLength:

The setter method is not composed in the same way as the getter. With no return value, the first part is a verb describing the action: "set".

The second part, "Length" may be considered to the direct object of the verb component (i.e. the internal property). In this case, it is also the name of the first parameter. Since they communicate the same information, only one is used.

Setter methods with prepositions before the first parameter like setValueUsingDouble: and setValueUsingNumber: are rarely used, even though they may seem like a good way to unambiguously set a property like "value" using different data types. Instead, the property itself is normally give a different name (i.e. setDoubleValue: or setNumberValue:). In this way, there is only ever one setter for a given property (although properties may contain dependencies).

Applying the naming convention to factory methods
string

This NSString method returns an empty string. It takes exactly the same format as a getter method, except that you invoke it on a class object, not an instance and it doesn't include a property name.

It may seem like this method is providing redundant information — we already know that a factory method for a class will return an instance of the same class — but identifying the return type here has two purposes:

  • factory method are identified precisely because they start with the class' name
  • it maintains consistency (always describe the return value when it is the purpose of the method)

A quick follow up to that second point though: the return value only needs to be described when it is the purpose of the method. BOOL values returned as an error indicator and other cases where the return value is a secondary effect (like autorelease which returns the receiver) do not need to describe the return value.

stringWithString:

The parameter here is named "String", indicating that it is an instance of NSString with no other constraints. The correct way to read this method is that it creates a new string in the simplest way possible from its string parameter — i.e. a copy of the string.

The preposition here is of little semantic purpose except to make the method name read like a sentence. The choice of preposition is simply: whatever is most appropriate if you read it like a sentence. For this reason class factory methods normally use "With" but instance factory methods normally use "By" because it implies some involvement of the receivers data (like stringByAppendingString:).

The NSArray method isEqualToArray shows a state interrogation with a preposition but again, the preposition is chosen simply to make the method name read like a sentence.

Applying the naming convention to action methods
release

This is an example of a simple action method: a verb describing how the method changes state or performs secondary actions.

addObject:

Again, a very simple method: verb plus a noun describing the first parameter (any "Object").

addObserver:forKeyPath:options:context:

The important addition to see here is the use of the preposition "for" in front of the second parameter. Yes, "KeyPath" is the indirect object of the method's verb "add" but the real purpose here is to highlight the importance of this parameter — pointing out that this parameter is more important than parameters which follow.

To explain this, consider that the method is not addObject:forKeyPath:usingOptions:andContext: — the options and context are really peripheral parameters whereas keyPath is an important consideration, despite being the second parameter.

This method can be compared to the very similar NSNotificationCenter method addObserver:selector:name:object:. In the case of the NSNotificationCenter method, name: is the corresponding parameter to forKeyPath: and yet no preposition is used (it is not forName:). This reflects the fact that the name parameter is optional (can be nil).

However you should not infer that lack of preposition means "optional" — far from it. Instead, the correct meaning is preposition means more important than other secondary parameters. The selector: parameter of the addObserver:selector:name:object: method is mandatory but is given no special status in the method. It is necessary for routing the notification but much of the time will simply reflect the notifications it observes (i.e. @selector(windowWillCloseNotification:)).

A small exception: delegate methods
windowDidChangeScreen:

This method begins with the name of a class but it is not a factory method nor a method which returns a mutated version of the receiver. It is actually a delegate method and delegate methods don't really follow any of the conventions.

If it were following the conventional standard its name would be receiveWindowDidChangeScreenNotification:. You could argue that delegate methods are passive — the delegate isn't being asked to perform an action, it is being told that something else performed an action — so omitting a verb is permissible. For my part, the break with convention makes me sad inside.

Further confusing things, the first parameter of the delegate method is rarely identified by the closest noun. Instead the first parameter is either the object sending the notification or it is a notification object (which will contain the sender). In the example above, the closest noun is "screen" but the first parameter is an NSNotification. In the case of applicationOpenUntitledFile: the closest noun is "File" but the parameter is the NSApplication object that sent the message.

Summary

In describing method formats, I've repeated myself quite a few times. The reality is that Objective-C uses one method naming convention almost everywhere.

Objective-C method names are very regular. So regular that you should be able to predict the names for methods without checking the documentation — start typing the method name as you guess it, then autocomplete. More than simply predicting the name, you can normally predict how parameters are used, so again you avoid documentation and know what you need.

Looking at the criticisms levelled at Objective-C method names that I listed at the start:

Long and wordy

Yes, Objective-C methods are made to read like sentences. They contain prepositions (something almost never seen in other languages), they contain type of the return parameter, they use full words instead of abbreviations.

The purpose is to make the method as quick to read as possible.

This is a good trade to make since you will read a method many times but only type it once (with code completion, less than once).

Xcode will suggest code completion automatically. Hit return at any time to pick the completion it offers. Hit the Code Sense Completion key (F5 by default) and it will present a popup list of matching options. Control-/ will step through the parameters so you can fill them in.
Repetitious or redundant

While methods like +[UIApplication sharedApplication] may seem redundant, the reality is that +[UIApplication shared] would have a different meaning (it would be an accessor for the static class property named "shared") and +[UIApplication singleton] by omitting a class name fails to communicate that the method also works as a factory method on the first invocation — repeating the class name has meaning, it is not redundant.

Filled with names for each parameter

Yes they are but these names makes it much easier to determine what each parameter is and allows metadata like importance to be conveyed.

Camel-case

Yes. The choice over underscore delimited words is largely aesthetic. Although the choice over Pascal-case (same as camel-case but where the first letter of the first word is uppercase not lowercase) is because uppercase initial letters are reserved for names with global scope — like class names, function names and global constants.

Conclusion

Few other languages have conventions that are as rigourously applied as those in Objective-C. Conventions are annoying to learn — as they may seem arbitrary and unnecessary at first glance — but once learned, they are regular and predictable with few surprises. When trying to get the computer to obey, that's good.

Further reading:

Read more...

Base64 encoding options on the Mac and iPhone

On Unix platforms, a common approach for Base64 encoding is to use libcrypto (the OpenSSL library). However, like most C libraries, you need to wrap it to integrate with Objective-C data types (like NSData and NSString) and it isn't available on the iPhone. I'll show you how to handle base64 encoding/decoding with OpenSSL and without so you can handle the Mac and iPhone equally.

Introduction

Base64 is an encoding for transferring binary data in 7-bit text. Originally used in email, it is also used for binary encoding data in HTML files. Another common use for Base64 is in HTTP Basic Access Authentication where it is used to transfer login details (which might not be printable characters).

The key library for handling Base64 on the Mac is normally libcrypto (the OpenSSL library), so it's a little disappointing that libcrypto isn't available on the iPhone.

Using OpenSSL

Via the command line

On the Mac, you can handle simple encoding tasks like base64 encoding with OpenSSL on the command line:

echo "Base64 encode this text." | openssl enc -base64

gives the encoding result:

QmFzZTY0IGVuY29kZSB0aGlzIHRleHQuCg==

The reverse is handled in the following manner:

echo "QmFzZTY0IGVuY29kZSB0aGlzIHRleHQuCg==" | openssl enc -d -base64

giving

Base64 encode this text.
In code

As you'd expect, doing the same work in code takes a little more typing. First, we're using a library, so we need to include it (in your Project's Build Settings under Other Linker Flags add the flag -lcrypto). Once that's done, you should be able to use the following method in a category on NSData:

#include <openssl/bio.h>
#include <openssl/evp.h>

- (NSString *)base64EncodedString
{
    // Construct an OpenSSL context
    BIO *context = BIO_new(BIO_s_mem());

    // Tell the context to encode base64
    BIO *command = BIO_new(BIO_f_base64());
    context = BIO_push(command, context);

    // Encode all the data
    BIO_write(context, [self bytes], [self length]);
    BIO_flush(context);

    // Get the data out of the context
    char *outputBuffer;
    long outputLength = BIO_get_mem_data(context, &outputBuffer);
    NSString *encodedString = [NSString
        stringWithCString:outputBuffer
        length:outputLength];

    BIO_free_all(context);

    return encodedString;
}

To handle a Base64 encode.

By default, encodedString will have newlines every 64 characters. If needed, you can disable the inclusion of newlines by adding the following line before the BIO_write:

BIO_set_flags(context, BIO_FLAGS_BASE64_NO_NL);

Tthe "BIO" system (I think it stands for buffered I/O) is not very symmetric so the code for decoding is quite different:

+ (NSData *)dataByBase64DecodingString:(NSString *)decode
{
    decode = [decode stringByAppendingString:@"\n"];
    NSData *data = [decode dataUsingEncoding:NSASCIIStringEncoding];
    
    // Construct an OpenSSL context
    BIO *command = BIO_new(BIO_f_base64());
    BIO *context = BIO_new_mem_buf((void *)[data bytes], [data length]);
        
    // Tell the context to encode base64
    context = BIO_push(command, context);

    // Encode all the data
    NSMutableData *outputData = [NSMutableData data];
    
    #define BUFFSIZE 256
    int len;
    char inbuf[BUFFSIZE];
    while ((len = BIO_read(context, inbuf, BUFFSIZE)) > 0)
    {
        [outputData appendBytes:inbuf length:len];
    }

    BIO_free_all(context);
    [data self]; // extend GC lifetime of data to here

    return outputData;
}

An interesting point to note at the top of this function: I add an extra newline to the start of the string. This is because if you have not disabled newlines and the string does not contain at least 1 newline, BIO_read will fail.

Handling Base64 on the iPhone

Using libcrypto isn't possible by default on the iPhone — the library isn't there. You could probably build libcrypto.a and link it statically against your app but that can be difficult to set up and would require that you notify Apple that your app contains encryption.

Normally, it is better to avoid libcrypto on the iPhone. The other functions that libcrypto handles can be found elsewhere:

  • md5 — use the CommonCrypto implementation CC_MD5
  • sha — use the CommonCrypto implementation CC_SHA
  • Public/Private Key Encryption/Decryption — use the SecKeyEncrypt/SecKeyDecrypt functions in the Security framework

You can find the documentation for the Security Framework by performing a standard Xcode API lookup. For some reason though, the CommonCrypto functions only appear in a full-text search.

The Base64 functionality of OpenSSL doesn't have an accessible equivalent on the iPhone, even though NSURLConnection, CFHTTPMessageRef and WebKit must all have access to an implementation — whatever they use is not accessible.

Encoding Base64

Fortunately, Base64 is a fairly simple encoding. At its heart, it looks like this:

static unsigned char base64EncodeLookup[65] =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

//
// Inner loop: turn 3 bytes into 4 base64 characters
//
outputBuffer[j++] = base64EncodeLookup[(inputBuffer[i] & 0xFC) >> 2];
outputBuffer[j++] = base64EncodeLookup[((inputBuffer[i] & 0x03) << 4)
    | ((inputBuffer[i + 1] & 0xF0) >> 4)];
outputBuffer[j++] = base64EncodeLookup[((inputBuffer[i + 1] & 0x0F) << 2)
    | ((inputBuffer[i + 2] & 0xC0) >> 6)];
outputBuffer[j++] = base64EncodeLookup[inputBuffer[i + 2] & 0x3F];

This might be a little ugly to look at if you're not use to seeing bitmasks and bitshifts but it is only a couple lines. It does little more than the comment states: it turns 3 bytes into 4 chars, with the specific chars specified by the base64EncodeLookup mapping.

Of course, while this code handles the center of the main loop, there's almost a hundred lines total in the complete implementation that I wrote.

As part of keeping the function optimal, I wanted to keep the conditionals out of the inner loop (making vectorizing easier). I succeeded and there are no conditionals in the inner loop but this means that there are a few tail conditions to handle in the epilogue.

I also wanted to calculate the exact size that would be required for the output buffer, so it can be allocated once with no waste, but this too occupies a few lines worth of space.

Decoding Base64

Decoding works similarly to encoding, except that in decoding we are reducing 4 characters down to 3 bytes instead of vice versa:

//
// Store the 6 bits from each of the 4 characters as 3 bytes
//
outputBuffer[j] = (accumulated[0] << 2) | (accumulated[1] >> 4);
outputBuffer[j + 1] = (accumulated[1] << 4) | (accumulated[2] >> 2);
outputBuffer[j + 2] = (accumulated[2] << 6) | accumulated[3];

More interesting than the code in this case is the lookup table that each of these accumulated bytes passes through before being used here:

//
// Definition for "masked-out" areas of the base64DecodeLookup mapping
//
#define xx 65

//
// Mapping from ASCII character to 6 bit pattern.
//
static unsigned char base64DecodeLookup[256] =
{
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 62, xx, xx, xx, 63, 
    52, 53, 54, 55, 56, 57, 58, 59, 60, 61, xx, xx, xx, xx, xx, xx, 
    xx,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 
    15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, xx, xx, xx, xx, xx, 
    xx, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 
    41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
    xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, xx, 
};

The "xx"s in this table are just a #define of 65 (i.e. outside the valid range of Base64) but they provide an interesting visual representation of the 6-bits that each Base64 character can occupy within the 8-bit byte.

I was unable to remove all of the conditionals from the inner loop of the decode side and keep the "skip over invalid characters" requirement.

This "skip over invalid characters" stage (where characters are accumulated until 4 valid characters are found) is handled by the following loop (which immediately preceeds the previous "store the 6 bits from each of the 4 characters as 3 bytes" code):

//
// Accumulate 4 valid characters (ignore everything else)
//
unsigned char accumulated[BASE64_UNIT_SIZE];
size_t accumulateIndex = 0;
while (i < length)
{
    unsigned char decode = base64DecodeLookup[inputBuffer[i++]];
    if (decode != xx)
    {
        accumulated[accumulateIndex] = decode;
        accumulateIndex++;
        
        if (accumulateIndex == BASE64_UNIT_SIZE)
        {
            break;
        }
    }
}

This is the only part which makes the decode stage sub-optimal. If you had Base64 input data with no newlines and no other characters requiring skipping, I think you could remove this section entirely so that the inner loop of the decode function could be vectorizable.

Conclusion

Download the NSData+Base64 class and header (4kB).

In this post, I've shown you how to use the default command-line and library options for Base64 handling on Mac OS X. I've also shown you the approach I use for Base64 encoding and decoding on the iPhone.

The libcrypto libraries (when available) are not as tight and simple as custom code for the task but do have the advantage that the pipeline for feeding data into them is more configurable.

I'm certainly not the only one to present C libraries for Base64 encoding that will work on the iPhone but the approach I've used should be efficient (especially the internal implementations in the C-functions) and it should drop into a Cocoa project on the iPhone very easily.

Read more...