With iOS 8, Apple introduced App Extensions. App Extensions are self-contained apps that developers can ship along with their main application. They can be launched on demand by the system to perform things such as sharing, editing a photo, displaying a widget in the Notification Center, presenting a custom keyboard or even provide content to an app running on the Apple Watch.

In order to allow an extension and its hosting application to share data and resources, application groups were also introduced (It’s worth noting that they were originally introduced on OS X in 10.7.4 to support sharing data between an application and a Login Item application shipped within its bundle).

An application group is an identifier shared between various applications in the same group (it can be enabled through an entitlement). Applications in the same group can share a group container directory that is located outside of their own sandbox but accessible by all applications in the group. Think of it as another sandbox shared between applications. An application and its extensions can share this group container but so can multiple applications released by the same developer (assuming they have all specified the same identifier for the com.apple.security.application-groups entitlement).

While it might not seem much at first, this means that applications belonging to the same group can now:

  • Share a directory on the file system. Multiple applications can read and write files in this directory (but also making sure that access is coordinated, more about this later) and could for example share a Core Data store or a sqlite database.
  • Share preferences. When creating an instance of NSUserDefaults with a suite name equal to the application group identifier, cfprefsd will store the preferences plist in the group container directory and all applications in the group will be able to share its content (A couple of years ago I built shared user defaults on the Mac before the application group suite was made available, it was fun).

While this is pretty cool and is a big step forward (previously, multiple apps by the same developer could only share keychain entries) there are times where you wish you could just send regular messages between applications without having to store them somewhere on the file system. Also, while NSUserDefaults, Core Data or sqlite support concurrent access neither actually notify other processes when one makes some changes to the shared data. What this means is that you could be inserting an object in one process but the other process wouldn’t be aware of the change until it attempts fetching the data again.

If we had an IPC mechanism to communicate between applications in the same group we could make sure that other processes are notified of changes in real-time without the need for polling.

This article will describe a complete and general solution to this problem on iOS. However, before diving into it I’d like to briefly discuss the IPC situation on OS X. If you can’t wait and just want to see the code, you can find the project on GitHub.

Current state of IPC on OS X

Mike Ash did an excellent job discussing the state of IPC on the Mac a few years ago so I will not repeat everything but will rather recommend that you read his post (it’s from 2009 but apart from XPC not yet being a thing it’s still a very good overview).

As you probably already know, both OS X and iOS are built on top of Darwin. Darwin itself is built around XNU, a hybrid kernel that combines the Mach 3 microkernel and various elements of BSD, itself a Unix derivative. The dual nature of XNU means that many features of both Mach and Unix are available on Darwin, including several IPC mechanisms.

Mach ports

Mach ports are the fundamental IPC mechanism on Mach. Using them directly is hard but there are Core Foundation (CFMachPort) and Foundation (NSMachPort) wrappers available that make things slightly easier. You rarely use a Mach port directly but assuming you have a sending and receiving ports available you should be able to construct a NSPortMessage and send it over.

While creating a local port is just a matter of initializing a new NSMachPort instance, retrieving the remote one requires the Mach bootstrap server. On OS X this usually means having the server side of the connection registering the port for the name with the shared NSMachBootstrapServer such as:

NSMachPort *port = [NSMachPort port];
NSString *name = @"com.ddeville.myapp.myport";
[[NSMachBootstrapServer sharedInstance] registerPort:port name:name];

On the client side of the connection, one could retrieve the remote port by doing:

NSString *name = @"com.ddeville.myapp.myport";
NSMachPort *port = [[NSMachBootstrapServer sharedInstance] portForName:name];

Alternatively, one could use CFMessagePortCreateLocal that registers the service with the bootstrap server under the cover and CFMessagePortCreateRemote that uses the bootstrap server to look up a registered service by name.

Note that if the application is sandboxed, the system will not let you register service names that aren’t starting with the application group identifier. When creating a Login Item application, LaunchServices implicitly registers a mach service for the login item whose name is the name as the login item’s bundle identifier. See the App Sandbox Design Guide and the iDecide sample project for more info.

Since working directly with Mach messages can be cumbersome, Distributed Objects, Distributed Notifications and Apple Events were built on top of them and simplify things a lot. I won’t discuss these here so go read Mike Ash’s post and the Distributed Objects Architecture if you want more info.

It’s also worth noting that with OS X 10.7 Lion, Apple shipped a revolutionary API built on top of Mach messages and libdispatch: XPC. As per its man page, XPC is “a structured, asynchronous interprocess communication library”. With 10.8, a new NSXPCConnection class was also released. NSXPCConnection brings back the Distributed Objects idea with an XPC flavor by letting you send messages to a remote proxy object through an XPC connection. Since the introduction of XPC, using Mach ports or one of its derivative is essentially unnecessary. However, it’s important to keep in mind that service name bootstrapping is still required with XPC. In particular, the docs for xpc_connection_create_mach_service clearly state that “the service name must exist in a Mach bootstrap that is accessible to the process and be advertised in a launchd.plist”.

POSIX file descriptors

By being a BSD derivative Darwin has all the Unix IPC goodies, in particular Berkeley sockets a.k.a. Unix Domain Sockets. Since these will be the core part of the article I’ll save their discussion for a later section.

Current state of IPC on iOS

As you’re probably aware, a big part of App Extensions on iOS is built around XPC. An XPC connection is set up between the hosting application and the extension in order to communicate. Most of the methods in NSExtensionRequestHandling and NSExtensionContext end up sending some message to the host application through an XPC connection.

However, XPC is also private on iOS and third-party developers cannot use it directly (yet). Similarly, Distributed Objects, Distributed Notifications and Apple Events are not available on the platform.

Regarding using Mach ports directly, while the Mach port API is indeed available on iOS, NSMachBootstrapServer isn’t and creating a remote port with CFMessagePortCreateLocal will return NULL and log a sandbox violation to the console.

[Update: This was proven incorrect. See this follow-up post for more info.]

<notify.h>

OS X has had support for the <notify.h> Core OS notification mechanism since 10.3. Similarly, iOS has had support for these mechanism since its inception. <notify.h> allows processes to exchange stateless notification events. As explained in the <notify.h> header:

These routines allow processes to exchange stateless notification events. Processes post notifications to a single system-wide notification server, which then distributes notifications to client processes that have registered to receive those notifications, including processes run by other users.

Notifications are associated with names in a namespace shared by all clients of the system. Clients may post notifications for names, and may monitor names for posted notifications. Clients may request notification delivery by a number of different methods.

In a nutshell, processes can post and receive system-wide notifications with the notifyd daemon acting as the server by receiving and broadcasting notifications. Observers can be registered on a dispatch queue, a signal, a mach port or a file descriptor!

A simple example would be registering for lock notifications on iOS:

#import <notify.h>

const char *notification_name = "com.apple.springboard.lockcomplete";
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
int registration_token;
notify_register_dispatch(notification_name, &registration_token, queue, ^ (int token) {
    // do something now that the device is locked
});

Pretty straightforward.

CFNotificationCenterGetDarwinNotifyCenter

Core Foundation offers a notification center based on <notify.h>: CFNotificationCenterGetDarwinNotifyCenter. By retrieving this notification center, one can post and observe system-level notifications that originate in <notify.h>.

static void notificationCallback(CFNotificationCenterRef center, void *observer, CFStringRef name, const void *object, CFDictionaryRef userInfo)
{
    // do something now that the device is locked
}

CFStringRef notificationName = CFSTR("com.apple.springboard.lockcomplete");
CFNotificationCenterRef notificationCenter = CFNotificationCenterGetDarwinNotifyCenter();
CFNotificationCenterAddObserver(notificationCenter, NULL, notificationCallback, notificationName, NULL, CFNotificationSuspensionBehaviorDeliverImmediately);

As you can see, this is actually more code since it requires adding a callback function. So, unless you need the behavior provided by CFNotificationSuspensionBehavior, I’d say stick with <notify.h> if you want to post or receive these notifications.

It’s also important noting that the userInfo parameter of the notification is not supported for the Darwin notify center. What this means is that even if you specify a userInfo dictionary when sending the notification from one process, it will not be available on the receiving side (Notification Center simply ignores it).

This drastically reduces the appeal of this method and essentially restricts its usage to simple stateless system-wide notifications – as it was designed for.

One use case I can think of is to reduce polling. You could for example post a notification whenever one process changes a user default so that other process can fetch the new value without having to constantly check whether the value has changed. However, this method is racy by nature since it requires the receiver to fetch the value after the notification was received and its value could have changed or even be reverted. Ideally the new value would be sent with the notification but as we’ve seen the API prevents this from happening.

NSFilePresenter

I previously discussed NSFilePresenter and NSFileCoordinator extensively in this article. In a few words, NSFileCoordinator class coordinates the reading and writing of files and directories among multiple processes and objects in the same process. Similarly, by adopting the NSFilePresenter protocol, an object can be notified whenever the file is changed on disk by another process.

I have used this method to implement messaging between process on OS X and people have been attempting the same on iOS.

Unfortunately, with TN2408 Apple made it clear that using file coordination to coordinate reads and writes between an app extension and its containing app is a very bad idea:

Important: When you create a shared container for use by an app extension and its containing app in iOS 8, you are obliged to write to that container in a coordinated manner to avoid data corruption. However, you must not use file coordination APIs directly for this. If you use file coordination APIs directly to access a shared container from an extension in iOS 8.0, there are certain circumstances under which the file coordination machinery deadlocks.

I suspect file coordination is not handling the case where the app extension is killed by the system while the containing app is blocked waiting to read or write. This will likely be fixed soon but for now this means using this API is not an option.

Unix Domain Sockets

As previously stated, Darwin has its foundations in BSD. BSD being a Unix derivative, Darwin inherited many Unix goodies, one of them being Berkeley sockets, also known as Unix domain sockets or BSD sockets.

You might be familiar with the “Everything in Unix is a file” adage. Well, turns out it’s quite true, in Unix pretty much everything is a file descriptor!

As you probably remember, we previously said that an application group provided a shared container directory to multiple applications in the same group. What if we could use a file to send messages between processes without having to deal with file coordination? Well, turns out a socket is a file and sockets are a pretty good channel for sending and receiving messages. As a side note, Apple wrote a nice section in favor of Unix Domain Sockets vs Mach Messages in TN2083.

In the next section we will discuss how to build a general solution for interprocess communication on iOS based on Berkeley sockets.

IPC on iOS based on Berkeley sockets

In order to build a general solution for IPC, we need to narrow down some requirements. I have used XPC as a model so the feature set I’m aiming at will be a subset of the one offered by XPC. Without further ado, here’s the list of requirements for our solution:

  1. A communication channel
  2. A client-server connection architecture
  3. Non-blocking communication (i.e. Asynchronous I/O)
  4. Message framing for data that is sent through the channel
  5. Support for complex data to be sent through the channel
  6. Secure encoding and decoding of data on each end of the channel
  7. Error and invalidation handling

Let’s now discuss each requirement individually and let’s try to figure out a way to solve each of them.

A communication channel

By creating a stream socket of Unix local type at a given path in the application group container directory we can support multiple processes connecting and sending data to each other. By specifying a unique name for the socket we can even support multiple simultaneous connections. It is however important for the client and server to agree on the name so that they can find each other.

Creating such a socket is as simple as:

dispatch_fd_t fd = socket(AF_UNIX, SOCK_STREAM, 0);

AF_UNIX creates a Unix domain socket (vs an Internet socket for example) and SOCK_STREAM makes the socket a stream socket (other options could be datagram or raw).

Note that creating a socket doesn’t even require a path to be specified. The path is usually specified later so we’ll discuss it in the next section. For now, we can just remember that calling socket returns a file descriptor.

A client-server connection architecture

For our hosts to communicate with each other we need a particular architecture where we can make sure that multiple hosts can connect to each other. In order to achieve this, we use a client-server architecture where one host acts as the server by creating the socket on disk and waiting for connections and the other host(s) act as clients by connecting to the server.

In socket parlance, the server binds to the socket and listens for connections while the client connects. For the client to fully connect, the server has to accept the connection. Let’s see how that translates to code.

We can construct the socket path by appending a unique identifier to the path of the application group container directory that all applications in the group have read/write access to. As long as both server and clients agree on the unique identifier, we now have a channel through which they can communicate.

First, on the server side we would start the connection by doing (error handling removed for brevity):

const char *socket_path = ...

dispatch_fd_t fd = socket(AF_UNIX, SOCK_STREAM, 0);

struct sockaddr_un addr;
memset(&addr, 0, sizeof(addr));
addr.sun_family = AF_UNIX;

unlink(socket_path);
strncpy(addr.sun_path, socket_path, sizeof(addr.sun_path) - 1);

bind(fd, (struct sockaddr *)&addr, sizeof(addr));

listen(fd, kLLBSDServerConnectionsBacklog);

Similarly, on the client side we would connect to the server with:

const char *socket_path = ...

dispatch_fd_t fd = socket(AF_UNIX, SOCK_STREAM, 0);

struct sockaddr_un addr;
memset(&addr, 0, sizeof(addr));
addr.sun_family = AF_UNIX;

strncpy(addr.sun_path, socket_path, sizeof(addr.sun_path) - 1);

connect(fd, (struct sockaddr *)&addr, sizeof(addr));

A couple notes. On the server, we call unlink() before binding to the socket. This is to make sure that a previous socket that was not correctly cleaned up is correctly removed before creating a new one. Also, note that connect() will fail if the server is not listening for connection. Usually, one should make sure that the server is running before attempting to connect a client (ideally, the server runs all the time, like a daemon process for example). Given that applications are short-lived on iOS, a typical scenario would involve attempting to connect the client multiple times until it succeeds.

Note that whereas the client has already called connect() and returned, the server hasn’t yet accepted the connection. Since this could potentially involve a blocking call we will discuss this in the next section.

Since accepting a new client is really a decision that the server application should make, the connection also has a -server:shouldAcceptNewConnection:; delegate method that the application can implement in order to decide whether a client should connect. It is also the perfect opportunity for the server to keep track of which client have connected.

Since there could potentially be multiple client connected to the server, sending a message from the server comes in two flavors:

  • Broadcasting a message. This will send the message to every connected client.
  • Sending a message to a particular client, assuming that the application knows its identity.

Non-blocking communication (Asynchronous I/O)

In order to accept the client connection, the server needs to call accept(). By default, if no pending connections are present on the queue, accept() blocks the caller until a connection is present. Since one of our requirement is non-blocking communication this is clearly not acceptable. Luckily, we can use the O_NONBLOCK property and a dispatch source to solve the problem. Since we can retrieve the pending connection by issuing a read we can set up a dispatch source for DISPATCH_SOURCE_TYPE_READ:

dispatch_source_t listeningSource = dispatch_source_create(DISPATCH_SOURCE_TYPE_READ, fd, 0, NULL);
dispatch_source_set_event_handler(listeningSource, ^ {
    struct sockaddr client_addr;
    socklen_t client_addrlen = sizeof(client_addr);
    dispatch_fd_t client_fd = accept(self.fd, &client_addr, &client_addrlen);
});
dispatch_resume(listeningSource);

Now that we have both server and client correctly connected, we need a way to send and receive messages. Like with any other socket, this can be achieved by mean of read() and write(). Remember that these two calls are blocking by default so not acceptable for our solutions. Luckily both also support a non-blocking variant and dispatch IO provides a very nice API to deal with such reads and writes asynchronously.

We can first create a dispatch IO channel with the following:

dispatch_io_t channel = dispatch_io_create(DISPATCH_IO_STREAM, fd, NULL, ^ (int error) {});
dispatch_io_set_low_water(channel, 1);
dispatch_io_set_high_water(channel, SIZE_MAX);

To read on the channel asynchronously:

dispatch_io_read(channel, 0, SIZE_MAX, NULL, ^ (bool done, dispatch_data_t data, int error) {
    if (error) {
        return;
    }
    // read data
    if (done) {
        // cleanup
    }
});

And to write on the channel asynchronously:

dispatch_data_t message_data = ...
dispatch_io_write(self.channel, 0, message_data, NULL, ^ (bool done, dispatch_data_t data, int write_error) {
    // check for errors
});

Message framing for data that is sent through the channel

As seen in the previous section, we can now read and write data asynchronously on the connection. However, this data is raw bytes and we have no idea where it starts and ends (remember, we are using a stream socket so the data arrives in order but it also arrives broken into pieces).

We need some mechanism in which we could wrap our raw data and that would make it easy to know where it starts and ends. This is known as message framing.

We could implement a new format that has some sort of delimiters for the start and end of a message. It would also probably have some kind of a header that lets one specify the content length of the message as a whole, the encoding to expect, etc… Finally, our framing should support binary data and not simply text since we want to be able to send anything through our connection.

As you can imagine, this is a somewhat solved problem. Many message framing “formats” have been created along the years and HTTP is definitely the most well-known and maybe most widely used. Also, HTTP support binary data (one misconception about HTTP is that it only supports a text body. This is actually not true: the HTTP headers have to be text but the body itself can be any binary data). Since CFNetwork has a very support for HTTP we’ll just use it to frame our messages.

Given a data object to send through the connection, we can create an HTTP message for it.

NSData *contentData = ...
CFHTTPMessageRef response = CFHTTPMessageCreateResponse(kCFAllocatorDefault, 200, NULL, kCFHTTPVersion1_1);
CFHTTPMessageSetHeaderFieldValue(response, (__bridge CFStringRef)@"Content-Length", (__bridge CFStringRef)[NSString stringWithFormat:@"%ld", (unsigned long)[contentData length]]);
CFHTTPMessageSetBody(response, (__bridge CFDataRef)contentData);

NSData *messageData = CFBridgingRelease(CFHTTPMessageCopySerializedMessage(response));
CFRelease(response);

As you can see, this is actually pretty simple. Given a raw NSData we can create an HTTP message by setting the raw data as the body and specifying a Content-Length header of the size of the data. We can then serialize the HTTP message as an NSData instance to send through the connection.

On the other end, we can do a similar thing, by creating a new HTTP message and appending bytes as they come in through the connection. Once the number of bytes is equal to the one we’re expecting from the Content-Length header, we can get our original data by looking at the message body.

CFHTTPMessageRef framedMessage CFHTTPMessageCreateEmpty(kCFAllocatorDefault, false);

while (... get more bytes ...) {
    NSData *data = ...
    CFHTTPMessageAppendBytes(framedMessage, data.bytes, (CFIndex)data.length);

    if (!CFHTTPMessageIsHeaderComplete(framedMessage)) {
        continue;
    }

    NSInteger contentLength = [CFBridgingRelease(CFHTTPMessageCopyHeaderFieldValue(framedMessage, CFSTR("Content-Length"))) integerValue];
    NSInteger bodyLength = (NSInteger)[CFBridgingRelease(CFHTTPMessageCopyBody(framedMessage)) length];
    if (contentLength != bodyLength) {
        continue;
    }

    NSData *rawData = CFBridgingRelease(CFHTTPMessageCopyBody(message));
}

This is indeed very similar. There’s a one little gotcha to be aware of: since we are receiving the HTTP message in chunks, the header might not actually be complete after receiving the first few bytes. Luckily CFNetwork has a handy CFHTTPMessageIsHeaderComplete function that we can use to check for it. Once the header is complete, for each chunk of data that we receive we can check whether the body length matches the expected Content-Length and just retrieve the body data once it’s been fully received.

Support for complex data to be sent through the channel

As discussed previously, one of the major drawbacks of the current solutions is the lack of support for complex data to be sent through the channel. <notify.h> for example only lets us send a message name.

Since our solution supports sending an NSData instance through the channel we can use an encoding and decoding mechanism to transform data <-> objects on each side of the connection.

Once again, luckily Foundation has very good support for this through NSKeyedArchiver and NSKeyedUnarchiver. By encoding the object graph on one end and decoding on the other end we can give the appearance that real objects are being sent through the connection. As far as the library user is concerned, a message containing objective-c objects is being sent and objective-c objects are similarly being received on the other side. Magical!

One requirement for this to work is that the object has to conform to the NSCoding protocol. Most of the Foundation classes already do and it’s pretty easy to implement for one’s custom classes. Obviously, when using custom classes one must make sure that such class is available on both end of the connection.

Secure encoding and decoding of data on each end of the channel

One concern with encoding and decoding random objects between processes is security. Not only do we have to make sure that the class is available on the client and the server for it to be decoded but we also have to make sure that the right class is being decoded and not one pretending to be.

Along NSXPCConnection in OS X 10.8, Apple introduced NSSecureCoding. This protocol extends NSCoding and by adopting NSSecureCoding an object indicates that it handles encoding and decoding instances of itself in a manner that is robust against object substitution attacks.

As per the NSSecureCoding header:

NSSecureCoding guarantees only that an archive contains the classes it claims. It makes no guarantees about the suitability for consumption by the receiver of the decoded content of the archive. Archived objects which may trigger code evaluation should be validated independently by the consumer of the objects to verify that no malicious code is executed (i.e. by checking key paths, selectors etc. specified in the archive).

By requiring the messaged objects to conform to NSSecureCoding and by providing a whitelist of classes to the connection, we can make sure that our messages are encoded and decoded in a secure fashion.

id content = ...
NSMutableData *contentData = [NSMutableData data];

NSKeyedArchiver *archiver = [[NSKeyedArchiver alloc] initForWritingWithMutableData:contentData];
archiver.requiresSecureCoding = YES;

@try {
    [archiver encodeObject:content forKey:NSKeyedArchiveRootObjectKey];
}
@catch (NSException *exception) {
    if ([exception.name isEqualToString:NSInvalidUnarchiveOperationException]) {
        return;
    }
    @throw exception;
}
@finally {
    [archiver finishEncoding];
}
NSSet *allowedClasses = ...
NSData *contentData = ...

NSKeyedUnarchiver *unarchiver = [[NSKeyedUnarchiver alloc] initForReadingWithData:contentData];
unarchiver.requiresSecureCoding = YES;

NSSet *classes = [NSSet setWithObjects:[NSDictionary class], [NSString class], [NSNumber class], [LLBSDProcessInfo class], nil];
classes = [classes setByAddingObjectsFromSet:allowedClasses];

@try {
    content = [unarchiver decodeObjectOfClasses:classes forKey:NSKeyedArchiveRootObjectKey];
}
@catch (NSException *exception) {
    if ([exception.name isEqualToString:NSInvalidUnarchiveOperationException]) {
        return;
    }
    @throw exception;
}
@finally {
    [unarchiver finishDecoding];
}

Error and invalidation handling

In an environment involving multiple processes that could start and die at any time, we cannot assume that errors and connection invalidation will never happen.

Luckily, as stated in TN2083, since Berkeley sockets are a connection-oriented API the server automatically learns about the death of a client and it’s easy for the server to asynchronously notify the client.

We can thus easily implement an invalidation handler where the server or client can be notified whenever the connection becomes invalid.

Similarly, sending a message through the connection can take a completion handler that can pass a possible error that occurred while sending the message.

Conclusion

We were able to build a full-fledged IPC mechanism to communicate between applications and extensions within an application group on iOS. Our solution is based on powerful yet simple technologies such as Berkeley sockets, HTTP message framing and libdispatch.

You can find LLBSDMessaging on GitHub. It is built as a framework so that you can easily integrate it in an existing project.