Caching and NSURLConnection

By Daniel Pasco

05/26/2012

Shortly after the launch of the Mule Radio App we started to notice that episode lists weren’t refreshing as new show audio was posted to the site. We hadn’t seen this during development and hadn’t really had a good opportunity to test this against the production server as new episodes went live before release, so we missed this behavior before the app shipped.

I had a hard time understanding why this was happening - we’ve shipped a lot of applications using NSURLConnection before and I hadn’t personally experienced this kind of thing before.

Here’s what I found out: it looks like NSURLConnection’s default cache policy may keep things around for a really, really long time if the web service you are talking to doesn’t specify a max age or content-expires header

NSURLConnection’s Default Caching Policy

When you use NSURLConnection without specifying a cache policy to use, NSURLRequestUseProtocolCachePolicy is used as the default.

The documentation for this is as follows

NSURLRequestUseProtocolCachePolicy Specifies that the caching logic defined in the protocol implementation, if any, is used for a particular URL load request. This is the default policy for URL load requests.

That is actually totally accurate, but it’s not really clear what they’re saying. Here’s what the document “Understanding Cache Access” says

Cache Use Semantics for the http Protocol

The most complicated cache use situation is when a request uses the http protocol and has set the cache policy to NSURLRequestUseProtocolCachePolicy.

If an NSCachedURLResponse does not exist for the request, then the data is fetched from the originating source. If there is a cached response for the request, the URL loading system checks the response to determine if it specifies that the contents must be revalidated. If the contents must be revalidated a connection is made to the originating source to see if it has changed. If it has not changed, then the response is returned from the local cache. If it has changed, the data is fetched from the originating source.

If the cached response doesn’t specify that the contents must be revalidated, the maximum age or expiration specified in the response is examined. If the cached response is recent enough, then the response is returned from the >local cache. If the response is determined to be stale, the originating source is checked for newer data. If newer data is available, the data is fetched from the originating source, otherwise it is returned from the cache.

RFC 2616, Section 13 specifies the semantics involved in detail.

So, in a nutshell, it says that the caching implementation looks at the response from the server to decide whether or not it should actually go and fetch the data again.

A field trip to the RFC mentioned revealed that what NSURLRequestUseProtocolCachePolicy actually does is look at the expiration headers returned along with the request for cues on how to handle caching. These cues come in the form of either an Expires: header or a Cache-Control: header with a max-age or s- maxage parameter.

All in all, this seems very sensible. Presumably the people that set up the server know what reasonable cache lifetimes are for their data, and this provides a way for applications to update their caching algorithms automatically, just by changing the values at the server. That’s nice.

The part that bit us

If none of Expires, Cache-Control: max-age, or Cache-Control: s- maxage (see section 14.9.3) appears in the response, and the response does not include other restrictions on caching, the cache MAY compute a freshness lifetime using a heuristic. The cache MUST attach Warning 113 to any response whose age is more than 24 hours if such warning has not already been added.

It also reads

Also, if the response does have a Last-Modified time, the heuristic expiration value SHOULD be no more than some fraction of the interval since that time. A typical setting of this fraction might be 10%.

So, if expiration information isn’t provided by the server, the client-side can figure out how to handle cache lifetimes itself. Apparently NSURLConnection caching logic does this by picking a rather large lifetime for the downloaded data. From our experience, I’d guess that the implementation being used for NSURLConnection is not using the 10% rule.

The problem is that Apple’s documentation doesn’t detail what it does when no cache policy is specified by the application and no expiration details are provided by the server. It would be great if a default expiration period were listed in the documentation. At least one person has suggested that the default timeout should be 24 hours, according to the RFC. I saw a lot of things that suggested it, but nothing that struck me as hard evidence that this is the case here.

Fortunately, empirical evidence from pissed off users suggests that the expiration time is somewhere between 6 hours and a day.

Clearly, the answer in our case was to update the server to provide explicit expiration information. But that may not always be something you can influence when you’re developing an application to work with a third party service supplying time-sensitive information.

How you can detect this and work around it in your application code

So, if the default expiration time isn’t going to work for you, here’s some code that might get you started figuring out how to work around it.

    - (NSCachedURLResponse *)connection:(NSURLConnection *)connection willCacheResponse:(NSCachedURLResponse *)cachedResponse {

        NSHTTPURLResponse *httpResponse = (NSHTTPURLResponse*)[cachedResponse response];

        // Look up the cache policy used in our request
        if([connection currentRequest].cachePolicy == NSURLRequestUseProtocolCachePolicy) {
            NSDictionary *headers = [httpResponse allHeaderFields];
            NSString *cacheControl = [headers valueForKey:@"Cache-Control"];
            NSString *expires = [headers valueForKey:@"Expires"];
            if((cacheControl == nil) && (expires == nil)) {
                NSLog(@"server does not provide expiration information and we are using NSURLRequestUseProtocolCachePolicy");
                return nil; // don't cache this
            }
        }
        return cachedResponse;
    }

- connection:willCacheResponse: will get called whenever NSURLConnection has downloaded valid data and is about to save it to the cache. You can get a reference to the original request, and see what cache policy was used, and you can also inspect the headers returned by the server yourself.

If the cache policy is set to NSURLRequestUseProtocolCachePolicy and expiration data has been provided, this method will allow the cache to be saved, to be redownloaded the next time we make a request after the expiration period has passed. This method will also allow caching for any other applicable cache policy.

If there are no expiration-related headers in the response, and the cache policy is set to NSURLRequestUseProtocolCachePolicy, you should be able to intercede here if the default time interval is too long for your needs.

In this case, I simply return nil, which prevents the copy from being saved at all, which means that every time our app tries to download the data (and it tries at reasonable intervals already), it will get a fresh copy.

This could be more robust, for instance, the Cache-Control header might be present but not contain valid data, but it’s a starting point.

Not all cache policies are actually implemented

Although there is an extensive list of different cache policies listed in the documentation, at least a few of these policies are not actually implemented, or actually do something different than suggested.

Here’s what’s actually in NSURLRequest.h

    enum
    {
        NSURLRequestUseProtocolCachePolicy = 0,

        NSURLRequestReloadIgnoringLocalCacheData = 1,
        NSURLRequestReloadIgnoringLocalAndRemoteCacheData = 4, // Unimplemented
        NSURLRequestReloadIgnoringCacheData = NSURLRequestReloadIgnoringLocalCacheData,

        NSURLRequestReturnCacheDataElseLoad = 2,
        NSURLRequestReturnCacheDataDontLoad = 3,

        NSURLRequestReloadRevalidatingCacheData = 5, // Unimplemented
    };
    typedef NSUInteger NSURLRequestCachePolicy;

I have opened a radar on this as well.