Link to wealthfront.com

Fork me on GitHub

Friday, February 18, 2011

Bulletproof Rails Asset Caching

We've been using Rails at Wealthfront for nearly two years and we love it, but there's no denying that, out of the box, Rails asset caching is broken. Below we'll identify the problems, define requirements for a successful strategy, and then describe how we meet those requirements at Wealthfront.

Background

For the uninitiated, the term "asset" refers to a resource referenced by a web page, such as an image, a script, or a stylesheet. Since assets change relatively infrequently, it's common to serve them with a cache expiration date far in the future, so browsers will store and reuse them rather than request them repeatedly.¹

For this to work, however, you must be able to "bust the cache" and force browsers to pick up the latest version whenever you modify an asset. Browser caches are keyed on URL, so modified assets need new URLs. Rails achieves this by including each asset's last-modified timestamp in its query string. For example, image_tag("rails.png") produces:
<img src="/images/rails.png?1230601161" ...>

What's Wrong with Cache-Busting Query Strings

So where's the problem? It's rooted in the fact that common web servers ignore query strings when serving static files.² Web servers don't know to check that an asset's last-modified timestamp matches the request's query string, and even if they did, they couldn't respond successfully in the event of a mismatch, since they only have one version of each asset at any given time.

Asset mix-ups come in two varieties:
  1. A new asset is served in response to a request for an old asset. This can happen, for example, if a new version of an asset is deployed to an asset server before it's deployed to all Rails servers. It can even happen on a single Rails server that serves its own assets due to the delay between when it formats an asset URI and when it receives and handles the request for that asset. Getting the wrong version of an asset can seriously break a page. Fortunately, reloading the page a short time later should fix this kind of mix-up, since the new page will no longer include any old asset URIs.
  2. An old asset is served in response to a request for a new asset. If you have multiple servers, either for redundancy or to offload the responsibility of serving assets, there will be a brief period during deployment when your servers' assets are out of sync. If one Rails server gets a new version of an asset before all asset servers, it will begin formatting URIs for that asset with the new timestamp, and a request for that new asset might be handled by a server that has the old version. This is worse than case 1 because a simple page reload doesn't fix it. Unless the affected user clears his browser cache, he will continue to see a broken page until the asset expires, is modified again, or becomes obsolete. It gets worse: if your assets are publicly cacheable (a best practice), proxies can further compound the problem, repeating one bad asset response to many users, poisoning all of their browser caches.³
These two cases both arise from race conditions during deployment. This means they will happen more often as your traffic increases and as you progress toward continuous deployment. The only solution is to improve upon Rails's default cache-busting strategy. Before diving into Wealthfront's approach, let's first define precisely what characteristics a bulletproof asset caching strategy must have.

Properties of a Correct Asset Caching Strategy

There are many ways to serve assets, from Apache to CDNs, and all of them can work. For correctness, just be sure that your strategy has these properties:
  1. Either:
    1. Asset servers can serve multiple versions of the same asset, or
    2. You manually version your assets (i.e. you rename every time you edit).
  2. New assets are fully deployed to asset servers (alongside old assets) before any server begins serving pages that reference them. This implies a two-step deployment process: assets first, then code.
  3. Asset servers keep old assets until no servers are still serving pages that reference them and any cacheable pages that reference them have expired.
Note: This prescription is not specific to Rails.

Properties of an Optimal Asset Caching Strategy

When aiming for correctness, it's easy to sacrifice optimality. For example, the Rails AssetTagHelper doc provides two alternatives to query string timestamps (involving RAILS_ASSET_ID and RELEASE_NUMBER), neither of which you'd ever want to use if you deploy more than once a month. The condition for optimality is simple:
  1. An asset's URI should change only when its content changes.
Last-modified timestamps are virtually optimal because it's rare to revert an asset back to an earlier version. If you choose timestamps, just be sure your deployment process preserves them.⁴ Digests (e.g. MD5, SHA-1) are also a reasonable choice—optimal with a minuscule risk of incorrectness.

Wealthfront's Asset Caching Strategy

So how do we satisfy all of these constraints at Wealthfront? Glad you asked! It's a four-pronged approach:
  1. Renaming assets to include their version
  2. Coercing Rails to format asset URIs our way
  3. Fixing stylesheets before deployment
  4. Guaranteeing asset availability

Renaming assets to include their version

We add timestamps to our asset filenames just before packaging them up for deployment to our asset servers. This way we can serve them using any dumb file server, and multiple versions of the same asset can sit side-by-side in the file system. Here's our dead-simple renaming script:
#!/bin/sh
for f in $(find public -type f)
do
  ts=$(ls -o --time-style=+%s $f | cut -d' ' -f5)
  mv $f $(echo -n $f | sed "s/[a-z0-9]*$/$ts.\0/")
done
It puts the timestamp right before the file extension (e.g. base.cssbase.1292435738.css). Note that this timestamp format is Unix time (seconds since the epoch), precisely what Rails uses. It's agnostic to the system time-zone—one less thing to worry about.

Coercing Rails to format assets URIs our way

We'd love to specify our URI format using config.action_controller.asset_path_template in production.rb. Unfortunately, it doesn't have access to the asset timestamp, so we monkey patch. Here's our config/initializers/assets.rb:
require 'action_view/helpers/asset_tag_helper'

module ActionView
  module Helpers
    module AssetTagHelper
      def rewrite_asset_path(source, path = nil)
        asset_id = rails_asset_id(source)
        if asset_id.blank?
          source
        else  # foo.png -> foo.1252928347.png
          source.sub(/[a-z0-9]*$/, asset_id + '.\0')
        end
      end
    end
  end
end if Rails.env.production?
Note that this patch only affects the production environment. We develop with timestamps in query strings, like everyone else.

Fixing stylesheets before deployment

In a standard Rails setup, background images in stylesheets are requested without versioned URIs, so for correctness, you must 1) never edit images referenced by stylesheets and 2) keep them around for at least one release after they're no longer referenced. Or you can do what we do: version the background image URIs in your stylesheets before you deploy them. Here's our script that inserts timestamps into the filenames of the background image URIs in our stylesheets. It writes the resulting files to some other directory ($1), the command-line argument:
#!/bin/bash
mkdir -p $1
for f in $(ls public/stylesheets)
do
  awk '
    BEGIN { OFS="" }
    /^(.*)url\(([^\)]*)(.*)/ {
      split($0, a, /(url\(|\))/);
      cmd = "ls -o --time-style=+%s public" a[2];
      cmd | getline ls_out;
      close(cmd)
      split(ls_out, file_info)
      cmd = "echo -n " a[2] " | sed -e \"s/[a-z0-9]*$/" file_info[5] ".\\0/\""
      cmd | getline file_name
      close(cmd)
      print a[1], "url(", file_name, ")", a[3];
      next
     }
     { print $0 }' public/stylesheets/$f > $1/$f
  # stylesheet needs to get the max timestamp of itself and all referenced images
  ts1=$(grep -o '\b1[0-9]\{9\}\b' $1/$f | sort -r | head -1)
  ts2=$(ls -o --time-style=+%s public/stylesheets/$f | cut -d' ' -f5)
  ts=$(echo -e "$ts1\n$ts2" | sort -r | head -1)
  touch -d @$ts $1/$f
done
The last four lines may require a brief explanation. If any images referenced by a stylesheet have a later timestamp than the stylesheet, these lines update the stylesheet's timestamp to match the newest referenced image. This is necessary because a change to a referenced image is also a change to the stylesheet (since our production stylesheets contain versioned image URIs). This script may feel messy, but it's an essential piece of our bulletproof asset caching strategy. It allows us to modify any image freely, without remembering to rename.

Guaranteeing asset availability

This last part's simple. To satisfy correctness conditions 1 & 2, we push new assets to our asset servers before pushing code that references them, and we keep old versions of assets around for a while. Partly to help achieve these goals, we've outsourced the responsibility of serving our assets from our Rails servers to a couple of Nginx servers.⁵ We push new assets to them without downtime and without deleting old assets. Never deleting old assets from asset servers is a perfectly acceptable policy. To free disk space, we installed a cron job on our asset servers that occasionally looks for assets with multiple versions and deletes all but the latest three.

Parting Tips

If you decide to try our approach, be sure that:
  • Your Rails server(s) end up with non-versioned assets (no timestamps in their filenames).
  • Your asset server(s) end up with the versioned assets (timestamps in their filenames).
    Note: If your Rails servers are your asset servers, it's okay for the versioned and non-versioned assets to sit side-by-side in the same directory tree.
  • If you use the Rails :cache => '...' option to concatenate stylesheets or scripts, be sure to generate the concatenated files and give them the max timestamp of their constituent source files when preparing a release for deployment.
Curious how this looks in practice? Go view source on our home page and check out our asset URIs. We're not shy.

Additional Resources

While you've got assets on the brain, check out:
  • asset_fingerprint: a gem that renames assets as we've suggested and also provides you the option of using digests instead timestamps. Just be sure to ignore Eliot's two suggestions for getting your asset server to respond correctly to versioned asset requests. You know better.
  • jammit: a gem that does just about everything else you might want to do with assets: pre-packaging, minification, compression, image embedding, font embedding, and more.

Footnotes

  1. Google has published a thorough caching guide.
  2. Ironic, isn't it? The fact that web servers ignore query strings when serving static files is precisely what motivated Rails's default cache-busting strategy.
  3. Steve Souders, among others, has reported that some old proxies incorrectly discard query strings entirely when storing and reading from their caches. This dramatically increases the likelihood of Rails asset mix-ups for the unlucky users behind them.
  4. Preserving asset timestamps doesn't require the system clocks of all asset servers to be synchronized, as the Rails AssetTagHelper doc states. cp -p, touch -r, and Subversion's use-commit-times may come in handy.
  5. We plan to move our assets to Amazon's Cloudfront soon, now that it supports SSL. We also plan to reintroduce multiple asset hosts and put our assets on a cookieless domain. So get off our back.