Cache-Control Header for Amazon S3

Or “How to set a far future Expires header in S3 to appease the YSlow gods”.

I’m working on a Ruby on Rails site that stores images and other static content on Amazon S3. We want Amazon to serve all of our images with a Cache-Control or Expires header set to a point in the very far future. This will avoid unnecessary HTTP requests on subsequent page views, making the site faster for users and consuming less bandwidth.

Amazon provides an option for specifying the Cache-Control header, but we use the AWS::S3 gem and the attachment_fu plugin for uploading our files to S3. The gem and plugin don’t provide a convenient way to set the Cache-Control header. My solution is to enhance the behavior of the store() method within the AWS::S3 gem so that it always specifies a Cache-Control header of 10 years if another value is not specified. Here’s my patch, which I placed in a file called s3_cache_control.rb in the lib directory of my rails project:

module AWS
  module S3
    class S3Object
      class << self
        def store_with_cache_control(key, data, bucket = nil, options = {})
          if (options['Cache-Control'].blank?)
            options['Cache-Control'] = 'max-age=315360000'
          end
          store_without_cache_control(key, data, bucket, options)
        end

        alias_method_chain :store, :cache_control
      end
    end
  end
end

In my config/environment.rb file, I added the following lines to load my patch:

require 'aws/s3'
require 's3_cache_control'

Restart your server, and from now on, anything stored to S3 via the AWS::S3 gem will automatically get a Cache-Control header with max-age set to 10 years. Rockin’ tacos.

But what about all those existing images our users have already uploaded? Those need to be updated too, so I added a method to my Photo model which iterates through all photos and sets the Cache-Control. Here’s the method:

def self.set_cache_control
  photos = Photo.find(:all)
  photos.each do |photo|
    begin
      s3_object = AWS::S3::S3Object.find(photo.full_filename,
        'your_bucket_name')
      s3_object.cache_control = 'max-age=315360000'
      s3_object.save({:access => :public_read})
    rescue Exception => e
      logger.error("Unable to update photo with key " +
        "#{photo.full_filename}: #{e}")
    end
  end
end

You can run the update using script/runner:

$ RAILS_ENV=production ./script/runner Photo.set_cache_control

The set_cache_control() method assumes you have a full_filename() method on your Photo class that provides the S3 key. You’ll already have the full_filename() method if you’re using attachment_fu. You’ll also need to replace your_bucket_name with your Amazon S3 bucket name in the code above.

Now you can sing Cache-Control to Major Tom like I’ve been doing all afternoon. In my head. I’ve only been singing it in my head. Mostly.


Posted

in

by

Tags:

Comments

7 responses to “Cache-Control Header for Amazon S3”

  1. Adrian B. Danieli Avatar

    Nice. I wanted to add “Expires” as well (I know, old school) so here’s a tweaked version that does it. This version forces the headers, and uses symbolized keys.

    require 'aws/s3'
    
    # Adds expiration headers to all stored S3 objects through duck-punching.
    # Based on Keaka Jackson's original work.
    #
    module AWS::S3
      class S3Object
        class << self
          MAX_AGE = 8.years
          def store_with_cache_control(key, data, bucket = nil, options = {})
            options[:cache_control] = "max-age=#{MAX_AGE.to_i}"
            options[:expires]       = MAX_AGE.from_now.httpdate
            store_without_cache_control(key, data, bucket, options)
          end
          alias_method_chain :store, :cache_control
        end
      end
    end
    
    # To update existing photos, run this in production script/console:
    #
    # Photo.find(:all).each do |p|
    #   s3 = AWS::S3::S3Object.find(p.full_filename, 'bucketname')
    #   s3.save(:access => :public_read) unless s3.cache_control && s3.expires
    # end
    
  2. […] Cache-Control Header for Amazon S3Amazon provides an option for specifying the Cache-Control header, but we use the AWS::S3 gem and the attachment_fu plugin for uploading our files to S3. The gem and plugin don’t provide a convenient way to set the Cache-Control header. My solution is to enhance the behavior of the store() method within the AWS::S3 gem so that it always specifies a Cache-Control header of 10 years if another value is not specified. […]

  3. Joe Martinez Avatar
    Joe Martinez

    What do you do about gzipping your s3 assets, or do you only use s3 for images?

  4. Keaka Jackson Avatar

    I mainly only use s3 for images, so I haven’t had to deal with serving gzipped assets yet.

  5. Travis Avatar
    Travis

    Do you happen to know the procedure and in what API method the header is ultimately being set so that when the object is put to S3 it has that header stored? I am using a PHP class which works well but I can’t figure out how to get my own custom headers (such as cache-control) to be included with the file…

  6. Keaka Jackson Avatar

    It’s been a while, but I believe you include a regular HTTP header in your HTTP PUT request. So wherever your PHP class generates the actual request to Amazon, you’d add a Cache-Control header to the request.

    This “Getting Started Guide” has a “Writing an Object” section with PHP sample code you might be able to use:
    http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/

  7. […] technique we use for setting the Cache-Control header was originally posted in this guide. We’ll enhance the AWS::S3 store method to add a Cache-Control header that’s 10 years […]