Caching might be one of the hard things in computer science, but wow, it's fun when it works! Let's dive into some cool new caching strategies!
8 min read
·
By Kristofer Giltvedt Selbekk
·
December 8, 2023
Sometimes, old stuff is the coolest. In a world saturated with new technology every day, I find solace in learning something new about a technology I have been using for decades. The fact that even the building blocks of the technologies I use every day still have something new to teach me, just feels good for some reason.
This year, my team and I spent three months implementing a new version of a fairly popular website in Norway, and during that process I spent a bunch of times trying to make everything as fast, responsive and available as possible. Since we were using Remix to implement the new web platform, I started looking into more of the core web fundamentals. What I learned turned out to be incredibly useful for me – and hopefully it will be for you too.
This article will teach you two (fairly) new caching strategies that helped us save money, reduce load time by a ton, and greatly simplify our code.
Before we dive into my new cool findings, I want to give a quick introduction on what caching is, and how caching works on the web.
Caching is a way to temporarily store data, so we can retrieve it quicker than we would from the original source. We do it all the time, often without noticing. Our CPUs cache data it needs within a millisecond or two, memory chips store cache data we need almost as quickly, and even our hard drives cache stuff we’ve accessed recently so that we don’t have to look it up from the storage unit itself.
When doing web development, there’s a few ways we can cache data as well. We can cache data in our JavaScript applications (hello useState), we can ask the browser to cache stuff for us, and we can use a globally distributed cache layer, called a Content Delivery Network (or CDN, for short) to cache documents and assets closer to the user than our origin servers. There are lots of other ways, of course, but these seem to be the most common ones.
When caching data on the web, we typically use something called caching headers to specify what we want to cache, and how we want to cache it. There are lots of fun cache headers to play with, but the most important one – at least for the strategies mentioned in this article – is the Cache-Control
header. Headers are part of the request that’s passed between the server and the client, providing meta information (information about information) to the receiver.
The Cache-Control
header has a bunch of functionality, but here are a few examples:
Cache-Control: max-age=3600; // Cache something for 3600 seconds (1 hour)
Cache-Control: nocache; // Don't cache this!
Cache-Control: private; // Only cache in private caches (i.e. the browser)
There are lots more to learn about this header, and I highly recommend diving into the MDN documentation for more details.
However, there are a few strategies that turned out to be great resources for us to reduce response time, server load and cost.
The site we implemented had a CDN (content delivery network, remember?) between the site’s servers and the end user. A CDN is a network of globally distributed machines, working as intermediate storage devices for any resources.
You typically see CDNs storing assets that never change, like images, videos and hashed JavaScript files. This lets the user download these assets from a nearby server (typically within the same state or part of the country) without going to the origin server. This is very practical, since it both drastically reduces response time for the user, as well as the load on the origin servers.
Our CDN (AWS CloudFront, if you were wondering) receives every request to our domain, checks whether it has that particular resource (file path) stored, and returns it to the user, without ever bothering the origin server. However, we also use it to serve files that do change, like CMS-backed HTML pages and JSON responses, which I what I want to talk about in this article.
As I mentioned earlier, the Cache-Control
header accepts a bunch of different directives. One of most powerful ones, is one called stale-while-revalidate
. This strategy tells our cache (CDN) to do the following:
The tradeoff is basically this: Some users might get stale content, but will still get it quickly.
It looks like this:
Cache-Control: max-age=3600, stale-while-revalidate=3600;
What happens here is the following:
This is a great strategy to use when you have a bunch of consistent traffic, or if any changes to the content you’re caching can wait for a few seconds. If you have consistent traffic, the cache will always be updated, since any stale request will trigger what’s called a revalidation (”go check the origin server in the background”). Cool stuff!
Our site was based on a CMS, and any changes in the returned HTML site (or JSON data) would typically be changes to a landing page, articles etc. Yes, it’s cool if the updated page was available right away, but it doesn’t really break anyone’s experience if they are a few seconds behind the breaking news. This seemed like a wonderful strategy for us.
The result is that, for resources that change, we have a > 95 % cache hit rate. The only pages that are retrieved from the server is the ones rarely updated. We’ve set the cache threshold to about 1 hour, which means you might get 1 hour old content if you’re the first visitor to visit that page a while – but if that’s the case, chances are the page might not be the most frequently updated one you have anyways.
The stale-while-revalidate
header is such a useful technique in combination with CDNs, I don’t think I’ll ever serve a site without it again.
We all make mistakes from time to time. In my experience, making sure the production environment we serve is resilient from errors is the best investment you’ll ever make.
Meet the stale-if-error
directive. It works in a pretty similar fashion to the stale-while-revalidate
directive, but with a very specific difference:
If the origin server returns a 5xx server error (500 Internal Server Error, 503 Service Unavailable etc) for a specific resource, or if there’s a network error (a timeout) return the last cached successful response.
In other words, we can specify that if our website goes down for some time (it happens), we can return the last working version of that website for a specified amount of time. Sure, the user won’t get any new data, but at least it’ll get some data back. Your server will hopefully trigger an alarm, and you can get your server back online without the user ever noticing.
In our case, we set this cache timeout to be 12 hours:
Cache-Control: stale-if-error=43200; // 12 hours * 60 minutes * 60 seconds
In other words – if our website goes down, we have 12 hours to get it fixed before we ever show a 500 error page to the user. Neat!
The previous version of our website had some intricate code that dealt with downtime in downstream systems, like our previous CMS or other internal APIs. Now, however, we could delete all that code and just use The Platform like it was supposed to work.
These two directives have made wonders for both our users, servers and pocket books. We spent a bunch of time optimizing the different parameters to make them just right, and we’ll probably keep on doing so for the foreseeable future, as we learn more about our users’ usage patterns.
outline
Caching stuff is fun. Sure, it might be one of two hard things in computer science, but it’s sure a great tool to keep in your toolbox.
I hope this article inspired you to start experimenting with these caching directives on your own site, saving your team and users a bunch of headaches.