Browser Caching: A Practical Introduction

Before we start, let me remind you what cache is. It is a software component that stores data locally so that future requests for it can be served faster. Cache reduces the access time by storing frequently used data in a location that is easier to reach than the original data source.

Browser caching comes in handy for any website or application to provide a faster, more seamless user experience. It is especially important for solutions operating with significant amounts of data (e.g., a website with lots of pictures).

Also, it is of great help for handheld device users, saving mobile traffic and compensating for possible slow connection.

First, let us have a look into the basic practices of caching technology:

HTTP Cache Headers

HTTP protocol gives us a powerful tool for caching data: HTTP cache headers. It allows us to cache data that is exchanged between the browser and the server. You can find exhaustive information about HTTP caching in MDN, but here we look at some practical recipes.

Web apps have two types of content: dynamically generated content (like web pages) and static resources (javascript, CSS, fonts, images). To cache static resources, we can use the cache busting technique.

In this case, we cache a resource for long periods of time (in some instances, it can be considered indefinite), as long as a static resource has a version in its URL:

<script src="main.v98.js"></script>

To cache static resources, we should add a header like this:

Cache-Control: public, max-age=31536000

This technique cannot be applied to dynamic resources because they can change anytime, and we must validate them at each request. However, upon validation, the server can return an HTTP 304 (‘Not Modified’) if the content remained changed since the previous request.

That would mean that the browser can retrieve content from the cache, saving time on sending tens or hundreds of kilobytes. For this purpose, it is really convenient to use the ETag header, which is a hash for dynamic content (for instance, a web page).

The server compares the ETag that comes from the client with the hash of the requested page and tells the browser if it can use the cached version.

Cache API

Another way of browser caching is Cache API. Together with a service worker, it gives us an experience of a standalone (offline) application. Once a web page is downloaded, it can be used in offline mode.

It is possible because the Cache API provides a mechanism for caching HTTP requests. However, unlike HTTP cache headers, we control it in the browser by means of JavaScript.

Also, when you use cache headers, the browser is the one to decide when to evict a cached entity and you have no control over it.

Using Cache API, on the contrary, means that evicting old entities is solely your own responsibility. Even though Cache API serves the purpose of caching HTTP requests, it can be also used for storing data. Here is a simple example of how it can be done:

// Open cache store

const cache = await caches.open('test');

const data = {value: 'Hello world'};

// Put data to cache

await cache.put('some_data', new Response(JSON.stringify(data)))

// Get data from cache

const response = await cache.match('some_data')

const json = await response.json();

console.log(json) // {value: 'Hello world'}

To control the amount of data you can store, you can use a navigator.storage.estimate(). It gives you an object with the following properties:

quota - total storage in bytes

usage - how many bytes currently are used

However, as with all good things, caching comes at a price. One of the problems is determining what data should be cached and for how long.

If too much data is cached, it can consume a significant amount of memory or storage space, which can negatively impact the system’s performance.

To address this problem, cache eviction algorithms were created. The two best-known examples are Least Recently Used (LRU) and Least Frequently Used (LFU). Both of them are especially popular at technical interviews, so it is important to keep them in mind.

Let us take a closer look at them and see how they can be implemented in a simple way using JavaScript.

Least Recently Used (LRU)

The basic idea behind LRU is to remove the least recently used items from the cache when the cache is full and a new item needs to be added.

The assumption behind this algorithm is that the items that have not been accessed for a long time are less likely to be accessed in the future than items that were used more recently.

In most cases, a double-linked list is used to implement this algorithm, but in JavaScript, we can do it in a much simpler way by using the Map method:

class LRUCache{
   constructor(capacity) {
       this.capacity = capacity;
       this.map = new Map();
   }
   get(key) {
       const value = this.map.get(key);
       if(value !== undefined) {
       this.map.delete(key);
       this.map.set(key,value);
       return value;
       }
      
       return -1;
   }
   put(key, value) {
       this.get(key);
       this.map.set(key,value);
       if(this.map.size > this.capacity) {
           const keyToDel = this.map.keys().next().value;
           this.map.delete(keyToDel);
       }
   }
}

Least Frequently Used (LFU)

The idea here is to remove the least frequently used items from the cache when the cache is full and a new item needs to be added.

In this case, the assumption is that the least frequently accessed items are less likely to be accessed in the future compared to the more frequently used items.

Below, you can see a JavaScript implementation of this algorithm. It also employs the Map method, but is a bit more complex than LRU:

class LFU {
   constructor(capacity) {
       this.capacity = capacity;
       this.map = new Map();
       this.freq = new Map([[0, new Set()]]);
   };


   _incrementFreq(key, freqVal) {
       this.freq.get(freqVal++).delete(key);
       if(!this.freq.has(freqVal)) {
           this.freq.set(freqVal, new Set());
       }
       this.freq.get(freqVal).add(key);
      
       return freqVal;
   }


   get(key) {
       const entity = this.map.get(key);
       if(!entity) {
           return -1;
       }
       const { value, freq } = entity;
       this.map.set(key, {value, freq: this.incrementFreq(key, freq)});
      
       return value;       
   }


   put(key, value) {
       if(this.capacity === 0) {
           return;
       }
       if(this.get(key) !== -1) {
           this.map.set(key, { ...this.map.get(key), value });       
       } else {
           if(this.map.size === this.capacity) {
               for(freqSet of this.freq.values()) {
                   if(freqSet.size !== 0) {
                       const kk = freqSet.values().next().value;
                       freqSet.delete(kk);
                       this.map.delete(kk);
                       break;
                   }   
               }   
           }
           this.map.set(key, { value, freq: 0 });
           this.freq.get(0).add(key);
       }  
   }
}

This article is just a brief introduction to browser caching. It has two main goals. First, to provide a few quick and easy-to-use recipes for beginners. Second, to trigger further interest in the topic.

There are countless resources dedicated to browser caching; here are a few handpicked ones for your further reading. Enjoy!

For more fundamentals, check out this article: https://web.dev/love-your-cache/

A further introduction can be found here: https://web.dev/i18n/en/http-cache/

An article specially dedicated to API stuff: https://web.dev/i18n/en/cache-api-quick-guide/

Finally, a nice practical tutorial on caching: https://web.dev/codelab-http-cache/