How To Take Screenshots In The Browser Using JavaScript

Taking screenshots in-browser (or 'client-side') is all about tradeoffs - there's no perfect solution for every situation. Let's take a look at three different ways you can take screenshots, and then how you can use them by sending them to a server or letting the user download the image.

Screenshots can be a very valuable and important part of your JavaScript application. Companies like Google use them for getting feedback from users, products like BugHerd use screenshots as a core part of their product and they're great for generating data exports (like charts).

Asking users to take a screenshot of what they're looking at introduces a lot of friction. First, they need to know the keyboard shortcut (which is different from macOS to Windows).

Then they need to know what to do with it. For example, they could upload it to somewhere like Dropbox or email it to you. Most users probably won't know how to resize or optimize the quality of the screenshot either. If you're hosting the screenshots yourself, you want to make sure they take up as little space as possible, which in turn ensures you spend as little as possible.

Here are three ways you can automatically capture screenshots for your users:

1. Using html2canvas for client-side screenshots

Niklas von Hertzen answered a StackOverflow question in 2011 saying it's possible to put the DOM of the page into an HTML canvas and use it to generate a screenshot. After he made the code public, he updated his answer with the original idea that is html2canvas. Later investigation revealed Google uses a very similar technique to automatically generate screenshots of users providing feedback, proving the implementation scales, and is robust enough for large products.

How to use html2canvas

The idea is pretty simple - you capture the DOM (HTML of the page) when you want to generate a screenshot, and you pass that DOM into an HTML canvas. Under certain restrictions, the canvas element can generate a data URI of the contents inside (as a base64 string). The example below generates a screenshot and opens it in the window as an image.

const screenshotTarget = document.body;

html2canvas(screenshotTarget).then((canvas) => {
    const base64image = canvas.toDataURL("image/png");
    window.location.href = base64image;
});

Pros/Cons of the canvas method

Pros

Quick - you don't need to wait on any external services, this is all done on the clientRegularly updated - while html2cavnas isn't updated daily, it regularly sees updates and fixes every few monthsVery popular - the community is strong and active, there are over 40 pull requests open and 600 issues logged (as of July 2020). There are regular questions (and answers!) on Stack Overflow too.Good browser support - html2canvas is over 9 years old, and still maintains great browser support.

Cons

Doesn't support Shadow DOM or Web Components - there's a pull request open to add support, and it looks like the maintainers are keen to get it merged.Cross-Origin restrictions - when you add elements that contain requests to the canvas (like images, stylesheets etc), they're subject to the same CORs policies that normal requests are.

Doesn't support every CSS property - properties are manually added and configured, so 100% coverage is difficult. The documentation states "html2canvas will never have full CSS support".Fragile to updates - html2canvas can be fragile to browser updates, occasionally breaking existing functionality.

2. Generate screenshots with getDisplayMedia

Around the same time that html2canvas got released, the idea that the web could support native video calling was getting a lot of traction. In 2011, Google released Hangouts, a realtime video and audio platform. To support Hangouts, Google put forward the idea (and some implementation) of Real-Time Communication (RTC) which was later developed into WebRTC. WebRTC has since become a standard in all modern browsers, and it's the infrastructure that provides the web with real-time video, audio, and data communication.

A part of WebRTC is getDisplayMedia - which is used as the screen sharing API. You can capture a still image from a video, like screen sharing, which essentially becomes a screenshot. Let's look at a basic example:

const capture = async () => {
  const canvas = document.createElement("canvas");
  const context = canvas.getContext("2d");
  const video = document.createElement("video");

  try {
    const captureStream = await navigator.mediaDevices.getDisplayMedia();
    video.srcObject = captureStream;
    context.drawImage(video, 0, 0, window.width, window.height);
    const frame = canvas.toDataURL("image/png");
    captureStream.getTracks().forEach(track => track.stop());
    window.location.href = frame;
  } catch (err) {
    console.error("Error: " + err);
  }
};

capture();

Even though the above example isn't as small as the first, it's still relatively easy to follow what's happening. Firstly, canvas and video elements are created in memory, then the

getDisplayMedia

API is used to request access to either the user's desktop or tab. After the user grants permission, the screen stream is set to the video element and drawn to the canvas. An image is extracted from the canvas using the same method as described in the html2canvas example above.

Pros/Cons of the getDisplayMedia API

Pros

Good modern browser support - it's a native API, which means it comes built-in to modern browsers, there's no need for any third party script or code.

Extensible - you can easily add support for video recording in the future, using the same APIPixel perfect - it's exactly what the user is seeing

Desktop or Tab - the getDisplayMedia API can record either the desktop or a browser window - making it great for taking a screenshot of something other than your application.

Cons

Permissions - you need to get permissions from the user to record. Depending on the browser, the permissions dialogue can look different and can sometimes be confusing.

Slow - because you need to get permission, it can take the user a while to click accept and understand what is happening. If the user needs a screenshot instantly, this little UI interaction cold delay capturing some time-sensitive content.

3. Screenshots as a Service

Screenshot services are growing in popularity as they offer an easy way to integrate screenshots into your existing application. While it's not client-side, using a screenshot service like url2png, Stillio or Urlbox means you don't need to manage any infrastructure that takes the screenshots.

While implementation can differ per service, the general idea is the same. You send the URL (and customizable parameters like dimensions, format and quality) to an external service and either wait for a response, or listen for the response elsewhere. This is usually handled with a backend service, so let's take a look using url2png and NodeJS and express:

const url2png = require('url2png')('API_KEY', 'PRIVATE_KEY');
const fs = require('fs');

app.get('/screenshot', (req, res) => {
	url2png.readURL(req.query.url, {}).pipe(fs.createWriteStream('screenshot.png'));
	res.json({success: true})
});

The above example takes a request like

 /screenshot?url=http://google.com.au

and then uses the url2png library to generate a screenshot. That screenshot is then saved to the local filesystem - or you can return it in the request.

Pros/Cons of using a screenshot service

Pros

Post processing - services can handle post processing too, offering the flexibility to generate images in different file types, sizes, quality and more.

Robust APIs - the services are used by many people and facilitate large volumes of requests.Infrastructure - you don't need to maintain the infrastructure to manage the process of taking a screenshot. If you're happy to, projects like PhantomJS and services like SauceLabs are a good place to start.

Cons

Not stateful - the screenshot might not be what the user sees. For example, if the user has interacted with the page, or if they're authenticated, the screenshot service won't be able to render the same page for a screenshot.

Cost - out of the three options, this is the only one that comes with a cost. Although it's quite cheap, if you're handling large volumes of screenshots, it could be expensive.

Time - screenshot services can take multiple minutes to generate and return an image.

Roundup

Whether you're integrating screenshots to get feedback or for a critical feature in your application, you need to weigh up the pros and cons of each solution.

Using a client-side solution, like html2canvas or the getDisplayMedia API means you don't have to manage any server infrastructure and generally the images are generated pretty quickly.

If you need a pixel-perfect representation of what your user is looking at and don't mind the sometimes obtuse permissions popup, the getDisplayMedia API is a good place to start.

If you want to take semi-accurate screenshots quickly with no external service dependency, html2canvas could be the right choice for you.

Finally, if you're okay with outsourcing the technical implementation and the cost associated with it, a screenshot service might be the best option.