Encoding AV1

This article presents two tools with AV1 support and some techical details behind that format and its support in ffmpeg.

I’m happy to announce that my two WebM converters, webm.py (command-line) and boram (graphical) now support encoding to the new AV1 video format, starting with versions 0.12.0 and 0.5.0 accordingly. Both of them work on all three major platforms (Windows, macOS, Linux) and cover typical usecases of encoding videos on demand.

AV1

In case you still haven’t heard about it, AV1 is a royalty-free modern video coding format which aims to provide better quality than alternative, patent encumbered H.265 (HEVC). It was made by a collaboration of large IT companies such as Google, Mozilla and Netflix called AOMedia and first stable release was made in March of this year.

You can read about technologies used in AV1 in this paper, some test comparisions with competitive solutions are available here.

The compression efficiency and planned hardware support of AV1 looks fantastic, it should be really beneficial to add to any modern converter. So I went ahead and implement support in my tools right away, even while it’s still a bleeding edge technology.

Tools

The first tool, webm.py is written in Python and provides simple wrapper around ffmpeg with useful defaults. For example you don’t have to deal with running commands twice for 2-pass encoding or calculating bitrate by hand if you want to reach specific target file size. webm.py will do that for you.

Another nice thing about webm.py is that it works in both Python 2 and 3 (Python 2 is often the only Python installed on computer even today), doesn’t have any dependencies except ffmpeg executable in PATH and contained in single source file. So it’s easy to just drop that file on any machine you want and start using right away.

However, if you have pip available, by running

pip install webm

you will have latest stable version installed along with handy webm command. (Make sure to check Add to PATH option in Python for Windows installer.)

webm.py provides command-line interface so it’s mostly useful for batch encodes or usage on server. However it also has interactive mode for cut and crop using scriptable mpv player. To encode AV1 video simply run

webm -i in.mkv -av1

webm.py

It will use 2-pass mode with default quality. To see all available options run

webm --help

Note that you need ffmpeg from git compiled with AV1 support in your PATH to make it work. Additionally few patches are required (more on that later), so I recommend you to grab ffmpeg binaries from boram releases on Windows and macOS (they are located in resources/app subdirectory of unpacked archive) and build from sources on Linux.

The second tool, boram provides user-friendly graphical interface for encoding and supports basic editing options such as scale, crop, deinterlacing and so on. It wraps ffmpeg as well and written in JavaScript using Electron framework.

In case if you’re interested, take a look at mpv.js component for Electron which allows boram to open any video format, not limited to the ones supported in Chrome.

Download latest release for your platform here, unpack and run. Interface and controls should be self-explanatory, just make sure to select AV1 codec at codecs tab.

boram

Player support

WebM formats are traditionally targeting the Web platform, so what browsers can play it? According to caniuse website, AV1 is supported in latest Firefox Nightly behind a media.av1.enabled flag, and in Chrome 69+ out of the box. Unfortunately not in Edge and Safari yet. Not much, but good enough to play with it in the wild.

Among desktop players at least VLC and mpv builds for Windows support videos in AV1 format.

FFmpeg patches

As said earlier, converters wrap ffmpeg executable, because it’s such a powerful framework for video processing and encoding. However, support for AV1 in ffmpeg right now is a bit lacking, due to novelty of that format. So I had to write few patches for it.

The first one is to leverage encoding parallelism. The reference encoder of AV1 is called libaom and thing to note is that it’s substantially based on libvpx code, the reference encoder for VP8 and VP9 formats. libvpx currently provides two mechanisms of multithread encoding for VP9: -tile-columns and -row-mt.

-tile-columns basically divides your video into independent coding units — tiles — and process them on different cores simultaneously to utilize multicore CPU.

tile-columns schematic

ffmpeg doesn’t support -tile-columns for AV1 yet but the patch is trivial:

diff --git a/libavcodec/libaomenc.c b/libavcodec/libaomenc.cindex 9431179886..55cb7ff72e 100644--- a/libavcodec/libaomenc.c+++ b/libavcodec/libaomenc.c@@ -68,6 +68,8 @@ typedef struct AOMEncoderContext {int static_thresh;int drop_threshold;int noise_sensitivity;+ int tile_columns;+ int tile_rows;} AOMContext;

static const char *const ctlidstr[] = {@@ -75,6 +77,8 @@ static const char *const ctlidstr[] = {[AOME_SET_CQ_LEVEL] = "AOME_SET_CQ_LEVEL",[AOME_SET_ENABLEAUTOALTREF] = "AOME_SET_ENABLEAUTOALTREF",[AOME_SET_STATIC_THRESHOLD] = "AOME_SET_STATIC_THRESHOLD",+ [AV1E_SET_TILE_COLUMNS] = "AV1E_SET_TILE_COLUMNS",+ [AV1E_SET_TILE_ROWS] = "AV1E_SET_TILE_ROWS",[AV1E_SET_COLOR_RANGE] = "AV1E_SET_COLOR_RANGE",[AV1E_SET_COLOR_PRIMARIES] = "AV1E_SET_COLOR_PRIMARIES",[AV1E_SET_MATRIX_COEFFICIENTS] = "AV1E_SET_MATRIX_COEFFICIENTS",@@ -449,6 +453,11 @@ static av_cold int aom_init(AVCodecContext *avctx,if (ctx->crf >= 0)codecctl_int(avctx, AOME_SET_CQ_LEVEL, ctx->crf);

+ if (ctx->tile_columns >= 0)+ codecctl_int(avctx, AV1E_SET_TILE_COLUMNS, ctx->tile_columns);+ if (ctx->tile_rows >= 0)+ codecctl_int(avctx, AV1E_SET_TILE_ROWS, ctx->tile_rows);+codecctl_int(avctx, AV1E_SET_COLOR_PRIMARIES, avctx->color_primaries);codecctl_int(avctx, AV1E_SET_MATRIX_COEFFICIENTS, avctx->colorspace);codecctl_int(avctx, AV1E_SET_TRANSFER_CHARACTERISTICS, avctx->color_trc);@@ -746,6 +755,8 @@ static const AVOption options[] = {{ "static-thresh", "A change threshold on blocks below which they will be skipped by the encoder", OFFSET(static_thresh), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, VE },{ "drop-threshold", "Frame drop threshold", offsetof(AOMContext, drop_threshold), AV_OPT_TYPE_INT, {.i64 = 0 }, INT_MIN, INT_MAX, VE },{ "noise-sensitivity", "Noise sensitivity", OFFSET(noise_sensitivity), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 4, VE},+ { "tile-columns", "Number of tile columns to use, log2", OFFSET(tile_columns), AV_OPT_TYPE_INT, {.i64 = -1}, -1, 6, VE},+ { "tile-rows", "Number of tile rows to use, log2", OFFSET(tile_rows), AV_OPT_TYPE_INT, {.i64 = -1}, -1, 6, VE},{ NULL }};

This also adds -tile-rows option which can be used to parallelize even more thanks to configurable prediction dependency between tile rows in AV1. Recent patch to libaom enables that feature.

In libvpx VP9 we have -row-mt option to enable multi-threading within a single column tile using a block row based threading approach. It was added to libaom few days ago as well, but currenly does nothing interesting, basically aliases the -tile-columns.

Second patch is required to enable AV1 support in WebM files. It’s also very simple:

diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.cindex 09a62e1922..76cb124221 100644--- a/libavformat/matroskaenc.c+++ b/libavformat/matroskaenc.c@@ -1296,11 +1296,12 @@ static int mkv_write_track(AVFormatContext *s, MatroskaMuxContext *mkv,

 if (mkv->mode == MODE\_WEBM && !(par->codec\_id == AV\_CODEC\_ID\_VP8 ||  
                                 par->codec\_id == AV\_CODEC\_ID\_VP9 ||

+ par->codec_id == AV_CODEC_ID_AV1 ||par->codec_id == AV_CODEC_ID_OPUS ||par->codec_id == AV_CODEC_ID_VORBIS ||par->codec_id == AV_CODEC_ID_WEBVTT)) {av_log(s, AV_LOG_ERROR,- "Only VP8 or VP9 video and Vorbis or Opus audio and WebVTT subtitles are supported for WebM.\n");+ "Only VP8 or VP9 or AV1 video and Vorbis or Opus audio and WebVTT subtitles are supported for WebM.\n");return AVERROR(EINVAL);}

Support for AV1 in WebM is a bit postponted because libaom’s WebM muxer isn’t updated to the latest spec yet, see this bug for details. But FFmpeg already does the right thing so there should be nothing wrong with WebM videos it produces.

You can also use MP4 container to store your AV1 video in (pass -f mp4 option to ffmpeg for that). It already works in latest Chrome and Firefox Nightly. Matroska container (almost the same as WebM but this more features, can be enabled with -f matroska) on other hand only works in Chrome and won’t work Firefox due to limitations of its demuxer.

I send both patches to ffmpeg-devel mailing list but they haven’t been accepted yet. However Windows and macOS releases of boram include ffmpeg binaries with that patches already applied so everything will work out of the box. If you would like to build ffmpeg by yourself see the next section.

FFmpeg build

I compile custom builds of ffmpeg for my tools because it gives me control over that encoders and features are included. It also helps to greatly reduce the download size of releases.

On Windows I use awesome media-autobuild_suite script which supports compilation of ffmpeg and a lot of other tools from sources. Just follow the instructions in readme and installation wizard.

To apply patches mentioned above add the following code to build/media-suite_compile.sh file right after if do_vcs ["https://git.ffmpeg.org/ffmpeg.git](https://git.ffmpeg.org/ffmpeg.git)"; then line:

do_patch "https://gist.github.com/Kagami/8d42f7b9d6c3d7e87a6da40b0fee10dc/raw/5f6cc9391e2796d2b2056cb15f90394cd8c7bcee/0001-lavc-libaomenc-Add-tile-columns-tile-rows.patch" amdo_patch "https://gist.github.com/Kagami/8d42f7b9d6c3d7e87a6da40b0fee10dc/raw/5f6cc9391e2796d2b2056cb15f90394cd8c7bcee/0002-lavf-matroska-Allow-AV1-in-WebM.patch" am

On macOS Homebrew does pretty much all the work, just put libaom.rb and ffmpeg.rb to /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core/Formula directory and run

brew install ffmpeg --HEAD --with-libvpx --with-libaom --with-libvorbis --with-opus --with-libass

On Linux follow any up to date instruction for your distribution you will find such as this one. Make sure to run

wget https://gist.github.com/Kagami/8d42f7b9d6c3d7e87a6da40b0fee10dc/raw/5f6cc9391e2796d2b2056cb15f90394cd8c7bcee/0001-lavc-libaomenc-Add-tile-columns-tile-rows.patchgit am 0001-lavc-libaomenc-Add-tile-columns-tile-rows.patchwget https://gist.github.com/Kagami/8d42f7b9d6c3d7e87a6da40b0fee10dc/raw/5f6cc9391e2796d2b2056cb15f90394cd8c7bcee/0002-lavf-matroska-Allow-AV1-in-WebM.patchgit am 0002-lavf-matroska-Allow-AV1-in-WebM.patch

inside the directory with ffmpeg sources to apply patches.

rav1e

There is alternative encoder for AV1 exists, called rav1e and written in Rust. It’s in early experimental stage and doesn’t provide C API yet so can’t be embedded into ffmpeg.

Why can’t we use rav1e from the command-line though? Why do we need ffmpeg? (Same goes for aomenc CLI utility, a part of libaom.) That’s entirely because ffmpeg is so powerful so you can pass it any video file as input, encode video, audio, subtitles to any combination of formats and containers, filter (scale, deinterlace, change framerate, etc) them in between and do all this with a single line of shell.

On the other hand, rav1e and aomenc accept a very limited number of formats (such as raw uncompressed video in YUV4MPEG2 container) with no audio or filtering possible at all. Your typical input video would have H.264 video in a MP4 container so in order to pass it to rav1e you need to decompress to separate file or decode on the fly and pass via stdin, which is awkward and messy process. That’s why encoding via ffmpeg is a preferred way.

Also, right now rav1e is focused on speed so most useful for testing purposes. Things are evolving quickly though, so let’s hope we will have great alternative encoder for AV1 in near future.

Future ideas

While you can already produce fully functional AV1 encodes, there are several things which are going to be improved and worth to keep an eye on.

One of them is that aomenc command provides much more encoder options than ffmpeg’s interface for libaom. This can potentially provides better compression of resulting files or faster encodes. Hopefully ffmpeg will eventually catch up and implement at least most important of them.

Secondly, right now libaom is very slow so either encodes are very time-consuming or compromise the compression effeciency. For example webm.py and boram both use -cpu-used 4 speed parameter while -cpu-used 1 being the default. libaom should eventually become much faster, the same thing happened with libvpx VP9 encoder which was terribly slow back in 2013. In worst case we still have rav1e as a plan B.

Thirdly, there are few interesting options in libaom’s build config, not enabled by default. It might be interesting to try them, to achieve better compression.