đŸ’Ÿ Archived View for sebastien-mouchet.fr â€ș blog â€ș en â€ș audio-video-encoding-with-ffmpeg.gmi captured on 2024-03-21 at 14:49:16. Gemini links have been rewritten to link to archived content

View Raw

More Information

âŹ…ïž Previous capture (2024-02-05)

🚧 View Differences

-=-=-=-=-=-=-

Audio/video encoding with FFmpeg

Posted on January 20, 2024, by SĂ©bastien

Cet article est également disponible en français.

———

TLDR?
If you don’t have 15 minutes to spend reading this blog post, here’s a 3 minute version instead:

FFmpeg Cheat Sheet

———

FFmpeg is a powerful tool, able to decode and encode hundreds of different audio and video formats.

In fact, it isn’t limited to decoding and encoding, it can also:

This blog post is solely about transcoding, though.

If you’ve never heard of FFmpeg, know that it’s a command-line program.

Wait
 don’t run away just yet, it’s actually quite easy to use. 😉

It provides libraries used by countless other programs, including media players, such as VLC, and a lot of audio/video transcoding software (for instance, Handbrake).

I used to use GUI apps for transcoding, but I’ve since cut out the middleman, and I now use FFmpeg directly.

Installation

If you’re running Windows, you can download prebuilt binaries:

https://ffmpeg.org/download.html#get-packages

Practically speaking, the last time I needed to download FFmpeg, I used the latest build from:

https://github.com/BtbN/FFmpeg-Builds/releases

More specifically:

ffmpeg-master-latest-win64-gpl.zip 

Extract the ZIP file somewhere.

For convenience, you can add the folder containing “ffmpeg.exe” to your “Path” environment variable, which will enable you to call “ffmpeg.exe” from any folder.

Searching for “environment variables” in the start menu should bring up the corresponding dialog, where you can edit the “Path” user variable.

To avoid breaking anything, make sure you don’t remove anything from this variable (only append to it).

Then, open a terminal by using Shift + right-click on the desired folder, “Open in Windows Terminal”.

If necessary, you can change the current folder using the “cd” command (“cd” stands for “change directory”), such as:

cd C:\Users\Sebastien\Videos\

If you’re on Linux, installation is usually very simple, as FFmpeg is bundled by most major distributions.

For instance, on Ubuntu or Debian, just type:

sudo apt install ffmpeg

Audio, lossless → FLAC

Imagine you have an uncompressed WAV file extracted straight from a CD.

You can convert it to a FLAC file, which can be around 30% smaller (sometimes up to 50%) without losing any quality, thanks to lossless compression.

ffmpeg -i input.wav output.flac

Yes, it’s that simple. The basic syntax only requires specifying the input file with the “-i” option, and ending the command with the output filename.

FFmpeg will automatically detect that it needs to use the FLAC encoder, based on the file extension.

Of course, there’s no point in converting lossy audio (such as MP3) to FLAC: you won’t make up any lost quality by doing that.

Audio, lossy, maximum compatibility → MP3

Lossy encoding enables significant savings in terms of file size, at the expense of some (irreversible) quality loss.

MP3 is a very famous lossy codec, with very wide compatibility. Virtually all devices/software that support lossy audio codecs support MP3.

ffmpeg -i input.wav -c:a libmp3lame -q:a 0 output.mp3

Compared to the FLAC example, there are 2 more options:

Here, I’ve explicitly selected the LAME MP3 encoder (libmp3lame). I’m pretty sure it’s the default MP3 encoder, so you could probably omit this option in this case.

MP3 can be encoded either using a constant bitrate (up to 320 kbit/s), or using a variable bitrate (VBR). Constant bitrate is considered wasteful, as the signal complexity varies throughout audio streams, so it makes sense to allocate less bitrate to parts of the track that are easier to encode.

Specifying the “quality” parameter instructs the LAME encoder to select a variable bitrate. 0 is the highest quality, 9 would be the lowest.

Here are the very approximate bitrates you can expect – for stereo audio – by tweaking the quality setting:

The vast majority of people cannot differentiate between a lossless source and a properly encoded MP3 with an average bitrate of 245 kbit/s by ear. Even though MP3 isn’t lossless, high bitrates are said to be “transparent”.

Many years ago, I found out in a blind test that I was personally unable to tell apart lossless audio and variable bitrate MP3 from LAME averaging at 190 kbit/s.

If you aren’t aiming for transparency, you can go as low as 130 kbit/s (“-q:a 5”) with a decent quality – as long as your source file has a high quality.

What about AAC?

AAC (Advanced Audio Coding) is another famous lossy codec, popularized by Apple with iPod devices and iTunes. AAC is not as ubiquitous as MP3, though.

Even though AAC is technically superior to MP3, the native AAC encoder available in FFmpeg is unfortunately not good – it’s actually worse than the LAME MP3 encoder.

FFmpeg supports another, superior AAC encoder, “libfdk_aac”, but due to licensing issues, you cannot download pre-built binaries including it. You would have to compile FFmpeg yourself (good luck
).

If you really want to encode AAC audio with a decent quality, your best bet is to use iTunes, not FFmpeg.

Audio, lossy, state of the art → Opus

Opus is a more recent codec, and is technically superior to virtually every other lossy audio codec, both for music, and for low-bitrate voice communications.

It’s used by streaming providers such as SoundCloud, YouTube, or Vimeo, and for voice communications in Skype, Discord, WhatsApp, Signal, and many others.

ffmpeg -i input.wav -c:a libopus -b:a 160k output.opus

The above command will use the variable bitrate mode, with an average bitrate of roughly 160 kbit/s.

“-b:a” stands for “audio bitrate” (“-ab” is an alias to “-b:a”).

You can also use the “.ogg” extension for Opus files.

Opus consistently ranks first in listening tests (blind tests) featuring other lossy codecs.

For instance, have a look at the following results from 2014:

http://listening-test.coresv.net/results.htm

The quality loss of Opus at 96 kbit/s was considered by many to be imperceptible, and “perceptible, but not annoying” by others.

AAC (encoded with iTunes) wasn’t far behind, in terms of quality, at the same bitrate.

MP3 was only slightly worse than AAC (closer to “perceptible, but not annoying”), albeit with a higher bitrate (around 130 kbit/s).

As long as the target device/software supports it, you should use Opus, if you need a lossy audio codec.

Video, wide compatibility → H.264 + MP3

H.264, also known as AVC (Advanced Video Coding), is a very popular video compression standard.

It’s the main video codec used in standard Blu-ray discs (Ultra HD Blu-rays use its successor, H.265).

Although newer codecs exist, it’s still widely used on the Web.

In addition to compatibility, its popularity brings another advantage: hardware-acceleration.

Most devices built in the past 10 to 15 years can decode H.264 at the hardware level, which results in improved performance, and lower power usage, compared to software decoding.

The H.264 encoder in FFmpeg is called x264:

ffmpeg -i input.mp4 -c:v libx264 -crf 24 -c:a libmp3lame -q:a 2 output.mp4

Feel free to use “-vcodec” in place of “-c:v” if you want (“video codec”).

Here, I used the MP4 container (“.mp4” extension), which is widely supported, including by Web browsers.

Matroska (MKV) is another interesting container format. It’s very versatile: it can hold an unlimited number of video, audio and subtitle tracks in one file. Whereas the only way to embed subtitles in an MP4 file is to “burn” them into the video stream. MKV is not as widely supported as MP4, though.

In order to use it, just replace “.mp4” by “.mkv” in the command above.

The “-crf” parameter allows you to adjust the output quality of the video stream. CRF stands for “Constant Rate Factor”, and it means that the encoder will try to keep a roughly constant quality throughout the video, rather than a constant bitrate.

Here’s the explanation from the FFmpeg documentation about CRF values:

The range of the CRF scale is 0–51, where 0 is lossless (for 8 bit only, for 10 bit use -qp 0), 23 is the default, and 51 is worst quality possible. A lower value generally leads to higher quality, and a subjectively sane range is 17–28. Consider 17 or 18 to be visually lossless or nearly so; it should look the same or nearly the same as the input but it isn’t technically lossless.
The range is exponential, so increasing the CRF value +6 results in roughly half the bitrate / file size, while -6 leads to roughly twice the bitrate.

https://trac.ffmpeg.org/wiki/Encode/H.264

Video, state-of-the-art → AV1 + Opus

As of 2024, the most advanced video codec you can realistically use is AV1. (Not to be confused with the – very old – AVI container format!)

There are several AV1 encoders in FFmpeg: “libaom”, “SVT-AV1” and “rav1e”.

https://trac.ffmpeg.org/wiki/Encode/AV1

I got better results with “SVT-AV1” on my machine, in terms of encoding speed relative to the output quality.

ffmpeg -i input.mp4 -c:v libsvtav1 -crf 38 -preset 4 -c:a libopus -b:a 128k output.webm

This time, I used the WebM container. It’s an open format, and supports the following codecs:

I could have used MKV or MP4 instead.

The “preset” option of “libsvtav1” is used to adjust the tradeoff between encoding speed and quality. It goes from 0 (highest quality) to 13 (fastest). You should set it as low as you can. That is to say, the encoding speed should be as slow as you can afford. Otherwise, you could end up with a disappointing quality.

The CRF values are not directly comparable to x264: in a quick test, “SVT-AV1” gave me a slightly higher quality with a CRF of 38 than x264 with a CRF of 24.

AV1 can achieve 30% better compression than VP9, and 50% better than H.264.

Hardware decoding requires recent hardware, though.

It started appearing in hardware products in 2020 (from Intel, NVIDIA, AMD), and became more common in products released in 2023 (by Apple, Qualcomm, 
).

Without hardware decoding, video playback is going to use a lot of CPU power, and drain the battery fast.

Intermediate video codecs: VP8, VP9, H.265

VP8 is an open format, designed as an alternative to H.264. It’s a bit inferior in terms of quality, though.

It was superseded by VP9, with an improved compression efficiency. It should be able to do better than H.264 on that front.

ffmpeg -i input.mp4 -c:v libvpx-vp9 -crf 40 -b:v 0 -c:a libopus -b:a 128k output.webm

H.265 is the successor to H.264, and competes with VP9 in terms of quality / file size ratio.

Hardware decoding is more common for VP9 and H.265 than for AV1 (but less common than for H.264).

Compatibility with Web browsers

If you ever need to encode video for direct use in a Web browser (i.e. without re-encoding by a platform such as YouTube), you should check out “Can I use”:

https://caniuse.com/mpeg4

https://caniuse.com/webm

https://caniuse.com/av1

https://caniuse.com/opus

A note about software patents

On one hand, if you’re a private individual encoding audio/video files for your own use, you probably don’t have to care about software patents.

On the other hand, if you intend to release audio/video content to the public, or develop software or hardware involving codecs, beware.

Do your research to determine if you need a patent license.

Fortunately, some codecs are not subject to royalties, either because the related patents have expired (e.g. MP3), or because they were designed to be royalty-free from the start (e.g. FLAC, VP8, VP9, AV1).

Opus was also designed to use only royalty-free patents, but a group of companies with questionnable practices is now trying to collect royalties for devices implementing Opus at the hardware level.