đŸ Archived View for sebastien-mouchet.fr âș blog âș en âș audio-video-encoding-with-ffmpeg.gmi captured on 2024-05-12 at 15:01:09. Gemini links have been rewritten to link to archived content
âŹ ïž Previous capture (2024-03-21)
-=-=-=-=-=-=-
Posted on January 20, 2024, by SĂ©bastien
Cet article est également disponible en français.
âââ
TLDR?
If you donât have 15 minutes to spend reading this blog post, hereâs a 3 minute version instead:
âââ
FFmpeg is a powerful tool, able to decode and encode hundreds of different audio and video formats.
In fact, it isnât limited to decoding and encoding, it can also:
This blog post is solely about transcoding, though.
If youâve never heard of FFmpeg, know that itâs a command-line program.
Wait⊠donât run away just yet, itâs actually quite easy to use. đ
It provides libraries used by countless other programs, including media players, such as VLC, and a lot of audio/video transcoding software (for instance, Handbrake).
I used to use GUI apps for transcoding, but Iâve since cut out the middleman, and I now use FFmpeg directly.
If youâre running Windows, you can download prebuilt binaries:
https://ffmpeg.org/download.html#get-packages
Practically speaking, the last time I needed to download FFmpeg, I used the latest build from:
https://github.com/BtbN/FFmpeg-Builds/releases
More specifically:
ffmpeg-master-latest-win64-gpl.zip
Extract the ZIP file somewhere.
For convenience, you can add the folder containing âffmpeg.exeâ to your âPathâ environment variable, which will enable you to call âffmpeg.exeâ from any folder.
Searching for âenvironment variablesâ in the start menu should bring up the corresponding dialog, where you can edit the âPathâ user variable.
To avoid breaking anything, make sure you donât remove anything from this variable (only append to it).
Then, open a terminal by using Shift + right-click on the desired folder, âOpen in Windows Terminalâ.
If necessary, you can change the current folder using the âcdâ command (âcdâ stands for âchange directoryâ), such as:
cd C:\Users\Sebastien\Videos\
If youâre on Linux, installation is usually very simple, as FFmpeg is bundled by most major distributions.
For instance, on Ubuntu or Debian, just type:
sudo apt install ffmpeg
Imagine you have an uncompressed WAV file extracted straight from a CD.
You can convert it to a FLAC file, which can be around 30% smaller (sometimes up to 50%) without losing any quality, thanks to lossless compression.
ffmpeg -i input.wav output.flac
Yes, itâs that simple. The basic syntax only requires specifying the input file with the â-iâ option, and ending the command with the output filename.
FFmpeg will automatically detect that it needs to use the FLAC encoder, based on the file extension.
Of course, thereâs no point in converting lossy audio (such as MP3) to FLAC: you wonât make up any lost quality by doing that.
Lossy encoding enables significant savings in terms of file size, at the expense of some (irreversible) quality loss.
MP3 is a very famous lossy codec, with very wide compatibility. Virtually all devices/software that support lossy audio codecs support MP3.
ffmpeg -i input.wav -c:a libmp3lame -q:a 0 output.mp3
Compared to the FLAC example, there are 2 more options:
Here, Iâve explicitly selected the LAME MP3 encoder (libmp3lame). Iâm pretty sure itâs the default MP3 encoder, so you could probably omit this option in this case.
MP3 can be encoded either using a constant bitrate (up to 320Â kbit/s), or using a variable bitrate (VBR). Constant bitrate is considered wasteful, as the signal complexity varies throughout audio streams, so it makes sense to allocate less bitrate to parts of the track that are easier to encode.
Specifying the âqualityâ parameter instructs the LAME encoder to select a variable bitrate. 0 is the highest quality, 9 would be the lowest.
Here are the very approximate bitrates you can expect â for stereo audio â by tweaking the quality setting:
The vast majority of people cannot differentiate between a lossless source and a properly encoded MP3 with an average bitrate of 245Â kbit/s by ear. Even though MP3 isnât lossless, high bitrates are said to be âtransparentâ.
Many years ago, I found out in a blind test that I was personally unable to tell apart lossless audio and variable bitrate MP3 from LAME averaging at 190Â kbit/s.
If you arenât aiming for transparency, you can go as low as 130Â kbit/s (â-q:a 5â) with a decent quality â as long as your source file has a high quality.
AAC (Advanced Audio Coding) is another famous lossy codec, popularized by Apple with iPod devices and iTunes. AAC is not as ubiquitous as MP3, though.
Even though AAC is technically superior to MP3, the native AAC encoder available in FFmpeg is unfortunately not good â itâs actually worse than the LAME MP3 encoder.
FFmpeg supports another, superior AAC encoder, âlibfdk_aacâ, but due to licensing issues, you cannot download pre-built binaries including it. You would have to compile FFmpeg yourself (good luckâŠ).
If you really want to encode AAC audio with a decent quality, your best bet is to use iTunes, not FFmpeg.
Opus is a more recent codec, and is technically superior to virtually every other lossy audio codec, both for music, and for low-bitrate voice communications.
Itâs used by streaming providers such as SoundCloud, YouTube, or Vimeo, and for voice communications in Skype, Discord, WhatsApp, Signal, and many others.
ffmpeg -i input.wav -c:a libopus -b:a 160k output.opus
The above command will use the variable bitrate mode, with an average bitrate of roughly 160Â kbit/s.
â-b:aâ stands for âaudio bitrateâ (â-abâ is an alias to â-b:aâ).
You can also use the â.oggâ extension for Opus files.
Opus consistently ranks first in listening tests (blind tests) featuring other lossy codecs.
For instance, have a look at the following results from 2014:
http://listening-test.coresv.net/results.htm
The quality loss of Opus at 96Â kbit/s was considered by many to be imperceptible, and âperceptible, but not annoyingâ by others.
AAC (encoded with iTunes) wasnât far behind, in terms of quality, at the same bitrate.
MP3 was only slightly worse than AAC (closer to âperceptible, but not annoyingâ), albeit with a higher bitrate (around 130Â kbit/s).
As long as the target device/software supports it, you should use Opus, if you need a lossy audio codec.
H.264, also known as AVC (Advanced Video Coding), is a very popular video compression standard.
Itâs the main video codec used in standard Blu-ray discs (Ultra HD Blu-rays use its successor, H.265).
Although newer codecs exist, itâs still widely used on the Web.
In addition to compatibility, its popularity brings another advantage: hardware-acceleration.
Most devices built in the past 10 to 15 years can decode H.264 at the hardware level, which results in improved performance, and lower power usage, compared to software decoding.
The H.264 encoder in FFmpeg is called x264:
ffmpeg -i input.mp4 -c:v libx264 -crf 24 -c:a libmp3lame -q:a 2 output.mp4
Feel free to use â-vcodecâ in place of â-c:vâ if you want (âvideo codecâ).
Here, I used the MP4 container (â.mp4â extension), which is widely supported, including by Web browsers.
Matroska (MKV) is another interesting container format. Itâs very versatile: it can hold an unlimited number of video, audio and subtitle tracks in one file. Whereas the only way to embed subtitles in an MP4 file is to âburnâ them into the video stream. MKV is not as widely supported as MP4, though.
In order to use it, just replace â.mp4â by â.mkvâ in the command above.
The â-crfâ parameter allows you to adjust the output quality of the video stream. CRF stands for âConstant Rate Factorâ, and it means that the encoder will try to keep a roughly constant quality throughout the video, rather than a constant bitrate.
Hereâs the explanation from the FFmpeg documentation about CRF values:
The range of the CRF scale is 0â51, where 0 is lossless (for 8 bit only, for 10 bit use -qp 0), 23 is the default, and 51 is worst quality possible. A lower value generally leads to higher quality, and a subjectively sane range is 17â28. Consider 17 or 18 to be visually lossless or nearly so; it should look the same or nearly the same as the input but it isnât technically lossless.
The range is exponential, so increasing the CRF value +6 results in roughly half the bitrate / file size, while -6 leads to roughly twice the bitrate.
https://trac.ffmpeg.org/wiki/Encode/H.264
As of 2024, the most advanced video codec you can realistically use is AV1. (Not to be confused with the â very old â AVI container format!)
There are several AV1 encoders in FFmpeg: âlibaomâ, âSVT-AV1â and ârav1eâ.
https://trac.ffmpeg.org/wiki/Encode/AV1
I got better results with âSVT-AV1â on my machine, in terms of encoding speed relative to the output quality.
ffmpeg -i input.mp4 -c:v libsvtav1 -crf 38 -preset 4 -c:a libopus -b:a 128k output.webm
This time, I used the WebM container. Itâs an open format, and supports the following codecs:
I could have used MKV or MP4 instead.
The âpresetâ option of âlibsvtav1â is used to adjust the tradeoff between encoding speed and quality. It goes from 0 (highest quality) to 13 (fastest). You should set it as low as you can. That is to say, the encoding speed should be as slow as you can afford. Otherwise, you could end up with a disappointing quality.
The CRF values are not directly comparable to x264: in a quick test, âSVT-AV1â gave me a slightly higher quality with a CRF of 38 than x264 with a CRF of 24.
AV1 can achieve 30% better compression than VP9, and 50% better than H.264.
Hardware decoding requires recent hardware, though.
It started appearing in hardware products in 2020 (from Intel, NVIDIA, AMD), and became more common in products released in 2023 (by Apple, Qualcomm, âŠ).
Without hardware decoding, video playback is going to use a lot of CPU power, and drain the battery fast.
VP8 is an open format, designed as an alternative to H.264. Itâs a bit inferior in terms of quality, though.
It was superseded by VP9, with an improved compression efficiency. It should be able to do better than H.264 on that front.
ffmpeg -i input.mp4 -c:v libvpx-vp9 -crf 40 -b:v 0 -c:a libopus -b:a 128k output.webm
H.265 is the successor to H.264, and competes with VP9 in terms of quality / file size ratio.
Hardware decoding is more common for VP9 and H.265 than for AV1 (but less common than for H.264).
If you ever need to encode video for direct use in a Web browser (i.e. without re-encoding by a platform such as YouTube), you should check out âCan I useâ:
On one hand, if youâre a private individual encoding audio/video files for your own use, you probably donât have to care about software patents.
On the other hand, if you intend to release audio/video content to the public, or develop software or hardware involving codecs, beware.
Do your research to determine if you need a patent license.
Fortunately, some codecs are not subject to royalties, either because the related patents have expired (e.g. MP3), or because they were designed to be royalty-free from the start (e.g. FLAC, VP8, VP9, AV1).
Opus was also designed to use only royalty-free patents, but a group of companies with questionnable practices is now trying to collect royalties for devices implementing Opus at the hardware level.