knowledge/technology/files/media/video/AV1.md
2024-04-04 13:15:31 +02:00

40 KiB

obj mime rev
format video/av1 2024-04-04

AV1

AV1 is a royalty-free video codec designed to be an alternative to the widely used H.264 and HEVC codecs. It was developed by the Alliance for Open Media, a consortium of technology companies including Google, Mozilla, and Netflix.
Encoding can be done with ffmpeg or av1an.

Key Features

AV1 includes a range of features designed to provide high-quality video compression while remaining open and accessible to all users:

  • High compression efficiency: AV1 is designed to provide high-quality video compression while using less bandwidth than other codecs. This makes it ideal for streaming video over the internet.
  • Scalable video technology: AV1 includes scalable video technology that allows it to adjust the quality of video based on available bandwidth. This ensures that users always receive the best possible video quality for their network connection.
  • Royalty-free: Unlike other codecs, AV1 is completely royalty-free. This means that anyone can use it without having to pay fees or royalties.
  • Wide range of applications: AV1 is suitable for a wide range of applications, including live streaming, video-on-demand services, and video conferencing.
  • Open and accessible: AV1 is an open standard, meaning that anyone can use it and contribute to its development. This makes it accessible to users of all skill levels.

Advantages

AV1 offers a number of advantages over other video codecs:

  • Higher quality video: AV1 provides higher quality video than other codecs, even at lower bitrates.
  • More efficient compression: AV1 provides more efficient compression than other codecs, allowing for faster download and upload times.
  • Better streaming performance: AV1 is designed for streaming video over the internet, and provides better streaming performance than other codecs.
  • Wide compatibility: AV1 is compatible with a wide range of devices and platforms, including desktop and mobile devices.

Usage with av1an

Av1an is a video encoding framework. It can increase your encoding speed and improve cpu utilization by running multiple encoder processes in parallel. Target quality, VMAF plotting, and more, available to take advantage for video encoding.

Encode a video file with the default parameters:

av1an [options] -i input.mkv

Options

Option Description
-i <INPUT> Input file to encode. Can be a video or vapoursynth (.py, .vpy) script.
-o <OUTPUT_FILE> Video output file.
--temp <TEMP> Temporary directory to use. If not specified, the temporary directory name is a hash of the input file name.
-q, --quiet Disable printing progress to the terminal.
--verbose Print extra progress info and stats to terminal.
-l, --log-file <LOG_FILE> Log file location. [default: <temp dir>/log.log].
--log-level <LOG_LEVEL> Set log level for log file (does not affect command-line log level).
-r, --resume Resume previous session from temporary directory.
-k, --keep Do not delete the temporary folder after encoding has finished.
--force Do not check if the encoder arguments specified by -v/--video-params are valid.
-y Overwrite output file without confirmation.
--max-tries <MAX_TRIES> Maximum number of chunk restarts for an encode. [default: 3].
-w, --workers <WORKERS> Number of workers to spawn [0 = automatic]. [default: 0].
--set-thread-affinity <SET_THREAD_AFFINITY> Pin each worker to a specific set of threads of this size (disabled by default). This is currently only supported on Linux and Windows, and does nothing on unsupported platforms. Leaving this option unspecified allows the OS to schedule all processes spawned.

Scene Detection

Option Description
-s, --scenes <SCENES> File location for scenes.
--split-method <SPLIT_METHOD> Method used to determine chunk boundaries. Can be "av-scenechange" or "none". [default: av-scenechange].
--sc-method <SC_METHOD> Scene detection algorithm to use for av-scenechange. Can be "standard" or "fast". [default: standard].
--sc-only Run the scene detection only before exiting. Requires a scene file with --scenes.
--sc-pix-format <SC_PIX_FORMAT> Perform scene detection with this pixel format.
--sc-downscale-height <SC_DOWNSCALE_HEIGHT> Optional downscaling for scene detection. Specify as the desired maximum height to scale to (e.g., "720" to downscale to 720p — this will leave lower resolution content untouched). Downscaling improves scene detection speed but lowers accuracy, especially when scaling to very low resolutions. By default, no downscaling is performed.
-x, --extra-split <EXTRA_SPLIT> Maximum scene length. When a scenecut is found whose distance to the previous scenecut is greater than the value specified by this option, one or more extra splits (scenecuts) are added. Set this option to 0 to disable adding extra splits.
--min-scene-len <MIN_SCENE_LEN> Minimum number of frames for a scenecut [default: 24].
--force-keyframes <FORCE_KEYFRAMES> Comma-separated list of frames to force as keyframes. Can be useful for improving seeking with chapters, etc. Frame 0 will always be a keyframe and does not need to be specified here.

Encoding

Option Description
-e, --encoder Specifies the video encoder to use. The default value is aom.
-v, --video-params Specifies parameters for the video encoder. These parameters are specific to each encoder and cannot be specified using ffmpeg syntax. For example, CRF is specified in ffmpeg via -crf <crf>, but the x264 binary takes this value with double dashes, as in --crf <crf>. See the --help output of each encoder for a list of valid options.
-p, --passes Specifies the number of encoder passes to use. Since aom and vpx benefit from two-pass mode even with constant quality mode, two-pass mode is used by default for these encoders. When using aom or vpx with RT mode (--rt), one-pass mode is always used regardless of the value specified by this flag (as RT mode in aom and vpx only supports one-pass encoding).
-a, --audio-params Specifies audio encoding parameters using ffmpeg syntax. If not specified, -c:a copy is used. Do not use ffmpeg's -map syntax with this option. Instead, use the colon syntax with each parameter you specify. Subtitles are always copied by default. Example to encode all audio tracks with libopus at 128k: -a="-c:a libopus -b:a 128k"
-f, --ffmpeg Specifies FFmpeg filter options. For more information, see the FFmpeg documentation.
-m, --chunk-method Method used for piping exact ranges of frames to the encoder. Methods that require an external VapourSynth plugin: lsmash - Generally the best and most accurate method. Does not require intermediate files. Errors generally only occur if the input file itself is broken (for example, if the video bitstream is invalid in some way, video players usually try to recover from the errors as much as possible even if it results in visible artifacts, while lsmash will instead throw an error). Requires the lsmashsource VapourSynth plugin to be installed. ffms2 - Accurate and does not require intermediate files. Can sometimes have bizarre bugs that are not present in lsmash (that can cause artifacts in the piped output). Slightly faster than lsmash for y4m input. Requires the ffms2 VapourSynth plugin to be installed. Methods that only require FFmpeg: hybrid - Uses a combination of segment and select. Usually accurate but requires intermediate files (which can be large). Avoids decoding irrelevant frames by seeking to the first keyframe before the requested frame and decoding only a (usually very small) number of irrelevant frames until relevant frames are decoded and piped to the encoder. select - Extremely slow, but accurate. Does not require intermediate files. Decodes from the first frame to the requested frame, without skipping irrelevant frames (causing quadratic decoding complexity). segment - Create chunks based on keyframes in the source. Not frame exact, as it can only split on keyframes in the source. Requires intermediate files (which can be large). Default: lsmash (if available), otherwise ffms2 (if available), otherwise hybrid.
--chunk-order The order in which av1an will encode chunks. Available methods: long-to-short - The longest chunks will be encoded first. This method results in the smallest amount of time with idle cores, as the encode will not be waiting on a very long chunk to finish at the end of the encode after all other chunks have finished. short-to-long - The shortest chunks will be encoded first. sequential - The chunks will be encoded in the order they appear in the video. random - The chunks will be encoded in a random order. This will provide a more accurate estimated filesize sooner in the encode. [default: long-to-short]
--photon-noise Generates a photon noise table and applies it using grain synthesis [strength: 0-64] (disabled by default) Photon noise tables are more visually pleasing than the film grain generated by aomenc, and provide a consistent level of grain regardless of the level of grain in the source. Strength values correlate to ISO values, e.g. 1 = ISO 100, and 64 = ISO 6400. This option currently only supports aomenc and rav1e. An encoder's grain synthesis will still work without using this option, by specifying the correct parameter to the encoder. However, the two should not be used together, and specifying this option will disable the encoder's internal grain synthesis.
--chroma-noise Adds chroma grain synthesis to the grain table generated by --photon-noise. Default: false
-c, --concat <CONCAT> Determines method used for concatenating encoded chunks and audio into output file. Choices: ffmpeg, mkvmerge, ivf (experimental). Default: ffmpeg
--pix-format <PIX_FORMAT> FFmpeg pixel format. Default: yuv420p10le

VMAF

Option Description
--vmaf Plot an SVG of the VMAF for the encode. This option is independent of --target-quality, i.e. it can be used with or without it. The SVG plot is created in the same directory as the output file.
--vmaf-path <VMAF_PATH> Path to VMAF model (used by --vmaf and --target-quality). If not specified, ffmpeg's default is used.
--vmaf-res <VMAF_RES> Resolution used for VMAF calculation [default: 1920x1080]
--vmaf-threads <VMAF_THREADS> Number of threads to use for VMAF calculation
--vmaf-filter <VMAF_FILTER> Filter applied to source at VMAF calcualation. This option should be specified if the source is cropped, for example.

Target Quality

Option Description
--target-quality <TARGET_QUALITY> Target a VMAF score for encoding (disabled by default)
For each chunk, target quality uses an algorithm to find the quantizer/crf needed to achieve a certain VMAF score. Target quality mode is much slower than normal encoding, but can improve the consistency of quality in some cases.
The VMAF score range is 0-100 (where 0 is the worst quality, and 100 is the best). Floating-point values are allowed.
--probes <PROBES> Maximum number of probes allowed for target quality [default: 4]
--probing-rate <PROBING_RATE> Framerate for probes, 1 - original [default: 1]
--probe-slow Use encoding settings for probes specified by --video-params rather than faster, less accurate settings
Note that this always performs encoding in one-pass mode, regardless of --passes.
--min-q <MIN_Q> Lower bound for target quality Q-search early exit
If min_q is tested and the probe's VMAF score is lower than target_quality, the Q-search early exits and min_q is used for the chunk.
If not specified, the default value is used (chosen per encoder).
--max-q <MAX_Q> Upper bound for target quality Q-search early exit
If max_q is tested and the probe's VMAF score is higher than target_quality, the Q-search early exits and max_q is used for the chunk.
If not specified, the default value is used (chosen per encoder).