No description
  • Rust 99.6%
  • Nix 0.4%
Find a file
JMARyA a97b7558d5
perf: extract embedded info.json from ffprobe extradata instead of 2nd ffmpeg pass
Replaces the dump_attachment() ffmpeg subprocess (which reads the
entire file) with ffprobe -show_data extradata available in the
same ffprobe JSON output. Eliminates a full-file read per video.
2026-05-26 12:08:56 +02:00
src perf: extract embedded info.json from ffprobe extradata instead of 2nd ffmpeg pass 2026-05-26 12:08:56 +02:00
.gitignore feat: initial release of mlib media library CLI 2026-05-24 21:30:56 +02:00
Cargo.lock feat: fixable lint system and mlib fix command 2026-05-25 05:45:21 +02:00
Cargo.toml feat: fixable lint system and mlib fix command 2026-05-25 05:45:21 +02:00
flake.nix feat: initial release of mlib media library CLI 2026-05-24 21:30:56 +02:00
mlib.example.toml feat: min_size, require_embedded_cover, min/max page count lint rules 2026-05-25 05:06:50 +02:00
README.md feat: fixable marker in health output + fix docs 2026-05-25 05:51:54 +02:00

mlib

Media library CLI — scan, lint, verify, and index large media collections stored in S3 or on a local filesystem.

Overview

mlib manages per-directory .meta.toml sidecar files as a persistent index. These sidecars store file hashes, ffprobe metadata (codecs, tags, streams), and verification timestamps. All other commands (lint, verify, health) read from the index without touching the original files.

Dependencies

  • Rust (stable)
  • ffprobe (part of ffmpeg) — must be in $PATH
  • For S3: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env vars

Build

cargo build --release
# binary at target/release/mlib

With Nix:

nix build
nix develop   # dev shell with Rust + ffmpeg

Configuration

Copy mlib.example.toml to mlib.toml and fill in your values. Loaded from ./mlib.toml first, then ~/.config/mlib/config.toml.

[storage]
backend  = "s3"
endpoint = "https://s3.example.com"
region   = "us-east-1"

[libraries.music]
bucket = "media-music"   # S3 mode
path   = "./music"       # local mode
mirror = "./sidecars/music"  # optional local mirror for sidecars (see GitOps)

[verify]
max_age_days = 90

Commands

mlib [--config <path>] [--json] <command>
Command Description
index Show file counts and sizes for all libraries
scan Probe files with ffprobe and write .meta.toml sidecars
lint Check filenames and metadata against configured rules
verify Re-hash files and compare against stored checksums
health Aggregated lint dashboard — violation counts and bar charts per library
fix Preview and apply automated fixes for lint violations
sync Pull sidecars from remote into the local mirror
clean Remove orphaned .meta.toml sidecars
dupes Find duplicate files across libraries by blake3 hash
stats Aggregate library analytics — codecs, bitrates, tag completeness
tui Interactive browser TUI

All commands accept an optional path argument to scope to a single library or subpath:

mlib scan                        # all libraries
mlib scan music                  # music library only
mlib scan music/Portishead       # subpath within music

Exit codes: 0 ok · 1 warnings · 2 errors

scan

mlib scan [path] [--full] [--dry-run]

Walks the library, runs ffprobe on each media file, and writes .meta.toml sidecars. Files that have not changed (size + mtime) are skipped by default.

  • --full — re-probe all files regardless of mtime
  • --dry-run — show what would be scanned without writing

If a mirror path is configured for the library, the sidecar is also written there on every update.

lint

mlib lint [path]

Reads sidecars and reports violations against the rules in [lint.<library>]. Run scan first to populate the index. When a mirror is configured it reads from there, so no S3 access is needed.

verify

mlib verify [path]

Re-hashes every file and compares against the stored blake3 checksum. On match, stamps verified = "<date>" in the sidecar. Downloads all S3 files — explicit and potentially slow.

health

mlib health [path]

Runs a full lint pass (including verification-age checks) and renders a per-library aggregate dashboard — violation counts grouped by type, with a proportional bar chart and percentage of files affected. Reads from the mirror when configured, no S3 access needed.

Violation codes that have at least one automated fix available are marked with . The footer shows a total fixable count and a hint to run mlib fix --apply.

fix

mlib fix [path] [--apply] [--code <code>]

Previews automated fixes for lint violations (dry-run by default). Run with --apply to execute them. Use --code to scope to a single violation type.

mlib lint shows fixable violations inline with [fix: ...] and a footer count. mlib health marks fixable codes with .

After applying, the affected sidecar is invalidated so mlib scan will re-index the changed files.

Currently fixable violations:

Code Fix Condition
cover Extract embedded cover art → cover.jpg via ffmpeg Any audio file in the dir has an embedded cover stream
artist_match Write correct artist tag Value derived from artist directory name
album_match Write correct album tag Value derived from album directory name (year stripped)
year_match Write correct date tag Year parsed from album directory name
missing_tags (artist/album/date) Backfill tag from directory path Value unambiguously derivable from path

Tag writes use lofty and modify the actual media file. Re-run mlib scan after applying to refresh the index.

dupes

mlib dupes [path]

Scans all libraries (or a single library/subpath) for files that share the same blake3 hash. Groups duplicates, shows each path, and reports wasted space. Results are sorted by largest wasted space first. All data is read from the sidecar index — no file access needed.

  a3f8c1d2e4b56789…  128 MiB  ·  256 MiB wasted
    movies/The Matrix (1999)/The Matrix (1999).mkv
    movies/backup/The Matrix (1999)/The Matrix (1999).mkv

stats

mlib stats [path]

Derives aggregate statistics from the sidecar index — no file access needed. Outputs a per-library breakdown with colored bar charts:

  • Audio: codec distribution, sample rates, bit depths, avg/min/max bitrate
  • Video: codec distribution, resolution buckets (SD → 4K), HDR count
  • Container: format distribution
  • Tag completeness: per-tag present/total counts for music (title, artist, album, …) and books (title, creator, language)
  • Archive: format distribution (cbz, epub, pdf), CBZ image formats, page count stats

sync

mlib sync [path] [--dry-run]

Pulls all .meta.toml sidecars from the remote (S3) into the configured local mirror directory. Remote is the source of truth — this is pull-only. Sidecars that no longer exist on the remote are deleted from the mirror.

  • --dry-run — show what would be pulled/deleted without writing

Requires mirror to be set in [libraries.<name>].

clean

mlib clean [path] [--dry-run] [--all]

Removes orphaned .meta.toml sidecars — those whose directory no longer contains any media files.

  • --all — wipe every sidecar unconditionally instead of just orphaned ones
  • --dry-run — show what would be deleted without writing

tui

mlib tui

Opens an interactive terminal UI for browsing libraries, viewing per-file metadata, and inspecting lint violations.

index

mlib index [--json]

Prints a summary table of all libraries grouped by section (visual / audio / reading). --json outputs machine-readable JSON.


GitOps

Set mirror on each library to a path inside a git repository. mlib scan and mlib sync both write sidecars there, keeping the mirror in sync with the remote. Committing the mirror gives you a full version-controlled history of your library metadata — no S3 access needed for CI linting or auditing past state.

[libraries.music]
bucket = "media-music"
mirror = "library/music"   # relative to mlib.toml
# Sync all sidecars from S3 into the mirror
mlib sync

# Commit the changes
git add library/
git commit -m "chore: sync sidecars"

mlib lint and mlib health read from the mirror automatically when it is configured and exists on disk, so they work offline without S3 credentials.


Sidecar format

One .meta.toml per directory. Contains a directory snapshot and a per-file record for each media file.

[dir]
scanned    = "2026-05-24"
total_size = 499942778
files      = ["01 Track.flac", "cover.jpg", ...]
subdirs    = ["Disc 1", "Disc 2"]

[files."01 Track.flac"]
path     = "Artist/Album (2024)/01 Track.flac"
size     = 32763444
blake3   = "6342fac3..."
checked  = "2026-05-24"
verified = "2026-05-24"       # set by `verify`

[files."01 Track.flac".format]
container = "flac"
duration  = 154.0
bitrate   = 1701

[files."01 Track.flac".tags]
artist    = "Artist"
album     = "Album"
title     = "Track"
track     = "1"
date      = "2024"

[[files."01 Track.flac".streams]]
index       = 0
type        = "audio"
codec       = "flac"
sample_rate = 44100
channels    = 2
bit_depth   = 24

[[files."01 Track.flac".streams]]
index  = 1
type   = "video"
codec  = "mjpeg"
width  = 600
height = 600

[files."01 Track.flac".streams.disposition]
attached_pic = true    # embedded cover art — ignored by video codec checks

Lint rules

All rules are opt-in via [lint.<library>] in the config. See mlib.example.toml for the full reference.

Per-path overrides are supported via [lint.<library>.overrides.<prefix>]. The most specific (longest) matching prefix wins and is merged field-by-field with the base config.

[lint.music.overrides."Compilations"]
audio.require_artist_match = false

Format

Key Values Description
format.container "flac", "mkv" Required container format
format.video "hevc", "h264" Required video codec
format.audio "flac", "opus" Required audio codec
format.channels "2", ">=6" Required channel count
format.sample_rate "44100" Required sample rate in Hz
format.bit_depth "24", ">=16" Required bit depth
audio.bitrate ">=600" Minimum bitrate in kbps
video.resolution ">=1080", "1920x1080" Minimum or exact resolution
video.hdr "none" "allowed" "required" HDR policy
audio.languages ["jpn", "eng"] Required audio track languages
video.subtitle_languages ["eng"] Required subtitle track languages
video.video_bitrate ">=2000" Minimum video bitrate in kbps

Tags

Key Description
audio.require_tags artist, album, title, track must be present
audio.require_albumartist albumartist tag must be present
audio.require_genre genre tag must be present
audio.require_replaygain REPLAYGAIN_TRACK_GAIN + REPLAYGAIN_ALBUM_GAIN must be present
audio.require_artist_match artist tag must match artist directory name
audio.require_album_match album tag must match album directory name (year suffix stripped)
audio.require_year_match date tag year must match year in album directory name

Structure

Key Description
min_size minimum file size in bytes — flags suspiciously small/corrupt files
audio.require_cover cover.{jpg,png,webp,...} must exist in the directory
audio.require_lrc .lrc lyrics file must exist alongside each audio track
audio.require_embedded_cover each audio file must have embedded cover art (attached_pic stream)
audio.require_no_gaps track numbers must be sequential with no missing numbers
video.require_nfo .nfo metadata file must exist in the directory
video.require_poster poster.* or folder.* image must exist in the directory
video.require_single_video directory must contain exactly one video file
sidecars glob patterns — any matching file is flagged as an unwanted sidecar
disallowed_files glob patterns — any matching file is flagged as a violation

Archive (CBZ / PDF)

Key Description
archive.image_format required image format for CBZ pages: "jpeg", "png", "webp", "avif"
archive.require_uniform_format flag archives mixing multiple image formats
archive.min_width minimum page width in pixels
archive.min_height minimum page height in pixels
archive.min_page_count minimum page count — flags truncated/incomplete archives
archive.max_page_count maximum page count — flags abnormally large archives
archive.require_no_volume_gaps volume numbers must be sequential with no gaps (directory-level)

Books (epub / PDF)

Key Description
archive.require_title title metadata must be present
archive.require_author creator/author metadata must be present
archive.require_language language metadata must be present

Verification

Verification checks are part of the lint pass and only run when [verify] max_age_days is set.

Code Description
no_hash no blake3 hash stored — re-run scan to compute
never_verified file has never been run through verify
stale_verification last verification is older than max_age_days

Library layouts

Music

{Genre}/
  {Artist}/
    {Album} ({Year})/
      {NN} {Title}.flac
      cover.jpg

# Example
German Rap/Ufo361/BEWARE (2026)/01 I KNOW.flac

Soundtrack

{anime|film|game}/
  {Title} ({Year})/
    {NN} - {Title}.flac
    cover.jpg

# Example
anime/Cowboy Bebop (1998)/01 - Tank!.flac

Movies

{Title} ({Year})/
  {Title} ({Year}).mkv

Anime / Series

{Title} ({Year})/
  Season {NN}/
    {Title} - S{NN}E{NN}.mkv

Manga

{Title}/
  {Title} - Vol.001.cbz
  {Title} - Ch.0001.cbz

Books

{Author}/
  {Title}.epub
  {Title}.pdf

Roadmap

mlib fix — additional fixes

The fix framework is in place. Planned additions:

  • Filename rename — rename files to match the configured pattern, derived from existing tags
  • Tag normalization — normalize casing and whitespace in existing tags
  • Embedded cover embedding — embed a directory cover.jpg into audio files missing one

TUI enhancements

  • Violation filter — filter file list to only show files with a specific violation code
  • Full metadata panel — expand any file to see all streams, tags, and archive info inline
  • Library health bars — show per-library violation rate on the library selector screen
  • Search — fuzzy search across filenames and tags within the current library
  • Run commands — trigger scan or lint for the current selection without leaving the TUI