58 KiB
repo | obj |
---|---|
https://github.com/trapexit/mergerfs | filesystem |
MergerFS
mergerfs is a union filesystem geared towards simplifying storage and management of files across numerous commodity storage devices. It is similar to mhddfs, unionfs, and aufs.
Usage: mergerfs -o<options> <branches> <mountpoint>
mergerfs logically merges multiple paths together. Think a union of sets. The file/s or directory/s acted on or presented through mergerfs are based on the policy chosen for that particular action
See MergerFS Tools for managing mergerfs.
Terminology
- branch: A base path used in the pool.
- pool: The mergerfs mount. The union of the branches.
- relative path: The path in the pool relative to the branch and mount.
- function: A filesystem call (open, unlink, create, getattr, rmdir, etc.)
- category: A collection of functions based on basic behavior (action, create, search).
- policy: The algorithm used to select a file when performing a function.
- path preservation: Aspect of some policies which includes checking the path for which a file would be created.
Basic Setup
Command Line:
mergerfs -o cache.files=partial,dropcacheonclose=true,category.create=mfs /mnt/hdd0:/mnt/hdd1 /media
/etc/fstab:
/mnt/hdd0:/mnt/hdd1 /media fuse.mergerfs cache.files=partial,dropcacheonclose=true,category.create=mfs 0 0
[Unit]
Description=mergerfs service
[Service]
Type=simple
KillMode=none
ExecStart=/usr/bin/mergerfs \
-f \
-o cache.files=partial \
-o dropcacheonclose=true \
-o category.create=mfs \
/mnt/hdd0:/mnt/hdd1 \
/media
ExecStop=/bin/fusermount -uz /media
Restart=on-failure
[Install]
WantedBy=default.target
Options
These options are the same regardless of whether you use them with the mergerfs
commandline program, in fstab, or in a config file.
mount option | description |
---|---|
config | Path to a config file. Same arguments as below in key=val / ini style format. |
branches | Colon delimited list of branches. |
minfreespace=SIZE | The minimum space value used for creation policies. Can be overridden by branch specific option. Understands 'K', 'M', and 'G' to represent kilobyte, megabyte, and gigabyte respectively. (default: 4G) |
moveonenospc=BOOL|POLICY | When enabled if a write fails with ENOSPC (no space left on device) or EDQUOT (disk quota exceeded) the policy selected will run to find a new location for the file. An attempt to move the file to that branch will occur (keeping all metadata possible) and if successful the original is unlinked and the write retried. (default: false, true = mfs) |
inodecalc=passthrough|path-hash|devino-hash|hybrid-hash | Selects the inode calculation algorithm. (default: hybrid-hash) |
dropcacheonclose=BOOL | When a file is requested to be closed call posix_fadvise on it first to instruct the kernel that we no longer need the data and it can drop its cache. Recommended when cache.files=partial|full|auto-full|per-process to limit double caching. (default: false) |
symlinkify=BOOL | When enabled and a file is not writable and its mtime or ctime is older than symlinkify_timeout files will be reported as symlinks to the original files. Please read more below before using. (default: false) |
symlinkify_timeout=UINT | Time to wait, in seconds, to activate the symlinkify behavior. (default: 3600) |
nullrw=BOOL | Turns reads and writes into no-ops. The request will succeed but do nothing. Useful for benchmarking mergerfs. (default: false) |
lazy-umount-mountpoint=BOOL | mergerfs will attempt to "lazy umount" the mountpoint before mounting itself. Useful when performing live upgrades of mergerfs. (default: false) |
ignorepponrename=BOOL | Ignore path preserving on rename. Typically rename and link act differently depending on the policy of create (read below). Enabling this will cause rename and link to always use the non-path preserving behavior. This means files, when renamed or linked, will stay on the same filesystem. (default: false) |
security_capability=BOOL | If false return ENOATTR when xattr security.capability is queried. (default: true) |
xattr=passthrough|noattr|nosys | Runtime control of xattrs. Default is to passthrough xattr requests. 'noattr' will short circuit as if nothing exists. 'nosys' will respond with ENOSYS as if xattrs are not supported or disabled. (default: passthrough) |
link_cow=BOOL | When enabled if a regular file is opened which has a link count > 1 it will copy the file to a temporary file and rename over the original. Breaking the link and providing a basic copy-on-write function similar to cow-shell. (default: false) |
statfs=base|full | Controls how statfs works. 'base' means it will always use all branches in statfs calculations. 'full' is in effect path preserving and only includes branches where the path exists. (default: base) |
statfs_ignore=none|ro|nc | 'ro' will cause statfs calculations to ignore available space for branches mounted or tagged as 'read-only' or 'no create'. 'nc' will ignore available space for branches tagged as 'no create'. (default: none) |
nfsopenhack=off|git|all | A workaround for exporting mergerfs over NFS where there are issues with creating files for write while setting the mode to read-only. (default: off) |
branches-mount-timeout=UINT | Number of seconds to wait at startup for branches to be a mount other than the mountpoint's filesystem. (default: 0) |
follow-symlinks=never|directory|regular|all | Turns symlinks into what they point to. (default: never) |
link-exdev=passthrough|rel-symlink|abs-base-symlink|abs-pool-symlink | When a link fails with EXDEV optionally create a symlink to the file instead. |
rename-exdev=passthrough|rel-symlink|abs-symlink | When a rename fails with EXDEV optionally move the file to a special directory and symlink to it. |
readahead=UINT | Set readahead (in kilobytes) for mergerfs and branches if greater than 0. (default: 0) |
posix_acl=BOOL | Enable POSIX ACL support (if supported by kernel and underlying filesystem). (default: false) |
async_read=BOOL | Perform reads asynchronously. If disabled or unavailable the kernel will ensure there is at most one pending read request per file handle and will attempt to order requests by offset. (default: true) |
fuse_msg_size=UINT | Set the max number of pages per FUSE message. Only available on Linux >= 4.20 and ignored otherwise. (min: 1; max: 256; default: 256) |
threads=INT | Number of threads to use. When used alone (process-thread-count=-1) it sets the number of threads reading and processing FUSE messages. When used together it sets the number of threads reading from FUSE. When set to zero it will attempt to discover and use the number of logical cores. If the thread count is set negative it will look up the number of cores then divide by the absolute value. ie. threads=-2 on an 8 core machine will result in 8 / 2 = 4 threads. There will always be at least 1 thread. If set to -1 in combination with process-thread-count then it will try to pick reasonable values based on CPU thread count. NOTE: higher number of threads increases parallelism but usually decreases throughput. (default: 0) |
read-thread-count=INT | Alias for threads. |
process-thread-count=INT | Enables separate thread pool to asynchronously process FUSE requests. In this mode read-thread-count refers to the number of threads reading FUSE messages which are dispatched to process threads. -1 means disabled otherwise acts like read-thread-count. (default: -1) |
process-thread-queue-depth=UINT | Sets the number of requests any single process thread can have queued up at one time. Meaning the total memory usage of the queues is queue depth multiplied by the number of process threads plus read thread count. 0 sets the depth to the same as the process thread count. (default: 0) |
pin-threads=STR | Selects a strategy to pin threads to CPUs (default: unset) |
scheduling-priority=INT | Set mergerfs' scheduling priority. Valid values range from -20 to 19. See setpriority man page for more details. (default: -10) |
fsname=STR | Sets the name of the filesystem as seen in mount, df, etc. Defaults to a list of the source paths concatenated together with the longest common prefix removed. |
func.FUNC=POLICY | Sets the specific FUSE function's policy. See below for the list of value types. Example: func.getattr=newest |
func.readdir=seq|cosr|cor|cosr | INT|cor:INT: Sets readdir policy. INT value sets the number of threads to use for concurrency. (default: seq) |
category.action=POLICY | Sets policy of all FUSE functions in the action category. (default: epall) |
category.create=POLICY | Sets policy of all FUSE functions in the create category. (default: epmfs) |
category.search=POLICY | Sets policy of all FUSE functions in the search category. (default: ff) |
cache.open=UINT | 'open' policy cache timeout in seconds. (default: 0) |
cache.statfs=UINT | 'statfs' cache timeout in seconds. (default: 0) |
cache.attr=UINT | File attribute cache timeout in seconds. (default: 1) |
cache.entry=UINT | File name lookup cache timeout in seconds. (default: 1) |
cache.negative_entry=UINT | Negative file name lookup cache timeout in seconds. (default: 0) |
cache.files=libfuse|off|partial|full|auto-full|per-process | File page caching mode (default: libfuse) |
cache.files.process-names=LIST | A pipe | delimited list of process comm names to enable page caching for when cache.files=per-process. (default: "rtorrent|qbittorrent-nox") |
cache.writeback=BOOL | Enable kernel writeback caching (default: false) |
cache.symlinks=BOOL | Cache symlinks (if supported by kernel) (default: false) |
cache.readdir=BOOL | Cache readdir (if supported by kernel) (default: false) |
parallel-direct-writes=BOOL | Allow the kernel to dispatch multiple, parallel (non-extending) write requests for files opened with cache.files=per-process (if the process is not in process-names) or cache.files=off. (This requires kernel support, and was added in v6.2) |
direct_io | deprecated - Bypass page cache. Use cache.files=off instead. (default: false) |
kernel_cache | deprecated - Do not invalidate data cache on file open. Use cache.files=full instead. (default: false) |
auto_cache | deprecated - Invalidate data cache if file mtime or size change. Use cache.files=auto-full instead. (default: false) |
async_read | deprecated - Perform reads asynchronously. Use async_read=true instead. |
sync_read | deprecated - Perform reads synchronously. Use async_read=false instead. |
splice_read | deprecated - Does nothing. |
splice_write | deprecated - Does nothing. |
splice_move | deprecated - Does nothing. |
allow_other | deprecated - mergerfs always sets this FUSE option as normal permissions can be used to limit access. |
use_ino | deprecated - mergerfs should always control inode calculation so this is enabled all the time. |
Value Types
Type | Value |
---|---|
BOOL | 'true' | 'false' |
INT | [MIN_INT,MAX_INT] |
UINT | [0,MAX_INT] |
SIZE | 'NNM'; NN = INT, M = 'K' | 'M' | 'G' | 'T' |
STR | string (may refer to an enumerated value, see details of argument) |
FUNC | filesystem function |
CATEGORY | function category |
POLICY | mergerfs function policy |
branches
The 'branches' argument is a colon (':') delimited list of paths to be pooled together. It does not matter if the paths are on the same or different filesystems nor does it matter the filesystem type (within reason). Used and available space will not be duplicated for paths on the same filesystem and any features which aren't supported by the underlying filesystem (such as file attributes or extended attributes) will return the appropriate errors.
Branches currently have two options which can be set. A type which impacts whether or not the branch is included in a policy calculation and a individual minfreespace value. The values are set by prepending an =
at the end of a branch designation and using commas as delimiters. Example: /mnt/drive=RW,1234
branch mode
- RW: (read/write) - Default behavior. Will be eligible in all policy categories.
- RO: (read-only) - Will be excluded from
create
andaction
policies. Same as a read-only mounted filesystem would be (though faster to process). - NC: (no-create) - Will be excluded from
create
policies. You can't create on that branch but you can change or delete.
globbing
To make it easier to include multiple branches mergerfs supports globbing. The globbing tokens MUST be escaped when using via the shell else the shell itself will apply the glob itself.
# mergerfs /mnt/hdd\*:/mnt/ssd /media
The above line will use all mount points in /mnt prefixed with hdd and ssd.
To have the pool mounted at boot or otherwise accessible from related tools use /etc/fstab.
# <file system> <mount point> <type> <options> <dump> <pass>
/mnt/hdd*:/mnt/ssd /media fuse.mergerfs minfreespace=16G 0 0
Functions, Categories and Policies
The POSIX filesystem API is made up of a number of functions. creat, stat, chown, etc. For ease of configuration in mergerfs most of the core functions are grouped into 3 categories: action, create, and search. These functions and categories can be assigned a policy which dictates which branch is chosen when performing that function.
Functions and their Category classifications
Category | FUSE Functions |
---|---|
action | chmod, chown, link, removexattr, rename, rmdir, setxattr, truncate, unlink, utimens |
create | create, mkdir, mknod, symlink |
search | access, getattr, getxattr, ioctl (directories), listxattr, open, readlink |
N/A | fchmod, fchown, futimens, ftruncate, fallocate, fgetattr, fsync, ioctl (files), read, readdir, release, statfs, write, copy_file_range |
Policies
A policy is the algorithm used to choose a branch or branches for a function to work on or generally how the function behaves.
A policy's behavior differs, as mentioned above, based on the function it is used with. Sometimes it really might not make sense to even offer certain policies because they are literally the same as others but it makes things a bit more uniform.
Policy | Description |
---|---|
all | Search: For mkdir, mknod, and symlink it will apply to all branches. create works like ff. |
epall (existing path, all) | For mkdir, mknod, and symlink it will apply to all found. create works like epff (but more expensive because it doesn't stop after finding a valid branch). |
epff (existing path, first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found where the relative path exists. |
eplfs (existing path, least free space) | Of all the branches on which the relative path exists choose the branch with the least free space. |
eplus (existing path, least used space) | Of all the branches on which the relative path exists choose the branch with the least used space. |
epmfs (existing path, most free space) | Of all the branches on which the relative path exists choose the branch with the most free space. |
eppfrd (existing path, percentage free random distribution) | Like pfrd but limited to existing paths. |
eprand (existing path, random) | Calls epall and then randomizes. Returns 1. |
ff (first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found. |
lfs (least free space) | Pick the branch with the least available free space. |
lus (least used space) | Pick the branch with the least used space. |
mfs (most free space) | Pick the branch with the most available free space. |
msplfs (most shared path, least free space) | Like eplfs but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
msplus (most shared path, least used space) | Like eplus but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
mspmfs (most shared path, most free space) | Like epmfs but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
msppfrd (most shared path, percentage free random distribution) | Like eppfrd but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
newest | Pick the file / directory with the largest mtime. |
pfrd (percentage free random distribution) | Chooses a branch at random with the likelihood of selection based on a branch's available space relative to the total. |
rand (random) | Calls all and then randomizes. Returns 1 branch. |