vfs: add restrictions to read(2) of a directory [1/2]

Historically, we've allowed read() of a directory and some filesystems will
accommodate (e.g. ufs/ffs, msdosfs). From the history department staffed by
Warner: <<EOF

pdp-7 unix seemed to allow reading directories, but they were weird, special
things there so I'm unsure (my pdp-7 assembler sucks).

1st Edition's sources are lost, mostly. The kernel allows it. The
reconstructed sources from 2nd or 3rd edition read it though.

V6 to V7 changed the filesystem format, and should have been a warning, but
reading directories weren't materially changed.

4.1b BSD introduced readdir because of UFS. UFS broke all directory reading
programs in 1983. ls, du, find, etc all had to be rewritten. readdir() and
friends were introduced here.

SysVr3 picked up readdir() in 1987 for the AT&T fork of Unix. SysVr4 updated
all the directory reading programs in 1988 because different filesystem
types were introduced.

In the 90s, these interfaces became completely ubiquitous as PDP-11s running
V7 faded from view and all the folks that initially started on V7 upgraded
to SysV. Linux never supported this (though I've not done the software
archeology to check) because it has always had a pathological diversity of
filesystems.
EOF

Disallowing read(2) on a directory has the side-effect of masking
application bugs from relying on other implementation's behavior
(e.g. Linux) of rejecting these with EISDIR across the board, but allowing
it has been a vector for at least one stack disclosure bug in the past[0].

By POSIX, this is implementation-defined whether read() handles directories
or not. Popular implementations have chosen to reject them, and this seems
sensible: the data you're reading from a directory is not structured in some
unified way across filesystem implementations like with readdir(2), so it is
impossible for applications to portably rely on this.

With this patch, we will reject most read(2) of a dirfd with EISDIR. Users
that know what they're doing can conscientiously set
bsd.security.allow_read_dir=1 to allow read(2) of directories, as it has
proven useful for debugging or recovery. A future commit will further limit
the sysctl to allow only the system root to read(2) directories, to make it
at least relatively safe to leave on for longer periods of time.

While we're adding logic pertaining to directory vnodes to vn_io_fault, an
additional assertion has also been added to ensure that we're not reaching
vn_io_fault with any write request on a directory vnode. Such request would
be a logical error in the kernel, and must be debugged rather than allowing
it to potentially silently error out.

Commented out shell aliases have been placed in root's chsrc/shrc to promote
awareness that grep may become noisy after this change, depending on your
usage.

A tentative MFC plan has been put together to try and make it as trivial as
possible to identify issues and collect reports; note that this will be
strongly re-evaluated. Tentatively, I will MFC this knob with the default as
it is in HEAD to improve our odds of actually getting reports. The future
priv(9) to further restrict the sysctl WILL NOT BE MERGED BACK, so the knob
will be a faithful reversion on stable/12. We will go into the merge
acknowledging that the sysctl default may be flipped back to restore
historical behavior at *any* point if it's warranted.

[0] https://www.freebsd.org/security/advisories/FreeBSD-SA-19:10.ufs.asc

PR:		246412
Reviewed by:	mckusick, kib, emaste, jilles, cy, phk, imp (all previous)
Reviewed by:	rgrimes (latest version)
MFC after:	1 month (note the MFC plan mentioned above)
Relnotes:	absolutely, but will amend previous RELNOTES entry
Differential Revision:	https://reviews.freebsd.org/D24596
This commit is contained in:
Kyle Evans 2020-06-04 18:09:55 +00:00
parent c847212986
commit dcef4f65ae
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=361798
4 changed files with 35 additions and 4 deletions

View file

@ -12,6 +12,10 @@ alias la ls -aF
alias lf ls -FA
alias ll ls -lAF
# read(2) of directories may not be desirable by default, as this will provoke
# EISDIR errors from each directory encountered.
# alias grep grep -d skip
# A righteous umask
umask 22

View file

@ -31,6 +31,9 @@
# alias mv='mv -i'
# alias rm='rm -i'
# read(2) of directories may not be desirable by default, as this will provoke
# EISDIR errors from each directory encountered.
# alias grep='grep -d skip'
# set prompt: ``username@hostname:directory $ ''
PS1="\u@\h:\w \\$ "

View file

@ -28,7 +28,7 @@
.\" @(#)read.2 8.4 (Berkeley) 2/26/94
.\" $FreeBSD$
.\"
.Dd March 30, 2020
.Dd June 4, 2020
.Dt READ 2
.Os
.Sh NAME
@ -199,9 +199,14 @@ was negative.
The file was marked for non-blocking I/O,
and no data were ready to be read.
.It Bq Er EISDIR
The file descriptor is associated with a directory residing
on a file system that does not allow regular read operations on
directories (e.g.\& NFS).
The file descriptor is associated with a directory.
Directories may only be read directly if the filesystem supports it and
the
.Dv security.bsd.allow_read_dir
sysctl MIB is set to a non-zero value.
For most scenarios, the
.Xr readdir 3
function should be used instead.
.It Bq Er EOPNOTSUPP
The file descriptor is associated with a file system and file type that
do not allow regular read operations on it.

View file

@ -136,6 +136,11 @@ static u_long vn_io_faults_cnt;
SYSCTL_ULONG(_debug, OID_AUTO, vn_io_faults, CTLFLAG_RD,
&vn_io_faults_cnt, 0, "Count of vn_io_fault lock avoidance triggers");
static int vfs_allow_read_dir = 0;
SYSCTL_INT(_security_bsd, OID_AUTO, allow_read_dir, CTLFLAG_RW,
&vfs_allow_read_dir, 0,
"Enable read(2) of directory by root for filesystems that support it");
/*
* Returns true if vn_io_fault mode of handling the i/o request should
* be used.
@ -1216,6 +1221,20 @@ vn_io_fault(struct file *fp, struct uio *uio, struct ucred *active_cred,
doio = uio->uio_rw == UIO_READ ? vn_read : vn_write;
vp = fp->f_vnode;
/*
* The ability to read(2) on a directory has historically been
* allowed for all users, but this can and has been the source of
* at least one security issue in the past. As such, it is now hidden
* away behind a sysctl for those that actually need it to use it.
*/
if (vp->v_type == VDIR) {
KASSERT(uio->uio_rw == UIO_READ,
("illegal write attempted on a directory"));
if (!vfs_allow_read_dir)
return (EISDIR);
}
foffset_lock_uio(fp, uio, flags);
if (do_vn_io_fault(vp, uio)) {
args.kind = VN_IO_FAULT_FOP;