Commit graph

337 commits

Author SHA1 Message Date
Andreas Kling 3879e5b9d4 Kernel: Start working on a syscall for logging performance events
This patch introduces sys$perf_event() with two event types:

- PERF_EVENT_MALLOC
- PERF_EVENT_FREE

After the first call to sys$perf_event(), a process will begin keeping
these events in a buffer. When the process dies, that buffer will be
written out to "perfcore" in the current directory unless that filename
is already taken.

This is probably not the best way to do this, but it's a start and will
make it possible to start doing memory allocation profiling. :^)
2020-02-02 20:26:27 +01:00
Andreas Kling c9e877a294 Kernel: Address validation helpers should take size_t, not ssize_t 2020-01-30 21:51:27 +01:00
Andreas Kling f4302b58fb Kernel: Remove SmapDisablers in sys$getsockname() and sys$getpeername()
Instead use the user/kernel copy helpers to only copy the minimum stuff
needed from to/from userspace.

Based on work started by Brian Gianforcaro.
2020-01-27 21:11:36 +01:00
Andreas Kling 30ad7953ca Kernel: Rename UnveilState to VeilState 2020-01-21 19:28:59 +01:00
Andreas Kling f38cfb3562 Kernel: Tidy up debug logging a little bit
When using dbg() in the kernel, the output is automatically prefixed
with [Process(PID:TID)]. This makes it a lot easier to understand which
thread is generating the output.

This patch also cleans up some common logging messages and removes the
now-unnecessary "dbg() << *current << ..." pattern.
2020-01-21 16:16:20 +01:00
Andreas Kling 0569123ad7 Kernel: Add a basic implementation of unveil()
This syscall is a complement to pledge() and adds the same sort of
incremental relinquishing of capabilities for filesystem access.

The first call to unveil() will "drop a veil" on the process, and from
now on, only unveiled parts of the filesystem are visible to it.

Each call to unveil() specifies a path to either a directory or a file
along with permissions for that path. The permissions are a combination
of the following:

- r: Read access (like the "rpath" promise)
- w: Write access (like the "wpath" promise)
- x: Execute access
- c: Create/remove access (like the "cpath" promise)

Attempts to open a path that has not been unveiled with fail with
ENOENT. If the unveiled path lacks sufficient permissions, it will fail
with EACCES.

Like pledge(), subsequent calls to unveil() with the same path can only
remove permissions, not add them.

Once you call unveil(nullptr, nullptr), the veil is locked, and it's no
longer possible to unveil any more paths for the process, ever.

This concept comes from OpenBSD, and their implementation does various
things differently, I'm sure. This is just a first implementation for
SerenityOS, and we'll keep improving on it as we go. :^)
2020-01-20 22:12:04 +01:00
Andreas Kling 8d9dd1b04b Kernel: Add a 1-deep cache to Process::region_from_range()
This simple cache gets hit over 70% of the time on "g++ Process.cpp"
and shaves ~3% off the runtime.
2020-01-19 16:44:37 +01:00
Andreas Kling ae0c435e68 Kernel: Add a Process::add_region() helper
This is a private helper for adding a Region to Process::m_regions.
It's just for convenience since it's a bit cumbersome to do this.
2020-01-19 16:26:42 +01:00
Andreas Kling 94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Sergey Bugaev e0013a6b4c Kernel+LibC: Unify sys$open() and sys$openat()
The syscall is now called sys$open(), but it behaves like the old sys$openat().
In userspace, open_with_path_length() is made a wrapper over openat_with_path_length().
2020-01-17 21:49:58 +01:00
Andreas Kling 26a31c7efb Kernel: Add "accept" pledge promise for accepting incoming connections
This patch adds a new "accept" promise that allows you to call accept()
on an already listening socket. This lets programs set up a socket for
for listening and then dropping "inet" and/or "unix" so that only
incoming (and existing) connections are allowed from that point on.
No new outgoing connections or listening server sockets can be created.

In addition to accept() it also allows getsockopt() with SOL_SOCKET
and SO_PEERCRED, which is used to find the PID/UID/GID of the socket
peer. This is used by our IPC library when creating shared buffers that
should only be accessible to a specific peer process.

This allows us to drop "unix" in WindowServer and LookupServer. :^)

It also makes the debugging/introspection RPC sockets in CEventLoop
based programs work again.
2020-01-17 11:19:06 +01:00
Andrew Kaster 7a7e7c82b5 Kernel: Tighten up exec/do_exec and allow for PT_INTERP iterpreters
This patch changes how exec() figures out which program image to
actually load. Previously, we opened the path to our main executable in
find_shebang_interpreter_for_executable, read the first page (or less,
if the file was smaller) and then decided whether to recurse with the
interpreter instead. We then then re-opened the main executable in
do_exec.

However, since we now want to parse the ELF header and Program Headers
of an elf image before even doing any memory region work, we can change
the way this whole process works. We open the file and read (up to) the
first page in exec() itself, then pass just the page and the amount read
to find_shebang_interpreter_for_executable. Since we now have that page
and the FileDescription for the main executable handy, we can do a few
things. First, validate the ELF header and ELF program headers for any
shenanigans. ELF32 Little Endian i386 only, please. Second, we can grab
the PT_INTERP interpreter from any ET_DYN files, and open that guy right
away if it exists. Finally, we can pass the main executable's and
optionally the PT_INTERP interpreter's file descriptions down to do_exec
and not have to feel guilty about opening the file twice.

In do_exec, we now have a choice. Are we going to load the main
executable, or the interpreter? We could load both, but it'll be way
easier for the inital pass on the RTLD if we only load the interpreter.
Then it can load the main executable itself like any old shared object,
just, the one with main in it :). Later on we can load both of them
into memory and the RTLD can relocate itself before trying to do
anything. The way it's written now the RTLD will get dibs on its
requested virtual addresses being the actual virtual addresses.
2020-01-13 13:03:30 +01:00
Brian Gianforcaro 4cee441279 Kernel: Combine validate and copy of user mode pointers (#1069)
Right now there is a significant amount of boiler plate code required
to validate user mode parameters in syscalls. In an attempt to reduce
this a bit, introduce validate_read_and_copy_typed which combines the
usermode address check and does the copy internally if the validation
passes. This cleans up a little bit of code from a significant amount
of syscalls.
2020-01-13 11:19:17 +01:00
Sergey Bugaev 33c0dc08a7 Kernel: Don't forget to copy & destroy root_directory_for_procfs
Also, rename it to root_directory_relative_to_global_root.
2020-01-12 20:02:11 +01:00
Sergey Bugaev dd54d13d8d Kernel+LibC: Allow passing mount flags to chroot()
Since a chroot is in many ways similar to a separate root mount, we can also
apply mount flags to it as if it was an actual mount. These flags will apply
whenever the chrooted process accesses its root directory, but not when other
processes access this same directory for the outside. Since it's common to
chdir("/") immediately after chrooting (so that files accessed through the
current directory inherit the same mount flags), this effectively allows one to
apply additional limitations to a process confined inside a chroot.

To this effect, sys$chroot() gains a mount_flags argument (exposed as
chroot_with_mount_flags() in userspace) which can be set to all the same values
as the flags argument for sys$mount(), and additionally to -1 to keep the flags
set for that file system. Note that passing 0 as mount_flags will unset any
flags that may have been set for the file system, not keep them.
2020-01-12 20:02:11 +01:00
Andreas Kling 017b34e1ad Kernel: Add "video" pledge for accessing framebuffer devices
WindowServer becomes the only user.
2020-01-12 02:18:30 +01:00
Andreas Kling 409a4f7756 ping: Use pledge() 2020-01-11 20:48:43 +01:00
Andreas Kling 24c736b0e7 Kernel: Use the Syscall string and buffer types more
While I was updating syscalls to stop passing null-terminated strings,
I added some helpful struct types:

    - StringArgument { const char*; size_t; }
    - ImmutableBuffer<Data, Size> { const Data*; Size; }
    - MutableBuffer<Data, Size> { Data*; Size; }

The Process class has some convenience functions for validating and
optionally extracting the contents from these structs:

    - get_syscall_path_argument(StringArgument)
    - validate_and_copy_string_from_user(StringArgument)
    - validate(ImmutableBuffer)
    - validate(MutableBuffer)

There's still so much code around this and I'm wondering if we should
generate most of it instead. Possible nice little project.
2020-01-11 12:47:47 +01:00
Andreas Kling 0ca6d6c8d2 Kernel: Remove validate_read_str() as nothing uses it anymore :^) 2020-01-11 10:57:50 +01:00
Andreas Kling f5092b1c7e Kernel: Pass a parameter struct to mount()
This was the last remaining syscall that took a null-terminated string
and figured out how long it was by walking it in kernelspace *shudder*.
2020-01-11 10:56:02 +01:00
Andreas Kling e380142853 Kernel: Pass a parameter struct to rename() 2020-01-11 10:36:54 +01:00
Andreas Kling 46830a0c32 Kernel: Pass a parameter struct to symlink() 2020-01-11 10:31:33 +01:00
Andreas Kling c97bfbd609 Kernel: Pass a parameter struct to mknod() 2020-01-11 10:27:37 +01:00
Andreas Kling 6536a80aa9 Kernel: Pass a parameter struct to chown() 2020-01-11 10:17:44 +01:00
Andreas Kling 29b3d95004 Kernel: Expose a process's filesystem root as a /proc/PID/root symlink
In order to preserve the absolute path of the process root, we save the
custody used by chroot() before stripping it to become the new "/".
There's probably a better way to do this.
2020-01-10 23:48:44 +01:00
Andreas Kling ddd0b19281 Kernel: Add a basic chroot() syscall :^)
The chroot() syscall now allows the superuser to isolate a process into
a specific subtree of the filesystem. This is not strictly permanent,
as it is also possible for a superuser to break *out* of a chroot, but
it is a useful mechanism for isolating unprivileged processes.

The VFS now uses the current process's root_directory() as the root for
path resolution purposes. The root directory is stored as an uncached
Custody in the Process object.
2020-01-10 23:14:04 +01:00
Andreas Kling 485443bfca Kernel: Pass characters+length to link() 2020-01-10 21:26:47 +01:00
Andreas Kling 0695ff8282 Kernel: Pass characters+length to readlink()
Note that I'm developing some helper types in the Syscall namespace as
I go here. Once I settle on some nice types, I will convert all the
other syscalls to use them as well.
2020-01-10 20:13:23 +01:00
Andreas Kling 952bb95baa Kernel: Enable SMAP protection during the execve() syscall
The userspace execve() wrapper now measures all the strings and puts
them in a neat and tidy structure on the stack.

This way we know exactly how much to copy in the kernel, and we don't
have to use the SMAP-violating validate_read_str(). :^)
2020-01-10 12:20:36 +01:00
Andreas Kling 197e73ee31 Kernel+LibELF: Enable SMAP protection during non-syscall exec()
When loading a new executable, we now map the ELF image in kernel-only
memory and parse it there. Then we use copy_to_user() when initializing
writable regions with data from the executable.

Note that the exec() syscall still disables SMAP protection and will
require additional work. This patch only affects kernel-originated
process spawns.
2020-01-10 10:57:06 +01:00
Andreas Kling ff16298b44 Kernel: Removed an unused global variable 2020-01-09 18:02:37 +01:00
Andreas Kling 4b4d369c5d Kernel: Take path+length in the unlink() and umount() syscalls 2020-01-09 16:23:41 +01:00
Andreas Kling 532f240f24 Kernel: Remove unused syscall for setting the signal mask 2020-01-08 15:21:06 +01:00
Andreas Kling faf32153f6 Kernel: Take const Process& in InodeMetadata::may_{read,write,execute} 2020-01-07 19:24:06 +01:00
Andreas Kling 5387a19268 Kernel: Make Process::file_description() vend a RefPtr<FileDescription>
This encourages callers to strongly reference file descriptions while
working with them.

This fixes a use-after-free issue where one thread would close() an
open fd while another thread was blocked on it becoming readable.

Test: Kernel/uaf-close-while-blocked-in-read.cpp
2020-01-07 15:53:42 +01:00
Andreas Kling 53bda09d15 Kernel: Make utime() take path+length, remove SmapDisabler 2020-01-06 12:23:30 +01:00
Andreas Kling 33025a8049 Kernel: Pass name+length to set_mmap_name() and remove SmapDisabler 2020-01-06 11:56:59 +01:00
Andreas Kling 7c916b9fe9 Kernel: Make realpath() take path+length, get rid of SmapDisabler 2020-01-06 11:32:25 +01:00
Andreas Kling d6b06fd5a3 Kernel: Make watch_file() syscall take path length as a size_t
We don't care to handle negative path lengths anyway.
2020-01-06 11:15:49 +01:00
Andreas Kling 0df72d4712 Kernel: Pass path+length to mkdir(), rmdir() and chmod() 2020-01-06 11:15:49 +01:00
Andreas Kling 642137f014 Kernel: Make access() take path+length
Also, let's return EFAULT for nullptr at the LibC layer. We can't do
all bad addresses this way, but we can at least do null. :^)
2020-01-06 11:15:48 +01:00
Andreas Kling c5890afc8b Kernel: Make chdir() take path+length 2020-01-05 22:06:25 +01:00
Andreas Kling f231e9ea76 Kernel: Pass path+length to the stat() and lstat() syscalls
It's not pleasant having to deal with null-terminated strings as input
to syscalls, so let's get rid of them one by one.
2020-01-05 22:02:54 +01:00
Andreas Kling d4761762f2 Kernel: Remove some unused Process members 2020-01-04 19:53:29 +01:00
Andreas Kling 95ba0d5a02 Kernel: Remove unused "putch" syscall 2020-01-04 16:00:25 +01:00
Andreas Kling 24cc67d199 Kernel: Remove read_tsc() syscall
Since nothing is using this, let's just remove it. That's one less
thing to worry about.
2020-01-03 09:27:09 +01:00
Andreas Kling fdde5cdf26 Kernel: Don't include the process GID in the "extra GIDs" table
Process::m_extra_gids is for supplementary GIDs only.
2020-01-02 23:45:52 +01:00
Andreas Kling 7f04334664 Kernel: Remove broken implementation of Unix SHM
This code never worked, as was never used for anything. We can build
a much better SHM implementation on top of TmpFS or similar when we
get to the point when we need one.
2020-01-02 12:44:21 +01:00
Andrew Kaster bc50a10cc9 Kernel: sys$mprotect protects sub-regions as well as whole ones
Split a region into two/three if the desired mprotect range is a strict
subset of an existing region. We can then set the access bits on a new
region that is just our desired range and add both the new
desired subregion and the leftovers back to our page tables.
2020-01-02 12:27:13 +01:00
Tibor Nagy 624116a8b1 Kernel: Implement AltGr key support 2019-12-31 19:31:42 +01:00