Commit graph

4027 commits

Author SHA1 Message Date
Andreas Kling 84b2d4c475 Kernel: Add "map_fixed" pledge promise
This is a new promise that guards access to mmap() with MAP_FIXED.

Fixed-address mappings are rarely used, but can be useful if you are
trying to groom the process address space for malicious purposes.

None of our programs need this at the moment, as the only user of
MAP_FIXED is DynamicLoader, but the fixed mappings are constructed
before the process has had a chance to pledge anything.
2021-02-21 01:08:48 +01:00
Andreas Kling 1bc859fb68 Kernel: Make UNMAP_AFTER_INIT imply NEVER_INLINE as well
We want to make sure these functions actually do get unmapped. If they
were inlined somewhere, the inlined version(s) would remain mapped.

Thanks to "thislooksfun" for the suggestion! :^)
2021-02-21 00:43:29 +01:00
Andreas Kling fa581a7470 Kernel: Mark some IDEController functions with UNMAP_AFTER_INIT 2021-02-20 17:28:29 +01:00
Andreas Kling efd4f66f36 Kernel: Don't take debug logging lock in sprintf()
This function doesn't write to the log, and so doesn't need to acquire
the logging lock. (This is only used by GCC's name demangling thingy.)
2021-02-20 17:21:53 +01:00
Andreas Kling cc0f5917d3 Kernel: Slap a handful more things with UNMAP_AFTER_INIT 2021-02-20 00:00:19 +01:00
Andreas Kling 2b2828ae52 Kernel: Slap UNMAP_AFTER_INIT on a bunch more functions
We're now able to unmap 100 KiB of kernel text after init. :^)
2021-02-19 21:42:18 +01:00
Andreas Kling fdf03852c9 Kernel: Slap UNMAP_AFTER_INIT on a whole bunch of functions
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.

With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling 32e93c8808 Kernel: Mark write_cr0() and write_cr4() as UNMAP_AFTER_INIT
This removes a very useful tool for attackers trying to disable
SMAP/SMEP/etc. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling 6136faa4eb Kernel: Add .unmap_after_init section for code we don't need after init
You can now declare functions with UNMAP_AFTER_INIT and they'll get
segregated into a separate kernel section that gets completely
unmapped at the end of initialization.

This can be used for anything we don't need to call once we've booted
into userspace.

There are two nice things about this mechanism:

- It allows us to free up entire pages of memory for other use.
  (Note that this patch does not actually make use of the freed
  pages yet, but in the future we totally could!)

- It allows us to get rid of obviously dangerous gadgets like
  write-to-CR0 and write-to-CR4 which are very useful for an attacker
  trying to disable SMAP/SMEP/etc.

I've also made sure to include a helpful panic message in case you
hit a kernel crash because of this protection. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling da100f12a6 Kernel: Add helpers for manipulating x86 control registers
Use read_cr{0,2,3,4} and write_cr{0,3,4} helpers instead of inline asm.
2021-02-19 20:23:05 +01:00
Andreas Kling 6e83be67b8 Kernel: Release ptrace lock in exec before stopping due to PT_TRACE_ME
If we have a tracer process waiting for us to exec, we need to release
the ptrace lock before stopping ourselves, since otherwise the tracer
will block forever on the lock.

Fixes #5409.
2021-02-19 12:13:54 +01:00
Andreas Kling 37d8faf1b4 ProcFS: Fix /proc/PID/* hardening bypass
This enabled trivial ASLR bypass for non-dumpable programs by simply
opening /proc/PID/vm before exec'ing.

We now hold the target process's ptrace lock across the refresh/write
operations, and deny access if the process is non-dumpable. The lock
is necessary to prevent a TOCTOU race on Process::is_dumpable() while
the target is exec'ing.

Fixes #5270.
2021-02-19 09:46:36 +01:00
Andreas Kling eb92ec3149 Kernel: Factor out mmap & friends range expansion to a helper function
sys$mmap() and related syscalls must pad to the nearest page boundary
below the base address *and* above the end address of the specified
range. Since we have to do this in many places, let's make a helper.
2021-02-18 18:04:58 +01:00
Andreas Kling 55a9a4f57a Kernel: Use KResult a bit more in sys$execve() 2021-02-18 09:37:33 +01:00
Andreas Kling 6c2f0316d9 Kernel: Convert snprintf() => String::formatted()/number() 2021-02-17 16:37:11 +01:00
Andreas Kling 5f610417d0 Kernel: Remove kprintf()
There are no remaining users of this API.
2021-02-17 16:33:43 +01:00
Andreas Kling 40e5210036 Kernel: Convert dbgprintf()/klog() => dbgln()/dmesgln() in UHCI code 2021-02-17 16:30:55 +01:00
Andreas Kling e4d84b5e79 Kernel: Remove dbgprintf() from kernel
There are no remaining users of this API in the kernel.
2021-02-17 16:22:34 +01:00
Andreas Kling 5a595ef134 Kernel: Use dbgln_if() in sys$fork() 2021-02-17 15:34:32 +01:00
AnotherTest 4043e770e5 Kernel: Don't go through ARP for IP broadcast messages 2021-02-17 14:41:36 +01:00
Andreas Kling 575c7ed414 Kernel: Make sys$msyscall() EFAULT on non-user address
Fixes #5361.
2021-02-16 11:32:00 +01:00
Ben Wiederhake fbb85f9b2f Kernel: Refuse excessively long iovec list, also in readv
This bug is a good example why copy-paste code should eventually be eliminated
from the code base: Apparently the code was copied from read.cpp before
c6027ed7cc, so the same bug got introduced here.

To recap: A malicious program can ask the Kernel to prepare sys-ing to
a huge amount of iovecs. The Kernel must first copy all the vector locations
into 'vecs', and before that allocates an arbitrary amount of memory:
    vecs.resize(iov_count);
This can cause Kernel memory exhaustion, triggered by any malicious userland
program.
2021-02-15 22:09:01 +01:00
Jean-Baptiste Boric 0d22ec9d32 Kernel: Handle 'Menu' key on PS/2 keyboard 2021-02-15 19:37:14 +01:00
AnotherTest 5729e76c7d Meta: Make it possible to (somewhat) build the system inside Serenity
This removes some hard references to the toolchain, some unnecessary
uses of an external install command, and disables a -Werror flag (for
the time being) - only if run inside serenity.

With this, we can build and link the kernel :^)
2021-02-15 17:32:56 +01:00
AnotherTest 4519950266 Kernel+LibC: Add the _SC_GETPW_R_SIZE_MAX sysconf enum
It just returns 4096 :P
2021-02-15 17:32:56 +01:00
AnotherTest a3a7ab83c4 Kernel+LibC: Implement readv
We already had writev, so let's just add readv too.
2021-02-15 17:32:56 +01:00
AnotherTest 1e79c04616 Kernel+LibC: Stub out SO_{SND_RCV}BUF 2021-02-15 17:32:56 +01:00
Brian Gianforcaro 7482cb6531 Kernel: Avoid some un-necessary copies coming from range based for loops
- The irq_controller was getting add_ref/released needlessly during enumeration.

- Used ranges were also getting needlessly copied.
2021-02-15 15:25:23 +01:00
Brian Gianforcaro 96943ab07c Kernel: Initial integration of Kernel Address Sanitizer (KASAN)
KASAN is a dynamic analysis tool that finds memory errors. It focuses
mostly on finding use-after-free and out-of-bound read/writes bugs.

KASAN works by allocating a "shadow memory" region which is used to store
whether each byte of memory is safe to access. The compiler then instruments
the kernel code and a check is inserted which validates the state of the
shadow memory region on every memory access (load or store).

To fully integrate KASAN into the SerenityOS kernel we need to:

 a) Implement the KASAN interface to intercept the injected loads/stores.

      void __asan_load*(address);
      void __asan_store(address);

 b) Setup KASAN region and determine the shadow memory offset + translation.
    This might be challenging since Serenity is only 32bit at this time.

    Ex: Linux implements kernel address -> shadow address translation like:

      static inline void *kasan_mem_to_shadow(const void *addr)
      {
          return ((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT)
                  + KASAN_SHADOW_OFFSET;
      }

 c) Integrating KASAN with Kernel allocators.
    The kernel allocators need to be taught how to record allocation state
    in the shadow memory region.

This commit only implements the initial steps of this long process:
- A new (default OFF) CMake build flag `ENABLE_KERNEL_ADDRESS_SANITIZER`
- Stubs out enough of the KASAN interface to allow the Kernel to link clean.

Currently the KASAN kernel crashes on boot (triple fault because of the crash
in strlen other sanitizer are seeing) but the goal here is to just get started,
and this should help others jump in and continue making progress on KASAN.

References:
* ASAN Paper: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37752.pdf
* KASAN Docs: https://github.com/google/kasan
* NetBSD KASAN Blog: https://blog.netbsd.org/tnf/entry/kernel_address_sanitizer_part_3
* LWN KASAN Article: https://lwn.net/Articles/612153/
* Tracking Issue #5351
2021-02-15 11:41:53 +01:00
Brian Gianforcaro 69df3cfae7 Kernel: Mark KBuffer and its getters as [[nodiscard]]
There is no reason to call a getter without observing the result, doing
so indicates an error in the code. Mark these methods as [[nodiscard]]
to find these cases.
2021-02-15 09:34:52 +01:00
Brian Gianforcaro 0cbede91b8 Kernel: Mark Lock getters as [[nodiscard]]
There is no reason to call a getter without observing the result, doing
so indicates an error in the code. Mark these methods as [[nodiscard]]
to find these cases.
2021-02-15 09:34:52 +01:00
Brian Gianforcaro a75d7958cc Kernel: Mark UserOrKernelBuffer and it's getters as [[nodicard]]
`UserOrKernelBuffer` objects should always be observed when created, in
turn there is no reason to call a getter without observing the result.
Doing either of these indicates an error in the code. Mark these methods
as [[nodiscard]] to find these cases.
2021-02-15 09:34:52 +01:00
Brian Gianforcaro 01a66efe9d Kernel: Mark KResult getters as [[nodiscard]]
There is no reason to call a getter without observing the result, doing
so indicates an error in the code. Mark these methods as [[nodiscard]]
to find these cases.
2021-02-15 09:34:52 +01:00
Brian Gianforcaro 8752a27519 Kernel: Mark PhysicalAddress/VirtualAddress getters as [[nodiscard]]
There is no reason to call a getter without observing the result, doing
so indicates an error in the code. Mark these methods as [[nodiscard]]
to find these cases.
2021-02-15 09:34:52 +01:00
Brian Gianforcaro d71e235894 Kernel: Mark more StdLib functions as [[nodiscard]]
In the never ending journey to catch bugs, mark more functions
as [[nodiscard]] to find incorrect call sites.
2021-02-15 09:34:52 +01:00
Brian Gianforcaro f12754ee10 Kernel: Mark BlockResult as [[nodiscard]]
In the majority of cases we want to force callers to observe the
result of a blocking operation as it's not grantee to succeed as
they expect. Mark BlockResult as [[nodiscard]] to force any callers
to observe the result of the blocking operation.
2021-02-15 08:28:57 +01:00
Brian Gianforcaro a8a834782c Kernel: Ignore unobserved BlockResult from Thread::Sleep
Suppress these in preparation for making BlockResult [[nodiscard]].
2021-02-15 08:28:57 +01:00
Brian Gianforcaro ddd79fe2cf Kernel: Add WaitQueue::wait_forever and it use it for all infinite waits.
In preparation for marking BlockingResult [[nodiscard]], there are a few
places that perform infinite waits, which we never observe the result of
the wait. Instead of suppressing them, add an alternate function which
returns void when performing and infinite wait.
2021-02-15 08:28:57 +01:00
Andreas Kling 68e3616971 Kernel: Forked children should inherit the signal trampoline address
Fixes #5347.
2021-02-14 18:38:46 +01:00
Andreas Kling 8ee42e47df Kernel: Mark a handful of things in CPU.cpp as READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling 99f596fd51 Kernel: Mark a handful of things in kmalloc.cpp as READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling 7a78a4915a Kernel: Mark a handful of things in init.cpp as READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling 49f463f557 Kernel: Mark a handful of things in Thread.cpp READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling c5c68bbd84 Kernel: Mark a handful of things in Scheduler.cpp READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling 00107a0dc1 Kernel: Mark a handful of things in Process.cpp READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling f0a1d9bfa5 Kernel: Mark the x86 IDT as READONLY_AFTER_INIT
We never need to modify the interrupt descriptor table after finishing
initialization, so let's make it an error to do so.
2021-02-14 18:12:00 +01:00
Andreas Kling a10accd48c Kernel: Print a helpful panic message for READONLY_AFTER_INIT crashes 2021-02-14 18:12:00 +01:00
Andreas Kling d8013c60bb Kernel: Add mechanism to make some memory read-only after init finishes
You can now use the READONLY_AFTER_INIT macro when declaring a variable
and we will put it in a special ".ro_after_init" section in the kernel.

Data in that section remains writable during the boot and init process,
and is then marked read-only just before launching the SystemServer.

This is based on an idea from the Linux kernel. :^)
2021-02-14 18:11:32 +01:00
Andreas Kling 6ee499aeb0 Kernel: Round old address/size in sys$mremap() to page size multiples
Found by fuzz-syscalls. :^)
2021-02-14 13:15:05 +01:00
Andreas Kling 0e92a80434 Kernel: Add some bits of randomness to kernel stack pointers
Since kernel stacks are much smaller (64 KiB) than userspace stacks,
we only add a small bit of randomness here (0-256 bytes, 16b aligned.)

This makes the location of the task context switch buffer not be
100% predictable. Note that we still also add extra randomness upon
syscall entry, so this patch primarily affects context switching.
2021-02-14 12:30:07 +01:00
Andreas Kling e47bffdc8c Kernel: Add some bits of randomness to the userspace stack pointer
This patch adds a random offset between 0 and 4096 to the initial
stack pointer in new processes. Since the stack has to be 16-byte
aligned, the bottom bits can't be randomized.

Yet another thing to make things less predictable. :^)
2021-02-14 11:53:49 +01:00
Andreas Kling 4188373020 Kernel: Fix TOCTOU in syscall entry region validation
We were doing stack and syscall-origin region validations before
taking the big process lock. There was a window of time where those
regions could then be unmapped/remapped by another thread before we
proceed with our syscall.

This patch closes that window, and makes sys$get_stack_bounds() rely
on the fact that we now know the userspace stack pointer to be valid.

Thanks to @BenWiederhake for spotting this! :^)
2021-02-14 11:47:14 +01:00
Andreas Kling 10b7f6b77e Kernel: Mark handle_crash() as [[noreturn]] 2021-02-14 11:47:14 +01:00
Ben Wiederhake c0692f1f95 Kernel: Avoid magic number in sys$poll 2021-02-14 10:57:33 +01:00
Andreas Kling cc341c95aa Kernel: Panic on sys$get_stack_bounds() in stack-less process 2021-02-14 10:51:18 +01:00
Andreas Kling 3131281747 Kernel: Panic on syscall from process with IOPL != 0
If this happens then the kernel is in an undefined state, so we should
rather panic than attempt to limp along.
2021-02-14 10:51:17 +01:00
Andreas Kling 781d29a337 Kernel+Userland: Give sys$recvfd() an options argument for O_CLOEXEC
@bugaevc pointed out that we shouldn't be setting this flag in
userspace, and he's right of course.
2021-02-14 10:39:48 +01:00
Andreas Kling 09b1b09c19 Kernel: Assert if rounding-up-to-page-size would wrap around to 0
If we try to align a number above 0xfffff000 to the next multiple of
the page size (4 KiB), it would wrap around to 0. This is most likely
never what we want, so let's assert if that happens.
2021-02-14 10:01:50 +01:00
Andreas Kling 198d641808 Kernel: Panic on attempt to map mmap'ed page at a kernel address
If we somehow get tricked into mapping user-controlled mmap memory
at a kernel address, let's just panic the kernel.
2021-02-14 09:36:58 +01:00
Andreas Kling b712345c92 Kernel: Use PANIC() in a bunch of places :^) 2021-02-14 09:36:58 +01:00
Andreas Kling c598a95b1c Kernel: Add a PANIC() function
Let's be a little more expressive when inducing a kernel panic. :^)
PANIC(...) passes any arguments you give it to dmesgln(), then prints
a backtrace and hangs the machine.
2021-02-14 09:36:58 +01:00
Andreas Kling 4021264201 Kernel: Make the Region constructor private
We can use adopt_own(*new T) instead of make<T>().
2021-02-14 01:39:04 +01:00
Andreas Kling 8415866c03 Kernel: Remove user/kernel flags from Region
Now that we no longer need to support the signal trampolines being
user-accessible inside the kernel memory range, we can get rid of the
"kernel" and "user-accessible" flags on Region and simply use the
address of the region to determine whether it's kernel or user.

This also tightens the page table mapping code, since it can now set
user-accessibility based solely on the virtual address of a page.
2021-02-14 01:34:23 +01:00
Andreas Kling 1593219a41 Kernel: Map signal trampoline into each process's address space
The signal trampoline was previously in kernelspace memory, but with
a special exception to make it user-accessible.

This patch moves it into each process's regular address space so we
can stop supporting user-allowed memory above 0xc0000000.
2021-02-14 01:33:17 +01:00
Andreas Kling ffdfbf1dba Kernel: Fix wrong sizeof() type in sys$execve() argument overflow check 2021-02-14 00:15:01 +01:00
Andreas Kling 34a83aba71 Kernel: Convert klog() => dbgln()/dmesgln() in Arch/i386/CPU.cpp 2021-02-13 21:51:16 +01:00
Jean-Baptiste Boric 9ce0639383 Kernel: Use divide_rounded_up inside write_block_list_for_inode 2021-02-13 19:56:49 +01:00
Jean-Baptiste Boric 869b33d6dd Kernel: Support triply indirect blocks for BlockListShape computation 2021-02-13 19:56:49 +01:00
Tom b445f15131 Kernel: Avoid flushing the tlb if there's only one thread
If we're flushing user space pointers and the process only has one
thread, we do not need to broadcast this to other processors as
they will all discard that request anyway.
2021-02-13 19:46:45 +01:00
Andreas Kling c877612211 Kernel: Round down base of partial ranges provided to munmap/mprotect
We were failing to round down the base of partial VM ranges. This led
to split regions being constructed that could have a non-page-aligned
base address. This would then trip assertions in the VM code.

Found by fuzz-syscalls. :^)
2021-02-13 01:49:44 +01:00
Andreas Kling af0e52ca54 Kernel: Don't assert on sys$setsockopt() with unexpected level
Just error out with ENOPROTOOPT instead.

Found by fuzz-syscalls. :^)
2021-02-13 01:29:28 +01:00
Andreas Kling a5def4e98c Kernel: Sanity check the VM range when constructing a Region
This should help us catch bogus VM ranges ending up in a process's
address space sooner.
2021-02-13 01:18:03 +01:00
Andreas Kling 62f0f73bf0 Kernel: Limit the number of file descriptors sys$poll() can handle
Just slap an arbitrary limit on there so we don't panic if somebody
asks us to poll 1 fajillion fds.

Found by fuzz-syscalls. :^)
2021-02-13 01:18:03 +01:00
Andreas Kling 7551090056 Kernel: Round up ranges to page size multiples in munmap and mprotect
This prevents passing bad inputs to RangeAllocator who then asserts.

Found by fuzz-syscalls. :^)
2021-02-13 01:18:03 +01:00
Ben Wiederhake 46e5890152 Kernel: Add forgotten 'const' flag 2021-02-13 00:40:31 +01:00
Ben Wiederhake 546cdde776 Kernel: clock_nanosleep's 'flags' is not a bitset
This had the interesting effect that most, but not all, non-zero values
were interpreted as an absolute value.
2021-02-13 00:40:31 +01:00
Ben Wiederhake e1db8094b6 Kernel: Avoid casting arbitrary user-controlled int to enum
This caused a load-invalid-value warning by KUBSan.

Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall waitid [ 1234 ]
2021-02-13 00:40:31 +01:00
Ben Wiederhake c6027ed7cc Kernel: Refuse excessively long iovec list
If a program attempts to write from more than a million different locations,
there is likely shenaniganery afoot! Refuse to write to prevent kmem exhaustion.

Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall writev 1 [ 0 ] 0x08000000
2021-02-13 00:40:31 +01:00
Ben Wiederhake 987b7f7917 Kernel: Forbid empty and whitespace-only process names
Those only exist to confuse the user anyway.

Found while using fuzz-syscalls.
2021-02-13 00:40:31 +01:00
Ben Wiederhake 4c42d1e35a Kernel: Do not try to print the string that cannot be read
What a silly bug :^)

Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall set_thread_name 14 14 14
2021-02-13 00:40:31 +01:00
Ben Wiederhake 1e630fb78a Kernel: Avoid creating unkillable processes
Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall exit_thread

This leaves the process in the 'Dying' state but never actually removes it.

Therefore, avoid this scenario by pretending to exit the entire process.
2021-02-13 00:40:31 +01:00
Ben Wiederhake b5e5e43d4b Kernel: Fix typo 2021-02-13 00:40:31 +01:00
Ben Wiederhake caeb41d92b Kernel: Don't crash on syscall with kernel-space argument
Fixes #5198.
2021-02-13 00:40:31 +01:00
Andreas Kling 9ae02d4c92 Kernel: Don't use a VLA for outgoing UDP packets
We had the same exact problem as da981578e3 but for UDP sockets.
2021-02-12 23:46:15 +01:00
Andreas Kling da981578e3 Kernel: Don't use a VLA for outgoing TCP packets
Since the payload size is user-controlled, this could be used to
overflow the kernel stack.

We should probably also be breaking things into smaller packets at a
higher level, e.g TCPSocket::protocol_send(), but let's do that as
a separate exercise.

Fixes #5310.
2021-02-12 23:00:25 +01:00
Andreas Kling 29045f84d4 Kernel: Decrease default userspace stack size to 1 MiB
Not sure why this was 4 MiB in the first place, but that's a lot of
memory to reserve for each thread when we're running with 512 MiB
total in the default testing setup. :^)
2021-02-12 19:17:09 +01:00
Andreas Kling e050577f0a Kernel: Make MAP_RANDOMIZED honor alignment requests
Previously, we only cared about the alignment on the fallback path.
2021-02-12 19:15:59 +01:00
Andreas Kling 4e2802bf91 Kernel: Move region dumps from dmesg to debug log
Also fix a broken format string caught by the new format string checks.
2021-02-12 16:33:58 +01:00
Andreas Kling 1ef43ec89a Kernel: Move get_interpreter_load_offset() out of Process class
This is only used inside the sys$execve() implementation so just make
it a execve.cpp local function.
2021-02-12 16:30:29 +01:00
Andreas Kling c4db224c94 Kernel: Convert klog() => dmesgln() / dbgln() in MemoryManager 2021-02-12 16:24:40 +01:00
Andreas Kling 5af69d6e93 Kernel: Convert klog() to dmesgln() in RangeAllocator 2021-02-12 16:24:40 +01:00
Andreas Kling 0a45cfee01 DevFS: Use strongly typed InodeIndex
Also add an assertion for the DevFS inode index allocator overflowing.
2021-02-12 16:24:40 +01:00
Sergey Bugaev 4717009e3e Kernel: Hold less locks when receiving ICMP packets
* We don't have to lock the "all IPv4 sockets" in exclusive mode, shared mode is
  enough for just reading the list (as opposed to modifying it).
* We don't have to lock socket's own lock at all, the IPv4Socket::did_receive()
  implementation takes care of this.
* Most importantly, we don't have to hold the "all IPv4 sockets" across the
  IPv4Socket::did_receive() call(s). We can copy the current ICMP socket list
  while holding the lock, then release the lock, and then call
  IPv4Socket::did_receive() on all the ICMP sockets in our list.

These changes fix a deadlock triggered by receiving ICMP messages when using tap
networking setup (as opposed to QEMU's default user/SLIRP networking) on the host.
2021-02-12 15:37:28 +01:00
Andreas Kling ffa39f98e8 Kernel: Fix build with BBFS_DEBUG 2021-02-12 13:51:34 +01:00
Andreas Kling c62c00e7db Ext2FS: Make Ext2FS::GroupIndex a distinct integer type 2021-02-12 13:33:58 +01:00
Andreas Kling 489317e573 Kernel: Make BlockBasedFS::BlockIndex a distinct integer type 2021-02-12 11:59:27 +01:00
Andreas Kling e44c1792a7 Kernel: Add distinct InodeIndex type
Use the DistinctNumeric mechanism to make InodeIndex a strongly typed
integer type.
2021-02-12 10:26:29 +01:00
Andreas Kling c8a90a31b6 Kernel: Remove default arguments from Inode::resolve_as_link()
Nobody was calling it without specifying all arguments anyway.
2021-02-12 09:06:03 +01:00
Owen Smith c2de22a635 Kernel: Merge split function and data sections into one during linking
Also add an assertion to make sure the safemem sections are never
discarded by the linker.
2021-02-12 08:57:26 +01:00
Andreas Kling 8c694ed6eb Kernel: Don't call Thread::set_should_die() twice on every thread
This stops the "should already die" debug spam we've been seeing.
2021-02-11 23:33:42 +01:00
Andreas Kling 95064f8b58 Ext2FS: Convert #if EXT2_DEBUG => dbgln_if() and constexpr-if 2021-02-11 23:05:16 +01:00
Andreas Kling abe4463b1c Kernel: Remove an unnecessary InterruptDisabler in early initialization 2021-02-11 22:56:14 +01:00
Andreas Kling a280cdf9ba Ext2FS: Shrink Ext2FSDirectoryEntry from 16 to 12 bytes
The way we read/write directories is very inefficient, and this doesn't
solve any of that. It does however reduce memory usage of directory
entry vectors by 25% which has nice immediate benefits.
2021-02-11 22:45:50 +01:00
Andreas Kling cef73f2010 Kernel: Remove CMake spam when setting up KUBSAN flags 2021-02-11 22:16:28 +01:00
Andreas Kling 54986228bf Kernel: Oops, add missing #include to fix ENABLE_ALL_THE_DEBUG_MACROS 2021-02-11 22:15:55 +01:00
Andreas Kling 0dbb22e9e0 Kernel: Remove a handful of unused things in VM/ directory
Also add some missing initializers.
2021-02-11 22:02:39 +01:00
Andreas Kling ba42d741cb Kernel: Add explicit __serenity__ define to workaround CLion problem
CLion doesn't understand that we switch compilers mid-build (which I
can understand since it's a bit unusual.) Defining __serenity__ makes
the majority of IDE features work correctly in the kernel context.
2021-02-11 21:23:31 +01:00
Jean-Baptiste Boric f8c352a022 Kernel: Fix undefined signed overflow in KernelRng's RTC fallback 2021-02-11 20:58:39 +01:00
Jean-Baptiste Boric eedb6480df Kernel: Don't assert if RTC believes we're in the past 2021-02-11 20:58:39 +01:00
Hendiadyoin1 4d5496b2b2
KUBSAN: Add nearly all missing -fsanitize handlers (#5254) 2021-02-11 20:58:01 +01:00
Andreas Kling 085f80aeac Kernel: Remove unused root directory computation in Process creation
sys$fork() already takes care of children inheriting the parent's root
directory, so there was no need to do the same thing when creating a
new user process.
2021-02-09 19:18:13 +01:00
Andreas Kling 1f277f0bd9 Kernel: Convert all *Builder::appendf() => appendff() 2021-02-09 19:18:13 +01:00
Andreas Kling e8f040139b Kernel: Remove unused Thread::is_runnable_state() 2021-02-08 23:07:33 +01:00
Andreas Kling 4ff0f971f7 Kernel: Prevent execve/ptrace race
Add a per-process ptrace lock and use it to prevent ptrace access to a
process after it decides to commit to a new executable in sys$execve().

Fixes #5230.
2021-02-08 23:05:41 +01:00
Andreas Kling 4b7b92c201 Kernel: Remove two unused fields from sys$execve's LoadResult 2021-02-08 22:31:03 +01:00
Andreas Kling 4cd2c475a8 Kernel: Make the space lock a RecursiveSpinLock 2021-02-08 22:28:48 +01:00
Andreas Kling 0d7af498d7 Kernel: Move ShouldAllocateTls enum from Process to execve.cpp 2021-02-08 22:24:37 +01:00
Andreas Kling 9ca42c4c0e Kernel: Always hold space lock while calculating memory statistics
And put the locker at the top of the functions for clarity.
2021-02-08 22:23:29 +01:00
Andreas Kling 8bda30edd2 Kernel: Move memory statistics helpers from Process to Space 2021-02-08 22:23:29 +01:00
Andreas Kling b1c9f93fa3 Kernel: Skip generic region lookup in sys$futex and sys$get_stack_bounds
Just ask the process space directly instead of using the generic region
lookup that also checks for kernel regions.
2021-02-08 22:23:29 +01:00
Andreas Kling f39c2b653e Kernel: Reorganize ptrace implementation a bit
The generic parts of ptrace now live in Kernel/Syscalls/ptrace.cpp
and the i386 specific parts are moved to Arch/i386/CPU.cpp
2021-02-08 19:34:41 +01:00
Andreas Kling 45231051e6 Kernel: Set the dumpable flag before switching spaces in sys$execve() 2021-02-08 19:15:42 +01:00
Andreas Kling d746639171 Kernel: Remove outdated code to dump memory layout after exec load 2021-02-08 19:07:29 +01:00
Andreas Kling f1b5def8fd Kernel: Factor address space management out of the Process class
This patch adds Space, a class representing a process's address space.

- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.

This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.

Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.

Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
2021-02-08 18:27:28 +01:00
Andreas Kling b2cba3036e Kernel: Remove unused MemoryManager::validate_range()
This is no longer used since we've switched to using the MMU to
generate EFAULT errors.
2021-02-08 18:27:28 +01:00
Andreas Kling cf5ab665e0 Kernel: Remove unused Process::for_each_thread_in_coredump() 2021-02-08 18:27:28 +01:00
AnotherTest 09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
AnotherTest 1f8a633cc7 Kernel: Make Arch/i386/CPU.cpp safe to run through clang-format
This file was far too messy, and touching it was a major pain.
Also enable clang-format linting on it.
2021-02-08 18:08:55 +01:00
AnotherTest 53ce923e10 Everywhere: Fix obvious dbgln() bugs
This will allow compiletime dbgln() checks to pass
2021-02-08 18:08:55 +01:00
Ben Wiederhake 0a2304ba05 Everywhere: Fix weird includes 2021-02-08 18:03:57 +01:00
Tom 1d843c46eb Kernel: KResultOr can use the same storage as the object for the error
Since it can only hold either an object or an error code, we can share
the same storage to hold either.
2021-02-08 18:00:38 +01:00
Tom 27a395d964 Kernel: Fix KResultOr copy-move from itself case
If move-assigning from itself we shouldn't do anything.
2021-02-07 23:02:57 +01:00
Tom b22740c08e Kernel: Use KResultOr::release_value in Process::create_kernel_thread
This should avoid an unneccessary reference bump.
2021-02-07 22:25:15 +01:00
Tom f74e31c74d Kernel: Change KResultOr::take_value to use move semantics
This may be more light weight than copying the object.
2021-02-07 22:25:15 +01:00
Andreas Kling ad42d873e5 Kernel: Remove ancient unused Scheduler::beep() declaration 2021-02-07 20:45:09 +01:00
Andreas Kling 0d8262cbab Kernel: Remove a handful of unused things from Thread 2021-02-07 20:26:53 +01:00
Andreas Kling 2ec8b4e177 Kernel: Don't allocate kernel stack twice per thread :^) 2021-02-07 20:13:51 +01:00
Andreas Kling b466ede1ea Kernel: Make sure we can allocate kernel stack before creating thread
Wrap thread creation in a Thread::try_create() helper that first
allocates a kernel stack region. If that allocation fails, we propagate
an ENOMEM error to the caller.

This avoids the situation where a thread is half-constructed, without a
valid kernel stack, and avoids having to do messy cleanup in that case.
2021-02-07 19:27:00 +01:00
Andreas Kling 5c45b0d32d Kernel: Combine Thread::backtrace() and backtrace_impl() into one 2021-02-07 19:27:00 +01:00
Andreas Kling fd3eca3acc Kernel: Add initializer for Thread::m_tss 2021-02-07 19:27:00 +01:00
Andreas Kling 5c1c82cd33 Kernel: Remove unused function Process::backtrace() 2021-02-07 19:27:00 +01:00
Andreas Kling b1813e5dae Kernel: Remove some unused declarations from Process 2021-02-07 19:27:00 +01:00
Brian Gianforcaro c95d48c8d6 Kernel: KUBSAN implementation of returns-nonnull-attribute
This didn't find anything in the current source.
2021-02-07 10:22:03 +01:00
William Bowling b97d23a71f
Kernel: Use the resolved parent path when testing create veil (#5231) 2021-02-06 19:11:44 +01:00
Andreas Kling 04ff46bff4 Kernel: And some more KUBSAN checks :^)
Here comes a few more:

* enum
* object-size
* vptr
2021-02-06 17:39:49 +01:00
Andreas Kling fad0332898 Kernel: Implement some more KUBSAN checks :^)
This patch enables the following -fsanitize sub-options:

* bounds
* bounds-strict
* integer-divide-by-zero
* return
* shift
* shift-base
* shift-exponent
2021-02-06 17:39:49 +01:00
Andreas Kling 930e3ce00d Kernel: Don't left-shift 1 (signed) 31 times
Found by KUBSAN :^)
2021-02-05 21:28:06 +01:00
Andreas Kling 4c0707e56c Kernel: Don't create a zero-length VLA in Ext2FS block list walk
Found by KUBSAN :^)
2021-02-05 21:23:11 +01:00
Andreas Kling d164f89ada Kenrel: Implement two more KUBSAN checks
This patch adds the following UndefinedBehaviorSanitizer sub-options:

* signed-integer-overflow
* vla-bound
2021-02-05 21:23:11 +01:00
Andreas Kling f4eb1f261f Kernel: Add missing initializer for SharedIRQHandler::m_enabled
Found by KUBSAN :^)
2021-02-05 21:23:11 +01:00
Andreas Kling d44be96893 Kernel: KUBSAN! (Kernel Undefined Behavior SANitizer) :^)
We now build the kernel with partial UBSAN support.
The following -fsanitize sub-options are enabled:

* nonnull-attribute
* bool

If the kernel detects UB at runtime, it will now print a debug message
with a stack trace. This is very cool! I'm leaving it on by default for
now, but we'll probably have to re-evaluate this as more options are
enabled and slowdown increases.
2021-02-05 21:23:11 +01:00
Andreas Kling e87eac9273 Userland: Add LibSystem and funnel all syscalls through it
This achieves two things:

- Programs can now intentionally perform arbitrary syscalls by calling
  syscall(). This allows us to work on things like syscall fuzzing.

- It restricts the ability of userspace to make syscalls to a single
  4KB page of code. In order to call the kernel directly, an attacker
  must now locate this page and call through it.
2021-02-05 12:23:39 +01:00
Jean-Baptiste Boric edd2362f39 Kernel: Add NE2000 network card driver
Remember, friends don't let friends use NE2000 network cards :^)
2021-02-05 09:35:02 +01:00
Liav A 865aade42b Kernel: Clear pending interrupts before enabling IRQs of IDE Channel
Calling detect_disks() can generate interrupts, so we must clear it to
allow proper function when booting with kernel argument smp=on.
2021-02-05 09:10:37 +01:00
Liav A f2faf11d61 Kernel: Try to detect Sound Blaster 16 before creating an instance
We shouldn't create a SB16 instance without checking if the Sound
Blaster 16 card is actually installed in the system.
2021-02-05 08:54:02 +01:00
Andreas Kling 54d28df97d Kernel: Make /proc/PID/stacks/TID a JSON array
The contents of these files are now raw JSON arrays. We no longer
symbolicate the addresses. That's up to userspace from now on.
2021-02-04 22:55:39 +01:00
asynts 6a00e338a8 Make it possible to overwrite debug macros locally.
Leaking macros across headers is a terrible thing, but I can't think of
a better way of achieving this.

  - We need some way of modifying debug macros from CMake to implement
    ENABLE_ALL_THE_DEBUG_MACROS.

  - We need some way of modifying debug macros in specific source files
    because otherwise we need to rebuild too many files.

This was done using the following script:

    sed -i -E 's/#cmakedefine01 ([A-Z0-9_]+)/#ifndef \1\n\0\n#endif\n/' AK/Debug.h.in
    sed -i -E 's/#cmakedefine01 ([A-Z0-9_]+)/#ifndef \1\n\0\n#endif\n/' Kernel/Debug.h.in
2021-02-04 18:26:22 +01:00
Andreas Kling c10e0adaca Kernel: Move perf event backtrace capture out of Thread class
There's no need for this to be generic and support running from an
arbitrary thread context. Perf events are always generated from within
the thread being profiled, so take advantage of that to simplify the
code. Also use Vector capacity to avoid heap allocations.
2021-02-03 11:53:05 +01:00
Andreas Kling 9c77980965 Everywhere: Remove some bitrotted "#if 0" blocks 2021-02-03 11:17:47 +01:00
Andreas Kling ac59903c89 Kernel: Don't try to symbolicate user addresses with ksyms
That's just not gonna work. :^)
2021-02-03 11:08:23 +01:00
Andreas Kling e1236dac3e Kernel: Check for off_t overflow in FileDescription::read/write
We were checking for size_t (unsigned) overflow but the current offset
is actually stored as off_t (signed). Fix this, and also fail with
EOVERFLOW correctly.
2021-02-03 10:54:35 +01:00
Andreas Kling 9f05044c50 Kernel: Check for off_t overflow before reading/writing InodeFile
Let's double-check before calling the Inode. This way we don't have to
trust every Inode subclass to validate user-supplied inputs.
2021-02-03 10:51:37 +01:00
Liav A 36f6351edc Kernel: Restore IDE PIO functionality
This change can be actually seen as two logical changes, the first
change is about to ensure we only read the ATA Status register only
once, because if we read it multiple times, we acknowledge interrupts
unintentionally. To solve this issue, we always use the alternate Status
register and only read the original status register in the IRQ handler.

The second change is how we handle interrupts - if we use DMA, we can
just complete the request and return from the IRQ handler. For PIO mode,
it's more complicated. For PIO write operation, after setting the ATA
registers, we send out the data to IO port, and wait for an interrupt.
For PIO read operation, we set the ATA registers, and wait for an
interrupt to fire, then we just read from the data IO port.
2021-02-03 10:18:16 +01:00
Andreas Kling d4dd4a82bb Kernel: Don't allow sys$msyscall() on non-mmap regions 2021-02-02 20:16:13 +01:00
Andreas Kling 823186031d Kernel: Add a way to specify which memory regions can make syscalls
This patch adds sys$msyscall() which is loosely based on an OpenBSD
mechanism for preventing syscalls from non-blessed memory regions.

It works similarly to pledge and unveil, you can call it as many
times as you like, and when you're finished, you call it with a null
pointer and it will stop accepting new regions from then on.

If a syscall later happens and doesn't originate from one of the
previously blessed regions, the kernel will simply crash the process.
2021-02-02 20:13:44 +01:00
Andreas Kling d4f40241f1 Ext2FS: Avoid unnecessary parent inode lookup during inode creation
Creation of new inodes is always driven by the parent inode, so we can
just refer directly to it instead of looking up the parent by ID.
2021-02-02 18:58:26 +01:00
Andreas Kling 9e4dd834ab Ext2FS: Simplify inode creation by always starting empty
We had two ways of creating a new Ext2FS inode. Either they were empty,
or they started with some pre-allocated size.

In practice, the pre-sizing code path was only used for new directories
and it didn't actually improve anything as far as I can tell.

This patch simplifies inode creation by simply always allocating empty
inodes. Block allocation and block list generation now always happens
on the same code path.
2021-02-02 18:58:26 +01:00
Andreas Kling dbb668ddd3 Ext2FS: Propagate error codes from write_directory() 2021-02-02 18:58:26 +01:00
Andreas Kling b0b51c3955 Kernel: Limit the size of stack traces
Let's not allow infinitely long stack traces. Cap it at 4096 frames.
2021-02-02 18:58:26 +01:00
Linus Groh ee41d6e154 Base: Rename some keymaps to use xx-xx format where appropriate
- en.json -> en-us.json
- gb.json -> en-gb.json
- ptbr.json -> pt-br.json
- ptpt.json -> pt-pt.json
2021-02-02 16:53:11 +01:00
Liav A 65c27bfe52 Kernel: Set file size for smbios_entry_point and DMI blobs in ProcFS
This is needed to support dmidecode version 3.3, so it can read the 2
blobs in ProcFS.
2021-02-01 17:13:23 +01:00
Liav A a1e20aa04f Kernel: Fix enum of sysconf values to be in the correct order
This prevented from dmidecode to get the right PAGE_SIZE when using the
sysconf syscall.

I found this bug, when I tried to figure why dmidecode fails to mmap
/dev/mem when I passed --no-procfs, and the conclusion is that it tried
to mmap unaligned physical address 0xf5ae0 (SMBIOS data), and that was
caused by a wrong value returned after using the sysconf syscall to get
the plaform page size, therefore, allowing to send an unaligned address
to the mmap syscall.
2021-02-01 17:13:23 +01:00
Liav A 5ab1864497 Kernel: Introduce the MemoryDevice
This is a character device that is being used by the dmidecode utility.
We only allow to map the BIOS ROM area to userspace with this device.
2021-02-01 17:13:23 +01:00
Liav A df59b80e23 Kernel: Expose SMBIOS blobs in ProcFS 2021-02-01 17:13:23 +01:00
Ben Wiederhake cbee0c26e1 Kernel+keymap+KeyboardMapper: New pledge for getkeymap 2021-02-01 09:54:32 +01:00
Ben Wiederhake 272df54a3e Kernel+LibKeyboard: Define the default keymap only in one place
Because it was 'static const' and also shared with userland programs,
the default keymap was defined in multiple places. This commit should
save several kilobytes! :^)
2021-02-01 09:54:32 +01:00
Ben Wiederhake 0e3408d4d6 LibKeyboard: Don't assert on failure 2021-02-01 09:54:32 +01:00
Ben Wiederhake dd4e670f72 LibKeyboard+keymap: Support querying the keymap via commandline 2021-02-01 09:54:32 +01:00
Ben Wiederhake a2c21a55e1 Kernel+LibKeyboard: Enable querying the current keymap 2021-02-01 09:54:32 +01:00
Jean-Baptiste Boric b48d8d1d6d Userland: Rename PCI slot to PCI device terminology 2021-01-31 19:06:40 +01:00
Jean-Baptiste Boric 06d76a4717 Kernel: Fix PCI bridge enumeration
The enumeration code is already enumerating all buses, recursively
enumerating bridges (which are buses) makes devices on bridges being
enumerated multiple times. Also, the PCI code was incorrectly mixing up
terminology; let's settle down on bus, device and function because ever
since PCIe came along "slots" isn't really a thing anymore.
2021-01-31 19:06:40 +01:00
Andreas Kling 1320b9351e Revert "Kernel: Don't clone kernel mappings for bottom 2 MiB VM into processes"
This reverts commit da7b21dc06.

This broke SMP boot, oops! :^)
2021-01-31 19:00:53 +01:00
Andreas Kling da7b21dc06 Kernel: Don't clone kernel mappings for bottom 2 MiB VM into processes
I can't think of anything that needs these mappings anymore, so let's
get rid of them.
2021-01-31 15:20:18 +01:00
Andreas Kling 9984201634 Kernel: Use KResult a bit more in the IPv4 networking code 2021-01-31 12:13:16 +01:00
Ben Wiederhake b00799b9ce Kernel: Make /proc/self/ work again
I have no idea when it broke.

Inspired by https://www.thanassis.space/bashheimer.html
2021-01-31 12:03:14 +01:00
Andreas Kling 6e4e3a7612 Kernel: Remove pledge exception for sys$getsockopt() with SO_PEERCRED
We had an exception that allowed SOL_SOCKET + SO_PEERCRED on local
socket to support LibIPC's PID exchange mechanism. This is no longer
needed so let's just remove the exception.
2021-01-31 09:29:27 +01:00
Andreas Kling 4d777a9bf4 Kernel: Allow changing thread names with the "stdio" promise
It's useful for programs to change their thread names to say something
interesting about what they are working on. Let's not require "thread"
for this since single-threaded programs may want to do it without
pledging "thread".
2021-01-30 23:38:57 +01:00
Peter Elliott c0e88b9710 Kernel: Add FIBMAP ioctl to Ext2FileSystem
FIBMAP is a linux ioctl that gives the location on disk of a specific
block of a file
2021-01-30 22:54:51 +01:00
Andreas Kling 90343eeaeb Revert "Kernel: Return -ENOTDIR for non-directory mount target"
This reverts commit b7b09470ca.

Mounting a file on top of a file is a valid thing we support.
2021-01-30 13:52:12 +01:00
Andreas Kling 123c37e1c0 Kernel: Fix mix-up between MAP_STACK/MAP_ANONYMOUS in prot validation 2021-01-30 10:30:17 +01:00
Andreas Kling e55ef70e5e Kernel: Remove "has made executable exception for dynamic loader" flag
As Idan pointed out, this flag is actually not needed, since we don't
allow transitioning from previously-executable to writable anyway.
2021-01-30 10:06:52 +01:00
Linus Groh a3da5bc925 Meta: Expect sync-local.sh script at repository root
This used to be in Kernel/, next to the build-root-filesystem.sh script,
which was then moved to Meta/ during the transition to CMake but has the
working directory set to Build/, effectively expecting it there - which
seems silly.

TL;DR: Very confusing. Use an explicit path relative to SERENITY_ROOT
instead and update the .gitignore files.
2021-01-30 09:18:46 +01:00
Andreas Kling a8c823f242 Kernel: Bump the number of fd's that can be queued on a local socket 2021-01-29 22:11:59 +01:00
Luke 40de84ba67 Kernel/Storage: Rewrite IDE disk detection and disk access
This replaces the current disk detection and disk access code with
code based on https://wiki.osdev.org/IDE

This allows the system to boot on VirtualBox with serial debugging
enabled and VMWare Player.

I believe there were several issues with the current code:
- It didn't utilise the last 8 bits of the LBA in 24-bit mode.
- {read,write}_sectors_with_dma was not setting the obsolete bits,
  which according to OSdev wiki aren't used but should be set.
- The PIO and DMA methods were using slightly different copy
  and pasted access code, which is now put into a single
  function called "ata_access"
- PIO mode doesn't work. This doesn't fix that and should
  be looked into in the future.
- The detection code was not checking for ATA/ATAPI.
- The detection code accidentally had cyls/heads/spt as 8-bit,
  when they're 16-bit.
- The capabilities of the device were not considered. This is now
  brought in and is currently used to check if the device supports
  LBA. If not, use CHS.
2021-01-29 21:20:38 +01:00
Andreas Kling d0c5979d96 Kernel: Add "prot_exec" pledge promise and require it for PROT_EXEC
This prevents sys$mmap() and sys$mprotect() from creating executable
memory mappings in pledged programs that don't have this promise.

Note that the dynamic loader runs before pledging happens, so it's
unaffected by this.
2021-01-29 18:56:34 +01:00
Jorropo df30b3e54c
Kernel: RangeAllocator randomized correctly check if size is in bound. (#5164)
The random address proposals were not checked with the size so it was
increasely likely to try to allocate outside of available space with
larger and larger sizes.

Now they will be ignored instead of triggering a Kernel assertion
failure.

This is a continuation of: c8e7baf4b8
2021-01-29 17:18:23 +01:00
Andreas Kling 51df44534b Kernel: Disallow mapping anonymous memory as executable
This adds another layer of defense against introducing new code into a
running process. The only permitted way of doing so is by mmapping an
open file with PROT_READ | PROT_EXEC.

This does make any future JIT implementations slightly more complicated
but I think it's a worthwhile trade-off at this point. :^)
2021-01-29 14:52:34 +01:00
Andreas Kling af3d3c5c4a Kernel: Enforce W^X more strictly (like PaX MPROTECT)
This patch adds enforcement of two new rules:

- Memory that was previously writable cannot become executable
- Memory that was previously executable cannot become writable

Unfortunately we have to make an exception for text relocations in the
dynamic loader. Since those necessitate writing into a private copy
of library code, we allow programs to transition from RW to RX under
very specific conditions. See the implementation of sys$mprotect()'s
should_make_executable_exception_for_dynamic_loader() for details.
2021-01-29 14:52:27 +01:00
Andreas Kling c8e7baf4b8 Kernel: Check for alignment size overflow when allocating VM ranges
Also add some sanity check assertions that we're generating and
returning ranges contained within the RangeAllocator's total range.

Fixes #5162.
2021-01-29 12:11:42 +01:00
Linus Groh dbbc378fb2 Kernel: Return -ENOTBLK for non-block device Ext2FS mount source
When mounting an Ext2FS, a block device source is required. All other
filesystem types are unaffected, as most of them ignore the source file
descriptor anyway.

Fixes #5153.
2021-01-29 08:45:56 +01:00