Commit graph

921 commits

Author SHA1 Message Date
Linus Groh f646d49ac1 Kernel: Add _SC_HOST_NAME_MAX 2021-09-11 00:28:39 +02:00
Andreas Kling dd82f68326 Kernel: Use KString all the way in sys$execve()
This patch converts all the usage of AK::String around sys$execve() to
using KString instead, allowing us to catch and propagate OOM errors.

It also required changing the kernel CommandLine helper class to return
a vector of KString for the userspace init program arguments.
2021-09-09 21:25:10 +02:00
Andreas Kling 524ef5e475 Kernel: Add KBuffer::bytes() and use it
(Instead of hand-wrapping { data(), size() } in a bunch of places.)
2021-09-08 20:16:00 +02:00
Liav A 3d5ddbab74 Kernel: Rename DevFS => DevTmpFS
The current implementation of DevFS resembles the linux devtmpfs, and
not the traditional DevFS, so let's rename it to better represent the
direction of the development in regard to this filesystem.

The abbreviation for DevTmpFS is still "dev", because it doesn't add
value as a commandline option to make it longer.

In quick summary - DevFS in unix OSes is simply a static filesystem, so
device nodes are generated and removed by the kernel code. DevTmpFS
is a "modern reinvention" of the DevFS, so it is much more like a TmpFS
in the sense that not only it's stored entirely in RAM, but the userland
is responsible to add and remove devices nodes as it sees fit, and no
kernel code is directly being involved to keep the filesystem in sync.
2021-09-08 00:42:20 +02:00
Andreas Kling a01b19c878 Kernel: Remove KBuffer::try_copy() in favor of try_create_with_bytes()
These were already equivalent, so let's only have one of them.
2021-09-07 16:22:29 +02:00
Andreas Kling b300f9aa2f Kernel: Convert KBuffer::copy() => KBuffer::try_copy()
This was a weird KBuffer API that assumed failure was impossible.
This patch converts it to a modern KResultOr<NonnullOwnPtr<KBuffer>> API
and updates the two clients to the new style.
2021-09-07 15:36:39 +02:00
Andreas Kling 250b52d6e5 Kernel: Make KBuffer::try_create_with_bytes() return KResultOr 2021-09-07 15:22:24 +02:00
Andreas Kling ed5d04b0ea Kernel: Use KResultOr and TRY() for FIFO 2021-09-07 13:58:16 +02:00
Andreas Kling 213b8868af Kernel: Rename file_description(fd) => open_file_description(fd)
To go with the class rename.
2021-09-07 13:53:14 +02:00
Andreas Kling 4a9c18afb9 Kernel: Rename FileDescription => OpenFileDescription
Dr. POSIX really calls these "open file description", not just
"file description", so let's call them exactly that. :^)
2021-09-07 13:53:14 +02:00
Andreas Kling dbd639a2d8 Kernel: Convert much of sys$execve() to using KString
Make use of the new FileDescription::try_serialize_absolute_path() to
avoid String in favor of KString throughout much of sys$execve() and
its helpers.
2021-09-07 13:53:14 +02:00
Andreas Kling 226383f45b LibELF: Use StringView to carry temporary strings in auxiliary vector
Let's not force clients to provide a String.
2021-09-07 13:53:14 +02:00
Andreas Kling a27c6f5226 Kernel: Avoid unnecessary String allocation in sys$statvfs() 2021-09-07 13:53:14 +02:00
Andreas Kling 55b0b06897 Kernel: Store process names as KString 2021-09-07 13:53:14 +02:00
Andreas Kling fe2e25edad Kernel: Add a comment explaining an alternate path in Process::exec()
I had to look at this for a moment before I realized that sys$execve()
and the spawning of /bin/SystemServer at boot are taking two different
paths out of exec().

Add a comment to help the next person looking at it. :^)
2021-09-07 01:34:26 +02:00
Andreas Kling 5d06ab6531 Kernel: Fix file description leak in sys$execve()
Before this patch, we were leaking a ref on the open file description
used for the interpreter (the dynamic loader) in sys$execve().

This surfaced when adapting the syscall to use TRY(), since we were now
correctly transferring ownership of the interpreter to Process::exec()
and no longer holding on to a local copy of it (in `elf_result`).

Fixing the leak uncovered another problem. The interpreter description
would now get destroyed when returning from do_exec(), which led to a
kernel panic when attempting to acquire a mutex.

This happens because we're in a particularly delicate state when
returning from do_exec(). Everything is primed for the upcoming context
switch into the new executable image, and trying to block the thread
at this point will panic the kernel.

We fix this by destroying the interpreter description earlier in
do_exec(), at the point where we no longer need it.
2021-09-07 01:18:02 +02:00
Andreas Kling e226400dd8 Kernel: Don't seek the program executable description in sys$execve()
The dynamic loader doesn't care if the kernel has moved the file
cursor around before it gains control.
2021-09-07 01:18:02 +02:00
Andreas Kling f4624e4ee1 Kernel: Hoist allocation of main program FD in sys$execve()
When executing a dynamically linked program, we need to pass the main
program executable via a file descriptor to the dynamic loader.

Before this patch, we were allocating an FD for this purpose long after
it was safe to do anything fallible. If we were unable to allocate an
FD we would simply panic the kernel(!)

We now hoist the allocation so it can fail before we've committed to
a new executable.
2021-09-07 01:18:02 +02:00
Andreas Kling b141bfe53b Kernel: Reorganize ELF loading so it can use TRY()
Due to the use of ELF::Image::for_each_program_header(), we were
previously unable to use TRY() in the ELF loading code (since the return
statement inside TRY() would only return from the iteration callback.)
2021-09-07 01:18:02 +02:00
Andreas Kling e6929835d2 Kernel: Make copy_time_from_user() helpers use KResultOr<Time>
...and use TRY() for smooth error propagation everywhere.
2021-09-07 01:18:02 +02:00
Andreas Kling 274d535d0e Kernel: Use TRY() in sys$module_load() and sys$module_unload() 2021-09-06 20:23:08 +02:00
Andreas Kling 56a2594de7 Kernel: Make KString factories return KResultOr + use TRY() everywhere
There are a number of places that don't have an error propagation path
right now, so I've added FIXME's about that.
2021-09-06 19:25:36 +02:00
Andreas Kling f16b9a691f Kernel: Rename ProcessPagingScope => ScopedAddressSpaceSwitcher 2021-09-06 18:56:51 +02:00
Andreas Kling cd8d52e6ae Kernel: Improve API names for switching address spaces
- enter_space => enter_address_space
- enter_process_paging_scope => enter_process_address_space
2021-09-06 18:56:51 +02:00
Andreas Kling 298cd57fe7 Kernel: Allocate signal trampoline before committing to a sys$execve()
Once we commit to a new executable image in sys$execve(), we can no
longer return with an error to whoever called us from userspace.
We must make sure to surface any potential errors before that point.

This patch moves signal trampoline allocation before the commit.
A number of other things remain to be moved.
2021-09-06 18:56:51 +02:00
Andreas Kling 6863d015ec Kernel: Use TRY() more in sys$execve()
I just keep finding more and more places to make use of this. :^)
2021-09-06 18:56:51 +02:00
Andreas Kling 009ea5013d Kernel: Use TRY() in find_elf_interpreter_for_executable() 2021-09-06 18:56:51 +02:00
Andreas Kling 511ebffd94 Kernel: Improve find_elf_interpreter_for_executable() parameter names 2021-09-06 18:56:51 +02:00
Andreas Kling 645e29a88b Kernel: Don't turn I/O errors during sys$execve() into ENOEXEC
Instead, just propagate whatever the real error was.
2021-09-06 13:06:05 +02:00
Andreas Kling 84addef10f Kernel: Improve arguments retrieval error propagation in sys$execve()
Instead of turning any arguments related error into an EFAULT, we now
propagate the innermost error during arguments retrieval.
2021-09-06 13:06:05 +02:00
Andreas Kling 6e3381ac32 Kernel: Use KResultOr and TRY() for {Shared,Private}InodeVMObject 2021-09-06 13:06:05 +02:00
Andreas Kling e3a716ceff Kernel: Make Memory::Region::map() return KResult
..and use TRY() at the call sites to propagate errors. :^)
2021-09-06 13:06:05 +02:00
Andreas Kling 7981422500 Kernel: Make Threads always have a name
We previously allowed Thread to exist in a state where its m_name was
null, and had to work around that in various places.

This patch removes that possibility and forces those who would create a
thread (or change the name of one) to provide a NonnullOwnPtr<KString>
with the name.
2021-09-06 13:06:05 +02:00
Andreas Kling cda2b9e71c Kernel: Improvements to Custody absolute path serialization
- Renamed try_create_absolute_path() => try_serialize_absolute_path()
- Use KResultOr and TRY() to propagate errors
- Don't call this when it's only for debug logging
2021-09-06 13:06:05 +02:00
Andreas Kling e3b063581e Kernel: Use TRY() in sys$alarm() 2021-09-06 13:06:05 +02:00
Andreas Kling d34f2b643e Kernel: Tidy up Plan9FS construction a bit 2021-09-06 13:06:05 +02:00
Andreas Kling 36725228fa Kernel: Tidy up Ext2FS construction a bit 2021-09-06 13:06:05 +02:00
Andreas Kling 47bfbe343b Kernel: Tidy up SysFS construction
- Use KResultOr and TRY() to propagate errors
- Check for OOM errors
- Move allocation out of constructors

There's still a lot more to do here, as SysFS is still quite brittle
in the face of memory pressure.
2021-09-06 13:06:05 +02:00
Andreas Kling 788b91a65c Kernel: Tidy up DevFS construction and handle OOM errorso
- Use KResultOr and TRY() to propagate errors
- Check for OOM
- Move allocations out of the DevFS constructor
2021-09-06 13:06:05 +02:00
Andreas Kling efe4e230ee Kernel: Tidy up DevPtsFS construction and handle OOM errors
- Use KResultOr and TRY() to propagate errors
- Check for OOM when creating new inodes
2021-09-06 13:06:05 +02:00
Andreas Kling f2512071f2 Kernel: Use TRY() in sys$getrandom() 2021-09-06 02:36:21 +02:00
Andreas Kling a8516681b7 Kernel: Tidy up TmpFS and TmpFSInode construction
- Use KResultOr<NonnullRefPtr<T>>
- Propagate errors
- Use TRY() at call sites
2021-09-06 02:36:21 +02:00
Andreas Kling a994f11f10 Kernel: Make AddressSpace::add_region() return KResultOr<Region*>
This allows us to use TRY() in a few places.
2021-09-06 02:02:06 +02:00
Andreas Kling 75564b4a5f Kernel: Make kernel region allocators return KResultOr<NOP<Region>>
This expands the reach of error propagation greatly throughout the
kernel. Sadly, it also exposes the fact that we're allocating (and
doing other fallible things) in constructors all over the place.

This patch doesn't attempt to address that of course. That's work for
our future selves.
2021-09-06 01:55:27 +02:00
Andreas Kling f4a9a0d561 Kernel: Make VirtualRangeAllocator return KResultOr<VirtualRange>
This achieves two things:
- The allocator can report more specific errors
- Callers can (and now do) use TRY() :^)
2021-09-06 01:55:27 +02:00
Andreas Kling 98dc08fe56 Kernel: Use KResultOr better in ProcessGroup construction
This allows us to use TRY() more.
2021-09-06 01:55:27 +02:00
Andreas Kling 12d9a6c1fa Kernel: Use TRY() in sys$waitid() 2021-09-05 18:42:32 +02:00
Andreas Kling c076d765c4 Kernel: Simplify sys$inode_watcher_remove_watch() a little bit 2021-09-05 18:41:28 +02:00
Andreas Kling d912bfdf38 Kernel: Use TRY() in sys$create_inode_watcher() and friends 2021-09-05 18:41:01 +02:00
Andreas Kling a9204510a4 Kernel: Make file description lookup return KResultOr
Instead of checking it at every call site (to generate EBADF), we make
file_description(fd) return a KResultOr<NonnullRefPtr<FileDescription>>.

This allows us to wrap all the calls in TRY(). :^)

The only place that got a little bit messier from this is sys$mount(),
and there's a whole bunch of things there in need of cleanup.
2021-09-05 18:36:13 +02:00
Andreas Kling 2d2ea05c97 Kernel: Use TRY() in sys$sethostname() 2021-09-05 18:22:18 +02:00
Andreas Kling 963f847579 Kernel: Use TRY() in sys$mount() 2021-09-05 18:20:57 +02:00
Andreas Kling 3580c5a72e Kernel: Use TRY() in sys$umount() 2021-09-05 18:18:23 +02:00
Andreas Kling 76f2596ce8 Kernel: Use TRY() in sys$set_process_name() 2021-09-05 18:17:06 +02:00
Andreas Kling 53aa01384d Kernel: Use TRY() in sys$set_coredump_metadata() 2021-09-05 18:15:42 +02:00
Andreas Kling bfe4c84541 Kernel: Use TRY() in sys$create_thread() 2021-09-05 18:15:05 +02:00
Andreas Kling 257fa80312 Kernel: Use TRY() in sys$set_thread_name() 2021-09-05 18:15:05 +02:00
Andreas Kling afc5bbd56b Kernel: Use TRY() in sys$write() 2021-09-05 18:15:05 +02:00
Andreas Kling 17933b193a Kernel: Use TRY() in sys$perf_register_string() 2021-09-05 18:15:05 +02:00
Andreas Kling 77b7a44691 Kernel: Use TRY() in sys$recvfd() 2021-09-05 18:15:05 +02:00
Andreas Kling ea911bc125 Kernel: Use TRY() in sys$pledge() 2021-09-05 18:15:05 +02:00
Andreas Kling 95e74d1776 Kernel: Use TRY() even more in sys$mmap() and friends :^) 2021-09-05 18:15:05 +02:00
Andreas Kling cf2c04eb13 Kernel: Use TRY() in sys$dbgputstr() 2021-09-05 18:15:05 +02:00
Andreas Kling 6dddd500bf Kernel: Use TRY() in sys$map_time_page() 2021-09-05 18:15:05 +02:00
Andreas Kling d53c60fd9f Kernel: Use TRY() in sys$setkeymap() 2021-09-05 18:15:05 +02:00
Andreas Kling 1f475f7bbc Kernel: Use TRY() in sys$realpath() 2021-09-05 18:15:05 +02:00
Andreas Kling 4ea3dc77f0 Kernel: Use TRY() in sys$statvfs() 2021-09-05 18:15:05 +02:00
Andreas Kling 7efa742a39 Kernel: Use TRY() in sys$stat() 2021-09-05 17:58:08 +02:00
Andreas Kling de2c1bc5c3 Kernel: Use TRY() in sys$unlink() 2021-09-05 17:56:40 +02:00
Andreas Kling 4e4b7c272c Kernel: Use TRY() in sys$readlink() 2021-09-05 17:55:43 +02:00
Andreas Kling e0cf9152ca Kernel: Use TRY() in sys$rmdir() 2021-09-05 17:54:33 +02:00
Andreas Kling 780737de7a Kernel: Use TRY() in sys$rename() 2021-09-05 17:53:59 +02:00
Andreas Kling c5d046c23a Kernel: Use TRY() in sys$utime() 2021-09-05 17:53:28 +02:00
Andreas Kling 789db813d3 Kernel: Use copy_typed_from_user<T> for fetching syscall parameters 2021-09-05 17:51:37 +02:00
Andreas Kling 48a0b31c47 Kernel: Make copy_{from,to}_user() return KResult and use TRY()
This makes EFAULT propagation flow much more naturally. :^)
2021-09-05 17:38:37 +02:00
Andreas Kling 9903f5c6ef Kernel: Use TRY() in sys$read() and sys$readv() 2021-09-05 16:25:40 +02:00
Andreas Kling c1e18befe8 Kernel: Reorder sys$pipe() to fail more nicely
Try to do both FD allocations up front instead of interleaved between
assigning them to the descriptor table. This prevents us from failing
in the middle of setting up the pipes.
2021-09-05 16:25:40 +02:00
Andreas Kling 4483110990 Kernel: Use TRY() in sys$open() 2021-09-05 16:25:40 +02:00
Andreas Kling b01e4b171d Kernel: Use TRY() in sys$mknod() 2021-09-05 16:25:40 +02:00
Andreas Kling 2658ab4a80 Kernel: Use TRY() in sys$mkdir() 2021-09-05 16:25:40 +02:00
Andreas Kling e899ea459c Kernel: Use TRY() in sys$lseek() 2021-09-05 16:25:40 +02:00
Andreas Kling f605cd04b6 Kernel: Use TRY() in sys$fork()
There's a lot of work to do on improving error propagation in the
fork system call. This just scratches the surface.
2021-09-05 16:25:40 +02:00
Andreas Kling 29bafec43b Kernel: Use is_error() in sys$fcntl() 2021-09-05 16:25:40 +02:00
Andreas Kling c767e20f91 Kernel: Use TRY() in sys$chown() 2021-09-05 16:25:40 +02:00
Andreas Kling 24e8ad5ade Kernel: Use TRY() in sys$chmod() 2021-09-05 16:25:40 +02:00
Andreas Kling aa4d5817af Kernel: Use TRY() in sys$ptrace() 2021-09-05 16:25:40 +02:00
Andreas Kling 4a2b0f6bec Kernel: Use TRY() in sys$access() 2021-09-05 16:25:40 +02:00
Andreas Kling f8fba5f017 Kernel: Use TRY() in sys$socket() and friends 2021-09-05 16:25:40 +02:00
Andreas Kling 83fed5b2de Kernel: Tidy up Memory::AddressSpace construction
- Return KResultOr<T> in places
- Propagate errors
- Use TRY()
2021-09-05 15:13:20 +02:00
Andreas Kling f2f5df793a Kernel: Use TRY() in sys$chdir() 2021-09-05 14:41:13 +02:00
Andreas Kling 8cd4879946 Kernel: Use TRY() in sys$link() and sys$symlink() 2021-09-05 14:38:28 +02:00
Andreas Kling c902b3cb0d Kernel: Use TRY() in sys$anon_create() 2021-09-05 14:36:40 +02:00
Andreas Kling 4012099338 Kernel: Tidy up AnonymousFile construction a bit
- Rename create() => try_create()
- Use adopt_nonnull_ref_or_enomem()
2021-09-05 14:33:25 +02:00
Andreas Kling 9a1fdb523f Kernel: Use TRY() in sys$unveil() 2021-09-05 14:30:08 +02:00
Andreas Kling 36efecf3c3 Kernel: Use try in sys$mmap() and friends :^) 2021-09-05 14:27:41 +02:00
Andreas Kling 6bf901b414 Kernel: Use TRY() in sys$execve()
There are more opportunities to use TRY() here, but it will require
improvements to error propagation first.
2021-09-05 14:20:03 +02:00
Andreas Kling 68a6d4c30a Kernel: Tidy up InodeWatcher construction
- Rename create() => try_create()
- Use adopt_nonnull_ref_or_enomem()
2021-09-05 01:10:56 +02:00
Andreas Kling ac85fdeb1c Kernel: Handle ProcessGroup allocation failures better
- Rename create* => try_create*
- Don't null out existing process group on allocation failure
2021-09-04 23:11:04 +02:00
Andreas Kling 12f820eb08 Kernel: Make Process::try_create() propagate errors better 2021-09-04 23:11:04 +02:00
Andreas Kling 5e2e17c38c Kernel: Rename Process::create() => try_create() 2021-09-04 23:11:03 +02:00
Daniel Bertalan d7b6cc6421 Everywhere: Prevent risky implicit casts of (Nonnull)RefPtr
Our existing implementation did not check the element type of the other
pointer in the constructors and move assignment operators. This meant
that some operations that would require explicit casting on raw pointers
were done implicitly, such as:
- downcasting a base class to a derived class (e.g. `Kernel::Inode` =>
  `Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp),
- casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>`
  in LibIMAP/Client.cpp)

This, of course, allows gross violations of the type system, and makes
the need to type-check less obvious before downcasting. Luckily, while
adding the `static_ptr_cast`s, only two truly incorrect usages were
found; in the other instances, our casts just needed to be made
explicit.
2021-09-03 23:20:23 +02:00
Brian Gianforcaro 668c429900 Kernel: Convert UserOrKernelBuffer callbacks to use AK::Bytes 2021-09-01 18:06:14 +02:00
Brian Gianforcaro f3baa5d8c9 Kernel: Convert random bytes interface to use AK::Bytes 2021-09-01 18:06:14 +02:00
Andrew Kaster fcdd7aa990 Kernel: Only unlock Mutex once in execve when PT_TRACE_ME is enabled
Fixes a regression introduced in 70518e6. Fixes #9704.
2021-09-01 13:36:26 +02:00
Andreas Kling 5046a1fe38 Kernel: Ignore zero-sized PT_LOAD headers when loading ELF images 2021-08-31 16:46:16 +02:00
Andreas Kling 68bf6db673 Kernel: Rename Spinlock::is_owned_by_current_thread()
...to is_owned_by_current_processor(). As Tom pointed out, this is
much more accurate. :^)
2021-08-29 22:19:42 +02:00
Andreas Kling 0b4671add7 Kernel: {Mutex,Spinlock}::own_lock() => is_locked_by_current_thread()
Rename these API's to make it more clear what they are checking.
2021-08-29 12:53:11 +02:00
Andreas Kling 48a1a3c0ce Kernel: Rename LocalSocket::create_connected_pair() => try_*() 2021-08-29 01:33:15 +02:00
Andreas Kling ae197deb6b Kernel: Strongly typed user & group ID's
Prior to this change, both uid_t and gid_t were typedef'ed to `u32`.
This made it easy to use them interchangeably. Let's not allow that.

This patch adds UserID and GroupID using the AK::DistinctNumeric
mechanism we've already been employing for pid_t/ProcessID.
2021-08-29 01:09:19 +02:00
Andreas Kling 59335bd8ea Kernel: Rename FileDescription::create() => try_create() 2021-08-29 01:09:19 +02:00
Andrew Kaster 54161bf5b4 Kernel: Acquire reference to waitee before trying to block in sys$waitid
Previously, we would try to acquire a reference to the all processes
lock or other contended resources while holding both the scheduler lock
and the thread's blocker lock. This could lead to a deadlock if we
actually have to block on those other resources.
2021-08-28 20:53:38 +02:00
Andrew Kaster dea62fe93c Kernel: Guard the all processes list with a Spinlock rather than a Mutex
There are callers of processes().with or processes().for_each that
require interrupts to be disabled. Taking a Mutexe with interrupts
disabled is a recipe for deadlock, so convert this to a Spinlock.
2021-08-28 20:53:38 +02:00
Andrew Kaster 70518e69f4 Kernel: Unlock ptrace lock before entering a critical section in execve
While it might not be as bad to release a mutex while interrupts are
disabled as it is to acquire one, we don't want to mess with that.
2021-08-28 20:53:38 +02:00
Andrew Kaster 8e70b85215 Kernel: Don't disable interrupts in validate_inode_mmap_prot
There's no need to disable interrupts when trying to access an inode's
shared vmobject. Additionally, Inode::shared_vmobject() acquires a Mutex
which is a recipe for deadlock if acquired with interrupts disabled.
2021-08-28 20:53:38 +02:00
Brian Gianforcaro 485f51690d Kernel: Always observe the return value of Region::map and remap
We have seen cases where the map fails, but we return the region
to the caller, causing them to page fault later on when they touch
the region.

The fix is to always observe the return code of map/remap.
2021-08-25 00:18:42 +02:00
Andreas Kling 6c16bedd69 Kernel: Remove unnecessary FutexQueue::did_remove()
This was only ever called immediately after FutexQueue::try_remove()
to VERIFY() that the state looks exactly like it should after returning
from try_remove().
2021-08-23 00:02:09 +02:00
Andreas Kling 85546af417 Kernel: Rename Thread::BlockCondition to BlockerSet
This class represents a set of Thread::Blocker objects attached to
something that those blockers are waiting on.
2021-08-23 00:02:09 +02:00
Andreas Kling bcd2025311 Everywhere: Core dump => Coredump
We all know what a coredump is, and it feels more natural to refer to
it as a coredump (most code already does), so let's be consistent.
2021-08-23 00:02:09 +02:00
Andreas Kling 1b9916439f Kernel: Make Processor::platform_string() return StringView 2021-08-23 00:02:09 +02:00
Andreas Kling c922a7da09 Kernel: Rename ScopedSpinlock => SpinlockLocker
This matches MutexLocker, and doesn't sound like it's a lock itself.
2021-08-22 03:34:10 +02:00
Andreas Kling 55adace359 Kernel: Rename SpinLock => Spinlock 2021-08-22 03:34:10 +02:00
Idan Horowitz cf271183b4 Kernel: Make Process::current() return a Process& instead of Process*
This has several benefits:
1) We no longer just blindly derefence a null pointer in various places
2) We will get nicer runtime error messages if the current process does
turn out to be null in the call location
3) GCC no longer complains about possible nullptr dereferences when
compiling without KUBSAN
2021-08-19 23:49:53 +02:00
Andreas Kling 961f727448 Kernel: Consolidate a bunch of i386/x86_64 code paths
Add some arch-specific getters and setters that allow us to merge blocks
that were previously specific to either ARCH(I386) or ARCH(X86_64).
2021-08-19 23:22:02 +02:00
Andreas Kling 4226b662cd Kernel+Userland: Remove global futexes
We only ever use private futexes, so it doesn't make sense to carry
around all the complexity required for global (cross-process) futexes.
2021-08-17 01:21:47 +02:00
sin-ack 0a18425cbb Kernel: Make Memory::Region allocation functions return KResultOr
This makes for some nicer handling of errors compared to checking an
OwnPtr for null state.
2021-08-15 15:41:02 +02:00
sin-ack 4bfd6e41b9 Kernel: Make Kernel::VMObject allocation functions return KResultOr
This makes for nicer handling of errors compared to checking whether a
RefPtr is null. Additionally, this will give way to return different
types of errors in the future.
2021-08-15 15:41:02 +02:00
Andreas Kling 1b739a72c2 Kernel+Userland: Remove chroot functionality
We are not using this for anything and it's just been sitting there
gathering dust for well over a year, so let's stop carrying all this
complexity around for no good reason.
2021-08-15 12:44:35 +02:00
Andreas Kling 0f6f863382 Kernel: Convert remaining users of copy_string_from_user()
This patch replaces the remaining users of this API with the new
try_copy_kstring_from_user() instead. Note that we still convert to a
String for continued processing, and I've added FIXME about continuing
work on using KString all the way.
2021-08-15 12:44:35 +02:00
sin-ack 748938ea59 Kernel: Handle allocation failure in ProcFS and friends
There were many places in which allocation failure was noticed but
ignored.
2021-08-15 02:27:13 +02:00
Andreas Kling 7676edfb9b Kernel: Stop allowing implicit conversion from KResult to int
This patch removes KResult::operator int() and deals with the fallout.
This forces a lot of code to be more explicit in its handling of errors,
greatly improving readability.
2021-08-14 15:19:00 +02:00
Andreas Kling d30d776ca4 Kernel: Make FileSystem::initialize() return KResult
This forced me to also come up with error codes for a bunch of
situations where we'd previously just panic the kernel.
2021-08-14 15:19:00 +02:00
Brian Gianforcaro 296452a981 Kernel: Make cloning of FileDescriptions OOM safe 2021-08-13 11:09:25 +02:00
Brian Gianforcaro ed6d842f85 Kernel: Fix OOB read in sys$dbgputstr(..) during fuzzing
The implementation uses try_copy_kstring_from_user to allocate a kernel
string using, but does not use the length of the resulting string.
The size parameter to the syscall is untrusted, as try copy kstring will
attempt to perform a `safe_strlen(..)` on the user mode string and use
that value for the allocated length of the KString instead. The bug is
that we are printing the kstring, but with the usermode size argument.

During fuzzing this resulted in us walking off the end of the allocated
KString buffer printing garbage (or any kernel data!), until we stumbled
in to the KSym region and hit a fatal page fault.

This is technically a kernel information disclosure, but (un)fortunately
the disclosure only happens to the Bochs debug port, and or the serial
port if serial debugging is enabled. As far as I can tell it's not
actually possible for an untrusted attacker to use this to do something
nefarious, as they would need access to the host. If they have host
access then they can already do much worse things :^).
2021-08-13 11:08:11 +02:00
Brian Gianforcaro 40a942d28b Kernel: Remove char* versions of path argument / kstring copy methods
The only two paths for copying strings in the kernel should be going
through the existing Userspace<char const*>, or StringArgument methods.

Lets enforce this by removing the option for using the raw cstring APIs
that were previously available.
2021-08-13 11:08:11 +02:00
Brian Gianforcaro 5121e58d4a Kernel: Fix sys$dbgputstr(...) to take a char* instead of u8*
We always attempt to print this as a string, and it's defined as such in
LibC, so fix the signature to match.
2021-08-13 11:08:11 +02:00
Liav A 01b79910b3 Kernel/Process: Move protected values to the end of the object
The compiler can re-order the structure (class) members if that's
necessary, so if we make Process to inherit from ProcFSExposedComponent,
even if the declaration is to inherit first from ProcessBase, then from
ProcFSExposedComponent and last from Weakable<Process>, the members of
class ProcFSExposedComponent (including the Ref-counted parts) are the
first members of the Process class.

This problem made it impossible to safely use the current toggling
method with the write-protection bit on the ProcessBase members, so
instead of inheriting from it, we make its members the last ones in the
Process class so we can safely locate and modify the corresponding page
write protection bit of these values.

We make sure that the Process class doesn't expand beyond 8192 bytes and
the protected values are always aligned on a page boundary.
2021-08-12 20:57:32 +02:00
Andreas Kling a29788e58a Kernel: Don't record sys$perf_event() if profiling is not enabled
If you want to record perf events, just enable profiling. This allows us
to add random perf events to programs without littering the file system
with perfcore files.
2021-08-12 00:03:40 +02:00
Andreas Kling 1e90a3a542 Kernel: Make sys$perf_register_string() generate the string ID's
Making userspace provide a global string ID was silly, and made the API
extremely difficult to use correctly in a global profiling context.

Instead, simply make the kernel do the string ID allocation for us.
This also allows us to convert the string storage to a Vector in the
kernel (and an array in the JSON profile data.)
2021-08-12 00:03:39 +02:00
Andreas Kling 4657c79143 Kernel+LibC: Add sys$perf_register_string()
This syscall allows userspace to register a keyed string that appears in
a new "strings" JSON object in profile output.

This will be used to add custom strings to profile signposts. :^)
2021-08-12 00:03:39 +02:00
Gunnar Beutner 3322efd4cd Kernel: Fix kernel panic when blocking on the process' big lock
Another thread might end up marking the blocking thread as holding
the lock before it gets a chance to finish invoking the scheduler.
2021-08-10 22:33:50 +02:00
Andreas Kling fdfc66db61 Kernel+LibC: Allow clock_gettime() to run without syscalls
This patch adds a vDSO-like mechanism for exposing the current time as
an array of per-clock-source timestamps.

LibC's clock_gettime() calls sys$map_time_page() to map the kernel's
"time page" into the process address space (at a random address, ofc.)
This is only done on first call, and from then on the timestamps are
fetched from the time page.

This first patch only adds support for CLOCK_REALTIME, but eventually
we should be able to support all clock sources this way and get rid of
sys$clock_gettime() in the kernel entirely. :^)

Accesses are synchronized using two atomic integers that are incremented
at the start and finish of the kernel's time page update cycle.
2021-08-10 19:21:16 +02:00
Andreas Kling fa64ab26a4 Kernel+UserspaceEmulator: Remove unused sys$gettimeofday()
Now that LibC uses clock_gettime() to implement gettimeofday(), we can
get rid of this entire syscall. :^)
2021-08-10 13:01:39 +02:00
Andreas Kling 0a02496f04 Kernel/SMP: Change critical sections to not disable interrupts
Leave interrupts enabled so that we can still process IRQs. Critical
sections should only prevent preemption by another thread.

Co-authored-by: Tom <tomut@yahoo.com>
2021-08-10 02:49:37 +02:00
Andreas Kling 9babb92a4b Kernel/SMP: Make entering/leaving critical sections multi-processor safe
By making these functions static we close a window where we could get
preempted after calling Processor::current() and move to another
processor.

Co-authored-by: Tom <tomut@yahoo.com>
2021-08-10 02:49:37 +02:00
Andreas Kling c94c15d45c Everywhere: Replace AK::Singleton => Singleton 2021-08-08 00:03:45 +02:00
Andreas Kling 15d033b486 Kernel: Remove unused Process pointer in Memory::AddressSpace
Nobody was using the back-pointer to the process, so let's lose it.
2021-08-08 00:03:45 +02:00
Idan Horowitz 9d21c79671 Kernel: Disable big process lock for sys$sync
This syscall doesn't touch any intra-process shared resources and only
calls VirtualFileSystem::sync, which is self-locking.
2021-08-07 15:30:26 +02:00
sin-ack 0d468f2282 Kernel: Implement a ISO 9660 filesystem reader :^)
This commit implements the ISO 9660 filesystem as specified in ECMA 119.
Currently, it only supports the base specification and Joliet or Rock
Ridge support is not present. The filesystem will normalize all
filenames to be lowercase (same as Linux).

The filesystem can be mounted directly from a file. Loop devices are
currently not supported by SerenityOS.

Special thanks to Lubrsi for testing on real hardware and providing
profiling help.

Co-Authored-By: Luke <luke.wilde@live.co.uk>
2021-08-07 15:21:58 +02:00
Andreas Kling 5acb7e4eba Kernel: Remove outdated FIXME about ProcessHandle
ProcessHandle hasn't been a thing since Process became ref-counted.
2021-08-07 12:29:26 +02:00
Jean-Baptiste Boric 08891e82a5 Kernel: Migrate process list locking to ProtectedValue
The existing recursive spinlock is repurposed for profiling only, as it
was shared with the process list.
2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric 8554b66d09 Kernel: Make process list a singleton 2021-08-07 11:48:00 +02:00
Jean-Baptiste Boric 626b99ce1c Kernel: Migrate hostname locking to ProtectedValue 2021-08-07 11:48:00 +02:00
Andreas Kling f770b9d430 Kernel: Fix bad search-and-replace renames
Oops, I didn't mean to change every *Range* to *VirtualRange*!
2021-08-07 00:39:06 +02:00
Idan Horowitz ad419a669d Kernel: Disable big process lock for sys$sysconf
This syscall only reads constant kernel globals, and as such does not
need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz efeb01e35f Kernel: Disable big process lock for sys$get_stack_bounds
This syscall only reads from the shared m_space field, but that field
is only over written to by Process::attach_resources, before the
process was initialized (aka, before syscalls can happen), by
Process::finalize which is only called after all the process' threads
have exited (aka, syscalls can not happen anymore), and by
Process::do_exec which calls all other syscall-capable threads before
doing so. Space's find_region_containing already holds its own lock,
and as such there's no need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz d40038a04f Kernel: Disable big process lock for sys$gettimeofday
This syscall doesn't touch any intra-process shared resources and only
accesses the time via the atomic TimeManagement::now so there's no need
to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz 3ba2449058 Kernel: Disable big process lock for sys$clock_nanosleep
This syscall doesn't touch any intra-process shared resources and only
accesses the time via the atomic TimeManagement::current_time so there's
no need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz fbd848e6eb Kernel: Disable big process lock for sys$clock_gettime()
This syscall doesn't touch any intra-process shared resources and
reads the time via the atomic TimeManagement::current_time, so it
doesn't need to hold any lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz 1a08694dfc Kernel: Disable big process lock for sys$getkeymap
This syscall only reads non process-related global values, and as such
doesn't need to hold the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz 48325e2959 Kernel: Disable big process lock for sys$getrandom
This syscall doesn't touch any intra-process shared resources and
already holds the global kernel RNG lock so there's no reason to hold
the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz b1f4f6ee15 Kernel: Disable big process lock for sys$dbgputch
This syscall doesn't touch any intra-process shared resources and
already holds the global logging lock so there's no reason to hold
the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz c7ad4c6c32 Kernel: Disable big process lock for sys$dbgputstr
This syscall doesn't touch any intra-process shared resources and
already holds the global logging lock so there's no reason to hold
the big lock.
2021-08-06 23:36:12 +02:00
Idan Horowitz 00818b8447 Kernel: Disable big process lock for sys$dump_backtrace()
This syscall only dumps the current thread's backtrace and as such
doesn't touch any shared intra-process resources.
2021-08-06 23:36:12 +02:00
Idan Horowitz da0b7d1737 Kernel: Disable big process lock for sys$beep()
The PCSpeaker is global and not locked anyways, so there's no need for
mutual exclusion between threads in the same process.
2021-08-06 23:36:12 +02:00
Idan Horowitz c3f668a758 Kernel: Make Process's m_promises & m_execpromises fields atomic
This is essentially free on x86 and allows us to not hold the big
process lock just to check the required promises for a syscall.
2021-08-06 23:36:12 +02:00
Andreas Kling 2cd8b21974 Kernel: Add convenience values to the Memory::Region::Access enum
Instead of `Memory::Region::Access::Read | Memory::Region::AccessWrite`
you can now say `Memory::Region::Access::ReadWrite`.
2021-08-06 22:25:00 +02:00
Andreas Kling 47bdd7c3a0 Kernel: Rename a very long enum to ShouldDeallocateVirtualRange
ShouldDeallocateVirtualMemoryVirtualRange was a bit on the long side.
2021-08-06 21:45:05 +02:00
Andreas Kling 208147c77c Kernel: Rename Process::space() => Process::address_space()
We commonly talk about "a process's address space" so let's nudge the
code towards matching how we talk about it. :^)
2021-08-06 14:05:58 +02:00
Andreas Kling b7476d7a1b Kernel: Rename Memory::Space => Memory::AddressSpace 2021-08-06 14:05:58 +02:00
Andreas Kling cd5faf4e42 Kernel: Rename Range => VirtualRange
...and also RangeAllocator => VirtualRangeAllocator.

This clarifies that the ranges we're dealing with are *virtual* memory
ranges and not anything else.
2021-08-06 14:05:58 +02:00
Andreas Kling 93d98d4976 Kernel: Move Kernel/Memory/ code into Kernel::Memory namespace 2021-08-06 14:05:58 +02:00
Andreas Kling a1d7ebf85a Kernel: Rename Kernel/VM/ to Kernel/Memory/
This directory isn't just about virtual memory, it's about all kinds
of memory management.
2021-08-06 14:05:58 +02:00
Andreas Kling 3377cc74df Kernel: Use try_copy_kstring_from_user() in sys$mount() 2021-08-06 00:37:47 +02:00
Andreas Kling 33adc3a42d Kernel: Store coredump metadata properties as KStrings
This patch also replaces the HashMap previously used to store coredump
properties with a plain AK::Array.
2021-08-06 00:37:47 +02:00
Andreas Kling 95669fa861 Kernel: Use try_copy_kstring_from_user() in sys$link() 2021-08-06 00:37:47 +02:00
Andreas Kling d5d8fba579 Kernel: Store Thread name as a KString 2021-08-06 00:37:47 +02:00
Brian Gianforcaro 187c086270 Kernel: Handle OOM from KBuffer creation in sys$module() 2021-08-03 18:54:23 +02:00
Brian Gianforcaro 8d3b819daf Kernel: Handle OOM from DoubleBuffer creation in FIFO creation 2021-08-03 18:54:23 +02:00
Brian Gianforcaro fc91eb365d Kernel: Do not cancel stale timers when servicing sys$alarm
The sys$alarm() syscall has logic to cache a m_alarm_timer to avoid
allocating a new timer for every call to alarm. Unfortunately that
logic was broken, and there were conditions in which we could have
a timer allocated, but it was no longer on the timer queue, and we
would attempt to cancel that timer again resulting in an infinite
loop waiting for the timers callback to fire.

To fix this, we need to track if a timer is currently in use or not,
allowing us to avoid attempting to cancel inactive timers.

Luke and Tom did the initial investigation, I just happened to have
time to write a repro and attempt a fix, so I'm adding them as the
as co-authors of this commit.

Co-authored-by: Luke <luke.wilde@live.co.uk>
Co-authored-by: Tom <tomut@yahoo.com>
2021-08-03 18:44:01 +02:00
Brian Gianforcaro 0fc853f5ba Kernel: Remove ThreadTracer.h include from Process.h / Thread.h
This isn't needed for Process / Thread as they only reference it
by pointer and it's already part of Kernel/Forward.h. So just include
it where the implementation needs to call it.
2021-08-01 08:10:16 +02:00
Brian Gianforcaro ed996fcced Kernel: Remove unused header includes 2021-08-01 08:10:16 +02:00
Andreas Kling b807f1c3fc Kernel: Fail madvise() volatile change with EINVAL for non-purgeable mem
AnonymousVMObject::set_volatile() assumes that nobody ever calls it on
non-purgeable objects, so let's make sure we don't do that.

Also return EINVAL instead of EPERM for non-anonymous VM objects so the
error codes match.
2021-07-28 20:42:49 +02:00
Brian Gianforcaro ddc950ce42 Kernel: Avoid file descriptor leak in Process::sys$socketpair on error
Previously it was possible to leak the file descriptor if we error out
after allocating the first descriptor. Now we perform both fd
allocations back to back so we can handle the potential error when
processing the second fd allocation.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro 4b2651ddab Kernel: Track allocated FileDescriptionAndFlag elements in each Process
The way the Process::FileDescriptions::allocate() API works today means
that two callers who allocate back to back without associating a
FileDescription with the allocated FD, will receive the same FD and thus
one will stomp over the other.

Naively tracking which FileDescriptions are allocated and moving onto
the next would introduce other bugs however, as now if you "allocate"
a fd and then return early further down the control flow of the syscall
you would leak that fd.

This change modifies this behavior by tracking which descriptions are
allocated and then having an RAII type to "deallocate" the fd if the
association is not setup the end of it's scope.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro ba03b6ad02 Kernel: Make Process::FileDescriptions::allocate return KResultOr<int>
Modernize more error checking by utilizing KResultOr.
2021-07-28 19:07:00 +02:00
Brian Gianforcaro d2cee9cbf6 Kernel: Remove unused fd allocation from Process::sys$connect(..) 2021-07-28 19:07:00 +02:00
Andreas Kling a085168c52 Kernel: Rename Space::create => Space::try_create() 2021-07-27 14:54:35 +02:00
Andreas Kling 4648bcd3d4 Kernel: Remove unnecessary weak pointer from Region to owning Process
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
2021-07-25 17:28:06 +02:00
Andreas Kling 09bc4cee15 Kernel: Remove unused madvise(MADV_GET_VOLATILE)
This was used to query the volatile state of a memory region, however
nothing ever actually used it.
2021-07-25 17:28:06 +02:00
Andreas Kling 2d1a651e0a Kernel: Make purgeable memory a VMObject level concept (again)
This patch changes the semantics of purgeable memory.

- AnonymousVMObject now has a "purgeable" flag. It can only be set when
  constructing the object. (Previously, all anonymous memory was
  effectively purgeable.)

- AnonymousVMObject now has a "volatile" flag. It covers the entire
  range of physical pages. (Previously, we tracked ranges of volatile
  pages, effectively making it a page-level concept.)

- Non-volatile objects maintain a physical page reservation via the
  committed pages mechanism, to ensure full coverage for page faults.

- When an object is made volatile, it relinquishes any unused committed
  pages immediately. If later made non-volatile again, we then attempt
  to make a new committed pages reservation. If this fails, we return
  ENOMEM to userspace.

mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
2021-07-25 17:28:05 +02:00
Brian Gianforcaro 9d8482c3e8 Kernel: Use StringView when parsing pledges in sys$pledge(..)
This ensures no potential allocation as in some cases the pledge char*
could be promoted to AK::String by the compiler to execute the
comparison.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro e4b86aa5d8 Kernel: Fix bug where we half apply pledges in sys$pledge(..)
This bug manifests it self when the caller to sys$pledge() passes valid
promises, but invalid execpromises. The code would apply the promises
and then return an error for the execpromises. This leaves the user in
a confusing state, as the promises were silently applied, but we return
an error suggesting the operation has failed.

Avoid this situation by tweaking the implementation to only apply the
promises / execpromises after all validation has occurred.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro 36ff717c54 Kernel: Migrate sys$pledge to use the KString API
This avoids potential unhandled OOM that's possible with the old
copy_string_from_user API.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro baec9e2d2d Kernel: Migrate sys$unveil to use the KString API
This avoids potential unhandled OOM that's possible with the old
copy_string_from_user API.
2021-07-23 19:02:25 +02:00
Brian Gianforcaro 2e7728bb05 Kernel: Use StringView literals for fs_type match in sys$mount(..) 2021-07-23 19:02:25 +02:00
Andreas Kling 082ed6f417 Kernel: Simplify VMObject locking & page fault handlers
This patch greatly simplifies VMObject locking by doing two things:

1. Giving VMObject an IntrusiveList of all its mapping Region objects.
2. Removing VMObject::m_paging_lock in favor of VMObject::m_lock

Before (1), VMObject::for_each_region() was forced to acquire the
global MM lock (since it worked by walking MemoryManager's list of
all regions and checking for regions that pointed to itself.)

With each VMObject having its own list of Regions, VMObject's own
m_lock is all we need.

Before (2), page fault handlers used a separate mutex for preventing
overlapping work. This design required multiple temporary unlocks
and was generally extremely hard to reason about.

Instead, page fault handlers now use VMObject's own m_lock as well.
2021-07-23 03:24:44 +02:00
Gunnar Beutner 31f30e732a Everywhere: Prefix hexadecimal numbers with 0x
Depending on the values it might be difficult to figure out whether a
value is decimal or hexadecimal. So let's make this more obvious. Also
this allows copying and pasting those numbers into GNOME calculator and
probably also other apps which auto-detect the base.
2021-07-22 08:57:01 +02:00
Peter Elliott 3fa2816642 Kernel+LibC: Implement fcntl(2) advisory locks
Advisory locks don't actually prevent other processes from writing to
the file, but they do prevent other processes looking to acquire and
advisory lock on the file.

This implementation currently only adds non-blocking locks, which are
all I need for now.
2021-07-20 17:44:30 +04:30
Brian Gianforcaro 8f01a8b741 Kernel: Disable big process lock for sys$yield() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro 5c10fb4007 Kernel: Disable big process lock for sys$gettid()
This syscall reads a read only value from the current thread, and hence
has no need for the big process lock.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro 638598b15d Kernel: Disable big process lock for sys$getpid() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro bfd4635274 Kernel: Disable big process lock for sys$uname() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro 10ce896d4f Kernel: Disable big process lock in sys$gethostname() sys$sethostname() 2021-07-20 03:21:14 +02:00
Brian Gianforcaro 9201a06027 Kernel: Annotate all syscalls with VERIFY_PROCESS_BIG_LOCK_ACQUIRED
Before we start disabling acquisition of the big process lock for
specific syscalls, make sure to document and assert that all the
lock is held during all syscalls.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro 308396bca1 Kernel: No lock validate_user_stack variant, switch to Space as argument
The entire process is not needed, just require the user to pass in the
Space. Also provide no_lock variant to use when you already have the
VM/Space lock acquired, to avoid unnecessary recursive spinlock
acquisitions.
2021-07-20 03:21:14 +02:00
Brian Gianforcaro 1cffecbe8d Kernel: Push ARCH specific ifdef's down into RegisterState functions
The non CPU specific code of the kernel shouldn't need to deal with
architecture specific registers, and should instead deal with an
abstract view of the machine. This allows us to remove a variety of
architecture specific ifdefs and helps keep the code slightly more
portable.

We do this by exposing the abstract representation of instruction
pointer, stack pointer, base pointer, return register, etc on the
RegisterState struct.
2021-07-19 08:46:55 +02:00
Andreas Kling 9457d83986 Kernel: Rename Locker => MutexLocker 2021-07-18 01:53:04 +02:00
Andreas Kling cee9528168 Kernel: Rename Lock to Mutex
Let's be explicit about what kind of lock this is meant to be.
2021-07-17 21:10:32 +02:00
Brian Gianforcaro c0987453e6 Kernel: Remove double RedBlackTree lookup in VM/Space region removal
We should never request a regions removal that we don't currently
own. We currently assert this everywhere else by all callers.

Instead lets just push the assert down into the RedBlackTree removal
and assume that we will always successfully remove the region.
2021-07-17 16:22:59 +02:00
Tom 704e1c2e3d Kernel: Rename functions to be less confusing
Thread::yield_and_release_relock_big_lock releases the big lock, yields
and then relocks the big lock.

Thread::yield_assuming_not_holding_big_lock yields assuming the big
lock is not being held.
2021-07-16 20:30:04 +02:00
Timothy 9715311837 AK+Kernel: Implement and use EnumBits has_any_flag()
This duplicates the old functionality of has_flag and will return true
when any flags present in the mask are also in the value.
2021-07-16 11:49:50 +02:00
Idan Horowitz be475cd6a8 Kernel: Handle OOM when adding memory regions to Spaces :^) 2021-07-15 00:49:41 +02:00
Andreas Kling 859e5741ff Kernel: Fix Process use-after-free in Thread finalization
We leak a ref() onto every user process when constructing them,
either via Process::create_user_process(), or via Process::sys$fork().

This ref() is balanced by a corresponding unref() in
Thread::WaitBlockCondition::finalize().

Since kernel processes don't have a leaked ref() on them, this led to
an extra Process::unref() on kernel processes during finalization.
This happened during every boot, with the `init_stage2` process.

Found by turning off kfree() scrubbing. :^)
2021-07-14 22:36:29 +02:00
Brian Gianforcaro 84b4b9447d Kernel: Move new process registration out of Space spinlock scope
There appears to be no reason why the process registration needs
to happen under the space spin lock. As the first thread is not started
yet it should be completely uncontested, but it's still bad practice.
2021-07-12 10:20:21 +02:00
Andreas Kling cd7a49b90d Kernel: Make Region splitting OOM-safe
Region allocation failures during splitting are now propagated all the
way out to where we can return ENOMEM for them.
2021-07-11 18:52:27 +02:00
Andreas Kling 88d490566f Kernel: Rename various *VMObject::create*() => try_create()
try_*() implies that it can fail (and they all return RefPtr with
nullptr signalling failure.)
2021-07-11 17:55:29 +02:00
Andreas Kling af8c74a328 Kernel: Make SharedInodeVMObject allocation OOM-safe 2021-07-11 17:52:07 +02:00
Andreas Kling 07c4c89297 Kernel: Make VirtualFileSystem::sync() static 2021-07-11 00:26:17 +02:00
Andreas Kling 0d39bd04d3 Kernel: Rename VFS => VirtualFileSystem 2021-07-11 00:25:24 +02:00
Andreas Kling d53d9d3677 Kernel: Rename FS => FileSystem
This matches our common naming style better.
2021-07-11 00:20:38 +02:00
Ralf Donau 8ee3a5e09e Kernel: Logic fix in the pledge syscall
Pledge should check m_has_promises. Calling pledge("", nullptr)
does not fail on an already pledged process anymore.
2021-07-10 21:59:29 +02:00
Gunnar Beutner 06883ed8a3 Kernel+Userland: Make the stack alignment comply with the System V ABI
The System V ABI for both x86 and x86_64 requires that the stack pointer
is 16-byte aligned on entry. Previously we did not align the stack
pointer properly.

As far as "main" was concerned the stack alignment was correct even
without this patch due to how the C++ _start function and the kernel
interacted, i.e. the kernel misaligned the stack as far as the ABI
was concerned but that misalignment (read: it was properly aligned for
a regular function call - but misaligned in terms of what the ABI
dictates) was actually expected by our _start function.
2021-07-10 01:41:57 +02:00
Ali Mohammad Pur e37f9fa7db LibPthread+Kernel: Add pthread_kill() and the thread_kill syscall 2021-07-09 15:36:50 +02:00
Daniel Bertalan d30dbf47f5 Kernel: Map non-page-aligned text segments correctly
`.text` segments with non-aligned offsets had their lengths applied to
the first page's base address. This meant that in some cases the last
PAGE_SIZE - 1 bytes weren't mapped. Previously, it did not cause any
problems as the GNU ld insists on aligning everything; but that's not
the case with the LLVM toolchain.
2021-07-07 22:26:53 +02:00
Max Wipfli d5722eab36 Kernel: Custody::absolute_path() => try_create_absolute_path()
This converts most users of Custody::absolute_path() to use the new
try_create_absolute_path() API, and return ENOMEM if the KString
allocation fails.
2021-07-07 15:32:17 +02:00
Max Wipfli ee342f5ec3 Kernel: Replace usage of LexicalPath with KLexicalPath
This replaces all uses of LexicalPath in the Kernel with the functions
from KLexicalPath. This also allows the Kernel to stop including
AK::LexicalPath.
2021-07-07 15:32:17 +02:00
Edwin Hoksberg 99328e1038 Kernel+KeyboardSettings: Remove numlock syscall and implement ioctl 2021-07-07 10:44:20 +02:00
Tom 864b50b5c2 Kernel: Do not hold spinlock while touching user mode futex values
The user_atomic_* functions are subject to the same rules as
copy_from/to/user, which may require preemption.
2021-07-07 10:05:55 +02:00
Tom 7593418b77 Kernel: Fix futex race that could lead to thread waiting forever
There is a race condition where we would remove a FutexQueue from
our futex map and in the meanwhile another thread started to queue
itself into that very same futex, leading to that thread to wait
forever as no other wake operation could discover that removed
FutexQueue.

This fixes the problem by:
* Tracking imminent waits, which prevents a new FutexQueue from being
  deleted that a thread will wait on momentarily
* Atomically marking a FutexQueue as removed, which prevents a thread
  from waiting on it before it is actually removed from the futex map.
2021-07-07 10:05:55 +02:00
Andreas Kling 565796ae4e Kernel+LibC: Remove sys$donate()
This was an old SerenityOS-specific syscall for donating the remainder
of the calling thread's time-slice to another thread within the same
process.

Now that Threading::Lock uses a pthread_mutex_t internally, we no
longer need this syscall, which allows us to get rid of a surprising
amount of unnecessary scheduler logic. :^)
2021-07-05 23:30:15 +02:00
ForLoveOfCats ce6658acc1 KeyboardSettings+Kernel: Setting to enable Num Lock on login 2021-07-05 06:19:59 +02:00
Idan Horowitz 301c1a3a58 Everywhere: Fix incorrect usages of AK::Checked
Specifically, explicitly specify the checked type, use the resulting
value instead of doing the same calculation twice, and break down
calculations to discrete operations to ensure no intermediary overflows
are missed.
2021-07-04 20:08:28 +01:00
Gunnar Beutner c51b49a8cb Kernel: Implement TLS support for x86_64 2021-07-04 01:07:28 +02:00
Gunnar Beutner a09e6171a6 Kernel: Don't allow allocate_tls() if the process has multiple threads
We can't safely update the other threads' FS selector. This shouldn't
be a problem in practice because allocate_tls() is only used by the
loader.
2021-07-04 01:07:28 +02:00
Daniel Bertalan b9f30c6f2a Everywhere: Fix some alignment issues
When creating uninitialized storage for variables, we need to make sure
that the alignment is correct. Fixes a KUBSAN failure when running
kernels compiled with Clang.

In `Syscalls/socket.cpp`, we can simply use local variables, as
`sockaddr_un` is a POD type.

Along with moving the `alignas` specifier to the correct member,
`AK::Optional`'s internal buffer has been made non-zeroed by default.
GCC emitted bogus uninitialized memory access warnings, so we now use
`__builtin_launder` to tell the compiler that we know what we are doing.
This might disable some optimizations, but judging by how GCC failed to
notice that the memory's initialization is dependent on `m_has_value`,
I'm not sure that's a bad thing.
2021-07-03 01:56:31 +04:30
Gunnar Beutner 16b9a2d2e1 Kernel+LibPthread: Add support for usermode threads on x86_64 2021-07-01 17:22:22 +02:00
Gunnar Beutner 93c741018e Kernel+LibPthread: Remove m_ prefix for public members 2021-07-01 17:22:22 +02:00
Gunnar Beutner fe2716df21 Kernel: Disable __thread and TLS on x86_64 for now
They're not yet properly supported.
2021-06-30 15:13:30 +02:00
Gunnar Beutner cafccb866c Kernel: Don't start usermode threads on x86_64 for now
Starting usermode threads doesn't currently work on x86_64. With this
stubbed out we can get text mode to boot though.
2021-06-30 15:13:30 +02:00
Max Wipfli 7405536a1a AK+Everywhere: Use mostly StringView in LexicalPath
This changes the m_parts, m_dirname, m_basename, m_title and m_extension
member variables to StringViews onto the m_string String. It also
removes the m_is_absolute member in favour of computing if a path is
absolute in the is_absolute() getter. Due to this, the canonicalize()
method has been completely rewritten.

The parts() getter still returns a Vector<String>, although it is no
longer a const reference as m_parts is no longer a Vector<String>.
Rather, it is constructed from the StringViews in m_parts upon request.
The parts_view() getter has been added, which returns Vector<StringView>
const&. Most previous users of parts() have been changed to use
parts_view(), except where Strings are required.

Due to this change, it's is now no longer allow to create temporary
LexicalPath objects to call the dirname, basename, title, or extension
getters on them because the returned StringViews will point to possible
freed memory.
2021-06-30 11:13:54 +02:00
Max Wipfli fc6d051dfd AK+Everywhere: Add and use static APIs for LexicalPath
The LexicalPath instance methods dirname(), basename(), title() and
extension() will be changed to return StringView const& in a further
commit. Due to this, users creating temporary LexicalPath objects just
to call one of those getters will recieve a StringView const& pointing
to a possible freed buffer.

To avoid this, static methods for those APIs have been added, which will
return a String by value to avoid those problems. All cases where
temporary LexicalPath objects have been used as described above haven
been changed to use the static APIs.
2021-06-30 11:13:54 +02:00
Liav A 7c87891c06 Kernel: Don't copy a Vector<FileDescriptionAndFlags>
Instead of copying a Vector everytime we need to enumerate a Process'
file descriptions, we can just temporarily lock so it won't change.
2021-06-29 20:53:59 +02:00
Liav A 12b6e69150 Kernel: Introduce the new ProcFS design
The new ProcFS design consists of two main parts:
1. The representative ProcFS class, which is derived from the FS class.
The ProcFS and its inodes are much more lean - merely 3 classes to
represent the common type of inodes - regular files, symbolic links and
directories. They're backed by a ProcFSExposedComponent object, which
is responsible for the functional operation behind the scenes.
2. The backend of the ProcFS - the ProcFSComponentsRegistrar class
and all derived classes from the ProcFSExposedComponent class. These
together form the entire backend and handle all the functions you can
expect from the ProcFS.

The ProcFSExposedComponent derived classes split to 3 types in the
manner of lifetime in the kernel:
1. Persistent objects - this category includes all basic objects, like
the root folder, /proc/bus folder, main blob files in the root folders,
etc. These objects are persistent and cannot die ever.
2. Semi-persistent objects - this category includes all PID folders,
and subdirectories to the PID folders. It also includes exposed objects
like the unveil JSON'ed blob. These object are persistent as long as the
the responsible process they represent is still alive.
3. Dynamic objects - this category includes files in the subdirectories
of a PID folder, like /proc/PID/fd/* or /proc/PID/stacks/*. Essentially,
these objects are always created dynamically and when no longer in need
after being used, they're deallocated.
Nevertheless, the new allocated backend objects and inodes try to use
the same InodeIndex if possible - this might change only when a thread
dies and a new thread is born with a new thread stack, or when a file
descriptor is closed and a new one within the same file descriptor
number is opened. This is needed to actually be able to do something
useful with these objects.

The new design assures that many ProcFS instances can be used at once,
with one backend for usage for all instances.
2021-06-29 20:53:59 +02:00
Liav A 92c0dab5ab Kernel: Introduce the new SysFS
The intention is to add dynamic mechanism for notifying the userspace
about hotplug events. Currently, the DMI (SMBIOS) blobs and ACPI tables
are exposed in the new filesystem.
2021-06-29 20:53:59 +02:00
Gunnar Beutner a5b4b95a76 Kernel: Report correct architecture for uname() 2021-06-29 20:03:36 +02:00
Gunnar Beutner 85561feb40 Kernel: Rename some variables to arch-independent names 2021-06-29 20:03:36 +02:00
Gunnar Beutner 90e3aa35ef Kernel: Fix correct argument order for userspace entry point 2021-06-29 20:03:36 +02:00
Gunnar Beutner 6dde7dac8f Kernel: Implement signal handling for x86_64 2021-06-29 20:03:36 +02:00
Gunnar Beutner df9e73de25 Kernel: Add x86_64 support for fork() 2021-06-29 20:03:36 +02:00
Gunnar Beutner 2a78bf8596 Kernel: Fix the return type for syscalls
The Process::Handler type has KResultOr<FlatPtr> as its return type.
Using a different return type with an equally-sized template parameter
sort of works but breaks once that condition is no longer true, e.g.
for KResultOr<int> on x86_64.

Ideally the syscall handlers would also take FlatPtrs as their args
so we can get rid of the reinterpret_cast for the function pointer
but I didn't quite feel like cleaning that up as well.
2021-06-28 22:29:28 +02:00