Commit graph

385 commits

Author SHA1 Message Date
asynts ea7b7d8ceb Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-22 22:14:30 +01:00
Andreas Kling 928ee2c791 Kernel: Don't let signals unblock threads while handling a page fault
It was possible to signal a process while it was paging in an inode
backed VM object. This would cause the inode read to EINTR, and the
page fault handler would assert.

Solve this by simply not unblocking threads due to signals if they are
currently busy handling a page fault. This is probably not the best way
to solve this issue, so I've added a FIXME to that effect.
2021-01-21 00:14:56 +01:00
Andreas Kling 57ca15f126 Kernel: Remove commented-out code from Thread::dispatch_signal() 2021-01-20 23:27:23 +01:00
asynts 94bb544c33 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.

This commit touches some dbg() calls which are enclosed in macros. This
should be fine because with the new constexpr stuff, we ensure that the
stuff actually compiles.
2021-01-16 11:54:35 +01:00
Andreas Kling 64b0d89335 Kernel: Make Process::allocate_region*() return KResultOr<Region*>
This allows region allocation to return specific errors and we don't
have to assume every failure is an ENOMEM.
2021-01-15 19:10:30 +01:00
Lenny Maiorani e6f907a155 AK: Simplify constructors and conversions from nullptr_t
Problem:
- Many constructors are defined as `{}` rather than using the ` =
  default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
  instead of requiring the caller to default construct. This violates
  the C++ Core Guidelines suggestion to declare single-argument
  constructors explicit
  (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).

Solution:
- Change default constructors to use the compiler-provided default
  constructor.
- Remove implicit conversion operators from `nullptr_t` and change
  usage to enforce type consistency without conversion.
2021-01-12 09:11:45 +01:00
Andreas Kling 7c4ddecacb Kernel: Convert a bunch of String::format() => String::formatted() 2021-01-11 22:07:01 +01:00
Sahan Fernando 9bf76a85c8 Everywhere: Fix incorrect uses of String::format and StringBuilder::appendf
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-11 21:06:32 +01:00
Sahan Fernando 099b83fd28 Everywhere: Fix incorrect uses of String::format and StringBuilder::appendf
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-11 21:06:32 +01:00
Andreas Kling 5dafb72370 Kernel+Profiler: Make profiling per-process and without core dumps
This patch merges the profiling functionality in the kernel with the
performance events mechanism. A profiler sample is now just another
perf event, rather than a dedicated thing.

Since perf events were already per-process, this now makes profiling
per-process as well.

Processes with perf events would already write out a perfcore.PID file
to the current directory on death, but since we may want to profile
a process and then let it continue running, recorded perf events can
now be accessed at any time via /proc/PID/perf_events.

This patch also adds information about process memory regions to the
perfcore JSON format. This removes the need to supply a core dump to
the Profiler app for symbolication, and so the "profiler coredump"
mechanism is removed entirely.

There's still a hard limit of 4MB worth of perf events per process,
so this is by no means a perfect final design, but it's a nice step
forward for both simplicity and stability.

Fixes #4848
Fixes #4849
2021-01-11 11:36:00 +01:00
asynts 4e8fd0216b Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-09 21:11:09 +01:00
Tom a0c91719d8 Kernel: Restore thread count if thread cannot be fully created 2021-01-01 23:43:44 +01:00
Tom 476f17b3f1 Kernel: Merge PurgeableVMObject into AnonymousVMObject
This implements memory commitments and lazy-allocation of committed
memory.
2021-01-01 23:43:44 +01:00
Andreas Kling ed5c26d698 AK: Remove custom %w format string specifier
This was a non-standard specifier alias for %04x. This patch replaces
all uses of it with new-style formatting functions instead.
2020-12-25 17:05:05 +01:00
Andreas Kling cb2c8f71f4 AK: Remove custom %b format string specifier
This was a non-standard specifier alias for %02x. This patch replaces
all uses of it with new-style formatting functions instead.
2020-12-25 17:04:28 +01:00
Andreas Kling 89d3b09638 Kernel: Allocate new main thread stack before committing to exec
If the allocation fails (e.g ENOMEM) we want to simply return an error
from sys$execve() and continue executing the current executable.

This patch also moves make_userspace_stack_for_main_thread() out of the
Thread class since it had nothing in particular to do with Thread.
2020-12-25 16:22:01 +01:00
Andreas Kling 40e9edd798 LibELF: Move AuxiliaryValue into the ELF namespace 2020-12-25 14:48:30 +01:00
Andreas Kling d7ad082afa Kernel+LibELF: Stop doing ELF symbolication in the kernel
Now that the CrashDaemon symbolicates crashes in userspace, let's take
this one step further and stop trying to symbolicate userspace programs
in the kernel at all.
2020-12-25 01:03:46 +01:00
Tom 5f51d85184 Kernel: Improve time keeping and dramatically reduce interrupt load
This implements a number of changes related to time:
* If a HPET is present, it is now used only as a system timer, unless
  the Local APIC timer is used (in which case the HPET timer will not
  trigger any interrupts at all).
* If a HPET is present, the current time can now be as accurate as the
  chip can be, independently from the system timer. We now query the
  HPET main counter for the current time in CPU #0's system timer
  interrupt, and use that as a base line. If a high precision time is
  queried, that base line is used in combination with quering the HPET
  timer directly, which should give a much more accurate time stamp at
  the expense of more overhead. For faster time stamps, the more coarse
  value based on the last interrupt will be returned. This also means
  that any missed interrupts should not cause the time to drift.
* The default system interrupt rate is reduced to about 250 per second.
* Fix calculation of Thread CPU usage by using the amount of ticks they
  used rather than the number of times a context switch happened.
* Implement CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE and use it
  for most cases where precise timestamps are not needed.
2020-12-21 18:26:12 +01:00
Lenny Maiorani 765936ebae
Everywhere: Switch from (void) to [[maybe_unused]] (#4473)
Problem:
- `(void)` simply casts the expression to void. This is understood to
  indicate that it is ignored, but this is really a compiler trick to
  get the compiler to not generate a warning.

Solution:
- Use the `[[maybe_unused]]` attribute to indicate the value is unused.

Note:
- Functions taking a `(void)` argument list have also been changed to
  `()` because this is not needed and shows up in the same grep
  command.
2020-12-21 00:09:48 +01:00
Tom c4176b0da1 Kernel: Fix Lock race causing infinite spinning between two threads
We need to account for how many shared lock instances the current
thread owns, so that we can properly release such references when
yielding execution.

We also need to release the process lock when donating.
2020-12-16 23:38:17 +01:00
Itamar b4842d33bb Kernel: Generate a coredump file when a process crashes
When a process crashes, we generate a coredump file and write it in
/tmp/coredumps/.

The coredump file is an ELF file of type ET_CORE.
It contains a segment for every userspace memory region of the process,
and an additional PT_NOTE segment that contains the registers state for
each thread, and a additional data about memory regions
(e.g their name).
2020-12-14 23:05:53 +01:00
Itamar 5b87904ab5 Kernel: Add ability to load interpreter instead of main program
When the main executable needs an interpreter, we load the requested
interpreter program, and pass to it an open file decsriptor to the main
executable via the auxiliary vector.

Note that we do not allocate a TLS region for the interpreter.
2020-12-14 23:05:53 +01:00
Tom c455fc2030 Kernel: Change wait blocking to Process-only blocking
This prevents zombies created by multi-threaded applications and brings
our model back to closer to what other OSs do.

This also means that SIGSTOP needs to halt all threads, and SIGCONT needs
to resume those threads.
2020-12-12 21:28:12 +01:00
Tom da5cc34ebb Kernel: Fix some issues related to fixes and block conditions
Fix some problems with join blocks where the joining thread block
condition was added twice, which lead to a crash when trying to
unblock that condition a second time.

Deferred block condition evaluation by File objects were also not
properly keeping the File object alive, which lead to some random
crashes and corruption problems.

Other problems were caused by the fact that the Queued state didn't
handle signals/interruptions consistently. To solve these issues we
remove this state entirely, along with Thread::wait_on and change
the WaitQueue into a BlockCondition instead.

Also, deliver signals even if there isn't going to be a context switch
to another thread.

Fixes #4336 and #4330
2020-12-12 21:28:12 +01:00
Tom 12cf6f8650 Kernel: Add CLOCK_REALTIME support to the TimerQueue
This allows us to use blocking timeouts with either monotonic or
real time for all blockers. Which means that clock_nanosleep()
now also supports CLOCK_REALTIME.

Also, switch alarm() to use CLOCK_REALTIME as per specification.
2020-12-02 13:02:04 +01:00
Tom 1f86d88dc4 Kernel: Don't assert if we can't deliver a signal due to thread state
Fixes an assertion found in #3990
2020-12-01 16:09:15 +01:00
Tom 78f1b5e359 Kernel: Fix some problems with Thread::wait_on and Lock
This changes the Thread::wait_on function to not enable interrupts
upon leaving, which caused some problems with page fault handlers
and in other situations. It may now be called from critical
sections, with interrupts enabled or disabled, and returns to the
same state.

This also requires some fixes to Lock. To aid debugging, a new
define LOCK_DEBUG is added that enables checking for Lock leaks
upon finalization of a Thread.
2020-12-01 09:48:34 +01:00
Tom 046d6855f5 Kernel: Move block condition evaluation out of the Scheduler
This makes the Scheduler a lot leaner by not having to evaluate
block conditions every time it is invoked. Instead evaluate them as
the states change, and unblock threads at that point.

This also implements some more waitid/waitpid/wait features and
behavior. For example, WUNTRACED and WNOWAIT are now supported. And
wait will now not return EINTR when SIGCHLD is delivered at the
same time.
2020-11-30 13:17:02 +01:00
Tom 6a620562cc Kernel: Allow passing a thread argument for new kernel threads
This adds the ability to pass a pointer to kernel thread/process.
Also add the ability to use a closure as thread function, which
allows passing information to a kernel thread more easily.
2020-11-30 13:17:02 +01:00
Tom 6cb640eeba Kernel: Move some time related code from Scheduler into TimeManagement
Use the TimerQueue to expire blocking operations, which is one less thing
the Scheduler needs to check on every iteration.

Also, add a BlockTimeout class that will automatically handle relative or
absolute timeouts as well as overriding timeouts (e.g. socket timeouts)
more consistently.

Also, rework the TimerQueue class to be able to fire events from
any processor, which requires Timer to be RefCounted. Also allow
creating id-less timers for use by blocking operations.
2020-11-30 13:17:02 +01:00
Tom 97b3035c14 Kernel: Don't resume thread into Running state directly on SIGCONT
We should never resume a thread by directly setting it to Running state.
Instead, if a thread was in Running state when stopped, record the state
as Runnable.

Fixes #4150
2020-11-23 18:33:19 +01:00
Andreas Kling 94ff04b536 Kernel: Make CLOCK_MONOTONIC respect the system tick frequency
The time returned by sys$clock_gettime() was not aligned with the delay
calculations in sys$clock_nanosleep(). This patch fixes that by taking
the system's ticks_per_second value into account in both functions.

This patch also removes the need for Thread::sleep_until() and uses
Thread::sleep() for both absolute and relative sleeps.

This was causing the nesalizer emulator port to sleep for a negative
amount of time at the end of each frame, making it run way too fast.
2020-11-22 17:20:58 +01:00
Tom 6b97118e89 Kernel: Fix race during thread destruction if it is preempted
This fixes a lot of crashes in Bochs, which is more likely to
preempt thread destruction.
2020-11-12 10:18:16 +01:00
Tom 75f61fe3d9 AK: Make RefPtr, NonnullRefPtr, WeakPtr thread safe
This makes most operations thread safe, especially so that they
can safely be used in the Kernel. This includes obtaining a strong
reference from a weak reference, which now requires an explicit
call to WeakPtr::strong_ref(). Another major change is that
Weakable::make_weak_ref() may require the explicit target type.
Previously we used reinterpret_cast in WeakPtr, assuming that it
can be properly converted. But WeakPtr does not necessarily have
the knowledge to be able to do this. Instead, we now ask the class
itself to deliver a WeakPtr to the type that we want.

Also, WeakLink is no longer specific to a target type. The reason
for this is that we want to be able to safely convert e.g. WeakPtr<T>
to WeakPtr<U>, and before this we just reinterpret_cast the internal
WeakLink<T> to WeakLink<U>, which is a bold assumption that it would
actually produce the correct code. Instead, WeakLink now operates
on just a raw pointer and we only make those constructors/operators
available if we can verify that it can be safely cast.

In order to guarantee thread safety, we now use the least significant
bit in the pointer for locking purposes. This also means that only
properly aligned pointers can be used.
2020-11-10 19:11:52 +01:00
Tom 1e2e3eed62 Kernel: Fix a few deadlocks with Thread::m_lock and g_scheduler_lock
g_scheduler_lock cannot safely be acquired after Thread::m_lock
because another processor may already hold g_scheduler_lock and wait
for the same Thread::m_lock.
2020-10-26 08:57:25 +01:00
Linus Groh bcfc6f0c57 Everywhere: Fix more typos 2020-10-03 12:36:49 +02:00
Tom 838d9fa251 Kernel: Make Thread refcounted
Similar to Process, we need to make Thread refcounted. This will solve
problems that will appear once we schedule threads on more than one
processor. This allows us to hold onto threads without necessarily
holding the scheduler lock for the entire duration.
2020-09-27 19:46:04 +02:00
Tom 69a9c78783 Kernel: Allow killing queued threads
We need to dequeue and wake threads that are waiting if the process
terminates.

Fixes #3603 without the HackStudio fixes in #3606.
2020-09-26 20:03:16 +02:00
Tom 1727b2d7cd Kernel: Fix thread joining issues
The thread joining logic hadn't been updated to account for the subtle
differences introduced by software context switching. This fixes several
race conditions related to thread destruction and joining, as well as
finalization which did not properly account for detached state and the
fact that threads can be joined after termination as long as they're not
detached.

Fixes #3596
2020-09-26 13:03:13 +02:00
Ben Wiederhake 64cc3f51d0 Meta+Kernel: Make clang-format-10 clean 2020-09-25 21:18:17 +02:00
Luke 68b361bd21 Kernel: Return ENOMEM in more places
There are plenty of places in the kernel that aren't
checking if they actually got their allocation.

This fixes some of them, but definitely not all.

Fixes #3390
Fixes #3391

Also, let's make find_one_free_page() return nullptr
if it doesn't get a free index. This stops the kernel
crashing when out of memory and allows memory purging
to take place again.

Fixes #3487
2020-09-16 20:38:19 +02:00
Tom f6d1e45bf3 Kernel: Don't symbolicate stack traces in IRQ handlers
If we're capturing a stack trace in an IRQ handler, don't try to
symbolicate it as we may not be able to access all pages.
2020-09-15 23:30:44 +02:00
Tom f5330304a4 Kernel: Stop back trace on a null base pointer
This silences some warnings trying to copy from null when capturing
a stack trace.
2020-09-14 11:31:12 +02:00
Tom c8d9f1b9c9 Kernel: Make copy_to/from_user safe and remove unnecessary checks
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.

So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.

To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.

Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
2020-09-13 21:19:15 +02:00
Tom 0fab0ee96a Kernel: Rename Process::is_ring0/3 to Process::is_kernel/user_process
Since "rings" typically refer to code execution and user processes
can also execute in ring 0, rename these functions to more accurately
describe what they mean: kernel processes and user processes.
2020-09-10 19:57:15 +02:00
Andreas Kling f8e59addf7 Kernel+LibC+UE: Introduce SIGINFO (generated with ^T)
This signal is ignored by default, but can be caught to implement state
reporting a la BSD. :^)
2020-09-09 21:10:23 +02:00
Tom 92bfe40954 Kernel: Keep signal state in sync
In c3d231616c we added the atomic variable
m_have_any_unmasked_pending_signals tracking the state of pending signals.
Add helper functions that automatically update this variable as needed.
2020-09-09 12:43:56 +02:00
Tom c3d231616c Kernel: Fix crash when delivering signal to barely created thread
We need to wait until a thread is fully set up and ready for running
before attempting to deliver a signal. Otherwise we may not have a
user stack yet.

Also, remove the Skip0SchedulerPasses and Skip1SchedulerPass thread
states that we don't really need anymore with software context switching.

Fixes the kernel crash reported in #3419
2020-09-07 16:49:19 +02:00
Tom 4b66692a55 Kernel: Make Heap implementation reusable, and make kmalloc expandable
Add an ExpandableHeap and switch kmalloc to use it, which allows
for the kmalloc heap to grow as needed.

In order to make heap expansion to work, we keep around a 1 MiB backup
memory region, because creating a region would require space in the
same heap. This means, the heap will grow as soon as the reported
utilization is less than 1 MiB. It will also return memory if an entire
subheap is no longer needed, although that is rarely possible.
2020-08-30 11:39:38 +02:00
Andreas Kling 079d2fe088 Kernel: Pack arguments, environment and auxiliary values better
Previously we were putting strings at the bottom of the allocated stack
region, and pointer arrays (argv, env, auxv) at the top. There was no
reason for this other than "it seemed easier to do it that way at the
time I wrote it."

This patch packs the strings and pointer vectors into the same area at
the top of the stack.

This reduces the memory footprint of all programs by 4 KiB. :^)
2020-08-20 16:34:44 +02:00
Nico Weber 96ff0971ea Kernel: Remove a comment that has been stale since 7a64f55c0f 2020-08-17 09:10:06 +02:00
Tom 72960fedc6 Kernel: Briefly resume stopped threads when being killed
We need to briefly put Stopped threads back into Running state
so that the kernel stacks can get cleaned up when they're being
killed.

Fixes #3130
2020-08-15 00:15:00 +02:00
Ben Wiederhake 42b057b0c9 Kernel: Mark compilation-unit-only functions as static
This enables a nice warning in case a function becomes dead code. Also, in case
of signal_trampoline_dummy, marking it external (non-static) prevents it from
being 'optimized away', which would lead to surprising and weird linker errors.

I found these places by using -Wmissing-declarations.

The Kernel still shows these issues, which I think are false-positives,
but don't want to touch:
- Kernel/Arch/i386/CPU.cpp:1081:17: void Kernel::enter_thread_context(Kernel::Thread*, Kernel::Thread*)
- Kernel/Arch/i386/CPU.cpp:1170:17: void Kernel::context_first_init(Kernel::Thread*, Kernel::Thread*, Kernel::TrapFrame*)
- Kernel/Arch/i386/CPU.cpp:1304:16: u32 Kernel::do_init_context(Kernel::Thread*, u32)
- Kernel/Arch/i386/CPU.cpp:1347:17: void Kernel::pre_init_finished()
- Kernel/Arch/i386/CPU.cpp:1360:17: void Kernel::post_init_finished()
	No idea, not gonna touch it.
- Kernel/init.cpp:104:30: void Kernel::init()
- Kernel/init.cpp:167:30: void Kernel::init_ap(u32, Kernel::Processor*)
- Kernel/init.cpp:184:17: void Kernel::init_finished(u32)
	Called by boot.S.
- Kernel/init.cpp:383:16: int Kernel::__cxa_atexit(void (*)(void*), void*, void*)
- Kernel/StdLib.cpp:285:19: void __cxa_pure_virtual()
- Kernel/StdLib.cpp:300:19: void __stack_chk_fail()
- Kernel/StdLib.cpp:305:19: void __stack_chk_fail_local()
	Not sure how to tell the compiler that the compiler is already using them.
	Also, maybe __cxa_atexit should go into StdLib.cpp?
- Kernel/Modules/TestModule.cpp:31:17: void module_init()
- Kernel/Modules/TestModule.cpp:40:17: void module_fini()
	Could maybe go into a new header. This would also provide type-checking for new modules.
2020-08-12 20:40:59 +02:00
Tom 49d5232f33 Kernel: Always return from Thread::wait_on
We need to always return from Thread::wait_on, even when a thread
is being killed. This is necessary so that the kernel call stack
can clean up and release references held by it. Then, right before
transitioning back to user mode, we check if the thread is
supposed to die, and at that point change the thread state to
Dying to prevent further scheduling of this thread.

This addresses some possible resource leaks similar to #3073
2020-08-11 14:54:36 +02:00
Ben Wiederhake bee08a4b9f Kernel: More PID/TID typing 2020-08-10 11:51:45 +02:00
Ben Wiederhake f5744a6f2f Kernel: PID/TID typing
This compiles, and contains exactly the same bugs as before.
The regex 'FIXME: PID/' should reveal all markers that I left behind, including:
- Incomplete conversion
- Issues or things that look fishy
- Actual bugs that will go wrong during runtime
2020-08-10 11:51:45 +02:00
Tom 41d2a0e9f7 Kernel: Dequeue dying threads from WaitQueue
If a thread is waiting but getting killed, we need to dequeue
the thread from the WaitQueue so that a potential wake before
finalization doesn't happen.
2020-08-06 10:02:55 +02:00
Tom f4a5c9b6c2 Kernel: Consolidate timeout logic
Allow passing in an optional timeout to Thread::block and move
the timeout check out of Thread::Blocker. This way all Blockers
implicitly support timeouts and don't need to implement it
themselves. Do however allow them to override timeouts (e.g.
for sockets).
2020-08-03 18:23:00 +02:00
Tom c813bb7355 Kernel: Fix a few Thread::block related races
We need to have a Thread lock to protect threading related
operations, such as Thread::m_blocker which is used in
Thread::block.

Also, if a Thread::Blocker indicates that it should be
unblocking immediately, don't actually block the Thread
and instead return immediately in Thread::block.
2020-08-03 15:59:11 +02:00
Tom f011c420c1 Kernel: Fix signal delivery when no syscall is made
This fixes a regression introduced by the new software context
switching where the Kernel would not deliver a signal unless the
process is making system calls. This is because the TSS no longer
updates the CS value, so the scheduler never considered delivery
as the process always appeared to be in kernel mode. With software
context switching we can just set up the signal trampoline at
any time and when the processor returns back to user mode it'll
get executed. This should fix e.g. killing programs that are
stuck in some tight loop that doesn't make any system calls and
is only pre-empted by the timer interrupt.

Fixes #2958
2020-08-02 20:50:29 +02:00
Tom 538b985487 Kernel: Remove ProcessInspectionHandle and make Process RefCounted
By making the Process class RefCounted we don't really need
ProcessInspectionHandle anymore. This also fixes some race
conditions where a Process may be deleted while still being
used by ProcFS.

Also make sure to acquire the Process' lock when accessing
regions.

Last but not least, there's no reason why a thread can't be
scheduled while being inspected, though in practice it won't
happen anyway because the scheduler lock is held at the same
time.
2020-08-02 17:15:11 +02:00
Ben Wiederhake d8c8820ee9 Kernel: Allow Thread::sleep for more than 388 days
Because Thread::sleep is an internal interface, it's easy to check that there
are only few callers: Process::sys$sleep, usleep, and nanosleep are happy
with this increased size, because now they support the entire range of their
arguments (assuming small-ish values for ticks_per_second()).
SyncTask doesn't care.

Note that the old behavior wasn't "cap out at 388 days", which would have been
reasonable. Instead, the code resulted in unsigned overflow, meaning that a
very long sleep would "on average" end after about 194 days, sometimes much
quicker.
2020-07-25 20:21:25 +02:00
Tom 419703a1f2 Kernel: Fix checking BlockResult
We now have BlockResult::WokeNormally and BlockResult::NotBlocked,
both of which indicate no error. We can no longer just check for
BlockResult::WokeNormally and assume anything else must be an
interruption.
2020-07-07 15:46:58 +02:00
Andrew Kaster f96b827990 Kernel+LibELF: Expose ELF Auxiliary Vector to Userspace
The AT_* entries are placed after the environment variables, so that
they can be found by iterating until the end of the envp array, and then
going even further beyond :^)
2020-07-07 10:38:54 +02:00
Tom bc107d0b33 Kernel: Add SMP IPI support
We can now properly initialize all processors without
crashing by sending SMP IPI messages to synchronize memory
between processors.

We now initialize the APs once we have the scheduler running.
This is so that we can process IPI messages from the other
cores.

Also rework interrupt handling a bit so that it's more of a
1:1 mapping. We need to allocate non-sharable interrupts for
IPIs.

This also fixes the occasional hang/crash because all
CPUs now synchronize memory with each other.
2020-07-06 17:07:44 +02:00
Andreas Kling 163c9d5f8f Kernel: Thread::wait_on() must always leave interrupts enabled on exit
The short-circuit path added for waiting on a queue that already had a
pending wake was able to return with interrupts disabled, which breaks
the API contract of wait_on() always returning with IF=1.

Fix this by adding a way to override the restored IF in ScopedCritical.
2020-07-06 11:33:32 +02:00
Tom 9725bda63e Kernel: Enhance WaitQueue to remember pending wakes
If WaitQueue::wake_all, WaitQueue::wake_one, or WaitQueue::wake_n
is called but nobody is currently waiting, we should remember that
fact and prevent someone from waiting after such a request. This
solves a race condition where the Finalizer thread is notified
to finalize a thread, but it is not (yet) waiting on this queue.

Fixes #2693
2020-07-06 10:00:24 +02:00
Tom 2a82a25fec Kernel: Various context switch fixes
These changes solve a number of problems with the software
context swithcing:

* The scheduler lock really should be held throughout context switches
* Transitioning from the initial (idle) thread to another needs to
  hold the scheduler lock
* Transitioning from a dying thread to another also needs to hold
  the scheduler lock
* Dying threads cannot necessarily be finalized if they haven't
  switched out of it yet, so flag them as active while a processor
  is running it (the Running state may be switched to Dying while
  it still is actually running)
2020-07-06 10:00:24 +02:00
Tom 788b2d64c6 Kernel: Require a reason to be passed to Thread::wait_on
The Lock class still permits no reason, but for everything else
require a reason to be passed to Thread::wait_on. This makes it
easier to diagnose why a Thread is in Queued state.
2020-07-06 10:00:24 +02:00
Tom bb84fad0bf Kernel: Fix retreiving frame pointer from a thread
If we're trying to walk the stack for another thread, we can
no longer retreive the EBP register from Thread::m_tss. Instead,
we need to look at the top of the kernel stack, because all threads
not currently running were last in kernel mode. Context switches
now always trigger a brief switch to kernel mode, and Thread::m_tss
only is used to save ESP and EIP.

Fixes #2678
2020-07-03 21:16:56 +02:00
Tom e373e5f007 Kernel: Fix signal delivery
When delivering urgent signals to the current thread
we need to check if we should be unblocked, and if not
we need to yield to another process.

We also need to make sure that we suppress context switches
during Process::exec() so that we don't clobber the registers
that it sets up (eip mainly) by a context switch. To be able
to do that we add the concept of a critical section, which are
similar to Process::m_in_irq but different in that they can be
requested at any time. Calls to Scheduler::yield and
Scheduler::donate_to will return instantly without triggering
a context switch, but the processor will then asynchronously
trigger a context switch once the critical section is left.
2020-07-03 19:32:34 +02:00
Tom 16783bd14d Kernel: Turn Thread::current and Process::current into functions
This allows us to query the current thread and process on a
per processor basis
2020-07-01 12:07:01 +02:00
Tom fb41d89384 Kernel: Implement software context switching and Processor structure
Moving certain globals into a new Processor structure for
each CPU allows us to eventually run an instance of the
scheduler on each CPU.
2020-07-01 12:07:01 +02:00
Sergey Bugaev d2b500fbcb AK+Kernel: Help the compiler inline a bunch of trivial methods
If these methods get inlined, the compiler is able to statically eliminate most
of the assertions. Alas, it doesn't realize this, and believes inlining them to
be too expensive. So give it a strong hint that it's not the case.

This *decreases* the kernel binary size.
2020-05-20 14:11:13 +02:00
Brian Gianforcaro eeb5318c25 Kernel: Expose timers via a TimerId type
The public consumers of the timer API shouldn't need to know
the how timer id's are tracked internally. Expose a typedef
instead to allow the internal implementation to be protected
from potential churn in the future.

It's also just good API design.
2020-04-27 11:14:41 +02:00
Brian Gianforcaro faf15e3721 Kernel: Add timeout support to Thread::wait_on
This change plumbs a new optional timeout option to wait_on.
The timeout is enabled by enqueing a timer on the timer queue
while we are waiting. We can then see if we were woken up or
timed out by checking if we are still on the wait queue or not.
2020-04-26 21:31:52 +02:00
Itamar 9e51e295cf ptrace: Add PT_SETREGS
PT_SETTREGS sets the regsiters of the traced thread. It can only be
used when the tracee is stopped.

Also, refactor ptrace.
The implementation was getting long and cluttered the alraedy large
Process.cpp file.

This commit moves the bulk of the implementation to Kernel/Ptrace.cpp,
and factors out peek & poke to separate methods of the Process class.
2020-04-13 00:53:22 +02:00
Itamar 4568a628f9 Thread: Set m_blocker to null in Thread::unblock()
Before this commit, m_blocker was only set to null in Thread::block,
after the thread has been unblocked.

Starting with this commit, m_blocker is also set to null in
Thread::unblock.

This change will allow us to implement a missing feature of the PT_TRACE
command of the ptrace syscall - stopping the traced thread when it
exits the execve syscall.

That feature will be implemented by sending a blocking SIGSTOP to the
traced thread after it has executed the execve logic and before it
starts executing the new program in userspace.

However, since Process::exec arranges the tss to return to userspace
(the so-called "yield-teleport"), the code in Thread::block that should
be run after the thread unblocks, and sets m_blocker to null, never
actually runs.

Setting m_blocker to null in Thread::unblock allows us to avoid an
incorrect state where the thread is in a Running state but conatins a
pointer to a Blocker.
2020-04-13 00:53:22 +02:00
Peter Nelson eff27f39d5
Kernel: Store previous thread state upon all transitions to Stopped (#1753)
We now store the previous thread state in m_stop_state for all
transitions to the Stopped state via Thread::set_state.

Fixes #1752 whereupon resuming a thread that was stopped with SIGTSTP,
the previous state of the thread is not remembered correctly, resulting
in m_stop_state == State::Invalid and the associated assertion fails.
2020-04-11 23:39:46 +02:00
Andrew Kaster 21b5909dc6 LibELF: Move ELF classes into namespace ELF
This is for consistency with other namespace changes that were made
a while back to the other libraries :)
2020-04-11 22:41:05 +02:00
Andreas Kling b7ff3b5ad1 Kernel: Include the current instruction pointer in profile samples
We were missing the innermost instruction pointer when sampling.
This makes the instruction-level profile info a lot cooler! :^)
2020-04-11 21:04:45 +02:00
Andreas Kling dc7340332d Kernel: Update cryptically-named functions related to symbolication 2020-04-08 17:19:46 +02:00
Itamar 6b74d38aab Kernel: Add 'ptrace' syscall
This commit adds a basic implementation of
the ptrace syscall, which allows one process
(the tracer) to control another process (the tracee).

While a process is being traced, it is stopped whenever a signal is
received (other than SIGCONT).

The tracer can start tracing another thread with PT_ATTACH,
which causes the tracee to stop.

From there, the tracer can use PT_CONTINUE
to continue the execution of the tracee,
or use other request codes (which haven't been implemented yet)
to modify the state of the tracee.

Additional request codes are PT_SYSCALL, which causes the tracee to
continue exection but stop at the next entry or exit from a syscall,
and PT_GETREGS which fethces the last saved register set of the tracee
(can be used to inspect syscall arguments and return value).

A special request code is PT_TRACE_ME, which is issued by the tracee
and causes it to stop when it calls execve and wait for the
tracer to attach.
2020-03-28 18:27:18 +01:00
Andreas Kling b1058b33fb AK: Add global FlatPtr typedef. It's u32 or u64, based on sizeof(void*)
Use this instead of uintptr_t throughout the codebase. This makes it
possible to pass a FlatPtr to something that has u32 and u64 overloads.
2020-03-08 13:06:51 +01:00
Liav A 0fc60e41dd Kernel: Use klog() instead of kprintf()
Also, duplicate data in dbg() and klog() calls were removed.
In addition, leakage of virtual address to kernel log is prevented.
This is done by replacing kprintf() calls to dbg() calls with the
leaked data instead.
Also, other kprintf() calls were replaced with klog().
2020-03-02 22:23:39 +01:00
Andreas Kling 678c87087d Kernel: Load executables on demand when symbolicating
Previously we would map the entire executable of a program in its own
address space (but make it unavailable to userspace code.)

This patch removes that and changes the symbolication code to remap
the executable on demand (and into the kernel's own address space
instead of the process address space.)

This opens up a couple of further simplifications that will follow.
2020-03-02 11:20:34 +01:00
Andreas Kling 5e0c4d689f Kernel: Move ProcessPagingScope to its own files 2020-03-01 15:38:09 +01:00
Andreas Kling 2839bb0be1 Kernel: Restore the previous thread state on SIGCONT after SIGSTOP
When stopping a thread with the SIGSTOP signal, we now store the thread
state in Thread::m_stop_state. That state is then restored on SIGCONT.
This fixes an issue where previously-blocked threads would unblock
upon resume. Now they simply resume in the Blocked state, and it's up
to the regular unblocking mechanism to unblock them.

Fixes #1326.
2020-03-01 15:14:17 +01:00
Andreas Kling 8b6d548b55 Kernel: Disable interrupts throughout Thread::raw_backtrace()
Otherwise we may hit an assertion when validating stack addresses.
2020-02-29 22:06:56 +01:00
Andreas Kling 7cd1bdfd81 Kernel: Simplify some dbg() logging
We don't have to log the process name/PID/TID, dbg() automatically adds
that as a prefix to every line.

Also we don't have to do .characters() on Strings passed to dbg() :^)
2020-02-29 13:39:06 +01:00
Liav A a506b2a48e Thread: Use dbg() instead of dbgprintf() 2020-02-27 13:05:12 +01:00
Cristian-Bogdan SIRB 05ce8586ea Kernel: Fix ASSERTION failed in join_thread syscall
set_interrupted_by_death was never called whenever a thread that had
a joiner died, so the joiner remained with the joinee pointer there,
resulting in an assertion fail in JoinBlocker: m_joinee pointed to
a freed task, filled with garbage.

Thread::current->m_joinee may not be valid after the unblock

Properly return the joinee exit value to the joiner thread.
2020-02-27 10:09:44 +01:00
Cristian-Bogdan SIRB 717cd5015e Kernel: Allow process with multiple threads to call exec and exit
This allows a process wich has more than 1 thread to call exec, even
from a thread. This kills all the other threads, but it won't wait for
them to finish, just makes sure that they are not in a running/runable
state.

In the case where a thread does exec, the new program PID will be the
thread TID, to keep the PID == TID in the new process.

This introduces a new function inside the Process class,
kill_threads_except_self which is called on exit() too (exit with
multiple threads wasn't properly working either).

Inside the Lock class, there is the need for a new function,
clear_waiters, which removes all the waiters from the
Process::big_lock. This is needed since after a exit/exec, there should
be no other threads waiting for this lock, the threads should be simply
killed. Only queued threads should wait for this lock at this point,
since blocked threads are handled in set_should_die.
2020-02-26 13:06:40 +01:00
Andreas Kling ceec1a7d38 AK: Make Vector use size_t for its size and capacity 2020-02-25 14:52:35 +01:00
Andreas Kling 94652fd2fb Kernel: Fully validate pointers when walking stack during profiling
It's not enough to just check that things wouldn't page fault, we also
need to verify that addresses are accessible to the profiled thread.
2020-02-22 10:09:54 +01:00
Andreas Kling 59b9e49bcd Kernel: Don't trigger page faults during profiling stack walk
The kernel sampling profiler will walk thread stacks during the timer
tick handler. Since it's not safe to trigger page faults during IRQ's,
we now avoid this by checking the page tables manually before accessing
each stack location.
2020-02-21 15:49:39 +01:00
Andreas Kling 9aa234cc47 Kernel: Reset FPU state on exec() 2020-02-18 13:44:27 +01:00
Andreas Kling 48f7c28a5c Kernel: Replace "current" with Thread::current and Process::current
Suggested by Sergey. The currently running Thread and Process are now
Thread::current and Process::current respectively. :^)
2020-02-17 15:04:27 +01:00
Andreas Kling 1d611e4a11 Kernel: Reduce header dependencies of MemoryManager and Region 2020-02-16 01:33:41 +01:00
Andreas Kling a356e48150 Kernel: Move all code into the Kernel namespace 2020-02-16 01:27:42 +01:00
Andreas Kling 0341ddc5eb Kernel: Rename RegisterDump => RegisterState 2020-02-16 00:15:37 +01:00
Andreas Kling 934b1d8a9b Kernel: Finalizer should not go back to sleep if there's more to do
Before putting itself back on the wait queue, the finalizer task will
now check if there's more work to do, and if so, do it first. :^)

This patch also puts a bunch of process/thread debug logging behind
PROCESS_DEBUG and THREAD_DEBUG since it was unbearable to debug this
stuff with all the spam.
2020-02-01 10:56:17 +01:00
Andreas Kling 5163c5cc63 Kernel: Expose the signal that stopped a thread via sys$waitpid() 2020-01-27 20:47:10 +01:00
Andreas Kling 17210a39e4 Kernel: Remove ancient hack that put the current PID in TSS.SS2
While I was bringing up multitasking, I put the current PID in the SS2
(ring 2 stack segment) slot of the TSS. This was so I could see which
PID was currently running when just inspecting the CPU state.
2020-01-27 13:10:24 +01:00
Andreas Kling ae0f92a0a1 Kernel: Simplify kernel thread stack allocation
We had two identical code paths doing this for some reason.
2020-01-27 12:52:45 +01:00
Andreas Kling f38cfb3562 Kernel: Tidy up debug logging a little bit
When using dbg() in the kernel, the output is automatically prefixed
with [Process(PID:TID)]. This makes it a lot easier to understand which
thread is generating the output.

This patch also cleans up some common logging messages and removes the
now-unnecessary "dbg() << *current << ..." pattern.
2020-01-21 16:16:20 +01:00
Andreas Kling e901a3695a Kernel: Use the templated copy_to/from_user() in more places
These ensure that the "to" and "from" pointers have the same type,
and also that we copy the correct number of bytes.
2020-01-20 13:41:21 +01:00
Andreas Kling 4b7a89911c Kernel: Remove some unnecessary casts to uintptr_t
VirtualAddress is constructible from uintptr_t and const void*.
PhysicalAddress is constructible from uintptr_t but not const void*.
2020-01-20 13:13:03 +01:00
Andreas Kling a246e9cd7e Use uintptr_t instead of u32 when storing pointers as integers
uintptr_t is 32-bit or 64-bit depending on the target platform.
This will help us write pointer size agnostic code so that when the day
comes that we want to do a 64-bit port, we'll be in better shape.
2020-01-20 13:13:03 +01:00
Andreas Kling 1d02ac35fc Kernel: Limit Thread::raw_backtrace() to the max profiler stack size
Let's avoid walking overly long stacks here, since kmalloc() is finite.
2020-01-19 13:54:09 +01:00
Andreas Kling 87583aea9c Kernel: Use copy_from_user() when appropriate during thread backtracing 2020-01-19 10:33:26 +01:00
Andreas Kling 94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Andreas Kling 65cb406327 Kernel: Allow unlocking a held Lock with interrupts disabled
This is needed to eliminate a race in Thread::wait_on() where we'd
otherwise have to wait until after unlocking the process lock before
we can disable interrupts.
2020-01-13 18:56:46 +01:00
Andreas Kling 41376d4662 Kernel: Fix Lock racing to the WaitQueue
There was a time window between releasing Lock::m_lock and calling into
the lock's WaitQueue where someone else could take m_lock and bring two
threads into a deadlock situation.

Fix this issue by holding Lock::m_lock until interrupts are disabled by
either Thread::wait_on() or WaitQueue::wake_one().
2020-01-12 19:04:16 +01:00
Andreas Kling a885719af5 Kernel: Keep SMAP protection enabled in Thread::backtrace_impl() 2020-01-12 10:47:01 +01:00
Andreas Kling f6c0fccc01 Kernel: Fix busted backtraces when a thread backtraces itself
When the current thread is backtracing itself, we now start walking the
stack from the current EBP register value, instead of the TSS one.

Now SystemMonitor always appears to be running Thread::backtrace() when
sampled, which makes perfect sense. :^)
2020-01-12 10:19:37 +01:00
Andreas Kling 8c5cd97b45 Kernel: Fix kernel null deref on process crash during join_thread()
The join_thread() syscall is not supposed to be interruptible by
signals, but it was. And since the process death mechanism piggybacked
on signal interrupts, it was possible to interrupt a pthread_join() by
killing the process that was doing it, leading to confusing due to some
assumptions being made by Thread::finalize() for threads that have a
pending joiner.

This patch fixes the issue by making "interrupted by death" a distinct
block result separate from "interrupted by signal". Then we handle that
state in join_thread() and tidy things up so that thread finalization
doesn't get confused by the pending joiner being gone.

Test: Tests/Kernel/null-deref-crash-during-pthread_join.cpp
2020-01-10 19:23:45 +01:00
Andreas Kling 17ef5bc0ac Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp}
These were always so awkwardly named.
2020-01-09 18:02:01 +01:00
Andreas Kling e23f05a157 Kernel: Remove unused variable Thread::m_userspace_stack_region 2020-01-09 12:31:18 +01:00
Andreas Kling f6691ad26e Kernel: Fix SMAP violation in thread signal dispatch 2020-01-05 18:19:26 +01:00
Andreas Kling 9eef39d68a Kernel: Start implementing x86 SMAP support
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.

Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.

There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.

THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.

Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
2020-01-05 18:14:51 +01:00
Andreas Kling 3a27790fa7 Kernel: Use Thread::from_tid() in more places 2020-01-04 18:56:04 +01:00
Andreas Kling 32ec1e5aed Kernel: Mask kernel addresses in backtraces and profiles
Addresses outside the userspace virtual range will now show up as
0xdeadc0de in backtraces and profiles generated by unprivileged users.
2020-01-02 20:51:31 +01:00
Andreas Kling f598bbbb1d Kernel: Prevent executing I/O instructions in userspace
All threads were running with iomapbase=0 in their TSS, which the CPU
interprets as "there's an I/O permission bitmap starting at offset 0
into my TSS".

Because of that, any bits that were 1 inside the TSS would allow the
thread to execute I/O instructions on the port with that bit index.

Fix this by always setting the iomapbase to sizeof(TSS32), and also
setting the TSS descriptor's limit to sizeof(TSS32), effectively making
the I/O permissions bitmap zero-length.

This should make it no longer possible to do I/O from userspace. :^)
2020-01-01 17:31:41 +01:00
Andreas Kling fd740829d1 Kernel: Switch to eagerly restoring x86 FPU state on context switch
Lazy FPU restore is well known to be vulnerable to timing attacks,
and eager restore is a lot simpler anyway, so let's just do it eagerly.
2020-01-01 16:54:21 +01:00
Andreas Kling 54d182f553 Kernel: Remove some unnecessary leaking of kernel pointers into dmesg
There's a lot more of this and we need to stop printing kernel pointers
anywhere but the debug console.
2019-12-31 01:22:00 +01:00
Andreas Kling 610f3ad12f Kernel: Add a basic thread boosting mechanism
This patch introduces a syscall:

    int set_thread_boost(int tid, int amount)

You can use this to add a permanent boost value to the effective thread
priority of any thread with your UID (or any thread in the system if
you are the superuser.)

This is quite crude, but opens up some interesting opportunities. :^)
2019-12-30 19:23:13 +01:00
Andreas Kling 50677bf806 Kernel: Refactor scheduler to use dynamic thread priorities
Threads now have numeric priorities with a base priority in the 1-99
range.

Whenever a runnable thread is *not* scheduled, its effective priority
is incremented by 1. This is tracked in Thread::m_extra_priority.
The effective priority of a thread is m_priority + m_extra_priority.

When a runnable thread *is* scheduled, its m_extra_priority is reset to
zero and the effective priority returns to base.

This means that lower-priority threads will always eventually get
scheduled to run, once its effective priority becomes high enough to
exceed the base priority of threads "above" it.

The previous values for ThreadPriority (Low, Normal and High) are now
replaced as follows:

    Low -> 10
    Normal -> 30
    High -> 50

In other words, it will take 20 ticks for a "Low" priority thread to
get to "Normal" effective priority, and another 20 to reach "High".

This is not perfect, and I've used some quite naive data structures,
but I think the mechanism will allow us to build various new and
interesting optimizations, and we can figure out better data structures
later on. :^)
2019-12-30 18:46:17 +01:00
Andreas Kling 9e55bcb7da Kernel: Make kernel memory regions be non-executable by default
From now on, you'll have to request executable memory specifically
if you want some.
2019-12-25 22:41:34 +01:00
Andreas Kling 52deb09382 Kernel: Enable PAE (Physical Address Extension)
Introduce one more (CPU) indirection layer in the paging code: the page
directory pointer table (PDPT). Each PageDirectory now has 4 separate
PageDirectoryEntry arrays, governing 1 GB of VM each.

A really neat side-effect of this is that we can now share the physical
page containing the >=3GB kernel-only address space metadata between
all processes, instead of lazily cloning it on page faults.

This will give us access to the NX (No eXecute) bit, allowing us to
prevent execution of memory that's not supposed to be executed.
2019-12-25 13:35:57 +01:00
Conrad Pankoff 0fdbe08637 Kernel: Fix debug message and kernel stack region names in thread setup 2019-12-24 01:28:38 +01:00
Conrad Pankoff 0cb89f5927 Kernel: Mark kernel stack regions as... stack regions 2019-12-24 01:28:38 +01:00
Conrad Pankoff b557aab884 Kernel: Move ring0 stacks out of kmalloc_eternal
This allows us to use all the same fun memory protection features as the
rest of the system for ring0 processes. Previously a ring0 process could
over- or underrun its stack and nobody cared, since kmalloc_eternal is the
wild west of memory.
2019-12-24 01:28:38 +01:00
Conrad Pankoff 3aaeff483b Kernel: Add a size argument to validate_read_from_kernel 2019-12-24 01:28:38 +01:00
Andreas Kling 523fd6533e Kernel: Unlock the Process when exit()ing
If there are more threads in a process when exit()ing, we need to give
them a chance to unwind any kernel stacks. This means we have to unlock
the process lock before giving control to the scheduler.

Fixes #891 (together with all of the other "no more main thread" work.)
2019-12-22 12:38:01 +01:00
Andreas Kling f4978b2be1 Kernel: Use IntrusiveList to make WaitQueue allocation-free :^) 2019-12-22 12:38:01 +01:00
Andreas Kling 4b8851bd01 Kernel: Make TID's be unique PID's
This is a little strange, but it's how I understand things should work.

The first thread in a new process now has TID == PID.
Additional threads subsequently spawned in that process all have unique
TID's generated by the PID allocator. TIDs are now globally unique.
2019-12-22 12:38:01 +01:00
Andreas Kling 16812f0f98 Kernel: Get rid of "main thread" concept
The idea of all processes reliably having a main thread was nice in
some ways, but cumbersome in others. More importantly, it didn't match
up with POSIX thread semantics, so let's move away from it.

This thread gets rid of Process::main_thread() and you now we just have
a bunch of Thread objects floating around each Process.

When the finalizer nukes the last Thread in a Process, it will also
tear down the Process.

There's a bunch of more things to fix around this, but this is where we
get started :^)
2019-12-22 12:37:58 +01:00
Andreas Kling 3012b224f0 Kernel: Fix intermittent assertion failure in sys$exec()
While setting up the main thread stack for a new process, we'd incur
some zero-fill page faults. This was to be expected, since we allocate
a huge stack but lazily populate it with physical pages.

The problem is that page fault handlers may enable interrupts in order
to grab a VMObject lock (or to page in from an inode.)

During exec(), a process is reorganizing itself and will be in a very
unrunnable state if the scheduler should interrupt it and then later
ask it to run again. Which is exactly what happens if the process gets
pre-empted while the new stack's zero-fill page fault grabs the lock.

This patch fixes the issue by creating new main thread stacks before
disabling interrupts and going into the critical part of exec().
2019-12-18 23:03:23 +01:00
Andreas Kling 7a64f55c0f Kernel: Fix get_register_dump_from_stack() after IRQ entry changes
I had to change the layout of RegisterDump a little bit to make the new
IRQ entry points work. This broke get_register_dump_from_stack() which
was expecting the RegisterDump to be badly aligned due to a goofy extra
16 bits which are no longer there.
2019-12-15 17:58:53 +01:00
Andreas Kling b32e961a84 Kernel: Implement a simple process time profiler
The kernel now supports basic profiling of all the threads in a process
by calling profiling_enable(pid_t). You finish the profiling by calling
profiling_disable(pid_t).

This all works by recording thread stacks when the timer interrupt
fires and the current thread is in a process being profiled.
Note that symbolication is deferred until profiling_disable() to avoid
adding more noise than necessary to the profile.

A simple "/bin/profile" command is included here that can be used to
start/stop profiling like so:

    $ profile 10 on
    ... wait ...
    $ profile 10 off

After a profile has been recorded, it can be fetched in /proc/profile

There are various limits (or "bugs") on this mechanism at the moment:

- Only one process can be profiled at a time.
- We allocate 8MB for the samples, if you use more space, things will
  not work, and probably break a bit.
- Things will probably fall apart if the profiled process dies during
  profiling, or while extracing /proc/profile
2019-12-11 20:36:56 +01:00
Andrew Kaster 9058962712 Kernel: Allow setting thread names
The main thread of each kernel/user process will take the name of
the process. Extra threads will get a fancy new name
"ProcessName[<tid>]".

Thread backtraces now list the thread name in addtion to tid.

Add the thread name to /proc/all (should it get its own proc
file?).

Add two new syscalls, set_thread_name and get_thread_name.
2019-12-08 14:09:29 +01:00
Andreas Kling 8bb98aa31b Kernel: Use a WaitQueue to implement finalizer wakeup
This gets rid of the special "Lurking" thread state and replaces it
with a generic WaitQueue :^)
2019-12-01 19:17:17 +01:00
Andreas Kling 5859e16e53 Kernel: Use a dedicated thread state for wait-queued threads
Instead of using the generic block mechanism, wait-queued threads now
go into the special Queued state.

This fixes an issue where signal dispatch would unblock a wait-queued
thread (because signal dispatch unblocks blocked threads) and cause
confusion since the thread only expected to be awoken by the queue.
2019-12-01 16:02:58 +01:00
Andreas Kling f067730f6b Kernel: Add a WaitQueue for Thread queueing/waking and use it for Lock
The kernel's Lock class now uses a proper wait queue internally instead
of just having everyone wake up regularly to try to acquire the lock.

We also keep the donation mechanism, so that whenever someone tries to
take the lock and fails, that thread donates the remainder of its
timeslice to the current lock holder.

After unlocking a Lock, the unlocking thread calls WaitQueue::wake_one,
which unblocks the next thread in queue.
2019-12-01 12:07:43 +01:00
Andreas Kling f75a6b9daa Kernel: Demangle kernel C++ symbols correctly again
I broke this while implementing module linking. Also move the actual
demangling work to AK, in AK::demangle(const char*)
2019-11-29 14:59:15 +01:00
Andreas Kling e34ed04d1e Kernel+LibPthread+LibC: Create secondary thread stacks in userspace
Have pthread_create() allocate a stack and passing it to the kernel
instead of this work happening in the kernel. The more of this we can
do in userspace, the better.

This patch also unexposes the raw create_thread() and exit_thread()
syscalls since they are now only used by LibPthread anyway.
2019-11-17 17:29:20 +01:00
Andreas Kling 794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Andreas Kling 73d6a69b3f Kernel: Release the big process lock while yielding in sys$yield()
Otherwise, a thread calling sched_yield() will prevent other threads
in that process from entering the kernel.
2019-11-16 12:18:59 +01:00
Andreas Kling cb5021419e Kernel: Move Thread::m_joinee_exit_value into the JoinBlocker
There's no need for this to be a permanent Thread member. Just use a
reference in the JoinBlocker instead.
2019-11-14 21:04:34 +01:00
Andreas Kling 69efa3f630 Kernel+LibPthread: Implement pthread_join()
It's now possible to block until another thread in the same process has
exited. We can also retrieve its exit value, which is whatever value it
passed to pthread_exit(). :^)
2019-11-14 20:58:23 +01:00
Sergey Bugaev 1e1ddce9d8 Kernel: Unwind kernel stacks before dying
While executing in the kernel, a thread can acquire various resources
that need cleanup, such as locks and references to RefCounted objects.
This cleanup normally happens on the exit path, such as in destructors
for various RAII guards. But we weren't calling those exit paths when
killing threads that have been executing in the kernel, such as threads
blocked on reading or sleeping, thus causing leaks.

This commit changes how killing threads works. Now, instead of killing
a thread directly, one is supposed to call thread->set_should_die(),
which will unblock it and make it unwind the stack if it is blocked
in the kernel. Then, just before returning to the userspace, the thread
will automatically die.
2019-11-14 20:05:58 +01:00
Andreas Kling 083c5f8b89 Kernel: Rework Process::Priority into ThreadPriority
Scheduling priority is now set at the thread level instead of at the
process level.

This is a step towards allowing processes to set different priorities
for threads. There's no userspace API for that yet, since only the main
thread's priority is affected by sched_setparam().
2019-11-06 16:30:06 +01:00
Andreas Kling 49635e62fa LibELF: Move AK/ELF/ into Libraries/LibELF/
Let's arrange things like this instead. It didn't feel right for all of
the ELF handling code to live in AK.
2019-11-06 13:42:38 +01:00
Drew Stratford 5efbb4ae95 Kernel: Fix bug in Thread::dispatch_signal().
dispatch_signal() expected a RegisterDump on the kernel stack. However
in certain cases, like just after a clone, this was not the case and
dispatch_signal() would instead write to an incorrect user stack pointer.

We now use the threads TSS in situations where the RegisterDump may not
be valid, fixing the issue.
2019-11-04 10:12:59 +01:00
Drew Stratford 44f22c99ef Thread.cpp: add method get_RegisterDump_from_stack().
This refactors some the RegisterDump code from dispatch_signal
into a stand-alone function, allowing for better reuse.
2019-11-04 10:12:59 +01:00
Andreas Kling cc68654a44 Kernel+LibC: Implement clock_gettime() and clock_nanosleep()
Only the CLOCK_MONOTONIC clock is supported at the moment, and it only
has millisecond precision. :^)
2019-11-02 19:34:06 +01:00
Andreas Kling 904c871727 Kernel: Allow userspace stacks to grow up to 4 MB by default
Make userspace stacks lazily allocated and allow them to grow up to
4 megabytes. This avoids a lot of silly crashes we were running into
with software expecting much larger stacks. :^)
2019-10-31 13:57:07 +01:00
Andrew Kaster 98c86e5109 Kernel: Move E2BIG calculation from Thread to Process
Thread::make_userspace_stack_for_main_thread is only ever called from
Process::do_exec, after all the fun ELF loading and TSS setup has
occured.

The calculations in there that check if the combined argv + envp
size will exceed the default stack size are not used in the rest of
the stack setup. So, it should be safe to move this to the beginning
of do_exec and bail early with -E2BIG, just like the man pages say.

Additionally, advertise this limit in limits.h to be a good POSIX.1
citizen. :)
2019-10-23 07:45:41 +02:00
Andreas Kling 40beb4c5c0 Kernel: Don't leak an FPU state buffer for every spawned thread
We were leaking 512 bytes of kmalloc memory for every new thread.
This patch fixes that, and also makes sure to zero out the FPU state
buffer after allocating it, and finally also makes the LogStream
operator<< for Thread look a little bit nicer. :^)
2019-10-13 14:36:55 +02:00
Drew Stratford c136fd3fe2 Kernel: Send SIGSEGV on seg-fault
Now programs can catch the SIGSEGV signal when they segfault.

This commit also introduced the send_urgent_signal_to_self method,
which is needed to send signals to a thread when handling exceptions
caused by the same thread.
2019-10-07 16:39:47 +02:00
Andreas Kling d5f3972012 Kernel: No need to manually deallocate kernel stack Region in ~Thread()
Since we're keeping this Region in an OwnPtr, it will be torn down when
we get to ~OwnPtr anyway.
2019-09-27 19:10:52 +02:00
Drew Stratford b65bedd610 Kernel: Change m_blockers to m_blocker.
Because of the way signals now work there should
not be more than one blocker per thread. This
changes the blocker and thread class to reflect
that.
2019-09-09 08:35:43 +02:00
Drew Stratford e529042895 Kernel: Remove reduntant kernel/user signal stacks.
Due to the changes in signal handling m_kernel_stack_for_signal_handler_region
and m_signal_stack_user_region are no longer necessary, and so, have been
removed. I've also removed the similarly reduntant m_tss_to_resume_kernel.
2019-09-09 08:35:43 +02:00
Andreas Kling e386579436 Kernel: Fix bitrotted code behind #ifdef SIGNAL_DEBUG 2019-09-08 14:29:59 +02:00
Andreas Kling 899233a925 Kernel: Handle running programs that don't have a TLS image
Programs without a PT_TLS header won't have a master TLS image for us
to copy, so we shouldn't try to copy the m_master_tls_region then.
2019-09-07 17:06:25 +02:00
Andreas Kling ec6bceaa08 Kernel: Support thread-local storage
This patch adds support for TLS according to the x86 System V ABI.
Each thread gets a thread-specific memory region, and the GS segment
register always points _to a pointer_ to the thread-specific memory.

In other words, to access thread-local variables, userspace programs
start by dereferencing the pointer at [gs:0].

The Process keeps a master copy of the TLS segment that new threads
should use, and when a new thread is created, they get a copy of it.
It's basically whatever the PT_TLS program header in the ELF says.
2019-09-07 15:55:36 +02:00
Drew Stratford 95fe775d81 Kernel: Add SysV stack alignment to signal trampoline
In both dispatch signal and asm_signal_trampoline we
now ensure that the stack is 16 byte aligned, as per
the System V ABI.
2019-09-05 16:37:09 +02:00
Drew Stratford 81d0f96f20 Kernel: Use user stack for signal handlers.
This commit drastically changes how signals are handled.

In the case that an unblocked thread is signaled it works much
in the same way as previously. However, when a blocking syscall
is interrupted, we set up the signal trampoline on the user
stack, complete the blocking syscall, return down the kernel
stack and then jump to the handler. This means that from the
kernel stack's perspective, we only ever get one system call deep.

The signal trampoline has also been changed in order to properly
store the return value from system calls. This is necessary due
to the new way we exit from signaled system calls.
2019-09-05 16:37:09 +02:00
Drew Stratford 259a1d56b0 Thread: added member m_kernel_stack_top.
This value stores the top of a threads kernel_stack.
2019-09-05 16:37:09 +02:00
Andreas Kling 77737be7b3 Kernel: Stop eagerly loading entire executables
We were forced to do this because the page fault code would fall apart
when trying to generate a backtrace for a non-current thread.

This issue has been fixed for a while now, so let's go back to lazily
loading executable pages which should make everything a little better.
2019-08-15 10:29:44 +02:00
Andreas Kling 83fdad25ed Kernel: For signal-killed threads, dump backtrace from finalizer thread
Instead of dumping the dying thread's backtrace in the signal handling
code, wait until we're finalizing the thread. Since signalling happens
during scheduling, the less work we do there the better.

Basically the less that happens during a scheduler pass the better. :^)
2019-08-06 19:45:08 +02:00
Andreas Kling 5e01ebfc56 Kernel: Clean up thread stacks when a thread dies
We were forgetting where we put the userspace thread stacks, so added a
member called Thread::m_userspace_thread_stack to keep track of it.

Then, in ~Thread(), we now deallocate the userspace, kernel and signal
stacks (if present.)

Out of curiosity, the "init_stage2" process doesn't have a kernel stack
which I found surprising. :^)
2019-08-01 20:17:12 +02:00
Andreas Kling 3ad6ae1842 Kernel: Delete non-main threads immediately after finalizing them
Previously we would wait until the whole process died before actually
deleting its threads.
2019-08-01 20:01:23 +02:00
Andreas Kling be4d33fb2c Kernel+LibC: A lot of the signal handling code was off-by-one.
There is no signal 0. The valid ones are 1 (SIGHUP) through 31 (SIGSYS)
Found by PVS-Studio.
2019-08-01 11:03:48 +02:00
Andreas Kling a79d8d8ae5 Kernel: Add (expensive) but valuable userspace symbols to stacks.
This is expensive because we have to page in the entire executable for every
process up front for this to work. This is due to the page fault code not
being strong enough to run while another process is active.

Note that we already had userspace symbols in *crash* stacks. This patch
adds them generally, so they show up in /proc, Process Manager, etc.

There's room for improvement here, but the debugging benefits way overshadow
the performance penalty right now. :^)
2019-07-27 12:02:56 +02:00
Andreas Kling 4316fa8123 Kernel: Dump backtrace to debugger for DefaultSignalAction::DumpCore.
This makes assertion failures generate backtraces again. Sorry to everyone
who suffered from the lack of backtraces lately. :^)

We share code with the /proc/PID/stack implementation. You can now get the
current backtrace for a Thread via Thread::backtrace(), and all the traces
for a Process via Process::backtrace().
2019-07-25 21:02:19 +02:00
Robin Burchell 342f7a6b0f Move runnable/non-runnable list control entirely over to Scheduler
This way, we can change how the scheduler works without having to change Thread too.
2019-07-22 09:42:39 +02:00
Robin Burchell dea7f937bf Scheduler: Allow reentry into block()
With the presence of signal handlers, it is possible that a thread might
be blocked multiple times. Picture for instance a signal handler using
read(), or wait() while the thread is already blocked elsewhere before
the handler is invoked.

To fix this, we turn m_blocker into a chain of handlers. Each block()
call now prepends to the list, and unblocking will only consider the
most recent (first) blocker in the chain.

Fixes #309
2019-07-21 12:42:22 +02:00
Robin Burchell d48c73b10a Thread: Cleanup m_blocker handling
The only two places we set m_blocker now are Thread::set_state(), and
Thread::block(). set_state is mostly just an issue of clarity: we don't
want to end up with state() != Blocked with an m_blocker, because that's
weird. It's also possible: if we yield, someone else may set_state() us.

We also now set_state() and set m_blocker under lock in block(), rather
than unlocking which might allow someone else to mess with our internals
while we're in the process of trying to block.

This seems to fix sending STOP & CONT causing a panic.

My guess as to what was happening is this:

    thread A blocks in select(): Blocking & m_blocker != nullptr
    thread B sends SIGSTOP: Stopped & m_blocker != nullptr
    thread B sends SIGCONT: we continue execution. Runnable & m_blocker != nullptr
    thread A tries to block in select() again:
        * sets m_blocker
        * unlocks (in block_helper)
        * someone else tries to unblock us? maybe from the old m_blocker? unclear -- clears m_blocker
        * sets Blocked (while unlocked!)

So, thread A is left with state Blocked & m_blocker == nullptr, leading
to the scheduler assert (m_blocker != nullptr) failing.

Long story short, let's do all our data management with the lock _held_.
2019-07-20 19:31:52 +02:00
Robin Burchell 96de90ceef Net: Merge Thread::wait_for_connect into LocalSocket (as the only place that uses it)
Also do this more like other blockers, don't call yield ourselves, as
block will do that for us.
2019-07-20 12:15:24 +02:00
Robin Burchell 833d444cd8 Thread: Return a result from block() indicating why the block terminated
And use this to return EINTR in various places; some of which we were
not handling properly before.

This might expose a few bugs in userspace, but should be more compatible
with other POSIX systems, and is certainly a little cleaner.
2019-07-20 12:15:24 +02:00
Andreas Kling f8beb0f665 Kernel: Share the "return to ring 0/3 from signal" trampolines globally.
Generate a special page containing the "return from signal" trampoline code
on startup and then route signalled threads to it. This avoids a page
allocation in every process that ever receives a signal.
2019-07-19 17:01:16 +02:00
Robin Burchell 53262cd08b AK: Introduce IntrusiveList
And use it in the scheduler.

IntrusiveList is similar to InlineLinkedList, except that rather than
making assertions about the type (and requiring inheritance), it
provides an IntrusiveListNode type that can be used to put an instance
into many different lists at once.

As a proof of concept, port the scheduler over to use it. The only
downside here is that the "list" global needs to know the position of
the IntrusiveListNode member, so we have to position things a little
awkwardly to make that happen. We also move the runnable lists to
Thread, to avoid having to publicize the node.
2019-07-19 15:42:30 +02:00
Andreas Kling 705cd2491c Kernel: Some small refinements to the thread blockers.
Committing some things my hands did while browsing through this code.

- Mark all leaf classes "final".
- FileDescriptionBlocker now stores a NonnullRefPtr<FileDescription>.
- FileDescriptionBlocker::blocked_description() now returns a reference.
- ConditionBlocker takes a Function&&.
2019-07-19 13:19:47 +02:00
Robin Burchell e74dce65e6 Thread: Normalize all for_each constructs to use IterationDecision
This way a caller can abort the for_each early if they want.
2019-07-19 13:19:02 +02:00
Robin Burchell cd76b691fb Kernel: Remove memory allocations from the new Blocker API 2019-07-19 11:03:22 +02:00
Robin Burchell 99c5377653 Kernel: Remove old block(State) API
New API should be used always :)
2019-07-19 11:03:22 +02:00
Robin Burchell 762333ba95 Kernel: Restore state strings for block states
"Blocking" is not terribly informative, but now that everything is
ported over, we can force the blocker to provide us with a reason.

This does mean that to_string(State) needed to become a member, but
that's OK.
2019-07-19 11:03:22 +02:00
Robin Burchell b13f1699fc Kernel: Rename Condition state to Blocked now we only have one blocking mechanism :) 2019-07-19 11:03:22 +02:00
Robin Burchell d2ca91c024 Kernel: Convert BlockedSignal and BlockedLurking to the new Blocker mechanism
The last two of the old block states gone :)
2019-07-19 11:03:22 +02:00
Robin Burchell 52743f9eec Kernel: Rename ThreadBlocker classes to avoid stutter
Thread::ThreadBlockerFoo is a lot less nice to read than Thread::FooBlocker
2019-07-19 11:03:22 +02:00
Robin Burchell 782e4ee6e1 Kernel: Port wait to ThreadBlocker 2019-07-19 11:03:22 +02:00
Robin Burchell 4f9ae9b970 Kernel: Port select to ThreadBlocker 2019-07-19 11:03:22 +02:00
Robin Burchell 32fcfb79e9 Kernel: Port sleep to ThreadBlocker 2019-07-19 11:03:22 +02:00
Robin Burchell 0c8813e6d9 Kernel: Introduce ThreadBlocker as a way to make unblocking neater :)
And port all the descriptor-based blocks over to it as a proof of concept.
2019-07-19 11:03:22 +02:00
Robin Burchell f2fdac789c Kernel: Add a new block state for accept() on a blocking socket
Rather than asserting, which really ruins everyone's day.
2019-07-18 10:56:49 +02:00
Robin Burchell 4f94fbc9e1 Kernel: Split SCHEDULER_DEBUG into a new SCHEDULER_RUNNABLE_DEBUG
And use dbgprintf() consistently on a few of the pieces of logging here.

This is useful when trying to track thread switching when you don't
really care about what it's switching _to_.
2019-07-17 14:23:15 +02:00
Andreas Kling b2e502e533 Kernel: Add Thread::block_until(Condition).
Replace the class-based snooze alarm mechanism with a per-thread callback.
This makes it easy to block the current thread on an arbitrary condition:

    void SomeDevice::wait_for_irq() {
        m_interrupted = false;
        current->block_until([this] { return m_interrupted; });
    }
    void SomeDevice::handle_irq() {
        m_interrupted = true;
    }

Use this in the SB16 driver, and in NetworkTask :^)
2019-07-14 14:54:54 +02:00