Commit graph

62 commits

Author SHA1 Message Date
kleines Filmröllchen b645f87b7a Kernel: Overhaul system shutdown procedure
For a long time, our shutdown procedure has basically been:
- Acquire big process lock.
- Switch framebuffer to Kernel debug console.
- Sync and lock all file systems so that disk caches are flushed and
  files are in a good state.
- Use firmware and architecture-specific functionality to perform
  hardware shutdown.

This naive and simple shutdown procedure has multiple issues:
- No processes are terminated properly, meaning they cannot perform more
  complex cleanup work. If they were in the middle of I/O, for instance,
  only the data that already reached the Kernel is written to disk, and
  data corruption due to unfinished writes can therefore still occur.
- No file systems are unmounted, meaning that any important unmount work
  will never happen. This is important for e.g. Ext2, which has
  facilites for detecting improper unmounts (see superblock's s_state
  variable) and therefore requires a proper unmount to be performed.
  This was also the starting point for this PR, since I wanted to
  introduce basic Ext2 file system checking and unmounting.
- No hardware is properly shut down beyond what the system firmware does
  on its own.
- Shutdown is performed within the write() call that asked the Kernel to
  change its power state. If the shutdown procedure takes longer (i.e.
  when it's done properly), this blocks the process causing the shutdown
  and prevents any potentially-useful interactions between Kernel and
  userland during shutdown.

In essence, current shutdown is a glorified system crash with minimal
file system cleanliness guarantees.

Therefore, this commit is the first step in improving our shutdown
procedure. The new shutdown flow is now as follows:
- From the write() call to the power state SysFS node, a new task is
  started, the Power State Switch Task. Its only purpose is to change
  the operating system's power state. This task takes over shutdown and
  reboot duties, although reboot is not modified in this commit.
- The Power State Switch Task assumes that userland has performed all
  shutdown duties it can perform on its own. In particular, it assumes
  that all kinds of clean process shutdown have been done, and remaining
  processes can be hard-killed without consequence. This is an important
  separation of concerns: While this commit does not modify userland, in
  the future SystemServer will be responsible for performing proper
  shutdown of user processes, including timeouts for stubborn processes
  etc.
- As mentioned above, the task hard-kills remaining user processes.
- The task hard-kills all Kernel processes except itself and the
  Finalizer Task. Since Kernel processes can delay their own shutdown
  indefinitely if they want to, they have plenty opportunity to perform
  proper shutdown if necessary. This may become a problem with
  non-cooperative Kernel tasks, but as seen two commits earlier, for now
  all tasks will cooperate within a few seconds.
- The task waits for the Finalizer Task to clean up all processes.
- The task hard-kills and finalizes the Finalizer Task itself, meaning
  that it now is the only remaining process in the system.
- The task syncs and locks all file systems, and then unmounts them. Due
  to an unknown refcount bug we currently cannot unmount the root file
  system; therefore the task is able to abort the clean unmount if
  necessary.
- The task performs platform-dependent hardware shutdown as before.

This commit has multiple remaining issues (or exposed existing ones)
which will need to be addressed in the future but are out of scope for
now:
- Unmounting the root filesystem is impossible due to remaining
  references to the inodes /home and /home/anon. I investigated this
  very heavily and could not find whoever is holding the last two
  references.
- Userland cannot perform proper cleanup, since the Kernel's power state
  variable is accessed directly by tools instead of a proper userland
  shutdown procedure directed by SystemServer.

The recently introduced Firmware/PowerState procedures are removed
again, since all of the architecture-independent code can live in the
power state switch task. The architecture-specific code is kept,
however.
2023-07-15 00:12:01 +02:00
Liav A 9b8b8c0e04 Kernel: Simplify reboot & poweroff code flow a bit
Instead of using ifdefs to use the correct platform-specific methods, we
can just use the same pattern we use for the microseconds_delay function
which has specific implementations for each Arch CPU subdirectory.

When linking a kernel image, the actual correct and platform-specific
power-state changing methods will be called in Firmware/PowerState.cpp
file.
2023-06-27 20:04:42 +02:00
Liav A d550b09871 Kernel: Move PC BIOS-related code to the x86_64 architecture directory
All code that is related to PC BIOS should not be in the Kernel/Firmware
directory as this directory is for abstracted and platform-agnostic code
like ACPI (and device tree parsing in the future).

This fixes a problem with the aarch64 architecure, as these machines
don't have any PC-BIOS in them so actually trying to access these memory
locations (EBDA, BIOS ROM) does not make any sense, as they're specific
to x86 machines only.
2023-06-19 23:49:00 +02:00
Liav A 5fd975da8f Kernel: Move MultiProcessor parsing code to the Arch/x86_64 directory
This code is very x86-specific, because Intel introduced the actual
MultiProcessor specification back in 1993, qouted here as a proof:

"The MP specification covers PC/AT-compatible MP platform designs based
on Intel processor architectures and Advanced Programmable Interrupt
Controller (APIC) architectures"
2023-06-19 23:49:00 +02:00
Liav A 428afca32b Kernel/ACPI: Make most of StaticParsing methods to be platform-agnostic
Most of the ACPI static parsing methods (methods that can be called
without initializing a full AML parser) are not tied to any specific
platform or CPU architecture.

The only method that is platform-specific is the one that finds the RSDP
structure. Thus, each CPU architecture/platform needs to implement it.
This means that now aarch64 can implement its own method to find the
ACPI RSDP structure, which would be hooked into the rest of the ACPI
code elegantly, but for now I just added a FIXME and that method returns
empty value of Optional<PhysicalAddress>.
2023-06-19 23:49:00 +02:00
Liav A be16a91aec Kernel: Rename FirmwareSysFSDirectory => SysFSFirmwareDirectory
This matches how we give the pattern names for other classses for SysFS
components.
2023-06-19 23:49:00 +02:00
Liav A 336fb4f313 Kernel: Move InterruptDisabler to the Interrupts subdirectory 2023-06-04 21:32:34 +02:00
Liav A 8f21420a1d Kernel: Move all boot-related code to the new Boot subdirectory 2023-06-04 21:32:34 +02:00
Liav A 7c0540a229 Everywhere: Move global Kernel pattern code to Kernel/Library directory
This has KString, KBuffer, DoubleBuffer, KBufferBuilder, IOWindow,
UserOrKernelBuffer and ScopedCritical classes being moved to the
Kernel/Library subdirectory.

Also, move the panic and assertions handling code to that directory.
2023-06-04 21:32:34 +02:00
Liav A aaa1de7878 Kernel: Move {Virtual,Physical}Address classes to the Memory directory 2023-06-04 21:32:34 +02:00
Liav A 1f9d3a3523 Kernel/PCI: Hold a reference to DeviceIdentifier in the Device class
There are now 2 separate classes for almost the same object type:
- EnumerableDeviceIdentifier, which is used in the enumeration code for
  all PCI host controller classes. This is allowed to be moved and
  copied, as it doesn't support ref-counting.
- DeviceIdentifier, which inherits from EnumerableDeviceIdentifier. This
  class uses ref-counting, and is not allowed to be copied. It has a
  spinlock member in its structure to allow safely executing complicated
  IO sequences on a PCI device and its space configuration.
  There's a static method that allows a quick conversion from
  EnumerableDeviceIdentifier to DeviceIdentifier while creating a
  NonnullRefPtr out of it.

The reason for doing this is for the sake of integrity and reliablity of
the system in 2 places:
- Ensure that "complicated" tasks that rely on manipulating PCI device
  registers are done in a safe manner. For example, determining a PCI
  BAR space size requires multiple read and writes to the same register,
  and if another CPU tries to do something else with our selected
  register, then the result will be a catastrophe.
- Allow the PCI API to have a united form around a shared object which
  actually holds much more data than the PCI::Address structure. This is
  fundamental if we want to do certain types of optimizations, and be
  able to support more features of the PCI bus in the foreseeable
  future.

This patch already has several implications:
- All PCI::Device(s) hold a reference to a DeviceIdentifier structure
  being given originally from the PCI::Access singleton. This means that
  all instances of DeviceIdentifier structures are located in one place,
  and all references are pointing to that location. This ensures that
  locking the operation spinlock will take effect in all the appropriate
  places.
- We no longer support adding PCI host controllers and then immediately
  allow for enumerating it with a lambda function. It was found that
  this method is extremely broken and too much complicated to work
  reliably with the new paradigm being introduced in this patch. This
  means that for Volume Management Devices (Intel VMD devices), we
  simply first enumerate the PCI bus for such devices in the storage
  code, and if we find a device, we attach it in the PCI::Access method
  which will scan for devices behind that bridge and will add new
  DeviceIdentifier(s) objects to its internal Vector. Afterwards, we
  just continue as usual with scanning for actual storage controllers,
  so we will find a corresponding NVMe controllers if there were any
  behind that VMD bridge.
2023-01-26 23:04:26 +01:00
Liav A 91db482ad3 Kernel: Reorganize Arch/x86 directory to Arch/x86_64 after i686 removal
No functional change.
2022-12-28 11:53:41 +01:00
Liav A 5ff318cf3a Kernel: Remove i686 support 2022-12-28 11:53:41 +01:00
Timon Kruiper 9827c11d8b Kernel: Move InterruptDisabler out of Arch directory
The code in this file is not architecture specific, so it can be moved
to the base Kernel directory.
2022-10-17 20:11:31 +02:00
Liav A 462802ef0c Kernel/SysFS: Expose file size of ACPI tables in /sys/firmware/acpi
It costs us nothing, and some utilities (such as the known file utility)
rely on the exposed file size (after doing lstat on it), to show
anything useful besides saying the file is "empty".
2022-10-16 17:26:35 +02:00
Liav A 11a5f2c508 Kernel: Initialize primitive class member of ACPISysFSComponent to zero 2022-10-16 17:26:35 +02:00
minus cf48200e7b Kernel: Make the ACPI DSDT table accessible
Expose the DSDT table in ACPI::Parser and in
/sys/firmware/acpi as a first little step toward
interpreting the AML bytecode from ACPI.
2022-10-12 00:11:36 -06:00
Liav A 05ba034000 Kernel: Introduce the IOWindow class
This class is intended to replace all IOAddress usages in the Kernel
codebase altogether. The idea is to ensure IO can be done in
arch-specific manner that is determined mostly in compile-time, but to
still be able to use most of the Kernel code in non-x86 builds. Specific
devices that rely on x86-specific IO instructions are already placed in
the Arch/x86 directory and are omitted for non-x86 builds.

The reason this works so well is the fact that x86 IO space acts in a
similar fashion to the traditional memory space being available in most
CPU architectures - the x86 IO space is essentially just an array of
bytes like the physical memory address space, but requires x86 IO
instructions to load and store data. Therefore, many devices allow host
software to interact with the hardware registers in both ways, with a
noticeable trend even in the modern x86 hardware to move away from the
old x86 IO space to exclusively using memory-mapped IO.

Therefore, the IOWindow class encapsulates both methods for x86 builds.
The idea is to allow PCI devices to be used in either way in x86 builds,
so when trying to map an IOWindow on a PCI BAR, the Kernel will try to
find the proper method being declared with the PCI BAR flags.
For old PCI hardware on non-x86 builds this might turn into a problem as
we can't use port mapped IO, so the Kernel will gracefully fail with
ENOTSUP error code if that's the case, as there's really nothing we can
do within such case.

For general IO, the read{8,16,32} and write{8,16,32} methods are
available as a convenient API for other places in the Kernel. There are
simply no direct 64-bit IO API methods yet, as it's not needed right now
and is not considered to be Arch-agnostic too - the x86 IO space doesn't
support generating 64 bit cycle on IO bus and instead requires two 2
32-bit accesses. If for whatever reason it appears to be necessary to do
IO in such manner, it could probably be added with some neat tricks to
do so. It is recommended to use Memory::TypedMapping struct if direct 64
bit IO is actually needed.
2022-09-23 17:22:15 +01:00
Liav A 1b7b360ca1 Kernel: Move x86-specific IRQ controller code to Arch/x86 directory
The PIC and APIC code are specific to x86 platforms, so move them out of
the general Interrupts directory to Arch/x86/common/Interrupts directory
instead.
2022-09-20 18:43:05 +01:00
Liav A 485d4e01ed Kernel: Move VMWare backdoor communication code to the x86 directory
The VMWare backdoor handling code involves many x86-specific
instructions and therefore should be in the Arch/x86 directory. This
ensures we can easily omit the code in compile-time for non-x86 builds.
2022-09-20 18:43:05 +01:00
Andreas Kling 11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Andreas Kling e475263113 AK+Kernel: Add AK::AtomicRefCounted and use everywhere in the kernel
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
2022-08-20 17:15:52 +02:00
Liav A 70afa0b171 Kernel/SysFS: Mark SysFSDirectory traverse and lookup methods as final
This enforces us to remove duplicated code across the SysFS code. This
results in great simplification of how the SysFS works now, because we
enforce one way to treat SysFSDirectory objects.
2022-07-15 12:29:23 +02:00
sin-ack 3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Tim Schumacher 3b3af58cf6 Kernel: Annotate all KBuffer and DoubleBuffer with a custom name 2022-07-12 00:55:31 +01:00
Liav A 9c6834698f Kerenl/Firmware: Add map_ebda and map_bios methods in the original place
In a previous commit I moved everything into the new subdirectories in
FileSystem/SysFS directory without trying to actually make changes in
the code itself too much. Now it's time to split the code to make it
more readable and understandable, hence this change occurs now.
2022-06-17 11:01:27 +02:00
Liav A 290eb53cb5 Kernel/SysFS: Stop cluttering the codebase with pieces of SysFS parts
Instead, start to put everything in one place to resemble the directory
structure of the SysFS when actually using it.
2022-06-17 11:01:27 +02:00
Timon Kruiper a4534678f9 Kernel: Implement InterruptDisabler using generic Processor functions
Now that the code does not use architectural specific code, it is moved
to the generic Arch directory and the paths are modified accordingly.
2022-06-02 13:14:12 +01:00
Liav A 02566d8091 Kernel: Move VMWareBackdoor to new directory in the Firmware directory 2022-04-20 19:21:32 +02:00
Idan Horowitz 086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Liav A ae2ec45e78 Kernel: Allow SysFS components to have non-zero size
This is important for dmidecode because it does an fstat on the DMI
blobs, trying to figure out their size. Because we already know the size
of the blobs when creating the SysFS components, there's no performance
penalty whatsoever, and this allows dmidecode to not use the /dev/mem
device as a fallback.
2022-04-01 11:27:19 +02:00
Liav A 66ff60db07 Kernel: Declare DMI SysFS BIOS classes as final 2022-04-01 11:27:19 +02:00
Liav A 338b4b27eb Kernel: Declare blob sizes of SysFS BIOS classes as const 2022-04-01 11:27:19 +02:00
Liav A 96aae59e9c Kernel: Initialize primitive data members of SysFS BIOS classes 2022-04-01 11:27:19 +02:00
Lenny Maiorani c6acf64558 Kernel: Change static constexpr variables to constexpr where possible
Function-local `static constexpr` variables can be `constexpr`. This
can reduce memory consumption, binary size, and offer additional
compiler optimizations.

These changes result in a stripped x86_64 kernel binary size reduction
of 592 bytes.
2022-02-09 21:04:51 +00:00
Idan Horowitz d5189b6677 Kernel: Replace {String => KString}::formatted in ACPISysFSDirectory 2022-01-21 16:27:21 +01:00
Idan Horowitz fdfdb5bd1c Kernel: Make ACPI reboot OOM-fallible 2022-01-21 16:27:21 +01:00
Idan Horowitz 3945e239e1 Kernel: Don't populate the ACPI SysFS directory with a disabled ACPI
This would cause a nullptr dereference on ACPI::Parser::the().
2022-01-18 21:00:46 +02:00
Idan Horowitz fb3e46e930 Kernel: Make map_typed() & map_typed_writable() fallible using ErrorOr
This mostly just moved the problem, as a lot of the callers are not
capable of propagating the errors themselves, but it's a step in the
right direction.
2022-01-13 22:40:25 +01:00
Idan Horowitz e2e5d4da16 Kernel: Make map_bios() and map_ebda() fallible using ErrorOr 2022-01-13 22:40:25 +01:00
mjz19910 3102d8e160 Everywhere: Fix many spelling errors 2022-01-07 10:56:59 +01:00
Brian Gianforcaro 6c66311ade Kernel: Use MUST + Vector::try_empend instead of Vector::empend
In preparation for making Vector::empend unavailable during
compilation of the Kernel.
2022-01-05 14:04:18 +01:00
Brian Gianforcaro 24066ba5ef Kernel: Use MUST + Vector::try_append instead of Vector::append
In preparation for making Vector::append unavailable during
compilation of the Kernel.
2022-01-05 14:04:18 +01:00
Tom 10efbfb09e Kernel: Scan ACPI memory ranges for the RSDP table
On some systems the ACPI RSDP table may be located in ACPI reserved
memory ranges rather than in the EBDA or BIOS areas.
2022-01-04 17:46:36 +00:00
Tom e70aa690d2 Kernel: Fix determining EBDA size
The first byte of the EBDA structure contains the size of the EBDA
in 1 KiB units. We were incorrectly using the word at offset 0x413
of the BDA which specifies the number of KiB before the EBDA structure.
2022-01-04 17:46:36 +00:00
Guilherme Goncalves 33b78915d3 Kernel: Propagate overflow errors from Memory::page_round_up
Fixes #11402.
2021-12-28 23:08:50 +01:00
Liav A 52e01b46eb Kernel: Move Multi Processor Parser code to a separate directory 2021-12-23 23:18:58 -08:00
Liav A bbdb55126c Kernel/SysFS: Don't allocate ACPISysFS components in constructors
Instead defer it to a method to be called after the construction of
ACPISysFSDirectory.
2021-12-14 09:01:33 +01:00
Liav A 381fdaa163 Kernel/SysFS: Make it clear that some components must be created in boot
Using the phrase "create" doesn't give information on whether the object
must be allocated or a failure to do so can be handled gracefully.
Therefore, we must use better phrase for such purpose, so "must_create"
for the allocate-and-construct static methods is definitely good choice.
2021-12-14 09:01:33 +01:00
Liav A 478f543899 Kernel/SysFS: Prevent allocation for component name during construction
Instead, allocate before constructing the object and pass NonnullOwnPtr
of KString to the object if needed. Some classes can determine their
names as they have a known attribute to look for or have a static name.
2021-12-14 09:01:33 +01:00