Commit graph

352 commits

Author SHA1 Message Date
Andreas Kling d6ef18f587 Kernel: Don't hog the MM lock while unmapping regions
We were holding the MM lock across all of the region unmapping code.
This was previously necessary since the quickmaps used during unmapping
required holding the MM lock.

Now that it's no longer necessary, we can leave the MM lock alone here.
2022-08-24 14:57:51 +02:00
Andreas Kling dc9d2c1b10 Kernel: Wrap RegionTree objects in SpinlockProtected
This makes locking them much more straightforward, and we can remove
a bunch of confusing use of AddressSpace::m_lock. That lock will also
be converted to use of SpinlockProtected in a subsequent patch.
2022-08-24 14:57:51 +02:00
Andreas Kling 6cd3695761 Kernel: Stop taking MM lock while using regular quickmaps
You're still required to disable interrupts though, as the mappings are
per-CPU. This exposed the fact that our CR3 lookup map is insufficiently
protected (but we'll address that in a separate commit.)
2022-08-22 17:56:03 +02:00
Andreas Kling c8375c51ff Kernel: Stop taking MM lock while using PD/PT quickmaps
This is no longer required as these quickmaps are now per-CPU. :^)
2022-08-22 17:56:03 +02:00
Andreas Kling a838fdfd88 Kernel: Make the page table quickmaps per-CPU
While the "regular" quickmap (used to temporarily map a physical page
at a known address for quick access) has been per-CPU for a while,
we also have the PD (page directory) and PT (page table) quickmaps
used by the memory management code to edit page tables. These have been
global, which meant that SMP systems had to keep fighting over them.

This patch makes *all* quickmaps per-CPU. We reserve virtual addresses
for up to 64 CPUs worth of quickmaps for now.

Note that all quickmaps are still protected by the MM lock, and we'll
have to fix that too, before seeing any real throughput improvements.
2022-08-22 17:56:03 +02:00
Andreas Kling 11eee67b85 Kernel: Make self-contained locking smart pointers their own classes
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:

- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable

This patch renames the Kernel classes so that they can coexist with
the original AK classes:

- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable

The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
2022-08-20 17:20:43 +02:00
Andreas Kling e475263113 AK+Kernel: Add AK::AtomicRefCounted and use everywhere in the kernel
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
2022-08-20 17:15:52 +02:00
kleines Filmröllchen 4314c25cf2 Kernel: Require lock rank for Spinlock construction
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
2022-08-19 20:26:47 -07:00
Liav A a1a1462a22 Kernel/Memory: Use scope guard to remove a region if we failed to map it 2022-08-19 15:26:04 +03:00
Andreas Kling 5ada38f9c3 Kernel: Reduce time under VMObject lock while handling zero faults
We only need to hold the VMObject lock while inspecting and/or updating
the physical page array in the VMObject.
2022-08-19 12:52:48 +02:00
Andreas Kling a84d893af8 Kernel/x86: Re-enable interrupts ASAP when handling page faults
As soon as we've saved CR2 (the faulting address), we can re-enable
interrupt processing. This should make the kernel more responsive under
heavy fault loads.
2022-08-19 12:14:57 +02:00
Andreas Kling 4bc3745ce6 Kernel: Make Region's physical page accessors safer to use
Region::physical_page() now takes the VMObject lock while accessing the
physical pages array, and returns a RefPtr<PhysicalPage>. This ensures
that the array access is safe.

Region::physical_page_slot() now VERIFY()'s that the VMObject lock is
held by the caller. Since we're returning a reference to the physical
page slot in the VMObject's physical page array, this is the best we
can do here.
2022-08-18 19:20:33 +02:00
Andreas Kling b560442fe1 Kernel: Don't hog VMObject lock when remapping a region page
We really only need the VMObject lock when accessing the physical pages
array, so once we have a strong pointer to the physical page we want to
remap, we can give up the VMObject lock.

This fixes a deadlock I encountered while building DOOM on SMP.
2022-08-18 18:56:35 +02:00
Andreas Kling 10399a258f Kernel: Move Region physical page accessors out of line 2022-08-18 18:52:34 +02:00
Andreas Kling c14dda14c4 Kernel: Add a comment about what the MM lock protects 2022-08-18 18:52:34 +02:00
Andreas Kling 75348bdfd3 Kernel: Don't require MM lock for Region::set_page_directory()
The MM lock is not required for this, it's just a simple ref-counted
pointer assignment.
2022-08-18 18:52:34 +02:00
Andreas Kling 27c1135d30 Kernel: Don't remap all regions from Region::remap_vmobject_page()
When handling a page fault, we only need to remap the faulting region in
the current process. There's no need to traverse *all* regions that map
the same VMObject and remap them cross-process as well.

Those other regions will get remapped lazily by their own page fault
handlers eventually. Or maybe they won't and we avoided some work. :^)
2022-08-18 18:52:34 +02:00
Andreas Kling 45e6123de8 Kernel: Shorten time under spinlocks while handling inode faults
- Instead of holding the VMObject lock across physical page allocation
  and quick-map + copy, we now only hold it when updating the VMObject's
  physical page slot.
2022-08-18 18:52:34 +02:00
Mike Akers de980de0e4 Kernel: Lock the inode before writing in SharedInodeVMObject::sync
We ensure that when we call SharedInodeVMObject::sync we lock the inode
lock before calling Inode virtual write_bytes method directly to avoid
assertion on the unlocked inode lock, as it was regressed recently. This
is not a complete fix as the need to lock from each path before calling
the write_bytes method should be avoided because it can lead to
hard-to-find bugs, and this commit only fixes the problem temporarily.
2022-08-16 16:54:03 +02:00
dylanbobb 8180211431 Kernel: Release 1 page instead of all pages when starved for pages
Previously, when starved for pages, *all* clean file-backed memory
would be released, which is quite excessive.

This patch instead releases just 1 page, since only 1 page is needed
to satisfy the request to `allocate_physical_page()`
2022-08-16 01:13:17 +02:00
dylanbobb 09d0fae2c6 Kernel: Allow release of a specific amount of of clean pages
Previously, we could only release *all* clean pages.
This patch makes it possible to release a specific amount of clean
pages. If the attempted number of pages to release is more than the
amount of clean pages, all clean pages will be released.
2022-08-16 01:13:17 +02:00
Idan Horowitz 4edae21bd1 Kernel: Remove regions from the region tree after failing to map them
At the point at which we try to map the Region it was already added to
the Process region tree, so we have to make sure to remove it before
freeing it in the mapping failure path, otherwise the tree will contain
a dangling pointer to the free'd instance.
2022-08-15 02:42:28 +02:00
Jorropo ec4b83326b Kernel: Don't release file-pages if volatile memory purge did it 2022-08-15 00:11:33 +02:00
Andreas Kling 3c7b0dab0b Kernel: Dump list of processes and their memory usage when OOMing 2022-08-14 23:33:28 +02:00
Andreas Kling 9e9924115f Kernel: Release some clean file-backed memory when starved for pages
Until now, our only backup plan when running out of physical pages
was to try and purge volatile memory. If that didn't work out, we just
hung userspace out to dry with an ENOMEM.

This patch improves the situation by also considering clean, file-backed
pages (that we could page back in from disk).

This could be better in many ways, but it already allows us to boot to
WindowServer with 256 MiB of RAM. :^)
2022-08-14 23:33:28 +02:00
Andreas Kling 92556e07d3 Kernel: Update outdated "user physical pages" terminology
These are now just "physical pages".
2022-08-14 23:33:28 +02:00
Brian Gianforcaro 2d06f6399f Kernel: Fix SMP deadlock in MM::allocate_contiguous_physical_pages
This deadlock was introduced with the creation of this API. The lock
order is such that we always need to take the page directory lock
before we ever take the MM lock.

This function violated that, as both Region creation and region
destruction require the pd and mm locks, but with the mm lock
already acquired we deadlocked with SMP mode enabled while other
threads were allocating regions.

With this change SMP boots to the desktop successfully for me,
(and then subsequently has other issues). :^)
2022-08-09 12:09:59 +02:00
Hendiadyoin1 66fc06001d Kernel: Add some inline capacity to find_regions_intersecting
This should avoid some allocations during simple cases of munmap,
mprotect and msync, where you usually don't have a lot of regions anyway
2022-07-15 12:42:43 +02:00
Liav A e4e5fa74d0 Kernel+Userland: Rename prefix of user_physical => physical
There's no such supervisor pages concept, so there's no need to call
physical pages with the "user_physical" prefix anymore.
2022-07-14 23:27:46 +02:00
Liav A 1c499e75bd Kernel+Userland: Remove supervisor pages concept
There's no real value in separating physical pages to supervisor and
user types, so let's remove the concept and just let everyone to use
"user" physical pages which can be allocated from any PhysicalRegion
we want to use. Later on, we will remove the "user" prefix as this
prefix is not needed anymore.
2022-07-14 23:27:46 +02:00
Liav A 37b4133c51 Kernel: Allocate user physical pages instead of supervisor ones for DMA
We are limited on the amount of supervisor pages we can allocate, so
don't allocate from that pool. Supervisor pages are always below 16 MiB
barrier so using those was crucial when we used devices like the ISA
SoundBlaster 16 card, because that device required very low physical
addresses to be used.
2022-07-14 13:15:24 +02:00
sin-ack 3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Idan Horowitz 8717e78918 Kernel: Stop committing pages for COW of uncommitted pages on sys$fork
Uncommitted pages (shared zero pages) can not contain any existing data
and can not be modified, so there's no point to committing a bunch of
extra pages to cover for them in the forked child.
2022-07-11 16:29:10 +02:00
Idan Horowitz 1d96c30488 Kernel: Stop leaking leftover committed cow pages from forked processes
Since both the parent process and child process hold a reference to the
COW committed set, once the child process exits, the committed COW
pages are effectively leaked, only being slowly re-claimed each time
the parent process writes to one of them, realizing it's no longer
shared, and uncommitting it.
In order to mitigate this we now hold a weak reference the parent
VMObject from which the pages are cloned, and we use it on destruction
when available to drop the reference to the committed set from it as
well.
2022-07-10 22:17:21 +03:00
Tim Schumacher cedec9751a Kernel: Decrease the amount of address space offset randomization
This is basically unchanged since the beginning of 2020, which is a year
before we had proper ASLR.

Now that we have a proper ASLR implementation, we can turn this down a
bit, as it is no longer our only protection against predictable dynamic
loader addresses, and it actually obstructs the default loading address
of x86_64 quite frequently.
2022-06-21 22:38:15 +01:00
Andrew Kaster 1d3b5d330d Kernel: Tolerate cloning MAP_STACK regions that are PROT_NONE
There's nothing stopping a userspace program from keeping a bunch of
threads around with a custom signal stack in a suspended state with
their normal thread stack mprotected to PROT_NONE.

OpenJDK seems to do this, for example.
2022-06-19 09:05:35 +02:00
Liav A 3d22917548 Kernel/Memory: Introduce the SharedFramebufferVMObject class
This new type of VMObject will be used to coordinate switching safely
from graphical mode to text mode and vice-versa, by supplying a way to
remap all Regions that were created with this object, so mappings can be
changed according to the given state of system mode. This makes it quite
easy to give applications like WindowServer the feeling of having full
access to the framebuffer device from a DisplayConnector, but still keep
the Kernel in control to be able to safely switch to text console.
2022-06-06 20:11:05 +01:00
Idan Horowitz ea1e5b630d Kernel: Verify system memory info consistency 2022-06-06 01:36:18 +03:00
Idan Horowitz b4e45a6636 Kernel: Tighten assertion in MM::find_free_user_physical_page
If our book-keeping of user physical pages is correct, we should always
find a physical page, regardless if it was committed or uncommitted.
2022-06-06 01:36:18 +03:00
Idan Horowitz 427f1f7e31 Kernel: Only use uncommitted pages when allocating contiguous user pages 2022-06-06 01:36:18 +03:00
Timon Kruiper a4534678f9 Kernel: Implement InterruptDisabler using generic Processor functions
Now that the code does not use architectural specific code, it is moved
to the generic Arch directory and the paths are modified accordingly.
2022-06-02 13:14:12 +01:00
Liav A e6ebf9e5c1 Kernel/Memory: Add TypedMapping base_address method
This method will be used to ease usage with the structure when we need
to do virtual pointer arithmetics.
2022-05-05 20:55:57 +02:00
Timon Kruiper feba7bc8a8 Kernel: Move Kernel/Arch/x86/SafeMem.h to Kernel/Arch/SafeMem.h
The file does not contain any specific architectural code, thus it can
be moved to the Kernel/Arch directory.
2022-05-03 21:53:36 +02:00
Tim Schumacher 71f2c342c4 Kernel: Limit free space between randomized memory allocations 2022-04-21 13:16:56 +02:00
Andreas Kling 0a83c03546 Kernel: Don't unregister Region from RegionTree *before* unmapping it
If we unregister from the RegionTree before unmapping, there's a race
where a new region can get inserted at the same address that we're about
to unmap. If this happens, ~Region() will then unmap the newly inserted
region, which now finds itself with cleared-out page table entries.
2022-04-05 13:46:50 +02:00
Andreas Kling a3db0ab14f Kernel: Remove MemoryManager::region_tree() accessor
Let's not have a way to grab at the RegionTree from outside of MM.
2022-04-05 13:45:10 +02:00
Andreas Kling f8d798b667 Kernel: Move allocate_unbacked_region_anywhere() to MemoryManager
This didn't need to be in RegionTree, and since it's specific to kernel
VM anyway, let's move it to MemoryManager.
2022-04-05 13:45:10 +02:00
Andreas Kling e0da8da657 Kernel: Move create_identity_mapped_region() to MemoryManager
This had no business being in RegionTree, since RegionTree doesn't track
identity-mapped regions anyway. (We allow *any* address to be identity
mapped, not just the ones that are part of the RegionTree's range.)
2022-04-05 13:45:10 +02:00
Andreas Kling cfb61cbd54 Kernel: Add RegionTree::find_region_containing(address or range)
Let's encapsulate looking up regions so clients don't have to dig into
RegionTree internals.
2022-04-05 12:23:47 +02:00
Andreas Kling da7ea2556e Kernel: Add RegionTree::remove(Region&)
This allows clients to remove a region from the tree without reaching
into the RegionTree internals.
2022-04-05 11:57:53 +02:00
Andreas Kling f0f97e1db0 Kernel: Take the RegionTree spinlock when inspecting tree from outside
This patch adds RegionTree::get_lock() which exposes the internal lock
inside RegionTree. We can then lock it from the outside when doing
lookups or traversal.

This solution is not very beautiful, we should find a way to protect
this data with SpinlockProtected or something similar. This is a stopgap
patch to try and fix the currently flaky CI.
2022-04-05 01:15:22 +02:00
Andreas Kling e3e1d79a7d Kernel: Remove unused ShouldDeallocateVirtualRange parameters
Since there is no separate virtual range allocator anymore, this is
no longer used for anything.
2022-04-05 01:15:22 +02:00
Andreas Kling 9bb45ab3a6 Kernel: Add debug logging to learn more about unexpected NP faults 2022-04-04 17:10:30 +02:00
Andreas Kling d1f2d63840 Kernel: Remove unused Region::try_create_kernel_only() 2022-04-04 12:34:13 +02:00
Idan Horowitz d6e4a25e0c Kernel: Use the InstrusiveRedBlackTree::begin_from(V&) API
This let's us skip an O(logn) tree traversal.
2022-04-04 00:16:11 +02:00
Idan Horowitz 30e6b313b4 Kernel: Remove false condition in RegionTree::allocate_range_specific
Since find_largest_not_above returns the highest region that is below
the end of the request range, no region after it can intersect with it.
2022-04-04 00:16:11 +02:00
Andreas Kling db75bab493 Kernel: Actually fix accidental overlaps in allocate_range_specific()
Thanks to Idan for spotting this! :^)
2022-04-03 23:58:57 +02:00
Andreas Kling 9765f9f67e Kernel: Fix accidental overlaps in RegionTree::allocate_range_specific()
Thanks to Idan for spotting this! :^)
2022-04-03 23:07:29 +02:00
Andreas Kling 92dfcdb6b1 Kenrel: Update a dmesgln() to say "RegionTree" instead of old class name 2022-04-03 22:00:19 +02:00
Andreas Kling 9e1da1f4f5 Kernel: Add a little explainer comment above RegionTree 2022-04-03 21:59:48 +02:00
Andreas Kling 8b01789ec4 Kernel: Improve RegionTree's internal helper function names
It's a bit nicer if functions that allocate ranges have some kind of
name that includes both "allocate" and "range". :^)
2022-04-03 21:56:16 +02:00
Andreas Kling 32dea6bde5 Kernel: Add missing include to PageDirectory.h 2022-04-03 21:51:58 +02:00
Andreas Kling 858b196c59 Kernel: Unbreak ASLR in the new RegionTree world
Functions that allocate and/or place a Region now take a parameter
that tells it whether to randomize unspecified addresses.
2022-04-03 21:51:58 +02:00
Andreas Kling e89c9ed2ca Kernel: Stop exposing RegionTree API for VM range allocation
...and remove the last remaining client of the API. It's no longer
possible to ask the RegionTree for a VM range. You can only ask it to
place your Region somewhere in available space.
2022-04-03 21:51:58 +02:00
Andreas Kling 07f3d09c55 Kernel: Make VM allocation atomic for userspace regions
This patch move AddressSpace (the per-process memory manager) to using
the new atomic "place" APIs in RegionTree as well, just like we did for
MemoryManager in the previous commit.

This required updating quite a few places where VM allocation and
actually committing a Region object to the AddressSpace were separated
by other code.

All you have to do now is call into AddressSpace once and it'll take
care of everything for you.
2022-04-03 21:51:58 +02:00
Andreas Kling e852a69a06 LibWeb: Make VM allocation atomic for kernel regions
Instead of first allocating the VM range, and then inserting a region
with that range into the MM region tree, we now do both things in a
single atomic operation:

    - RegionTree::place_anywhere(Region&, size, alignment)
    - RegionTree::place_specifically(Region&, address, size)

To reduce the number of things we do while locking the region tree,
we also require callers to provide a constructed Region object.
2022-04-03 21:51:58 +02:00
Andreas Kling cbf52d474c Kernel: Remove now-unused VirtualRangeAllocator
This has been replaced with the allocation-free RegionTree. :^)
2022-04-03 21:51:58 +02:00
Andreas Kling e8f543c390 Kernel: Use intrusive RegionTree solution for kernel regions as well
This patch ports MemoryManager to RegionTree as well. The biggest
difference between this and the userspace code is that kernel regions
are owned by extant OwnPtr<Region> objects spread around the kernel,
while userspace regions are owned by the AddressSpace itself.

For kernelspace, there are a couple of situations where we need to make
large VM reservations that never get backed by regular VMObjects
(for example the kernel image reservation, or the big kmalloc range.)
Since we can't make a VM reservation without a Region object anymore,
this patch adds a way to create unbacked Region objects that can be
used for this exact purpose. They have no internal VMObject.)
2022-04-03 21:51:58 +02:00
Andreas Kling ffe2e77eba Kernel: Add Memory::RegionTree to share code between AddressSpace and MM
RegionTree holds an IntrusiveRedBlackTree of Region objects and vends a
set of APIs for allocating memory ranges.

It's used by AddressSpace at the moment, and will be used by MM soon.
2022-04-03 21:51:58 +02:00
Andreas Kling 02a95a196f Kernel: Use AddressSpace region tree for range allocation
This patch stops using VirtualRangeAllocator in AddressSpace and instead
looks for holes in the region tree when allocating VM space.

There are many benefits:

- VirtualRangeAllocator is non-intrusive and would call kmalloc/kfree
  when used. This new solution is allocation-free. This was a source
  of unpleasant MM/kmalloc deadlocks.

- We consolidate authority on what the address space looks like in a
  single place. Previously, we had both the range allocator *and* the
  region tree both being used to determine if an address was valid.
  Now there is only the region tree.

- Deallocation of VM when splitting regions is no longer complicated,
  as we don't need to keep two separate trees in sync.
2022-04-03 21:51:58 +02:00
Andreas Kling 2617adac52 Kernel: Store AddressSpace memory regions in an IntrusiveRedBlackTree
This means we never need to allocate when inserting/removing regions
from the address space.
2022-04-03 21:51:58 +02:00
James Mintram d79c772c87 Kernel: Make MemoryManager compile on aarch64 2022-04-02 19:34:20 -07:00
James Mintram 6299a69253 Kernel: Make handle_crash available to aarch64 2022-04-02 19:34:20 -07:00
James Mintram d3b6201b40 Kernel: Make PageDirectory.cpp compile on aarch64 2022-04-02 19:34:20 -07:00
James Mintram 0d7eee625f Kernel: Make AddressSpace.cpp compile on aarch64 2022-04-02 19:34:20 -07:00
James Mintram 627fd231d5 Kernel: Make Region.cpp compile on aarch64 2022-04-02 19:34:20 -07:00
Idan Horowitz 086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Idan Horowitz f0166efe8c Kernel: Use the whole kernel PD range when randomizing the KASLR offset
Now that we reclaim the memory range that is created by KASLR before
the start of the kernel image, there's no need to be conservative with
the KASLR offset.
2022-03-23 19:49:49 +02:00
Idan Horowitz e18632660f Kernel: Use the pre-image kernel memory range introduced by KASLR
This ensures we don't just waste the memory range between the default
base load address and the actual load address that was shifted by the
KASLR offset.
2022-03-22 16:46:51 +01:00
Lenny Maiorani 190cf1507b Kernel: Use default constructors/destructors
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#cother-other-default-operation-rules

"The compiler is more likely to get the default semantics right and
you cannot implement these functions better than the compiler."
2022-03-17 00:51:36 -07:00
Idan Horowitz e37e4a7980 Kernel: Make Inode::set_shared_vmobject() OOM-fallible
Allocating a WeakPtr can fail, so this let's us properly propagate said
failure.
2022-02-14 11:35:20 +01:00
Idan Horowitz 75fe51a9ca Kernel: Stop trying to write unmapped Process regions into CoreDumps
If we crashed in the middle of mapping in Regions, some of the regions
may not have a page directory yet, and will result in a crash when
Region::remap() is called.
2022-02-11 17:49:46 +02:00
Idan Horowitz 57bce8ab97 Kernel: Set up Regions before adding them to a Process's AddressSpace
This reduces the amount of time in which not fully-initialized Regions
are present inside an AddressSpace's region tree.
2022-02-11 17:49:46 +02:00
Idan Horowitz d9d3362722 Kernel: Make SharedInodeVMObject pages Bitmap allocation OOM-fallible 2022-02-11 17:49:46 +02:00
Idan Horowitz 8030e2a88f Kernel: Make AnonymousVMObject COW-Bitmap allocation OOM-fallible 2022-02-11 17:49:46 +02:00
Idan Horowitz 871a53db76 AK: Make Bitmap construction OOM-fallible 2022-02-11 17:49:46 +02:00
Andreas Kling 2ff9db0245 Kernel: Make contiguous VM objects use "user physical pages" by default
If someone specifically wants contiguous memory in the low-physical-
address-for-DMA range ("super pages"), they can use the
allocate_dma_buffer_pages() helper.
2022-02-11 12:45:38 +01:00
Lenny Maiorani c6acf64558 Kernel: Change static constexpr variables to constexpr where possible
Function-local `static constexpr` variables can be `constexpr`. This
can reduce memory consumption, binary size, and offer additional
compiler optimizations.

These changes result in a stripped x86_64 kernel binary size reduction
of 592 bytes.
2022-02-09 21:04:51 +00:00
Idan Horowitz 8289727fac Kernel: Stop using the make<T> factory method in the Kernel
As make<T> is infallible, it really should not be used anywhere in the
Kernel. Instead replace with fallible `new (nothrow)` calls, that will
eventually be error-propagated.
2022-02-03 23:33:20 +01:00
Andreas Kling d85f062990 Revert "Kernel: Only update page tables for faulting region"
This reverts commit 1c5ffaae41.

This broke shared memory as used by OutOfProcessWebView. Let's do
a revert until we can figure out what went wrong.
2022-02-02 11:02:54 +01:00
Andreas Kling 1c5ffaae41 Kernel: Only update page tables for faulting region
When a page fault led to the mapping of a new physical page, we were
updating the page tables for *every* region that shared the same
underlying VMObject.

Let's just not do that, avoiding a bunch of unnecessary page table
updates and TLB invalidations.
2022-02-02 02:16:49 +01:00
Andreas Kling a44316fa8b Kernel: Release page directory and MM locks sooner in space finalization
We don't need to hold these locks when tearing down the region tree.
Release them as soon as unmapping is finished.
2022-01-30 16:21:59 +01:00
Andreas Kling 3845c90e08 Kernel: Remove unnecessary includes from Thread.h
...and deal with the fallout by adding missing includes everywhere.
2022-01-30 16:21:59 +01:00
Lenny Maiorani b0a54518d8 Everywhere: Remove redundant inline keyword
`constexpr` implies `inline` so when both are used it is redundant.
2022-01-29 21:45:17 +02:00
Andreas Kling a12e19c015 Kernel: Move kernel region checks from x86 page fault handler to MM
Ideally the x86 fault handler would only do x86 specific things and
delegate the rest of the work to MemoryManager. This patch moves some of
the address checks to a more generic place.
2022-01-28 23:41:18 +01:00
Andreas Kling 5092813a45 Kernel: Quickly reject userspace addresses in kernel_region_from_vaddr()
This avoids taking and releasing the MM lock just to reject an address
that we can tell from just looking at it that it won't ever be in the
kernel regions tree.
2022-01-28 23:41:18 +01:00
Idan Horowitz 5146315a15 Kernel: Convert MemoryManager::allocate_user_physical_page to ErrorOr
This allows is to use the TRY macro at the call sites, instead of using
clunky null checks.
2022-01-28 19:05:52 +02:00
Idan Horowitz bd5b56cab0 Kernel: Make allocate_supervisor_physical_page OOM-fallible 2022-01-28 19:05:52 +02:00
Idan Horowitz 4d2f1a05ec Kernel: Make allocate_contiguous_supervisor_physical_pages OOM-fallible 2022-01-28 19:05:52 +02:00
Idan Horowitz 956824afe2 Kernel: Use memset instead of fast_u32_fill in MemoryManager zero fills
When the values we're setting are not actually u32s and the size of the
area we're setting is PAGE_SIZE-aligned and a multiple of PAGE_SIZE in
size, there's no point in using fast_u32_fill, as that forces us to use
STOSDs instead of STOSQs.
2022-01-28 19:05:52 +02:00
Idan Horowitz 1e941fc3cc Kernel: Make VirtualRangeAllocator::carve_from_region OOM-fallible 2022-01-26 22:05:34 +00:00
Tom 6e46e21c42 Kernel: Implement Page Attribute Table (PAT) support and Write-Combine
This allows us to enable Write-Combine on e.g. framebuffers,
significantly improving performance on bare metal.

To keep things simple we right now only use one of up to three bits
(bit 7 in the PTE), which maps to the PA4 entry in the PAT MSR, which
we set to the Write-Combine mode on each CPU at boot time.
2022-01-26 09:21:04 +02:00
Idan Horowitz a6f0ab358a Kernel: Make AddressSpace::find_regions_intersecting OOM-fallible 2022-01-26 02:37:03 +02:00
Idan Horowitz bd603003b5 Kernel: Make AddressSpace::amount_clean_inode() OOM-fallible 2022-01-26 02:37:03 +02:00
Marco Cutecchia a7ce0c2297 Kernel: Add missing #include <AK/Badge.h> to MemoryManager.h
This missing header broke the aarch64 build
2022-01-23 14:31:50 +01:00
Idan Horowitz d65347d39d Kernel: Make Memory::RingBuffer construction fallible 2022-01-21 16:27:21 +01:00
Andreas Kling df34f7b90b Kernel: Use an IntrusiveRedBlackTree for kernel regions
We were already using a non-intrusive RedBlackTree, and since the kernel
regions tree is non-owning, this is a trivial conversion that makes a
bunch of the tree operations infallible (by being allocation-free.) :^)
2022-01-16 23:31:01 +01:00
creator1creeper1 326c6130a5 Kernel: Don't access directory table of uninitialized PageDirectory
PageDirectory gets initialized step-by-step in
PageDirectory::try_create_for_userspace(). This initialization may fail
anywhere in this function - for example, we may not be able to
allocate a directory table, in which case
PageDirectory::try_create_for_userspace() will return a null pointer.
We recognize this condition and early-return ENOMEM. However, at this
point, we need to correctly destruct the only partially initialized
PageDirectory. Previously, PageDirectory::~PageDirectory() would assume
that the object it was destructing was always fully initialized. It now
uses the new helper PageDirectory::is_cr3_initialized() to correctly
recognize when the directory table was not yet initialized. This helper
checks if the pointer to the directory table is null. Only if it is not
null does the destructor try to fetch the directory table using
PageDirectory::cr3().
2022-01-16 12:08:57 -08:00
creator1creeper1 1a2f71581b Kernel: Remove infallible VMObject resource factory functions
These infallible resource factory functions were only there to ease the
conversion to the new factory functions. Since all child classes of
VMObject now use the fallible resource factory functions, we don't
need the infallible versions anymore.
2022-01-15 22:16:00 +02:00
creator1creeper1 2a4e410b63 Kernel: Make SharedInodeVMObject construction OOM-aware
This commit moves the allocation of the resources required for
SharedInodeVMObject from its constructors to its factory functions.

We're making this change to expose the fallibility of the allocation.
2022-01-15 22:16:00 +02:00
creator1creeper1 9a1dfe70fe Kernel: Make PrivateInodeVMObject construction OOM-aware
This commit moves the allocation of the resources required for
PrivateInodeVMObject from its constructors to its factory functions.

We're making this change to expose the fallibility of the allocation.
2022-01-15 22:16:00 +02:00
creator1creeper1 ad480ff18b Kernel: Make InodeVMOBject construction OOM-aware
This commit moves the allocation of the resources required for
InodeVMObject from its constructors to the constructors of its child
classes.

We're making this change to give the child classes the chance to expose
the fallibility of the allocation.
2022-01-15 22:16:00 +02:00
creator1creeper1 3879e70447 Kernel: Make AnonymousVMObject construction OOM-aware
This commit moves the allocation of the resources required for
AnonymousVMObject from its constructors to its factory functions.

We're making this change to expose the fallibility of the allocation.
2022-01-15 22:16:00 +02:00
creator1creeper1 d1f265e851 Kernel: Make VMOBject construction OOM-aware
This commit moves the allocation of the resources required for VMObject
from its constructors to the constructors of its child classes.

We're making this change to give the child classes the chance to expose
the fallibility of the allocation.
2022-01-15 22:16:00 +02:00
Andreas Kling 094b88e6a5 Kernel: Don't remap already non-writable regions when they become CoW
The only purpose of the remap() in Region::try_clone() is to ensure
non-writable page table entries for CoW regions. If a region is already
non-writable, there's no need to waste time updating the page tables.
2022-01-15 19:51:16 +01:00
Andreas Kling c6adefcfc0 Kernel: Don't bother with page tables for PROT_NONE mappings
When mapping or unmapping completely inaccessible memory regions,
we don't need to update the page tables at all. This saves a bunch of
time in some situations, most notably during dynamic linking, where we
make a large VM reservation and immediately throw it away. :^)
2022-01-15 19:51:16 +01:00
Andreas Kling 9ddccbc6da Kernel: Use move() in Region::try_clone() to avoid a VMObject::unref() 2022-01-15 19:51:15 +01:00
Andreas Kling 8e0387e674 Kernel: Only register kernel regions with MemoryManager
We were already only tracking kernel regions, this patch just makes it
more clear by having it reflected in the name of the registration
helpers.

We also stop calling them for userspace regions, avoiding some spinlock
action in such cases.
2022-01-15 19:51:15 +01:00
Andreas Kling 4fa3c1bf2d Kernel: Remove old "region lookup cache" optimization
This optimization was added when region lookup was O(n), before we had
the O(log n) RedBlackTree. Let's remove it to simplify the code, as we
have no evidence that it remains valuable.
2022-01-15 19:51:15 +01:00
Idan Horowitz 6e37487477 Kernel: Always remove PageDirectories from the cr3 map on destruction
Previously we would only remove them from the map if they were attached
to an AddressSpace, even though we would always add them to the map on
construction. This results in an assertion failure on destruction if
the page directory was never attached to an AddressSpace. (for example,
on an allocation failure of said AddressSpace)
2022-01-15 11:04:07 +01:00
Idan Horowitz fb3e46e930 Kernel: Make map_typed() & map_typed_writable() fallible using ErrorOr
This mostly just moved the problem, as a lot of the callers are not
capable of propagating the errors themselves, but it's a step in the
right direction.
2022-01-13 22:40:25 +01:00
Andreas Kling de05223122 Kernel: Don't flush TLB after creating brand-new mappings
The CPU will not cache TLB entries for non-present mappings, so we
can simply skip flushing the TLB after creating entirely new mappings.
2022-01-13 11:22:11 +01:00
Idan Horowitz 618f123463 Kernel: Use StringView instead of String in RingBuffer's constructor
This String was being copied into a KString internally anyways.
2022-01-13 00:20:08 -08:00
Andreas Kling 5f71925aa4 Kernel: Actually clear page slots in Region::clear_to_zero()
We were copying the RefPtr<PhysicalPage> and zeroing the copy instead
of zeroing the slot itself.
2022-01-12 14:52:47 +01:00
Andreas Kling d8206c1059 Kernel: Don't release/relock spinlocks repeatedly during space teardown
Grab the page directory and MM locks once at the start of address space
teardown, then hold onto them across all the region unmapping work.
2022-01-12 14:52:47 +01:00
Andreas Kling 2323cdd914 Kernel: Do less unnecessary work when tearing down process address space
When deleting an entire AddressSpace, we don't need to do TLB flushes
at all (since the entire page directory is going away anyway).

We also don't need to deallocate VM ranges one by one, since the entire
VM range allocator will be deleted anyway.
2022-01-12 14:52:47 +01:00
Andreas Kling 24ecf1d021 Kernel: Remove redundant hash map of page tables in PageDirectory
The purpose of the PageDirectory::m_page_tables map was really just
to act as ref-counting storage for PhysicalPage objects that were
being used for the directory's page tables.

However, this was basically redundant, since we can find the physical
address of each page table from the page directory, and we can find the
PhysicalPage object from MemoryManager::get_physical_page_entry().
So if we just manually ref() and unref() the pages when they go in and
out of the directory, we no longer need PageDirectory::m_page_tables!

Not only does this remove a bunch of kmalloc() traffic, it also solves
a race condition that would occur when lazily adding a new page table
to a directory:

Previously, when MemoryManager::ensure_pte() would call HashMap::set()
to insert the new page table into m_page_tables, if the HashMap had to
grow its internal storage, it would call kmalloc(). If that kmalloc()
would need to perform heap expansion, it would end up calling
ensure_pte() again, which would clobber the page directory mapping used
by the outer invocation of ensure_pte().

The net result of the above bug would be that any invocation of
MemoryManager::ensure_pte() could erroneously return a pointer into
a kernel page table instead of the correct one!

This whole problem goes away when we remove the HashMap, as ensure_pte()
no longer does anything that allocates from the heap.
2022-01-10 16:22:37 +01:00
Andreas Kling bdbff9df24 Kernel: Don't relock MM lock for every page when remapping region
Make sure that callers already hold the MM lock, and we don't have to
worry about reacquiring it every time.
2022-01-10 16:22:37 +01:00
Hendiadyoin1 1cdace7898 Kernel: Add implied auto qualifiers in Memory 2022-01-09 23:29:57 -08:00
Pankaj Raghav 59da9bd0bd Kernel: Overload DMA helper without Physical Page output parameter
Not all drivers need the PhysicalPage output parameter while creating
a DMA buffer. This overload will avoid creating a temporary variable
for the caller
2022-01-09 00:45:38 +01:00
Pankaj Raghav e79f94f998 Kernel: Set Cacheable parameter to NO explicitly in DMA helpers
The cacheable parameter to allocate_kernel_region should be explicitly
set to No as this region is used to do physical memory transfers. Even
though most architectures ignore this even if it is set, it is better
to make this explicit.
2022-01-09 00:45:38 +01:00
creator1creeper1 3c05261611 AK+Everywhere: Make FixedArray OOM-safe
FixedArray now doesn't expose any infallible constructors anymore.
Rather, it exposes fallible methods. Therefore, it can be used for
OOM-safe code.
This commit also converts the rest of the system to use the new API.
However, as an example, VMObject can't take advantage of this yet,
as we would have to endow VMObject with a fallible static
construction method, which would require a very fundamental change
to VMObject's whole inheritance hierarchy.
2022-01-08 22:54:05 +01:00
Liav A ca254699ec Kernel: Implement read functionality for MemoryDevice
So far we only had mmap(2) functionality on the /dev/mem device, but now
we can also do read(2) on it.

The test unit was updated to check we are doing it safely.
2022-01-08 13:21:16 +02:00
Liav A 876559d283 Kernel: Change method name to clarify physical memory mmap validation 2022-01-08 13:21:16 +02:00
Liav A 3e066d380d Kernel/Memory: Remove needless VERIFY in /dev/mem mmap validation method
As it was pointed by Idan Horowitz, the rest of the method doesn't
assume we have any reserved ranges to allow mmap(2) to work on them, so
the VERIFY is not needed at all.
2022-01-07 19:13:27 +02:00
Tom 10efbfb09e Kernel: Scan ACPI memory ranges for the RSDP table
On some systems the ACPI RSDP table may be located in ACPI reserved
memory ranges rather than in the EBDA or BIOS areas.
2022-01-04 17:46:36 +00:00
Tom 190572b714 Kernel: Fix possible buffer overrun when scanning a MappedROM
If the length of the prefix was less than the chunk_size argument
we were potentionally reading past the mapped memory region.
2022-01-04 17:46:36 +00:00
Pankaj Raghav 602b35aa62 Kernel: Add DMA allocate functions that are TRY-able
Add DMA allocate buffer helper functions in MemoryManager.
2022-01-01 14:55:58 +01:00
Idan Horowitz be91b4fe3e Kernel: Support Mutex Protected lists in ListedRefCounted
This will allow us to support Mutex Protected lists like the custodies
list as well.
2021-12-29 12:04:15 +01:00
Guilherme Goncalves 33b78915d3 Kernel: Propagate overflow errors from Memory::page_round_up
Fixes #11402.
2021-12-28 23:08:50 +01:00
Andreas Kling ac7ce12123 Kernel: Remove the kmalloc_eternal heap :^)
This was a premature optimization from the early days of SerenityOS.
The eternal heap was a simple bump pointer allocator over a static
byte array. My original idea was to avoid heap fragmentation and improve
data locality, but both ideas were rooted in cargo culting, not data.

We would reserve 4 MiB at boot and only ended up using ~256 KiB, wasting
the rest.

This patch replaces all kmalloc_eternal() usage by regular kmalloc().
2021-12-28 21:02:38 +01:00
Andreas Kling 3399b6c57f Kernel: Remove old SlabAllocator :^)
This is no longer useful since kmalloc() does automatic slab allocation
without any of the limitations of the old SlabAllocator. :^)
2021-12-26 21:22:59 +01:00
Andreas Kling 43099fb387 Kernel: Remove all uses of MAKE_SLAB_ALLOCATED()
Objects that were previously allocated via slab_alloc()/slab_dealloc()
now go through kmalloc()/kfree_sized() instead.
2021-12-26 21:22:59 +01:00
Andreas Kling f7a4c34929 Kernel: Make kmalloc heap expansion kmalloc-free
Previously, the heap expansion logic could end up calling kmalloc
recursively, which was quite messy and hard to reason about.

This patch redesigns heap expansion so that it's kmalloc-free:

- We make a single large virtual range allocation at startup
- When expanding, we bump allocate VM from that region
- When expanding, we populate page tables directly ourselves,
  instead of going via MemoryManager.

This makes heap expansion a great deal simpler. However, do note that it
introduces two new flaws that we'll need to deal with eventually:

- The single virtual range allocation is limited to 64 MiB and once
  exhausted, kmalloc() will fail. (Actually, it will PANIC for now..)

- The kmalloc heap can no longer shrink once expanded. Subheaps stay
  in place once constructed.
2021-12-25 22:07:59 +01:00
Brian Gianforcaro 1c950773fb Kernel: Make MemoryManager::protect_ksyms_after_init UNMAP_AFTER_INIT
The function to protect ksyms after initialization, is only used during
boot of the system, so it can be UNMAP_AFTER_INIT as well.

This requires we switch the order of the init sequence, so we now call
`MM.protect_ksyms_after_init()` before `MM.unmap_text_after_init()`.
2021-12-24 14:28:59 -08:00
Guilherme Gonçalves da6aef9fff Kernel: Make msync return EINVAL when regions are too large
As a small cleanup, this also makes `page_round_up` verify its
precondition with `page_round_up_would_wrap` (which callers are expected
to call), rather than having its own logic.

Fixes #11297.
2021-12-23 17:43:12 -08:00
Daniel Bertalan 4195a7ef4b Kernel: Return EEXIST in VirtualRangeAllocator::try_allocate_specific()
This error only ever gets propagated to the userspace if
MAP_FIXED_NOREPLACE is requested, as MAP_FIXED unmaps intersecting
ranges beforehand, and non-fixed mmap() calls will just fall back to
allocating anywhere.

Linux specifies MAP_FIXED_NOREPLACE to return EEXIST when it can't
allocate, we now match that behavior.
2021-12-23 23:08:10 +01:00
Brian Gianforcaro b8e210deea Kernel: Initialize PhysicalRegion::m_large_zones, remove m_small_zones
Found by PVS Studio Static Analysis.
2021-12-22 13:29:31 -08:00
Idan Horowitz 5f4a67434c Kernel: Move userspace virtual address range base to 0x10000
Now that the shared bottom 2 MiB virtual address mappings are gone
userspace can use lower virtual addresses.
2021-12-22 00:02:36 -08:00
Idan Horowitz fccd0432a1 Kernel: Don't share the bottom 2 MiB of kernel mappings with processes
Now that the last 2 users of these mappings (the Prekernel and the APIC
ap boot environment) were removed, these are no longer used.
2021-12-22 00:02:36 -08:00