Commit graph

124 commits

Author SHA1 Message Date
Andreas Kling 2d1bcce34a Kernel: Fix triple-fault when clicking on SystemServer in SystemMonitor
The fault was happening when retrieving a current backtrace for the
SystemServer process.

To generate a backtrace, we go into the paging scope of the process,
meaning we temporarily switch to using its page directory as our own.

Because kernel VM is allocated on demand, it's possible for a process's
mappings above the 3GB mark to be out-of-date. Normally this just gets
fixed up transparently by the page fault handler (which simply copies
the PDE from the canonical MM.kernel_page_directory() into the current
process.)

However, if the current kernel *stack* is in a piece of memory that
the backtraced process lacks up-to-date PDE's for, we still get a page
fault, but are unable to handle it, since the CPU wants to push to the
stack as part of calling the page fault handler. So we're screwed and
it's a triple-fault.

Fix this by always updating the kernel VM mappings before switching
into a paging scope. In practical terms, this is a 1KB memcpy() that
happens when generating a backtrace, or doing exec().
2019-11-27 12:40:42 +01:00
Andreas Kling 5b8cf2ee23 Kernel: Make syscall counters and page fault counters per-thread
Now that we show individual threads in SystemMonitor and "top",
it's also very nice to have individual counters for the threads. :^)
2019-11-26 21:37:38 +01:00
Andreas Kling 3dc87be891 Kernel: Mark mmap()-created regions with a special bit
Then only allow regions with that bit to be manipulated via munmap()
and mprotect(). This prevents messing with non-mmap()ed regions in
a process's address space (stacks, shared buffers, ...)
2019-11-24 12:26:21 +01:00
Andreas Kling 9a157b5e81 Revert "Kernel: Move Kernel mapping to 0xc0000000"
This reverts commit bd33c66273.

This broke the network card drivers, since they depended on kmalloc
addresses being identity-mapped.
2019-11-23 17:27:09 +01:00
Jesse Buhagiar bd33c66273 Kernel: Move Kernel mapping to 0xc0000000
The kernel is now no longer identity mapped to the bottom 8MiB of
memory, and is now mapped at the higher address of `0xc0000000`.

The lower ~1MiB of memory (from GRUB's mmap), however is still
identity mapped to provide an easy way for the kernel to get
physical pages for things such as DMA etc. These could later be
mapped to the higher address too, as I'm not too sure how to
go about doing this elegantly without a lot of address subtractions.
2019-11-22 16:23:23 +01:00
Andreas Kling 794758df3a Kernel: Implement some basic stack pointer validation
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.

If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.

Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
2019-11-17 12:15:43 +01:00
Liav A bce510bf6f Kernel: Fix the search method of free userspace physical pages (#742)
Now the userspace page allocator will search through physical regions,
and stop the search as it finds an available page.

Also remove an "address of" sign since we don't need that when
counting size of physical regions
2019-11-08 22:39:29 +01:00
supercomputer7 c3c905aa6c Kernel: Removing hardcoded offsets from Memory Manager
Now the kernel page directory and the page tables are located at a
safe address, to prevent from paging data colliding with garbage.
2019-11-08 17:38:23 +01:00
Andreas Kling 19398cd7d5 Kernel: Reorganize memory layout a bit
Move the kernel image to the 1 MB physical mark. This prevents it from
colliding with stuff like the VGA memory. This was causing us to end
up with the BIOS screen contents sneaking into kernel memory sometimes.

This patch also bumps the kmalloc heap size from 1 MB to 3 MB. It's not
the perfect permanent solution (obviously) but it should get the OOM
monkey off our backs for a while.
2019-11-04 12:04:35 +01:00
Andreas Kling a6e9119537 Kernel: Tweak some outdated kprintfs in Region 2019-11-04 00:48:45 +01:00
Andreas Kling d67c6a92db Kernel: Move page fault handling from MemoryManager to Region
After the page fault handler has found the region in which the fault
occurred, do the rest of the work in the region itself.

This patch also makes all fault types consistently crash the process
if a new page is needed but we're all out of pages.
2019-11-04 00:47:03 +01:00
Andreas Kling 0e8f1d7cb6 Kernel: Don't expose a region's page directory to the outside world
Now that region manages its own mapping/unmapping, there's no need for
the outside world to be able to grab at its page directory.
2019-11-04 00:26:00 +01:00
Andreas Kling 6ed9cc4717 Kernel: Remove Region API's for setting/unsetting the page directory
This is done implicitly by mapping or unmapping the region.
2019-11-04 00:24:20 +01:00
Andreas Kling e3dda4e87b Kernel: Fix weird Region constructor that took nullable RefPtr<Inode>
It's never valid to construct a Region with a null Inode pointer using
this constructor, so just take a NonnullRefPtr<Inode> instead.
2019-11-04 00:21:08 +01:00
Andreas Kling 9b2dc36229 Kernel: Merge MemoryManager::map_region_at_address() into Region::map() 2019-11-04 00:05:57 +01:00
Andreas Kling 98b328754e Kernel: Fix bad setup of CoW faults for offset regions
Regions with an offset into their VMObject were incorrectly adding the
page offset when indexing into the CoW bitmap.
2019-11-03 23:54:35 +01:00
Andreas Kling 5b7f8634e3 Kernel: Set the G (global) bit for kernel page tables
Since the kernel page tables are shared between all processes, there's
no need to (implicitly) flush the TLB for them on every context switch.

Setting the G bit on kernel page tables allows the CPU to keep the
translation caches around.
2019-11-03 23:51:55 +01:00
Andreas Kling 4bf1a72d21 Kernel: Teach Region how to remap itself
Now remapping (i.e flushing kernel metadata to the CPU page tables)
is done by simply calling Region::remap().
2019-11-03 21:11:08 +01:00
Andreas Kling 3dce0f23f4 Kernel: Regions should be mapped into a PageDirectory, not a Process
This patch changes the parameter to Region::map() to be a PageDirectory
since that matches how we think about the memory model:

Regions are views onto VMObjects, and are mapped into PageDirectories.
Each Process has a PageDirectory. The kernel also has a PageDirectory.
2019-11-03 21:11:08 +01:00
Andreas Kling 2cfc43c982 Kernel: Move region map/unmap operations into the Region class
The more Region can take care of itself, the better.
2019-11-03 21:11:08 +01:00
Andreas Kling a221cddeec Kernel: Clean up a bunch of wrong-looking Region/VMObject code
Since a Region is merely a "window" onto a VMObject, it can both begin
and end at a distance from the VMObject's boundaries.
Therefore, we should always be computing indices into a VMObject's
physical page array by adding the Region's "first_page_index()".

There was a whole bunch of code that forgot to do that. This fixes
many wrong behaviors for Regions that start part-way into a VMObject.
2019-11-03 15:44:13 +01:00
Andreas Kling fe455c5ac4 Kernel: Move page remapping into Region::remap_page(index)
Let Region deal with this, instead of everyone calling MemoryManager.
2019-11-03 15:32:11 +01:00
Andreas Kling b0321bf290 Kernel: Zero-fill faults should not temporarily enable interrupts
We were doing a temporary STI/CLI in MemoryManager::zero_page() to be
able to acquire the VMObject's lock before zeroing out a page.

This logic was inherited from the inode fault handler, where we need
to enable interrupts anyway, since we might need to interact with the
underlying storage device.

Zero-fill faults don't actually need to lock the VMObject, since they
are already guaranteed exclusivity by interrupts being disabled when
entering the fault handler.

This is different from inode faults, where a second thread can often
get an inode fault for the same exact page in the same VMObject before
the first fault handler has received a response from the disk.
This is why the lock exists in the first place, to prevent this race.

This fixes an intermittent crash in sys$execve() that was made much
more visible after I made userspace stacks lazily allocated.
2019-11-01 17:59:47 +01:00
Tom 00a7c48d6e APIC: Enable APIC and start APs 2019-10-16 19:14:02 +02:00
Andreas Kling 35138437ef Kernel+SystemMonitor: Add fault counters
This patch adds three separate per-process fault counters:

- Inode faults

    An inode fault happens when we've memory-mapped a file from disk
    and we end up having to load 1 page (4KB) of the file into memory.

- Zero faults

    Memory returned by mmap() is lazily zeroed out. Every time we have
    to zero out 1 page, we count a zero fault.

- CoW faults

    VM objects can be shared by multiple mappings that make their own
    unique copy iff they want to modify it. The typical reason here is
    memory shared between a parent and child process.
2019-10-02 14:13:49 +02:00
Andreas Kling d481ae95b5 Kernel: Defer creation of Region CoW bitmaps until they're needed
Instead of allocating and populating a Copy-on-Write bitmap for each
Region up front, wait until we actually clone the Region for sharing
with another process.

In most cases, we never need any CoW bits and we save ourselves a lot
of kmalloc() memory and time.
2019-10-01 19:58:41 +02:00
Andreas Kling c58d1868cb Kernel: Fix munmap() bad splitting of already-split Regions
When splitting an Region that's already the result of an earlier split,
we have to take the Region's offset-in-VMObject into account since it
may be non-zero.
2019-10-01 11:40:40 +02:00
Andreas Kling ac20919b13 Kernel: Make it possible to turn off VM guard pages at compile time
This might be useful for debugging since guard pages introduce a fair
amount of noise in the virtual address space.
2019-09-30 17:22:16 +02:00
Conrad Pankoff fa20a447a9 Kernel: Repair unaligned regions supplied by the boot loader
We were just blindly trusting that the bootloader would only give us
page-aligned memory regions. This is apparently not always the case,
so now we can try to repair those regions.

Fixes #601
2019-09-28 09:23:52 +02:00
Andreas Kling 2584636d19 Kernel: Fix partial munmap() deallocating still-in-use VM
We were always returning the full VM range of the partially-unmapped
Region to the range allocator. This caused us to re-use those addresses
for subsequent VM allocations.

This patch also skips creating a new VMObject in partial munmap().
Instead we just make split regions that point into the same VMObject.

This fixes the mysterious GCC ICE on large C++ programs.
2019-09-27 20:21:52 +02:00
Andreas Kling 7f9a33dba1 Kernel: Make Region single-owner instead of ref-counted
This simplifies the ownership model and makes Region easier to reason
about. Userspace Regions are now primarily kept by Process::m_regions.

Kernel Regions are kept in various OwnPtr<Regions>'s.

Regions now only ever get unmapped when they are destroyed.
2019-09-27 14:25:42 +02:00
Andreas Kling 9c549c178a Kernel: Pad virtual address space allocations with guard pages
Put one unused page on each side of VM allocations to make invalid
accesses more likely to generate crashes.

Note that we will not add this guard padding for mmap() at a specific
memory address, only to "mmap it anywhere" requests.
2019-09-22 15:12:29 +02:00
Conrad Pankoff 224fbb7910 Kernel: Fix returning pages to regions >= 2GB 2019-09-17 09:27:23 +02:00
Conrad Pankoff 9c5e3cd818 Kernel: Ignore memory the bootloader gives us above 2^32 2019-09-17 09:27:23 +02:00
Andreas Kling a34f3a3729 Kernel: Fix some bitrot in MemoryManager debug logging code 2019-09-16 14:45:44 +02:00
Andreas Kling 5d491fa1cd Kernel: Add a simple slab allocator for small allocations
This is a freelist allocator with static size classes that works as a
complement to the generic kmalloc(). It's a lot faster than kmalloc()
since allocation just means popping from the freelist.

It's also significantly more compact when there are a lot of objects
smaller than the minimum kmalloc chunk size (32 bytes.)

This patch enables it for the Region and PhysicalPage classes.
In the PhysicalPage (8 bytes) case, it's a huge improvement since we
no longer waste 75% of the storage allocated.

There are also a number of ways this can be improved, so let's keep
working on it going forward.
2019-09-16 10:33:27 +02:00
Andreas Kling 1c692e87a6 Kernel: Move kmalloc() into a Kernel/Heap/ directory 2019-09-16 09:01:44 +02:00
Andreas Kling e60bbadbbc Kernel: Add LogStream operator<< for PhysicalAddress 2019-09-15 20:47:49 +02:00
Andreas Kling a40afc4562 Kernel: Get rid of MemoryManager::allocate_page_table()
We can just use the physical page allocator directly, there's no need
for a dedicated function for page tables.
2019-09-15 20:34:03 +02:00
Andreas Kling 73fdbba59c AK: Rename <AK/AKString.h> to <AK/String.h>
This was a workaround to be able to build on case-insensitive file
systems where it might get confused about <string.h> vs <String.h>.

Let's just not support building that way, so String.h can have an
objectively nicer name. :^)
2019-09-06 15:36:54 +02:00
Andreas Kling bf43d94a2f Kernel: Disable interrupts throughout ~Region()
We don't want an interrupt handler to access the VM data structures
while their internal consistency is broken.
2019-09-05 11:15:05 +02:00
Andreas Kling e25ade7579 Kernel: Rename "vmo" to "vmobject" everywhere 2019-09-04 11:27:14 +02:00
Andreas Kling 0e53b1d1ad Kernel: Add some convenient getters to Region
Add getters for the underlying Range, the access bits, and also add
contains(Range) which just wraps m_range.contains().
2019-08-29 20:55:40 +02:00
Andreas Kling 10e0e13bf3 Kernel: Add LogStream operator<< for Range 2019-08-29 20:54:50 +02:00
Andreas Kling dde10f534f Revert "Kernel: Avoid a memcpy() of the whole block when paging in from inode"
This reverts commit 11896d0e26.

This caused a race where other processes using the same InodeVMObject
could end up accessing the newly-mapped physical page before we've
actually filled it with bytes from disk.

It would be nice to avoid these copies without breaking anything.
2019-08-26 13:50:55 +02:00
Andreas Kling f5d779f47e Kernel: Never forcibly page in entire executables
We were doing this for the initial kernel-spawned userspace process(es)
to work around instability in the page fault handler. Now that the page
fault handler is more robust, we can stop worrying about this.

Specifically, the page fault handler was previous not able to handle
getting a page fault in anything but the currently executing task's
page directory.
2019-08-26 13:20:01 +02:00
Andreas Kling e29fd3cd20 Kernel: Display virtual addresses as V%p instead of L%x
The L was a leftover from when these were called linear addresses.
2019-08-26 11:31:58 +02:00
Andreas Kling 11896d0e26 Kernel: Avoid a memcpy() of the whole block when paging in from inode 2019-08-25 14:34:53 +02:00
Andreas Kling b018cd653f Kernel: Fix oversized InodeVMObject after inode size changes 2019-08-24 19:35:47 +02:00
Andreas Kling 179158bc97 Kernel: Put debug spam about already-paged-in inode pages behind #ifdef 2019-08-19 17:30:36 +02:00