A couple of things were changed:
1. Semantic changes - PCI segments are now called PCI domains, to better
match what they are really. It's also the name that Linux gave, and it
seems that Wikipedia also uses this name.
We also remove PCI::ChangeableAddress, because it was used in the past
but now it's no longer being used.
2. There are no WindowedMMIOAccess or MMIOAccess classes anymore, as
they made a bunch of unnecessary complexity. Instead, Windowed access is
removed entirely (this was tested, but never was benchmarked), so we are
left with IO access and memory access options. The memory access option
is essentially mapping the PCI bus (from the chosen PCI domain), to
virtual memory as-is. This means that unless needed, at any time, there
is only one PCI bus being mapped, and this is changed if access to
another PCI bus in the same PCI domain is needed. For now, we don't
support mapping of different PCI buses from different PCI domains at the
same time, because basically it's still a non-issue for most machines
out there.
2. OOM-safety is increased, especially when constructing the Access
object. It means that we pre-allocating any needed resources, and we try
to find PCI domains (if requested to initialize memory access) after we
attempt to construct the Access object, so it's possible to fail at this
point "gracefully".
3. All PCI API functions are now separated into a different header file,
which means only "clients" of the PCI subsystem API will need to include
that header file.
4. Functional changes - we only allow now to enumerate the bus after
a hardware scan. This means that the old method "enumerate_hardware"
is removed, so, when initializing an Access object, the initializing
function must call rescan on it to force it to find devices. This makes
it possible to fail rescan, and also to defer it after construction from
both OOM-safety terms and hotplug capabilities.
This expands the reach of error propagation greatly throughout the
kernel. Sadly, it also exposes the fact that we're allocating (and
doing other fallible things) in constructors all over the place.
This patch doesn't attempt to address that of course. That's work for
our future selves.
The default template argument is only used in one place, and it
looks like it was probably just an oversight. The rest of the Kernel
code all uses u8 as the type. So lets make that the default and remove
the unused template argument, as there doesn't seem to be a reason to
allow the size to be customizable.
Now that the old PCI::Device was removed, we can complete the PCI
changes by making the PCI::DeviceController to be named PCI::Device.
Really the entire purpose and the distinction between the two was about
interrupts, but since this is no longer a problem, just rename it to
simplify things further.
...and also RangeAllocator => VirtualRangeAllocator.
This clarifies that the ranges we're dealing with are *virtual* memory
ranges and not anything else.
Problem:
- New `all_of` implementation takes the entire container so the user
does not need to pass explicit begin/end iterators. This is unused
except is in tests.
Solution:
- Make use of the new and more user-friendly version where possible.
We don't need to have a dedicated API for creating a VMObject with a
single page, the multi-page API option works in all cases.
Also make the API take a Span<NonnullRefPtr<PhysicalPage>> instead of
a NonnullRefPtrVector<PhysicalPage>.
The `#pragma GCC diagnostic` part is needed because the class has
virtual methods with the same name but different arguments, and Clang
tries to warn us that we are not actually overriding anything with
these.
Weirdly enough, GCC does not seem to care.
This fixes a bug that occurs when the controller's ports are not
(internally) numbered sequentially.
This is done by checking the bits set in PI.
This bug was found on bare-metal, on a laptop with 1 Port that
was reported as port 4.
If we are in a shared interrupt handler, the called handlers might
indicate it was not their interrupt, so we should not increment the
call counter of these handlers.
This has a quirk with the AMD Hudson-2 SATA controller. [1022:7801]
Having this flag set makes the controller become stuck in a busy loop.
I decided to remove the flag instead of making it a quirk as it still
works with Qemu, VirtualBox, VMware Player and the Intel Wildcat
Point-LP SATA Controller [8086:9c83] without it, thus making it simpler
to just remove it.
Partial fix for #7738 (as it still does not work in IDE mode)
This change allows the controller to utilize interrupts even if no
device was connected to a port when we initialize it, so we can support
hotplug events now.
This was proved to be a problematic option. I tested this option on
bare metal AHCI controller, and if we didn't reset the controller, the
firmware (SeaBIOS) could leave the controller state not clean, so an
plugged device signature was in place although the specific port had no
plugged device after rebooting.
Therefore, we need to ensure we use the controller in a clean state
always.
In addition to that, the Complete option was renamed to Aggressive, as
it represents better the consequences of choosing this option.
Fixes off-by-one caused by reading the register directly
without adding a 1 to it, because AHCI reports 1 less port than
the actual number of ports supported.
On my machine, it only sets PRC and not PCC.
Confirmed to happen on:
- 8086:9ca2 (Intel Corporation Wildcat Point-LP SATA Controller
[AHCI Mode] (rev 03))
On my bare metal machine, enabling it as this point causes it to
instantly send an interrupt, and we're too early in the process
to be able to handle AHCI interrupts. The interrupts were being
enabled in the initialize function anyway.
Confirmed to happen on:
- 8086:9ca2 (Intel Corporation Wildcat Point-LP SATA Controller
[AHCI Mode] (rev 03))
- 8086:3b22 (Intel Corporation 5 Series/3400 Series Chipset
6 port SATA AHCI Controller (rev 06))
AnonymousVMObject::create_with_physical_page(s) can't be NonnullRefPtr
as it allocates internally. Fixing the API then surfaced an issue in
ScatterGatherList, where the code was attempting to create an
AnonymousVMObject in the constructor which will not be observable
during OOM.
Fix all of these issues and start propagating errors at the callers
of the AnonymousVMObject and ScatterGatherList APis.
This code was unlocking the lock directly, but the Locker is still
attached, causing the lock to be unlocked an extra time, hence
corrupting the internal lock state.
This is extra confusing though, as complete_current_request() runs
without a lock which also looks like a bug. But that's a task for
another day.
We want to move this out of the AHCI subsystem into the VM system,
since other parts of the kernel may need to perform scatter-gather IO.
We rename the current VM::ScatterGatherList impl that's used in the
virtio subsystem to VM::ScatterGatherRefList, since its distinguishing
feature from the AHCI scatter-gather list is that it doesn't own its
buffers.
We had some inconsistencies before:
- Sometimes "The", sometimes "the"
- Sometimes trailing ".", sometimes no trailing "."
I picked the most common one (lowecase "the", trailing ".") and applied
it to all copyright headers.
By using the exact same string everywhere we can ensure nothing gets
missed during a global search (and replace), and that these
inconsistencies are not spread any further (as copyright headers are
commonly copied to new files).
The overall design is the same, but we change a few things,
like decreasing the amount of blocking forever loops. The goal
is to ensure the kernel won't hang forever when dealing with
buggy hardware.
Also, we reset the channel when initializing it, just in case the
hardware was in bad state before we start use it.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The first one is for disabling the PS2 controller, the other one is for
disabling physical storage enumeration.
We can't be sure any machine will work with our implementation,
therefore this will help us to test more machines.
We need to do it to let real hardware to put the correct voltages
on the wire.
Apparently my ICH7 machine refused to boot, and was reading lots of
garbage from an unconnected IDE channel. It was fixed after I added a
delay of 20 microseconds. It probably can be reduced, I just took a safe
value and it seems to work correctly without any problems :)
Also handle native and compatibility channel modes together, so if only
one IDE channel was set to work on PCI native mode, we need to handle it
separately, so the other channel continue to operate with the legacy IO
ports and interrupt line.
This is a "quirk" I've observed on a Intel ICH7 test machine. Apparently
we need to select the device (master or slave) before starting to work
with the bus master register.
It's very possible that other machines are requiring this step to happen
before the DMA transfer can occur correctly.
Also, when reading with DMA, we should set the transfer direction before
clearing the interrupt status.
For the sake of completeness, I added a few lines in places that I
deemed it to be reasonable to clear the interrupt status there.
Although unlikely to happen, a user can have an IDE controller that
doesn't support bus master capability. If that's the case, we need to
check for this, and create an IDEChannel (not BMIDEChannel) to allow
IO operations with the controller.
If the user requests to force PIO mode, we just create IDEChannel
objects which are capable of sending PIO commands only.
However, if the user doesn't force PIO mode, we create BMIDEChannel
objects, which are sending DMA commands.
This change is somewhat simplifying the code, so each class is
supporting its type of operation - PIO or DMA. The PATADiskDevice
should not care if DMA is enabled or not.
Later on, we could write an IDEChannel class for UDMA modes,
that are available and documented on Intel specifications for their IDE
controllers.
Technically not supported by the original ATA specification, IDE
hot swapping is still in practice possible, so the only sane way
to start support it is with ref-counting the IDEChannel object so if we
remove a PATADiskDevice, it's not gone with it.
An article about IDE limits states that:
"Hard drives over 8.4 GB are supposed to report their geometry as
16383/16/63. This in effect means that the `geometry' is obsolete, and
the total disk size can no longer be computed from the geometry, but is
found in the LBA capacity field returned by the IDENTIFY command.
Hard drives over 137.4 GB are supposed to report an LBA capacity of
0xfffffff = 268435455 sectors (137438952960 bytes). Now the actual disk
size is found in the new 48-capacity field."
(https://tldp.org/HOWTO/Large-Disk-HOWTO-4.html) which is the main
reason to not support CHS as harddrives with less than 8.4 GB capacity
are completely obsolete.
Another good reason is that virtually any harddrive in the last 20 years
or so, supports LBA mode. Therefore, it's probably OK to just ignore CHS
as it's unlikely to encounter a harddrive that doesn't support LBA.
This is somewhat simplifying the IDE initialization and access code.
Also, we should use the ATAIdentifyBlock structure if possible,
so now we do it instead of using macros to calculate offsets.
With the usage of the ATAIdentifyBlock structure, we now use the
48-bit LBA max count if the drive indicates it supports 48-bit LBA mode.
This reverts commit cfc2f33dcb.
We can't actually change the IRQ line value and expect the device
to work with it (this was my mistake).
That register is R/W so the firmware can figure out IRQ routing and put
the correct value and write it to the Interrupt line register.
As a compromise, if the fimrware decided to set the IRQ line to be 7,
or something else we can't deal with, the user can simply force the code
to work with IRQ 11, with the boot argument "force_ahci_irq_11" being
set to "on".
Instead of polling if the device ended the operation, we can just use
interrupts for signalling about end of IO operation.
In similar way, we use interrupts during device detection.
Also, we use the new Work Queue mechanism introduced by @tomuta to allow
better performance and stability :)
We can't use deferred functions for anything that may require preemption,
such as copying from/to user or accessing the disk. For those purposes
we should use a work queue, which is essentially a kernel thread that
may be preempted or blocked.
These errors are classed as fatal, so we need to recover from them.
Found while trying to debug AHCI boot on VMware Player,
where I got TFES.
From the spec: "Fatal errors (signified by the setting of PxIS.HBFS,
PxIS.HBDS, PxIS.IFS, or PxIS.TFES) will cause the HBA to enter
the ERR:Fatal state"
We were already recovering from IFS.