Commit graph

96 commits

Author SHA1 Message Date
Matt Macy ab3059a8e7 Back pcpu zone with domain correct pages
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
  (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)

- Allocate page from the correct domain for a given cpu.

- Don't initialize pc_domain to non-zero value if NUMA is not defined
  There are some misconceptions surrounding this field. It is the
  _VM_ NUMA domain and should only ever correspond to valid domain
  values as understood by the VM.

The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.

Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D15933
2018-07-06 02:06:03 +00:00
Andriy Gapon ec6faf94c4 add support for console resuming, implement it for uart, use on x86
This change adds a new optional console method cn_resume and a kernel
console interface cnresume.  Consoles that may need to re-initialize
their hardware after suspend (e.g., because firmware does not care to do
it) will implement cn_resume.  Note that it is called in rather early
environment not unlike early boot, so the same restrictions apply.
Platform specific code, for platforms that support hardware suspend,
should call cnresume early after resume, before any console output is
expected.

This change fixes a problem with a system of mine failing to resume when
a serial console is used.  I found that the serial port was in a strange
configuration and an attempt to write to it likely resulted in an
infinite loop.

To avoid adding cn_resume method to every console driver, CONSOLE_DRIVER
macro has been extended to support optional methods.

Reviewed by:	imp, mav
MFC after:	3 weeks
Differential Revision: https://reviews.freebsd.org/D15552
2018-05-29 16:16:24 +00:00
Konstantin Belousov 3621ba1ede Add Intel Spec Store Bypass Disable control.
Speculative Store Bypass (SSB) is a speculative execution side channel
vulnerability identified by Jann Horn of Google Project Zero (GPZ) and
Ken Johnson of the Microsoft Security Response Center (MSRC)
https://bugs.chromium.org/p/project-zero/issues/detail?id=1528.
Updated Intel microcode introduces a MSR bit to disable SSB as a
mitigation for the vulnerability.

Introduce a sysctl hw.spec_store_bypass_disable to provide global
control over the SSBD bit, akin to the existing sysctl that controls
IBRS. The sysctl can be set to one of three values:
0: off
1: on
2: auto

Future work will enable applications to control SSBD on a per-process
basis (when it is not enabled globally).

SSBD bit detection and control was verified with prerelease microcode.

Security:	CVE-2018-3639
Tested by:	emaste (previous version, without updated microcode)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-05-21 21:08:19 +00:00
Jung-uk Kim e787342e25 Redo r332918 with the ACPICA API and remove debug.acpi.suspend_deep_bounce.
AcpiOsEnterSleep() was meant to implement this feature.

Reviewed by:	avg
2018-05-03 19:00:50 +00:00
Konstantin Belousov 986c4ca387 Turn off IBRS on suspend.
Resume starts CPU from the init state, which clears any loaded
microcode updates.  As result, IBRS MSRs are no longer available,
until the microcode is reloaded.

I have to forcibly clear cpu_stdext_feature3, which assumes that CPUID
leaf 7 reg %ebx does not report anything except Meltdown/Spectre bugs
bits.  If future CPUs add new bits there, hw_ibrs_recalculate() and
identify_cpu1()/identify_cpu2() need to be adjusted for that.

Submitted and tested by:	Michael Danilov <mike.d.ft402@gmail.com>
PR:	227866
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15236
2018-04-30 20:18:32 +00:00
Andriy Gapon e673a4ec4c add a new ACPI suspend debugging knob, debug.acpi.suspend_deep_bounce
This sysctl allows a deeper dive into the sleep abyss comparing to
debug.acpi.suspend_bounce.  When the new sysctl is set the system will
execute the suspend sequence up to the call to AcpiEnterSleepState().
That includes saving processor contexts and parking APs.  Then, instead
of actually entering the sleep state, the BSP will call resumectx() to
emulate the wakeup.  The APs should get restarted by the sequence of
Init and Startup IPIs that BSP sends to them.

MFC after:	8 days
2018-04-24 09:42:58 +00:00
Konstantin Belousov d86c1f0dc1 i386 4/4G split.
The change makes the user and kernel address spaces on i386
independent, giving each almost the full 4G of usable virtual addresses
except for one PDE at top used for trampoline and per-CPU trampoline
stacks, and system structures that must be always mapped, namely IDT,
GDT, common TSS and LDT, and process-private TSS and LDT if allocated.

By using 1:1 mapping for the kernel text and data, it appeared
possible to eliminate assembler part of the locore.S which bootstraps
initial page table and KPTmap.  The code is rewritten in C and moved
into the pmap_cold(). The comment in vmparam.h explains the KVA
layout.

There is no PCID mechanism available in protected mode, so each
kernel/user switch forth and back completely flushes the TLB, except
for the trampoline PTD region. The TLB invalidations for userspace
becomes trivial, because IPI handlers switch page tables. On the other
hand, context switches no longer need to reload %cr3.

copyout(9) was rewritten to use vm_fault_quick_hold().  An issue for
new copyout(9) is compatibility with wiring user buffers around sysctl
handlers. This explains two kind of locks for copyout ptes and
accounting of the vslock() calls.  The vm_fault_quick_hold() AKA slow
path, is only tried after the 'fast path' failed, which temporary
changes mapping to the userspace and copies the data to/from small
per-cpu buffer in the trampoline.  If a page fault occurs during the
copy, it is short-circuit by exception.s to not even reach C code.

The change was motivated by the need to implement the Meltdown
mitigation, but instead of KPTI the full split is done.  The i386
architecture already shows the sizing problems, in particular, it is
impossible to link clang and lld with debugging.  I expect that the
issues due to the virtual address space limits would only exaggerate
and the split gives more liveness to the platform.

Tested by: pho
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D14633
2018-04-13 20:30:49 +00:00
Jeff Roberson b6715dab8f Move VM_NUMA_ALLOC and DEVICE_NUMA under the single global config option NUMA.
Sponsored by:	Netflix, Dell/EMC Isilon
Discussed with:	jhb
2018-01-14 03:36:03 +00:00
Jeff Roberson 3f289c3fcf Implement 'domainset', a cpuset based NUMA policy mechanism. This allows
userspace to control NUMA policy administratively and programmatically.

Implement domainset based iterators in the page layer.

Remove the now legacy numa_* syscalls.

Cleanup some header polution created by having seq.h in proc.h.

Reviewed by:	markj, kib
Discussed with:	alc
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13403
2018-01-12 22:48:23 +00:00
Bruce Evans da9fba5447 Use resume_cpus() instead of restart_cpus() to resume from ACPI suspension.
restart_cpus() worked well enough by accident.  Before this set of fixes,
resume_cpus() used the same cpuset (started_cpus, meaning CPUs directed to
restart) as restart_cpus().  resume_cpus() waited for the wrong cpuset
(stopped_cpus) to become empty, but since mixtures of stopped and suspended
CPUs are not close to working, stopped_cpus must be empty when resuming so
the wait is null -- restart_cpus just allows the other CPUs to restart and
returns without waiting.

Fix resume_cpus() to wait on a non-wrong cpuset for the ACPI case, and
add further kludges to try to keep it working for the XEN case.  It
was only used for XEN.  It waited on suspended_cpus.  This works for
XEN.  However, for ACPI, resuming is a 2-step process.  ACPI has already
woken up the other CPUs and removed them from suspended_cpus.  This
fix records the move by putting them in a new cpuset resuming_cpus.
Waiting on suspended_cpus would give the same null wait as waiting on
stopped_cpus.  Wait on resuming_cpus instead.

Add a cpuset toresume_cpus to map the CPUs being told to resume to keep
this separate from the cpuset started_cpus for mapping the CPUs being told
to restart.  Mixtures of stopped and suspended/resuming CPUs are still far
from working.  Describe new and some old cpusets in comments.

Add further kludges to cpususpend_handler() to try to avoid breaking it
for XEN.  XEN doesn't use resumectx(), so it doesn't use the second
return path for savectx(), and it goes from the suspended state directly
to the restarted state, while ACPI resume goes through the resuming state.
Enter the resuming state early for all cases so that resume_cpus can test
for being in this state and not have to worry about the intermediate
!suspended state for ACPI only.

Reviewed by:	kib
2017-12-21 09:17:48 +00:00
Bruce Evans 2ba6fe0009 Remove the permanent double mapping of low physical memory and replace
it by a transient double mapping for the one instruction in ACPI wakeup
where it is needed (and for many surrounding instructions in ACPI resume).
Invalidate the TLB as soon as convenient after undoing the transient
mapping.  ACPI resume already has the strict ordering needed for this.

This fixes the non-trapping of null pointers and other garbage pointers
below NBPDR (except transiently).  NBPDR is quite large (4MB, or 2MB for
PAE).

This fixes spurious traps at the first instruction in VM86 bioscalls.
The traps are for transiently missing read permission in the first
VM86 page (physical page 0) which was just written to at KERNBASE in
the kernel.  The mechanism is unknown (it is not simply PG_G).

locore uses a similar but larger transient double mapping and needs
it for 2 instructions instead of 1.  Unmap the first PDE in it after
the 2 instructions to detect most garbage pointers while bootstrapping.
pmap_bootstrap() finishes the unmapping.

Remove the avoidance of the double mapping for a recently fixed special
case.  ACPI resume could use this avoidance (made non-special) to avoid
any problems with the transient double mapping, but no such problems
are known.

Update comments in locore.  Many were for old versions of FreeBSD which
tried to map low memory r/o except for special cases, or might have
allowed access to low memory via physical offsets.  Now all kernel
maps are r/w, and removal of of the double map disallows use of physical
offsets again.
2017-12-18 13:53:22 +00:00
Pedro F. Giffuni ebf5747bdb sys/x86: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:11:47 +00:00
Roger Pau Monné 45ff071d6e acpi/srat: zero the SRAT cpu array
Fix from fallout introduced in r322348 that moved the cpus array to a
dynamic allocation without zeroing the area.

Reported by:		mjg
MFC with:		r322348
Reviewed by:		mjg
Differential revision:	https://reviews.freebsd.org/D12220
2017-09-04 10:08:42 +00:00
Alexander Motin ffc7e53a65 Fix off-by-one error when parsing SRAT table.
Reviewed by:	jhb
MFC after:	1 week
2017-08-22 19:56:30 +00:00
Roger Pau Monné 72446721e4 srat: use pmap_unmapbios
To match the pmap_mapbios.

Reported by:	jhb
MFC with:	r322403
2017-08-13 14:50:38 +00:00
Roger Pau Monné c642d2f5b5 acpi/srat: fix build without DMAP
Use pmap_mapbios to map memory used to store the cpus array.

Reported by:	lwhsu
X-MFC-with:	r322348
2017-08-11 14:19:55 +00:00
Roger Pau Monné 3f0a9fe06c mptable: fix i386 build failure
Reported by:	emaste
X-MFC-with:	r322347
2017-08-10 17:46:57 +00:00
Roger Pau Monné a74bb29ada x86: bump MAX_APIC_ID to 512
Introduce a new define to take int account the xAPIC ID limit, for
systems where x2APIC is not available/reliable.

Also change some of the usages of the APIC ID to use an unsigned int
(which is the correct storage type to deal with x2APIC IDs as found in
x2APIC MADT entries).

This allows booting FreeBSD on a box with 256 CPUs and APIC IDs up to
295:

FreeBSD/SMP: Multiprocessor System Detected: 256 CPUs
FreeBSD/SMP: 1 package(s) x 64 core(s) x 4 hardware threads
Package HW ID = 0
	Core HW ID = 0
		CPU0 (BSP): APIC ID: 0
		CPU1 (AP/HT): APIC ID: 1
		CPU2 (AP/HT): APIC ID: 2
		CPU3 (AP/HT): APIC ID: 3
[...]
	Core HW ID = 73
		CPU252 (AP): APIC ID: 292
		CPU253 (AP/HT): APIC ID: 293
		CPU254 (AP/HT): APIC ID: 294
		CPU255 (AP/HT): APIC ID: 295

Submitted by:		kib (previous version)
Relnotes:		yes
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11913
2017-08-10 09:16:40 +00:00
Roger Pau Monné 84525e55c1 x86: make the arrays that depend on MAX_APIC_ID dynamic
So that MAX_APIC_ID can be bumped without wasting memory.

Note that the usage of MAX_APIC_ID in the SRAT parsing forces the
parser to allocate memory directly from the phys_avail physical memory
array, which is not the best approach probably, but I haven't found
any other way to allocate memory so early in boot. This memory is not
returned to the system afterwards, but at least it's sized according
to the maximum APIC ID found in the MADT table.

Sponsored by:		Citrix Systems R&D
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11912
2017-08-10 09:16:03 +00:00
Roger Pau Monné fd1f83fb45 apic_enumerator: only set mp_ncpus and mp_maxid at probe cpus phase
Populate the lapics arrays and call cpu_add/lapic_create in the setup
phase instead. Also store the max APIC ID found in the newly
introduced max_apic_id global variable.

This is a requirement in order to make the static arrays currently
using MAX_LAPIC_ID dynamic.

Sponsored by:		Citrix Systems R&D
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11911
2017-08-10 09:15:18 +00:00
Konstantin Belousov fc8929cb29 More accurately handle early EFER restoration on resume.
Do not try to set LMA bit while CPU is still in legacy mode.
Apparently Intel CPUs ignore non-id writes to LMA, while AMD's
(over-)react with #GP.

Reported and tested by:	danfe
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-06-11 14:39:08 +00:00
Konstantin Belousov bd101a6648 Ensure that resume path on amd64 only accesses page tables for normal
operation after processor is configured to allow all required
features.

In particular, NX must be enabled in EFER, otherwise load of page
table element with nx bit set causes reserved bit page fault.  Since
malloc uses direct mapping for small allocations, in particular for
the suspension pcbs, and DMAP is nx after r316767, this commit tripped
fault on resume path.

Restore complete state of EFER while wakeup code is still executing
with custom page table, before calling resumectx, instead of trying to
guess which features might be needed before resumectx restored EFER on
its own.

Bisected and tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-05-15 20:52:43 +00:00
Gleb Smirnoff 83c9dea1ba - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter
in place.  To do per-cpu stats, convert all fields that previously were
  maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
  before we have set up UMA and we can do counter_u64_alloc(), provide an
  early counter mechanism:
  o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
  o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
    so that at early stages of boot, before counters are allocated we already
    point to a counter that can be safely written to.
  o For sparc64 that required a whole dummy pcpu[MAXCPU] array.

Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
  to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.

This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html

Reviewed by:	kib, gallatin, marius, lidl
Differential Revision:	https://reviews.freebsd.org/D10156
2017-04-17 17:34:47 +00:00
Roger Pau Monné 6e2ab0aef5 x86/srat: fix parsing of APIC IDs > MAX_APIC_ID
Ignore them like it's done in the MADT parser. This allows booting on a box
with SRAT and APIC IDs > 255.

Reported by:	Wei Liu <wei.liu2@citrix.com>
Tested by:	Wei Liu <wei.liu2@citrix.com>
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Citrix Systems R&D
2017-03-16 09:33:36 +00:00
Konstantin Belousov 57f6622f92 For i386, remove config options CPU_DISABLE_CMPXCHG, CPU_DISABLE_SSE
and device npx.

This means that FPU is always initialized and handled when available,
and SSE+ register file and exception are handled when available.  This
makes the kernel FPU code much easier to maintain by the cost of
slight bloat for CPUs older than 25 years.

CPU_DISABLE_CMPXCHG outlived its usefulness, see the removed comment
explaining the original purpose.

Suggested by and discussed with:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2017-02-03 12:51:40 +00:00
Konstantin Belousov 5ab0f0c3f0 Prefix hex memory addresses with 0x in diagnostic messages from the
SRAT parser.

Submitted by:	Oliver Pinter
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8750
2016-12-11 19:01:27 +00:00
Jung-uk Kim 493deb390b Merge ACPICA 20160930. 2016-10-04 20:27:15 +00:00
Konstantin Belousov 36596c2a29 Detect x2APIC mode on boot and obey it.
If BIOS performed hand-off to OS with BSP LAPIC in the x2APIC mode,
system usually consumes such configuration without a notice, since
x2APIC is turned on by OS if possible (nop).  But if BIOS
simultaneously requested OS to not use x2APIC, code assumption that
that xAPIC is active breaks.

In my opinion, we cannot safely turn off x2APIC if control is passed
in this mode.  Make madt.c ignore user or BIOS requests to turn x2APIC
off, and do not check the x2APIC black list.  Just trust the config
and try to continue, giving a warning in dmesg.

Reported and tested by:	Slawa Olhovchenkov <slw@zxy.spb.ru> (previous version)
Diagnosed by and discussed with:	avg
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-09-19 15:58:45 +00:00
Mark Johnston f4d0e9c95f Allow ACPI wakeup code and page tables to be stored in non-contiguous pages.
Since these pages are allocated from a narrow range of memory, this makes
the allocation more likely to succeed.

Suggested by:	kib
Reviewed by:	jkim, kib
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D7154
2016-07-14 00:38:04 +00:00
Mark Johnston c722a89a63 Use M_NOWAIT when allocating memory for the ACPI wakeup handler.
If the allocation attempt fails, we may otherwise VM_WAIT after a failed
attempt to reclaim contiguous memory in the requested range. After r297466,
this results in the thread going to sleep, causing a hang during boot.

Reviewed by:	jkim, kib
Approved by:	re (gjb)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D6945
2016-06-23 19:24:38 +00:00
Eric van Gyzen 2db0699d88 Work around (ignore) broken SRAT tables
Instead of panicking when parsing an invalid ACPI SRAT table,
just ignore it, effectively disabling NUMA.

https://lists.freebsd.org/pipermail/freebsd-current/2016-May/060984.html

Reported and tested by:	 Bill O'Hanlon (bill.ohanlon at gmail.com)
Reviewed by:	jhb
MFC after:	1 week
Relnotes:	If dmesg shows "SRAT: Duplicate local APIC ID",
                try updating your BIOS to fix NUMA support.
Sponsored by:	Dell Inc.
2016-05-03 20:14:04 +00:00
John Baldwin 8a08b7d36b Revert bus_get_cpus() for now.
I really thought I had run this through the tinderbox before committing,
but many places need <sys/types.h> -> <sys/param.h> for <sys/bus.h> now.
2016-05-03 01:17:40 +00:00
John Baldwin bc153c692f Add a new bus method to fetch device-specific CPU sets.
bus_get_cpus() returns a specified set of CPUs for a device.  It accepts
an enum for the second parameter that indicates the type of cpuset to
request.  Currently two valus are supported:

 - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
   the device when DEVICE_NUMA is enabled)
 - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL.  INTR_CPUS is mapped to 'all_cpus'
by default.  The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled.  They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Reviewed by:	wblock (manpage)
Differential Revision:	https://reviews.freebsd.org/D5519
2016-05-02 18:00:38 +00:00
Conrad Meyer 3765b80993 SRAT: Don't overflow domain_pxm table
If we reached MAXMEMDOM, we would previously try to insert an additional
element and only detect overflow after causing (probably trivial) memory
overflow.  Instead, detect the ndomain > MAXMEMDOM case before we write past
the end.

Reported by:	Coverity
CID:		1354783
Sponsored by:	EMC / Isilon Storage Division
2016-04-20 01:10:07 +00:00
Warner Losh bd3bce41db Deprecate using hints.acpi.0.rsdp to communicate the RSDP to the
system. This uses the hints mechnanism. This mostly works today
because when there's no static hints (the default), this value can be
fetched from the hint. When there is a static hints file, the hint
passed from the boot loader to the kernel is ignored, but for the BIOS
case we're able to find it anyway. However, with UEFI, the fallback
doesn't work, so we get a panic instead.

Switch to acpi.rsdp and use TUNABLE_ULONG_FETCH instead. Continue to
generate the old values to allow for transitions. In addition, fall
back to the old method if the new method isn't present.

Add comments about all this.

Differential Revision: https://reviews.freebsd.org/D5866
2016-04-14 04:59:51 +00:00
John Baldwin 62d70a8174 Add more fine-grained kernel options for NUMA support.
VM_NUMA_ALLOC is used to enable use of domain-aware memory allocation in
the virtual memory system.  DEVICE_NUMA is used to enable affinity
reporting for devices such as bus_get_domain().

MAXMEMDOM must still be set to a value greater than for any NUMA support
to be effective.  Note that 'cpuset -gd' always works if MAXMEMDOM is
enabled and the system supports NUMA.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D5782
2016-04-09 13:58:04 +00:00
Svatopluk Kraus a1e1814d76 As <machine/pmap.h> is included from <vm/pmap.h>, there is no need to
include it explicitly when <vm/pmap.h> is already included.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D5373
2016-02-22 09:02:20 +00:00
Konstantin Belousov 906430e4f0 In the SandyBridge x2APIC workaround detection code, only fetch the
environment variable when SandyBridge CPU is detected.  Reduce code
duplication.

Sponsored by:	The FreeBSD Foundation
2015-12-03 10:59:10 +00:00
Adrian Chadd a14bc739d5 Add ASUS Sandybridge laptops to the similar x2apic disable logic
that was recently added for Lenovo laptops.

This is a prime candidate for conversion into a table and also
checking other fields like "product".

Tested:

* ASUS UX31E
2015-09-16 01:44:11 +00:00
Konstantin Belousov 8c48615974 Automatically disable x2APIC mode on SandyBridge Lenovo machines. I
believe that the bug only affects mobile CPUs, at least I did not see
other reports, but it is impossible to detect it in madt_setup_local().

While there, reduce duplication in the information strings printed
when x2APIC is auto-disabled, and do not print the line when user
manually override the setting.

Tested and reviewed by:	  royger (previous version)
Sponsored by:	The FreeBSD Foundation
2015-08-21 15:13:25 +00:00
Jung-uk Kim 5ef5072350 Merge ACPICA 20150619. 2015-06-18 23:14:45 +00:00
John Baldwin 125954c873 Handle X2APIC entries in the MADT for APICs with an ID < 255. At least one
BIOS has been seen to include such entries even though the relevant specs
require that X2APIC entries only be used for CPUs with an APIC ID >= 255.

This was tested on a system with "plain" local APIC entries in the MADT
to ensure no regressions, but it has not yet been tested on a system with
X2APIC entries in the MADT.  Currently such systems do not boot at all,
and with this change they might now boot correctly.

Differential Revision:	https://reviews.freebsd.org/D2521
Reviewed by:	kib
MFC after:	2 weeks
2015-06-09 10:49:40 +00:00
Adrian Chadd 4f5e270a93 Update the comments to match what the code ended up becoming.
-1 is now "no locality information available".

Sponsored by:	Norse Corp, Inc.
2015-05-15 21:33:19 +00:00
Adrian Chadd 415d7ccab2 Add initial memory locality cost awareness to the VM, and include
a basic ACPI SLIT table parser.

For now this just exports the map via sysctl; it'll eventually be useful
to userland when there's more useful NUMA support in -HEAD.

* Add an optional mem_locality map;
* add a mapping function taking from/to domain and returning the
  relative cost, or -1 if it's not available;
* Add a very basic SLIT parser to x86 ACPI.

Differential Revision:	https://reviews.freebsd.org/D2460
Reviewed by:	rpaulo, stas, jhb
Sponsored by:	Norse Corp, Inc (hardware, coding); Dell (hardware)
2015-05-08 00:56:56 +00:00
John Baldwin 179fa75e6e Reassign copyright statements on several files from Advanced
Computing Technologies LLC to Hudson River Trading LLC.

Approved by:	Hudson River Trading LLC (who owns ACT LLC)
MFC after:	1 week
2015-04-23 14:22:20 +00:00
Konstantin Belousov 34c15db9cd Add config option PAE_TABLES for the i386 kernel. It switches pmap to
use PAE format for the page tables, but does not incur other
consequences of the full PAE config.  In particular, vm_paddr_t and
bus_addr_t are left 32bit, and max supported memory is still limited
by 4GB.

The option allows to have nx permissions for memory mappings on i386
kernel, while keeping the usual i386 KBI and avoiding the kernel data
sizing problems typical for the PAE config.

Intel documented that the PAE format for page tables is available
starting with the Pentium Pro, but it is possible that the plain
Pentium CPUs have the required support (Appendix H).  The goal is to
enable the option and non-exec mappings on i386 for the GENERIC
kernel.  Anybody wanting a useful system on 486, have to reconfigure
the modern i386 kernel anyway.

Discussed with:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-04-13 15:22:45 +00:00
Jung-uk Kim 9e222cd613 Fix build on i386.
Reported by:	bz
2015-04-12 22:40:27 +00:00
Konstantin Belousov 45bc78eb45 For now, disable x2APIC mode when Xen is detected, even if CPU
declares support for it.  Newer versions of Xen works fine with x2APIC
code, but e.g. Xen 4.2 delivers GPF on the LAPIC MSR write, despite
x2APIC mode being known to hypervisor.

Discussed with:	royger
Sponsored by:	The FreeBSD Foundation
2015-02-25 16:44:07 +00:00
Tijl Coosemans f140f6d14b Fix build on i386 without "device apic"
Reviewed by:	kib
2015-02-20 19:42:26 +00:00
Konstantin Belousov 117c6e7cf2 Fix UP build.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2015-02-18 10:51:48 +00:00