Merge 6.9-rc5 into char-misc-next

We need the char/misc fixes in here as well to work off of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit is contained in:
Greg Kroah-Hartman 2024-04-23 13:16:03 +02:00
commit df1aa5b0d1
626 changed files with 9479 additions and 4401 deletions

View File

@ -446,7 +446,8 @@ Mythri P K <mythripk@ti.com>
Nadav Amit <nadav.amit@gmail.com> <namit@vmware.com>
Nadav Amit <nadav.amit@gmail.com> <namit@cs.technion.ac.il>
Nadia Yvette Chambers <nyc@holomorphy.com> William Lee Irwin III <wli@holomorphy.com>
Naoya Horiguchi <naoya.horiguchi@nec.com> <n-horiguchi@ah.jp.nec.com>
Naoya Horiguchi <nao.horiguchi@gmail.com> <n-horiguchi@ah.jp.nec.com>
Naoya Horiguchi <nao.horiguchi@gmail.com> <naoya.horiguchi@nec.com>
Nathan Chancellor <nathan@kernel.org> <natechancellor@gmail.com>
Neeraj Upadhyay <quic_neeraju@quicinc.com> <neeraju@codeaurora.org>
Neil Armstrong <neil.armstrong@linaro.org> <narmstrong@baylibre.com>
@ -524,6 +525,7 @@ Rémi Denis-Courmont <rdenis@simphalempin.com>
Ricardo Ribalda <ribalda@kernel.org> <ricardo@ribalda.com>
Ricardo Ribalda <ribalda@kernel.org> Ricardo Ribalda Delgado <ribalda@kernel.org>
Ricardo Ribalda <ribalda@kernel.org> <ricardo.ribalda@gmail.com>
Richard Genoud <richard.genoud@bootlin.com> <richard.genoud@gmail.com>
Richard Leitner <richard.leitner@linux.dev> <dev@g0hl1n.net>
Richard Leitner <richard.leitner@linux.dev> <me@g0hl1n.net>
Richard Leitner <richard.leitner@linux.dev> <richard.leitner@skidata.com>

View File

@ -3146,6 +3146,10 @@ S: Triftstra=DFe 55
S: 13353 Berlin
S: Germany
N: Gustavo Pimental
E: gustavo.pimentel@synopsys.com
D: PCI driver for Synopsys DesignWare
N: Emanuel Pirker
E: epirker@edu.uni-klu.ac.at
D: AIC5800 IEEE 1394, RAW I/O on 1394

View File

@ -138,11 +138,10 @@ associated with the source address of the indirect branch. Specifically,
the BHB might be shared across privilege levels even in the presence of
Enhanced IBRS.
Currently the only known real-world BHB attack vector is via
unprivileged eBPF. Therefore, it's highly recommended to not enable
unprivileged eBPF, especially when eIBRS is used (without retpolines).
For a full mitigation against BHB attacks, it's recommended to use
retpolines (or eIBRS combined with retpolines).
Previously the only known real-world BHB attack vector was via unprivileged
eBPF. Further research has found attacks that don't require unprivileged eBPF.
For a full mitigation against BHB attacks it is recommended to set BHI_DIS_S or
use the BHB clearing sequence.
Attack scenarios
----------------
@ -430,6 +429,23 @@ The possible values in this file are:
'PBRSB-eIBRS: Not affected' CPU is not affected by PBRSB
=========================== =======================================================
- Branch History Injection (BHI) protection status:
.. list-table::
* - BHI: Not affected
- System is not affected
* - BHI: Retpoline
- System is protected by retpoline
* - BHI: BHI_DIS_S
- System is protected by BHI_DIS_S
* - BHI: SW loop, KVM SW loop
- System is protected by software clearing sequence
* - BHI: Vulnerable
- System is vulnerable to BHI
* - BHI: Vulnerable, KVM: SW loop
- System is vulnerable; KVM is protected by software clearing sequence
Full mitigation might require a microcode update from the CPU
vendor. When the necessary microcode is not available, the kernel will
report vulnerability.
@ -484,7 +500,11 @@ Spectre variant 2
Systems which support enhanced IBRS (eIBRS) enable IBRS protection once at
boot, by setting the IBRS bit, and they're automatically protected against
Spectre v2 variant attacks.
some Spectre v2 variant attacks. The BHB can still influence the choice of
indirect branch predictor entry, and although branch predictor entries are
isolated between modes when eIBRS is enabled, the BHB itself is not isolated
between modes. Systems which support BHI_DIS_S will set it to protect against
BHI attacks.
On Intel's enhanced IBRS systems, this includes cross-thread branch target
injections on SMT systems (STIBP). In other words, Intel eIBRS enables
@ -638,6 +658,18 @@ kernel command line.
spectre_v2=off. Spectre variant 1 mitigations
cannot be disabled.
spectre_bhi=
[X86] Control mitigation of Branch History Injection
(BHI) vulnerability. This setting affects the deployment
of the HW BHI control and the SW BHB clearing sequence.
on
(default) Enable the HW or SW mitigation as
needed.
off
Disable the mitigation.
For spectre_v2_user see Documentation/admin-guide/kernel-parameters.txt
Mitigation selection guide

View File

@ -3444,6 +3444,7 @@
retbleed=off [X86]
spec_rstack_overflow=off [X86]
spec_store_bypass_disable=off [X86,PPC]
spectre_bhi=off [X86]
spectre_v2_user=off [X86]
srbds=off [X86,INTEL]
ssbd=force-off [ARM64]
@ -6063,6 +6064,15 @@
sonypi.*= [HW] Sony Programmable I/O Control Device driver
See Documentation/admin-guide/laptops/sonypi.rst
spectre_bhi= [X86] Control mitigation of Branch History Injection
(BHI) vulnerability. This setting affects the
deployment of the HW BHI control and the SW BHB
clearing sequence.
on - (default) Enable the HW or SW mitigation
as needed.
off - Disable the mitigation.
spectre_v2= [X86,EARLY] Control mitigation of Spectre variant 2
(indirect branch speculation) vulnerability.
The default operation protects the kernel from

View File

@ -53,6 +53,15 @@ patternProperties:
compatible:
const: qcom,sm8150-dpu
"^displayport-controller@[0-9a-f]+$":
type: object
additionalProperties: true
properties:
compatible:
contains:
const: qcom,sm8150-dp
"^dsi@[0-9a-f]+$":
type: object
additionalProperties: true

View File

@ -52,6 +52,9 @@ properties:
- const: main
- const: mm
power-domains:
maxItems: 1
required:
- compatible
- reg

View File

@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Atmel Universal Synchronous Asynchronous Receiver/Transmitter (USART)
maintainers:
- Richard Genoud <richard.genoud@gmail.com>
- Richard Genoud <richard.genoud@bootlin.com>
properties:
compatible:

View File

@ -97,7 +97,6 @@ like this::
static struct virtio_driver virtio_dummy_driver = {
.driver.name = KBUILD_MODNAME,
.driver.owner = THIS_MODULE,
.id_table = id_table,
.probe = virtio_dummy_probe,
.remove = virtio_dummy_remove,

View File

@ -0,0 +1,11 @@
.. SPDX-License-Identifier: GPL-2.0
======================
bcachefs Documentation
======================
.. toctree::
:maxdepth: 2
:numbered:
errorcodes

View File

@ -69,6 +69,7 @@ Documentation for filesystem implementations.
afs
autofs
autofs-mount-control
bcachefs/index
befs
bfs
btrfs

View File

@ -24,10 +24,10 @@ fragmentation statistics can be obtained through gfp flag information of
each page. It is already implemented and activated if page owner is
enabled. Other usages are more than welcome.
It can also be used to show all the stacks and their outstanding
allocations, which gives us a quick overview of where the memory is going
without the need to screen through all the pages and match the allocation
and free operation.
It can also be used to show all the stacks and their current number of
allocated base pages, which gives us a quick overview of where the memory
is going without the need to screen through all the pages and match the
allocation and free operation.
page owner is disabled by default. So, if you'd like to use it, you need
to add "page_owner=on" to your boot cmdline. If the kernel is built
@ -75,42 +75,45 @@ Usage
cat /sys/kernel/debug/page_owner_stacks/show_stacks > stacks.txt
cat stacks.txt
prep_new_page+0xa9/0x120
get_page_from_freelist+0x7e6/0x2140
__alloc_pages+0x18a/0x370
new_slab+0xc8/0x580
___slab_alloc+0x1f2/0xaf0
__slab_alloc.isra.86+0x22/0x40
kmem_cache_alloc+0x31b/0x350
__khugepaged_enter+0x39/0x100
dup_mmap+0x1c7/0x5ce
copy_process+0x1afe/0x1c90
kernel_clone+0x9a/0x3c0
__do_sys_clone+0x66/0x90
do_syscall_64+0x7f/0x160
entry_SYSCALL_64_after_hwframe+0x6c/0x74
stack_count: 234
post_alloc_hook+0x177/0x1a0
get_page_from_freelist+0xd01/0xd80
__alloc_pages+0x39e/0x7e0
allocate_slab+0xbc/0x3f0
___slab_alloc+0x528/0x8a0
kmem_cache_alloc+0x224/0x3b0
sk_prot_alloc+0x58/0x1a0
sk_alloc+0x32/0x4f0
inet_create+0x427/0xb50
__sock_create+0x2e4/0x650
inet_ctl_sock_create+0x30/0x180
igmp_net_init+0xc1/0x130
ops_init+0x167/0x410
setup_net+0x304/0xa60
copy_net_ns+0x29b/0x4a0
create_new_namespaces+0x4a1/0x820
nr_base_pages: 16
...
...
echo 7000 > /sys/kernel/debug/page_owner_stacks/count_threshold
cat /sys/kernel/debug/page_owner_stacks/show_stacks> stacks_7000.txt
cat stacks_7000.txt
prep_new_page+0xa9/0x120
get_page_from_freelist+0x7e6/0x2140
__alloc_pages+0x18a/0x370
alloc_pages_mpol+0xdf/0x1e0
folio_alloc+0x14/0x50
filemap_alloc_folio+0xb0/0x100
page_cache_ra_unbounded+0x97/0x180
filemap_fault+0x4b4/0x1200
__do_fault+0x2d/0x110
do_pte_missing+0x4b0/0xa30
__handle_mm_fault+0x7fa/0xb70
handle_mm_fault+0x125/0x300
do_user_addr_fault+0x3c9/0x840
exc_page_fault+0x68/0x150
asm_exc_page_fault+0x22/0x30
stack_count: 8248
post_alloc_hook+0x177/0x1a0
get_page_from_freelist+0xd01/0xd80
__alloc_pages+0x39e/0x7e0
alloc_pages_mpol+0x22e/0x490
folio_alloc+0xd5/0x110
filemap_alloc_folio+0x78/0x230
page_cache_ra_order+0x287/0x6f0
filemap_get_pages+0x517/0x1160
filemap_read+0x304/0x9f0
xfs_file_buffered_read+0xe6/0x1d0 [xfs]
xfs_file_read_iter+0x1f0/0x380 [xfs]
__kernel_read+0x3b9/0x730
kernel_read_file+0x309/0x4d0
__do_sys_finit_module+0x381/0x730
do_syscall_64+0x8d/0x150
entry_SYSCALL_64_after_hwframe+0x62/0x6a
nr_base_pages: 20824
...
cat /sys/kernel/debug/page_owner > page_owner_full.txt

View File

@ -252,7 +252,7 @@ an involved disclosed party. The current ambassadors list:
AMD Tom Lendacky <thomas.lendacky@amd.com>
Ampere Darren Hart <darren@os.amperecomputing.com>
ARM Catalin Marinas <catalin.marinas@arm.com>
IBM Power Anton Blanchard <anton@linux.ibm.com>
IBM Power Michael Ellerman <ellerman@au.ibm.com>
IBM Z Christian Borntraeger <borntraeger@de.ibm.com>
Intel Tony Luck <tony.luck@intel.com>
Qualcomm Trilok Soni <quic_tsoni@quicinc.com>

View File

@ -2191,7 +2191,6 @@ N: mxs
ARM/FREESCALE LAYERSCAPE ARM ARCHITECTURE
M: Shawn Guo <shawnguo@kernel.org>
M: Li Yang <leoyang.li@nxp.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git
@ -2708,7 +2707,7 @@ F: sound/soc/rockchip/
N: rockchip
ARM/SAMSUNG S3C, S5P AND EXYNOS ARM ARCHITECTURES
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
R: Alim Akhtar <alim.akhtar@samsung.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: linux-samsung-soc@vger.kernel.org
@ -3573,6 +3572,7 @@ S: Supported
C: irc://irc.oftc.net/bcache
T: git https://evilpiepirate.org/git/bcachefs.git
F: fs/bcachefs/
F: Documentation/filesystems/bcachefs/
BDISP ST MEDIA DRIVER
M: Fabien Dessenne <fabien.dessenne@foss.st.com>
@ -4869,7 +4869,6 @@ F: drivers/power/supply/cw2015_battery.c
CEPH COMMON CODE (LIBCEPH)
M: Ilya Dryomov <idryomov@gmail.com>
M: Xiubo Li <xiubli@redhat.com>
R: Jeff Layton <jlayton@kernel.org>
L: ceph-devel@vger.kernel.org
S: Supported
W: http://ceph.com/
@ -4881,7 +4880,6 @@ F: net/ceph/
CEPH DISTRIBUTED FILE SYSTEM CLIENT (CEPH)
M: Xiubo Li <xiubli@redhat.com>
M: Ilya Dryomov <idryomov@gmail.com>
R: Jeff Layton <jlayton@kernel.org>
L: ceph-devel@vger.kernel.org
S: Supported
W: http://ceph.com/
@ -5557,7 +5555,7 @@ F: drivers/cpuidle/cpuidle-big_little.c
CPUIDLE DRIVER - ARM EXYNOS
M: Daniel Lezcano <daniel.lezcano@linaro.org>
M: Kukjin Kim <kgene@kernel.org>
R: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
R: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-pm@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
S: Maintained
@ -8523,7 +8521,6 @@ S: Maintained
F: drivers/video/fbdev/fsl-diu-fb.*
FREESCALE DMA DRIVER
M: Li Yang <leoyang.li@nxp.com>
M: Zhang Wei <zw@zh-kernel.org>
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
@ -8688,10 +8685,9 @@ F: drivers/soc/fsl/qe/tsa.h
F: include/dt-bindings/soc/cpm1-fsl,tsa.h
FREESCALE QUICC ENGINE UCC ETHERNET DRIVER
M: Li Yang <leoyang.li@nxp.com>
L: netdev@vger.kernel.org
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
S: Orphan
F: drivers/net/ethernet/freescale/ucc_geth*
FREESCALE QUICC ENGINE UCC HDLC DRIVER
@ -8708,10 +8704,9 @@ S: Maintained
F: drivers/tty/serial/ucc_uart.c
FREESCALE SOC DRIVERS
M: Li Yang <leoyang.li@nxp.com>
L: linuxppc-dev@lists.ozlabs.org
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
S: Orphan
F: Documentation/devicetree/bindings/misc/fsl,dpaa2-console.yaml
F: Documentation/devicetree/bindings/soc/fsl/
F: drivers/soc/fsl/
@ -8745,17 +8740,15 @@ F: Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
F: sound/soc/fsl/fsl_qmc_audio.c
FREESCALE USB PERIPHERAL DRIVERS
M: Li Yang <leoyang.li@nxp.com>
L: linux-usb@vger.kernel.org
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
S: Orphan
F: drivers/usb/gadget/udc/fsl*
FREESCALE USB PHY DRIVER
M: Ran Wang <ran.wang_1@nxp.com>
L: linux-usb@vger.kernel.org
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
S: Orphan
F: drivers/usb/phy/phy-fsl-usb*
FREEVXFS FILESYSTEM
@ -9000,7 +8993,7 @@ F: drivers/i2c/muxes/i2c-mux-gpio.c
F: include/linux/platform_data/i2c-mux-gpio.h
GENERIC GPIO RESET DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
S: Maintained
F: drivers/reset/reset-gpio.c
@ -10030,7 +10023,7 @@ F: drivers/media/platform/st/sti/hva
HWPOISON MEMORY FAILURE HANDLING
M: Miaohe Lin <linmiaohe@huawei.com>
R: Naoya Horiguchi <naoya.horiguchi@nec.com>
R: Naoya Horiguchi <nao.horiguchi@gmail.com>
L: linux-mm@kvack.org
S: Maintained
F: mm/hwpoison-inject.c
@ -12001,7 +11994,7 @@ F: include/keys/encrypted-type.h
F: security/keys/encrypted-keys/
KEYS-TRUSTED
M: James Bottomley <jejb@linux.ibm.com>
M: James Bottomley <James.Bottomley@HansenPartnership.com>
M: Jarkko Sakkinen <jarkko@kernel.org>
M: Mimi Zohar <zohar@linux.ibm.com>
L: linux-integrity@vger.kernel.org
@ -13295,7 +13288,7 @@ F: drivers/iio/adc/max11205.c
MAXIM MAX17040 FAMILY FUEL GAUGE DRIVERS
R: Iskren Chernev <iskren.chernev@gmail.com>
R: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
R: Krzysztof Kozlowski <krzk@kernel.org>
R: Marek Szyprowski <m.szyprowski@samsung.com>
R: Matheus Castello <matheus@castello.eng.br>
L: linux-pm@vger.kernel.org
@ -13305,7 +13298,7 @@ F: drivers/power/supply/max17040_battery.c
MAXIM MAX17042 FAMILY FUEL GAUGE DRIVERS
R: Hans de Goede <hdegoede@redhat.com>
R: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
R: Krzysztof Kozlowski <krzk@kernel.org>
R: Marek Szyprowski <m.szyprowski@samsung.com>
R: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
R: Purism Kernel Team <kernel@puri.sm>
@ -13363,7 +13356,7 @@ F: Documentation/devicetree/bindings/power/supply/maxim,max77976.yaml
F: drivers/power/supply/max77976_charger.c
MAXIM MUIC CHARGER DRIVERS FOR EXYNOS BASED BOARDS
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-pm@vger.kernel.org
S: Maintained
B: mailto:linux-samsung-soc@vger.kernel.org
@ -13374,7 +13367,7 @@ F: drivers/power/supply/max77693_charger.c
MAXIM PMIC AND MUIC DRIVERS FOR EXYNOS BASED BOARDS
M: Chanwoo Choi <cw00.choi@samsung.com>
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
B: mailto:linux-samsung-soc@vger.kernel.org
@ -14158,7 +14151,7 @@ F: mm/mm_init.c
F: tools/testing/memblock/
MEMORY CONTROLLER DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
B: mailto:krzysztof.kozlowski@linaro.org
@ -14363,7 +14356,7 @@ F: drivers/dma/at_xdmac.c
F: include/dt-bindings/dma/at91.h
MICROCHIP AT91 SERIAL DRIVER
M: Richard Genoud <richard.genoud@gmail.com>
M: Richard Genoud <richard.genoud@bootlin.com>
S: Maintained
F: Documentation/devicetree/bindings/serial/atmel,at91-usart.yaml
F: drivers/tty/serial/atmel_serial.c
@ -15539,7 +15532,7 @@ F: include/uapi/linux/nexthop.h
F: net/ipv4/nexthop.c
NFC SUBSYSTEM
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: netdev@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/net/nfc/
@ -15916,7 +15909,7 @@ F: Documentation/devicetree/bindings/regulator/nxp,pf8x00-regulator.yaml
F: drivers/regulator/pf8x00-regulator.c
NXP PTN5150A CC LOGIC AND EXTCON DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/extcon/extcon-ptn5150.yaml
@ -16527,7 +16520,7 @@ K: of_overlay_remove
OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
M: Rob Herring <robh@kernel.org>
M: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
M: Krzysztof Kozlowski <krzk+dt@kernel.org>
M: Conor Dooley <conor+dt@kernel.org>
L: devicetree@vger.kernel.org
S: Maintained
@ -16974,7 +16967,6 @@ F: drivers/pci/controller/dwc/pci-exynos.c
PCI DRIVER FOR SYNOPSYS DESIGNWARE
M: Jingoo Han <jingoohan1@gmail.com>
M: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
L: linux-pci@vger.kernel.org
S: Maintained
@ -17485,7 +17477,7 @@ F: Documentation/devicetree/bindings/pinctrl/renesas,*
F: drivers/pinctrl/renesas/
PIN CONTROLLER - SAMSUNG
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Sylwester Nawrocki <s.nawrocki@samsung.com>
R: Alim Akhtar <alim.akhtar@samsung.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
@ -19453,7 +19445,7 @@ F: Documentation/devicetree/bindings/sound/samsung*
F: sound/soc/samsung/
SAMSUNG EXYNOS PSEUDO RANDOM NUMBER GENERATOR (RNG) DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-crypto@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
S: Maintained
@ -19488,7 +19480,7 @@ S: Maintained
F: drivers/platform/x86/samsung-laptop.c
SAMSUNG MULTIFUNCTION PMIC DEVICE DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
S: Maintained
@ -19514,7 +19506,7 @@ F: drivers/media/platform/samsung/s3c-camif/
F: include/media/drv-intf/s3c_camif.h
SAMSUNG S3FWRN5 NFC DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
S: Maintained
F: Documentation/devicetree/bindings/net/nfc/samsung,s3fwrn5.yaml
F: drivers/nfc/s3fwrn5
@ -19535,7 +19527,7 @@ S: Supported
F: drivers/media/i2c/s5k5baf.c
SAMSUNG S5P Security SubSystem (SSS) DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Vladimir Zapolskiy <vz@mleia.com>
L: linux-crypto@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
@ -19557,7 +19549,7 @@ F: Documentation/devicetree/bindings/media/samsung,fimc.yaml
F: drivers/media/platform/samsung/exynos4-is/
SAMSUNG SOC CLOCK DRIVERS
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
M: Sylwester Nawrocki <s.nawrocki@samsung.com>
M: Chanwoo Choi <cw00.choi@samsung.com>
R: Alim Akhtar <alim.akhtar@samsung.com>
@ -19589,7 +19581,7 @@ F: drivers/net/ethernet/samsung/sxgbe/
SAMSUNG THERMAL DRIVER
M: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-pm@vger.kernel.org
L: linux-samsung-soc@vger.kernel.org
S: Maintained
@ -19676,7 +19668,7 @@ F: drivers/scsi/sg.c
F: include/scsi/sg.h
SCSI SUBSYSTEM
M: "James E.J. Bottomley" <jejb@linux.ibm.com>
M: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
M: "Martin K. Petersen" <martin.petersen@oracle.com>
L: linux-scsi@vger.kernel.org
S: Maintained
@ -22577,6 +22569,7 @@ Q: https://patchwork.kernel.org/project/linux-pm/list/
B: https://bugzilla.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git turbostat
F: tools/power/x86/turbostat/
F: tools/testing/selftests/turbostat/
TW5864 VIDEO4LINUX DRIVER
M: Bluecherry Maintainers <maintainers@bluecherrydvr.com>
@ -23785,7 +23778,7 @@ S: Orphan
F: drivers/mmc/host/vub300.c
W1 DALLAS'S 1-WIRE BUS
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
S: Maintained
F: Documentation/devicetree/bindings/w1/
F: Documentation/w1/

View File

@ -2,7 +2,7 @@
VERSION = 6
PATCHLEVEL = 9
SUBLEVEL = 0
EXTRAVERSION = -rc3
EXTRAVERSION = -rc5
NAME = Hurr durr I'ma ninja sloth
# *DOCUMENTATION*

View File

@ -1172,12 +1172,12 @@ config PAGE_SIZE_LESS_THAN_256KB
config PAGE_SHIFT
int
default 12 if PAGE_SIZE_4KB
default 13 if PAGE_SIZE_8KB
default 14 if PAGE_SIZE_16KB
default 15 if PAGE_SIZE_32KB
default 16 if PAGE_SIZE_64KB
default 18 if PAGE_SIZE_256KB
default 12 if PAGE_SIZE_4KB
default 13 if PAGE_SIZE_8KB
default 14 if PAGE_SIZE_16KB
default 15 if PAGE_SIZE_32KB
default 16 if PAGE_SIZE_64KB
default 18 if PAGE_SIZE_256KB
# This allows to use a set of generic functions to determine mmap base
# address by giving priority to top-down scheme only if the process

View File

@ -666,7 +666,7 @@ &usdhc1 {
bus-width = <4>;
no-1-8-v;
no-sdio;
no-emmc;
no-mmc;
status = "okay";
};

View File

@ -210,6 +210,7 @@ ov2680_to_mipi: endpoint {
remote-endpoint = <&mipi_from_sensor>;
clock-lanes = <0>;
data-lanes = <1>;
link-frequencies = /bits/ 64 <330000000>;
};
};
};

View File

@ -79,10 +79,8 @@ static struct musb_hdrc_platform_data tusb_data = {
static struct gpiod_lookup_table tusb_gpio_table = {
.dev_id = "musb-tusb",
.table = {
GPIO_LOOKUP("gpio-0-15", 0, "enable",
GPIO_ACTIVE_HIGH),
GPIO_LOOKUP("gpio-48-63", 10, "int",
GPIO_ACTIVE_HIGH),
GPIO_LOOKUP("gpio-0-31", 0, "enable", GPIO_ACTIVE_HIGH),
GPIO_LOOKUP("gpio-32-63", 26, "int", GPIO_ACTIVE_HIGH),
{ }
},
};
@ -140,12 +138,11 @@ static int slot1_cover_open;
static int slot2_cover_open;
static struct device *mmc_device;
static struct gpiod_lookup_table nokia8xx_mmc_gpio_table = {
static struct gpiod_lookup_table nokia800_mmc_gpio_table = {
.dev_id = "mmci-omap.0",
.table = {
/* Slot switch, GPIO 96 */
GPIO_LOOKUP("gpio-80-111", 16,
"switch", GPIO_ACTIVE_HIGH),
GPIO_LOOKUP("gpio-96-127", 0, "switch", GPIO_ACTIVE_HIGH),
{ }
},
};
@ -153,12 +150,12 @@ static struct gpiod_lookup_table nokia8xx_mmc_gpio_table = {
static struct gpiod_lookup_table nokia810_mmc_gpio_table = {
.dev_id = "mmci-omap.0",
.table = {
/* Slot switch, GPIO 96 */
GPIO_LOOKUP("gpio-96-127", 0, "switch", GPIO_ACTIVE_HIGH),
/* Slot index 1, VSD power, GPIO 23 */
GPIO_LOOKUP_IDX("gpio-16-31", 7,
"vsd", 1, GPIO_ACTIVE_HIGH),
GPIO_LOOKUP_IDX("gpio-0-31", 23, "vsd", 1, GPIO_ACTIVE_HIGH),
/* Slot index 1, VIO power, GPIO 9 */
GPIO_LOOKUP_IDX("gpio-0-15", 9,
"vio", 1, GPIO_ACTIVE_HIGH),
GPIO_LOOKUP_IDX("gpio-0-31", 9, "vio", 1, GPIO_ACTIVE_HIGH),
{ }
},
};
@ -415,8 +412,6 @@ static struct omap_mmc_platform_data *mmc_data[OMAP24XX_NR_MMC];
static void __init n8x0_mmc_init(void)
{
gpiod_add_lookup_table(&nokia8xx_mmc_gpio_table);
if (board_is_n810()) {
mmc1_data.slots[0].name = "external";
@ -429,6 +424,8 @@ static void __init n8x0_mmc_init(void)
mmc1_data.slots[1].name = "internal";
mmc1_data.slots[1].ban_openended = 1;
gpiod_add_lookup_table(&nokia810_mmc_gpio_table);
} else {
gpiod_add_lookup_table(&nokia800_mmc_gpio_table);
}
mmc1_data.nr_slots = 2;

View File

@ -41,7 +41,7 @@ usbotg1: usb@5b0d0000 {
interrupts = <GIC_SPI 267 IRQ_TYPE_LEVEL_HIGH>;
fsl,usbphy = <&usbphy1>;
fsl,usbmisc = <&usbmisc1 0>;
clocks = <&usb2_lpcg 0>;
clocks = <&usb2_lpcg IMX_LPCG_CLK_6>;
ahb-burst-config = <0x0>;
tx-burst-size-dword = <0x10>;
rx-burst-size-dword = <0x10>;
@ -58,7 +58,7 @@ usbmisc1: usbmisc@5b0d0200 {
usbphy1: usbphy@5b100000 {
compatible = "fsl,imx7ulp-usbphy";
reg = <0x5b100000 0x1000>;
clocks = <&usb2_lpcg 1>;
clocks = <&usb2_lpcg IMX_LPCG_CLK_7>;
power-domains = <&pd IMX_SC_R_USB_0_PHY>;
status = "disabled";
};
@ -67,8 +67,8 @@ usdhc1: mmc@5b010000 {
interrupts = <GIC_SPI 232 IRQ_TYPE_LEVEL_HIGH>;
reg = <0x5b010000 0x10000>;
clocks = <&sdhc0_lpcg IMX_LPCG_CLK_4>,
<&sdhc0_lpcg IMX_LPCG_CLK_0>,
<&sdhc0_lpcg IMX_LPCG_CLK_5>;
<&sdhc0_lpcg IMX_LPCG_CLK_5>,
<&sdhc0_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "ahb", "per";
power-domains = <&pd IMX_SC_R_SDHC_0>;
status = "disabled";
@ -78,8 +78,8 @@ usdhc2: mmc@5b020000 {
interrupts = <GIC_SPI 233 IRQ_TYPE_LEVEL_HIGH>;
reg = <0x5b020000 0x10000>;
clocks = <&sdhc1_lpcg IMX_LPCG_CLK_4>,
<&sdhc1_lpcg IMX_LPCG_CLK_0>,
<&sdhc1_lpcg IMX_LPCG_CLK_5>;
<&sdhc1_lpcg IMX_LPCG_CLK_5>,
<&sdhc1_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "ahb", "per";
power-domains = <&pd IMX_SC_R_SDHC_1>;
fsl,tuning-start-tap = <20>;
@ -91,8 +91,8 @@ usdhc3: mmc@5b030000 {
interrupts = <GIC_SPI 234 IRQ_TYPE_LEVEL_HIGH>;
reg = <0x5b030000 0x10000>;
clocks = <&sdhc2_lpcg IMX_LPCG_CLK_4>,
<&sdhc2_lpcg IMX_LPCG_CLK_0>,
<&sdhc2_lpcg IMX_LPCG_CLK_5>;
<&sdhc2_lpcg IMX_LPCG_CLK_5>,
<&sdhc2_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "ahb", "per";
power-domains = <&pd IMX_SC_R_SDHC_2>;
status = "disabled";

View File

@ -28,8 +28,8 @@ lpspi0: spi@5a000000 {
#size-cells = <0>;
interrupts = <GIC_SPI 336 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&spi0_lpcg 0>,
<&spi0_lpcg 1>;
clocks = <&spi0_lpcg IMX_LPCG_CLK_0>,
<&spi0_lpcg IMX_LPCG_CLK_4>;
clock-names = "per", "ipg";
assigned-clocks = <&clk IMX_SC_R_SPI_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <60000000>;
@ -44,8 +44,8 @@ lpspi1: spi@5a010000 {
#size-cells = <0>;
interrupts = <GIC_SPI 337 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&spi1_lpcg 0>,
<&spi1_lpcg 1>;
clocks = <&spi1_lpcg IMX_LPCG_CLK_0>,
<&spi1_lpcg IMX_LPCG_CLK_4>;
clock-names = "per", "ipg";
assigned-clocks = <&clk IMX_SC_R_SPI_1 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <60000000>;
@ -60,8 +60,8 @@ lpspi2: spi@5a020000 {
#size-cells = <0>;
interrupts = <GIC_SPI 338 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&spi2_lpcg 0>,
<&spi2_lpcg 1>;
clocks = <&spi2_lpcg IMX_LPCG_CLK_0>,
<&spi2_lpcg IMX_LPCG_CLK_4>;
clock-names = "per", "ipg";
assigned-clocks = <&clk IMX_SC_R_SPI_2 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <60000000>;
@ -76,8 +76,8 @@ lpspi3: spi@5a030000 {
#size-cells = <0>;
interrupts = <GIC_SPI 339 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&spi3_lpcg 0>,
<&spi3_lpcg 1>;
clocks = <&spi3_lpcg IMX_LPCG_CLK_0>,
<&spi3_lpcg IMX_LPCG_CLK_4>;
clock-names = "per", "ipg";
assigned-clocks = <&clk IMX_SC_R_SPI_3 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <60000000>;
@ -145,8 +145,8 @@ adma_pwm: pwm@5a190000 {
compatible = "fsl,imx8qxp-pwm", "fsl,imx27-pwm";
reg = <0x5a190000 0x1000>;
interrupts = <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&adma_pwm_lpcg 1>,
<&adma_pwm_lpcg 0>;
clocks = <&adma_pwm_lpcg IMX_LPCG_CLK_4>,
<&adma_pwm_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "per";
assigned-clocks = <&clk IMX_SC_R_LCD_0_PWM_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
@ -355,8 +355,8 @@ adc0: adc@5a880000 {
reg = <0x5a880000 0x10000>;
interrupts = <GIC_SPI 240 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&adc0_lpcg 0>,
<&adc0_lpcg 1>;
clocks = <&adc0_lpcg IMX_LPCG_CLK_0>,
<&adc0_lpcg IMX_LPCG_CLK_4>;
clock-names = "per", "ipg";
assigned-clocks = <&clk IMX_SC_R_ADC_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
@ -370,8 +370,8 @@ adc1: adc@5a890000 {
reg = <0x5a890000 0x10000>;
interrupts = <GIC_SPI 241 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&adc1_lpcg 0>,
<&adc1_lpcg 1>;
clocks = <&adc1_lpcg IMX_LPCG_CLK_0>,
<&adc1_lpcg IMX_LPCG_CLK_4>;
clock-names = "per", "ipg";
assigned-clocks = <&clk IMX_SC_R_ADC_1 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
@ -384,8 +384,8 @@ flexcan1: can@5a8d0000 {
reg = <0x5a8d0000 0x10000>;
interrupts = <GIC_SPI 235 IRQ_TYPE_LEVEL_HIGH>;
interrupt-parent = <&gic>;
clocks = <&can0_lpcg 1>,
<&can0_lpcg 0>;
clocks = <&can0_lpcg IMX_LPCG_CLK_4>,
<&can0_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "per";
assigned-clocks = <&clk IMX_SC_R_CAN_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <40000000>;
@ -405,8 +405,8 @@ flexcan2: can@5a8e0000 {
* CAN1 shares CAN0's clock and to enable CAN0's clock it
* has to be powered on.
*/
clocks = <&can0_lpcg 1>,
<&can0_lpcg 0>;
clocks = <&can0_lpcg IMX_LPCG_CLK_4>,
<&can0_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "per";
assigned-clocks = <&clk IMX_SC_R_CAN_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <40000000>;
@ -426,8 +426,8 @@ flexcan3: can@5a8f0000 {
* CAN2 shares CAN0's clock and to enable CAN0's clock it
* has to be powered on.
*/
clocks = <&can0_lpcg 1>,
<&can0_lpcg 0>;
clocks = <&can0_lpcg IMX_LPCG_CLK_4>,
<&can0_lpcg IMX_LPCG_CLK_0>;
clock-names = "ipg", "per";
assigned-clocks = <&clk IMX_SC_R_CAN_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <40000000>;

View File

@ -25,8 +25,8 @@ lsio_pwm0: pwm@5d000000 {
compatible = "fsl,imx27-pwm";
reg = <0x5d000000 0x10000>;
clock-names = "ipg", "per";
clocks = <&pwm0_lpcg 4>,
<&pwm0_lpcg 1>;
clocks = <&pwm0_lpcg IMX_LPCG_CLK_6>,
<&pwm0_lpcg IMX_LPCG_CLK_1>;
assigned-clocks = <&clk IMX_SC_R_PWM_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
#pwm-cells = <3>;
@ -38,8 +38,8 @@ lsio_pwm1: pwm@5d010000 {
compatible = "fsl,imx27-pwm";
reg = <0x5d010000 0x10000>;
clock-names = "ipg", "per";
clocks = <&pwm1_lpcg 4>,
<&pwm1_lpcg 1>;
clocks = <&pwm1_lpcg IMX_LPCG_CLK_6>,
<&pwm1_lpcg IMX_LPCG_CLK_1>;
assigned-clocks = <&clk IMX_SC_R_PWM_1 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
#pwm-cells = <3>;
@ -51,8 +51,8 @@ lsio_pwm2: pwm@5d020000 {
compatible = "fsl,imx27-pwm";
reg = <0x5d020000 0x10000>;
clock-names = "ipg", "per";
clocks = <&pwm2_lpcg 4>,
<&pwm2_lpcg 1>;
clocks = <&pwm2_lpcg IMX_LPCG_CLK_6>,
<&pwm2_lpcg IMX_LPCG_CLK_1>;
assigned-clocks = <&clk IMX_SC_R_PWM_2 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
#pwm-cells = <3>;
@ -64,8 +64,8 @@ lsio_pwm3: pwm@5d030000 {
compatible = "fsl,imx27-pwm";
reg = <0x5d030000 0x10000>;
clock-names = "ipg", "per";
clocks = <&pwm3_lpcg 4>,
<&pwm3_lpcg 1>;
clocks = <&pwm3_lpcg IMX_LPCG_CLK_6>,
<&pwm3_lpcg IMX_LPCG_CLK_1>;
assigned-clocks = <&clk IMX_SC_R_PWM_3 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
#pwm-cells = <3>;

View File

@ -14,6 +14,7 @@ connector {
pinctrl-0 = <&pinctrl_usbcon1>;
type = "micro";
label = "otg";
vbus-supply = <&reg_usb1_vbus>;
id-gpios = <&gpio3 21 GPIO_ACTIVE_HIGH>;
port {
@ -183,7 +184,6 @@ &usb3_0 {
};
&usb3_phy0 {
vbus-supply = <&reg_usb1_vbus>;
status = "okay";
};

View File

@ -14,6 +14,7 @@ connector {
pinctrl-0 = <&pinctrl_usbcon1>;
type = "micro";
label = "otg";
vbus-supply = <&reg_usb1_vbus>;
id-gpios = <&gpio3 21 GPIO_ACTIVE_HIGH>;
port {
@ -202,7 +203,6 @@ &usb3_0 {
};
&usb3_phy0 {
vbus-supply = <&reg_usb1_vbus>;
status = "okay";
};

View File

@ -153,15 +153,15 @@ &flexcan1 {
};
&flexcan2 {
clocks = <&can1_lpcg 1>,
<&can1_lpcg 0>;
clocks = <&can1_lpcg IMX_LPCG_CLK_4>,
<&can1_lpcg IMX_LPCG_CLK_0>;
assigned-clocks = <&clk IMX_SC_R_CAN_1 IMX_SC_PM_CLK_PER>;
fsl,clk-source = /bits/ 8 <1>;
};
&flexcan3 {
clocks = <&can2_lpcg 1>,
<&can2_lpcg 0>;
clocks = <&can2_lpcg IMX_LPCG_CLK_4>,
<&can2_lpcg IMX_LPCG_CLK_0>;
assigned-clocks = <&clk IMX_SC_R_CAN_2 IMX_SC_PM_CLK_PER>;
fsl,clk-source = /bits/ 8 <1>;
};

View File

@ -161,12 +161,18 @@ static inline unsigned long get_trans_granule(void)
#define MAX_TLBI_RANGE_PAGES __TLBI_RANGE_PAGES(31, 3)
/*
* Generate 'num' values from -1 to 30 with -1 rejected by the
* __flush_tlb_range() loop below.
* Generate 'num' values from -1 to 31 with -1 rejected by the
* __flush_tlb_range() loop below. Its return value is only
* significant for a maximum of MAX_TLBI_RANGE_PAGES pages. If
* 'pages' is more than that, you must iterate over the overall
* range.
*/
#define TLBI_RANGE_MASK GENMASK_ULL(4, 0)
#define __TLBI_RANGE_NUM(pages, scale) \
((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1)
#define __TLBI_RANGE_NUM(pages, scale) \
({ \
int __pages = min((pages), \
__TLBI_RANGE_PAGES(31, (scale))); \
(__pages >> (5 * (scale) + 1)) - 1; \
})
/*
* TLB Invalidation
@ -379,10 +385,6 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
* 3. If there is 1 page remaining, flush it through non-range operations. Range
* operations can only span an even number of pages. We save this for last to
* ensure 64KB start alignment is maintained for the LPA2 case.
*
* Note that certain ranges can be represented by either num = 31 and
* scale or num = 0 and scale + 1. The loop below favours the latter
* since num is limited to 30 by the __TLBI_RANGE_NUM() macro.
*/
#define __flush_tlb_range_op(op, start, pages, stride, \
asid, tlb_level, tlbi_user, lpa2) \

View File

@ -289,6 +289,11 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
adr_l x1, __hyp_text_end
adr_l x2, dcache_clean_poc
blr x2
mov_q x0, INIT_SCTLR_EL2_MMU_OFF
pre_disable_mmu_workaround
msr sctlr_el2, x0
isb
0:
mov_q x0, HCR_HOST_NVHE_FLAGS
@ -323,13 +328,11 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
cbz x0, 2f
/* Set a sane SCTLR_EL1, the VHE way */
pre_disable_mmu_workaround
msr_s SYS_SCTLR_EL12, x1
mov x2, #BOOT_CPU_FLAG_E2H
b 3f
2:
pre_disable_mmu_workaround
msr sctlr_el1, x1
mov x2, xzr
3:

View File

@ -276,7 +276,10 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
pte_t *ptep = NULL;
pgdp = pgd_offset(mm, addr);
p4dp = p4d_offset(pgdp, addr);
p4dp = p4d_alloc(mm, pgdp, addr);
if (!p4dp)
return NULL;
pudp = pud_alloc(mm, p4dp, addr);
if (!pudp)
return NULL;

View File

@ -219,9 +219,6 @@ bool kernel_page_present(struct page *page)
pte_t *ptep;
unsigned long addr = (unsigned long)page_address(page);
if (!can_set_direct_map())
return true;
pgdp = pgd_offset_k(addr);
if (pgd_none(READ_ONCE(*pgdp)))
return false;

View File

@ -100,6 +100,13 @@ bus@10000000 {
#size-cells = <2>;
dma-coherent;
isa@18000000 {
compatible = "isa";
#size-cells = <1>;
#address-cells = <2>;
ranges = <1 0x0 0x0 0x18000000 0x4000>;
};
liointc0: interrupt-controller@1fe01400 {
compatible = "loongson,liointc-2.0";
reg = <0x0 0x1fe01400 0x0 0x40>,

View File

@ -61,12 +61,45 @@ &xhci1 {
&gmac0 {
status = "okay";
phy-mode = "gmii";
phy-handle = <&phy0>;
mdio {
compatible = "snps,dwmac-mdio";
#address-cells = <1>;
#size-cells = <0>;
phy0: ethernet-phy@0 {
reg = <2>;
};
};
};
&gmac1 {
status = "okay";
phy-mode = "gmii";
phy-handle = <&phy1>;
mdio {
compatible = "snps,dwmac-mdio";
#address-cells = <1>;
#size-cells = <0>;
phy1: ethernet-phy@1 {
reg = <2>;
};
};
};
&gmac2 {
status = "okay";
phy-mode = "rgmii";
phy-handle = <&phy2>;
mdio {
compatible = "snps,dwmac-mdio";
#address-cells = <1>;
#size-cells = <0>;
phy2: ethernet-phy@2 {
reg = <0>;
};
};
};

View File

@ -51,6 +51,13 @@ bus@10000000 {
#address-cells = <2>;
#size-cells = <2>;
isa@18400000 {
compatible = "isa";
#size-cells = <1>;
#address-cells = <2>;
ranges = <1 0x0 0x0 0x18400000 0x4000>;
};
pmc: power-management@100d0000 {
compatible = "loongson,ls2k2000-pmc", "loongson,ls2k0500-pmc", "syscon";
reg = <0x0 0x100d0000 0x0 0x58>;
@ -109,6 +116,8 @@ pic: interrupt-controller@10000000 {
msi: msi-controller@1fe01140 {
compatible = "loongson,pch-msi-1.0";
reg = <0x0 0x1fe01140 0x0 0x8>;
interrupt-controller;
#interrupt-cells = <1>;
msi-controller;
loongson,msi-base-vec = <64>;
loongson,msi-num-vecs = <192>;
@ -140,27 +149,34 @@ pcie@1a000000 {
#address-cells = <3>;
#size-cells = <2>;
device_type = "pci";
msi-parent = <&msi>;
bus-range = <0x0 0xff>;
ranges = <0x01000000 0x0 0x00008000 0x0 0x18400000 0x0 0x00008000>,
ranges = <0x01000000 0x0 0x00008000 0x0 0x18408000 0x0 0x00008000>,
<0x02000000 0x0 0x60000000 0x0 0x60000000 0x0 0x20000000>;
gmac0: ethernet@3,0 {
reg = <0x1800 0x0 0x0 0x0 0x0>;
interrupts = <12 IRQ_TYPE_LEVEL_HIGH>;
interrupts = <12 IRQ_TYPE_LEVEL_HIGH>,
<13 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "macirq", "eth_lpi";
interrupt-parent = <&pic>;
status = "disabled";
};
gmac1: ethernet@3,1 {
reg = <0x1900 0x0 0x0 0x0 0x0>;
interrupts = <14 IRQ_TYPE_LEVEL_HIGH>;
interrupts = <14 IRQ_TYPE_LEVEL_HIGH>,
<15 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "macirq", "eth_lpi";
interrupt-parent = <&pic>;
status = "disabled";
};
gmac2: ethernet@3,2 {
reg = <0x1a00 0x0 0x0 0x0 0x0>;
interrupts = <17 IRQ_TYPE_LEVEL_HIGH>;
interrupts = <17 IRQ_TYPE_LEVEL_HIGH>,
<18 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "macirq", "eth_lpi";
interrupt-parent = <&pic>;
status = "disabled";
};

View File

@ -11,6 +11,7 @@
#define _ASM_ADDRSPACE_H
#include <linux/const.h>
#include <linux/sizes.h>
#include <asm/loongarch.h>

View File

@ -14,11 +14,6 @@
#include <asm/pgtable-bits.h>
#include <asm/string.h>
/*
* Change "struct page" to physical address.
*/
#define page_to_phys(page) ((phys_addr_t)page_to_pfn(page) << PAGE_SHIFT)
extern void __init __iomem *early_ioremap(u64 phys_addr, unsigned long size);
extern void __init early_iounmap(void __iomem *addr, unsigned long size);
@ -73,6 +68,21 @@ extern void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t
#define __io_aw() mmiowb()
#ifdef CONFIG_KFENCE
#define virt_to_phys(kaddr) \
({ \
(likely((unsigned long)kaddr < vm_map_base)) ? __pa((unsigned long)kaddr) : \
page_to_phys(tlb_virt_to_page((unsigned long)kaddr)) + offset_in_page((unsigned long)kaddr);\
})
#define phys_to_virt(paddr) \
({ \
extern char *__kfence_pool; \
(unlikely(__kfence_pool == NULL)) ? __va((unsigned long)paddr) : \
page_address(phys_to_page((unsigned long)paddr)) + offset_in_page((unsigned long)paddr);\
})
#endif
#include <asm-generic/io.h>
#define ARCH_HAS_VALID_PHYS_ADDR_RANGE

View File

@ -16,6 +16,7 @@
static inline bool arch_kfence_init_pool(void)
{
int err;
char *kaddr, *vaddr;
char *kfence_pool = __kfence_pool;
struct vm_struct *area;
@ -35,6 +36,14 @@ static inline bool arch_kfence_init_pool(void)
return false;
}
kaddr = kfence_pool;
vaddr = __kfence_pool;
while (kaddr < kfence_pool + KFENCE_POOL_SIZE) {
set_page_address(virt_to_page(kaddr), vaddr);
kaddr += PAGE_SIZE;
vaddr += PAGE_SIZE;
}
return true;
}

View File

@ -78,7 +78,26 @@ typedef struct { unsigned long pgprot; } pgprot_t;
struct page *dmw_virt_to_page(unsigned long kaddr);
struct page *tlb_virt_to_page(unsigned long kaddr);
#define virt_to_pfn(kaddr) PFN_DOWN(PHYSADDR(kaddr))
#define pfn_to_phys(pfn) __pfn_to_phys(pfn)
#define phys_to_pfn(paddr) __phys_to_pfn(paddr)
#define page_to_phys(page) pfn_to_phys(page_to_pfn(page))
#define phys_to_page(paddr) pfn_to_page(phys_to_pfn(paddr))
#ifndef CONFIG_KFENCE
#define page_to_virt(page) __va(page_to_phys(page))
#define virt_to_page(kaddr) phys_to_page(__pa(kaddr))
#else
#define WANT_PAGE_VIRTUAL
#define page_to_virt(page) \
({ \
extern char *__kfence_pool; \
(__kfence_pool == NULL) ? __va(page_to_phys(page)) : page_address(page); \
})
#define virt_to_page(kaddr) \
({ \
@ -86,6 +105,11 @@ struct page *tlb_virt_to_page(unsigned long kaddr);
dmw_virt_to_page((unsigned long)kaddr) : tlb_virt_to_page((unsigned long)kaddr);\
})
#endif
#define pfn_to_virt(pfn) page_to_virt(pfn_to_page(pfn))
#define virt_to_pfn(kaddr) page_to_pfn(virt_to_page(kaddr))
extern int __virt_addr_valid(volatile void *kaddr);
#define virt_addr_valid(kaddr) __virt_addr_valid((volatile void *)(kaddr))

View File

@ -4,6 +4,7 @@
*/
#include <linux/export.h>
#include <linux/io.h>
#include <linux/kfence.h>
#include <linux/memblock.h>
#include <linux/mm.h>
#include <linux/mman.h>
@ -111,6 +112,9 @@ int __virt_addr_valid(volatile void *kaddr)
{
unsigned long vaddr = (unsigned long)kaddr;
if (is_kfence_address((void *)kaddr))
return 1;
if ((vaddr < PAGE_OFFSET) || (vaddr >= vm_map_base))
return 0;

View File

@ -11,13 +11,13 @@
struct page *dmw_virt_to_page(unsigned long kaddr)
{
return pfn_to_page(virt_to_pfn(kaddr));
return phys_to_page(__pa(kaddr));
}
EXPORT_SYMBOL(dmw_virt_to_page);
struct page *tlb_virt_to_page(unsigned long kaddr)
{
return pfn_to_page(pte_pfn(*virt_to_kpte(kaddr)));
return phys_to_page(pfn_to_phys(pte_pfn(*virt_to_kpte(kaddr))));
}
EXPORT_SYMBOL(tlb_virt_to_page);

View File

@ -159,7 +159,7 @@ extern unsigned long exception_ip(struct pt_regs *regs);
#define exception_ip(regs) exception_ip(regs)
#define profile_pc(regs) instruction_pointer(regs)
extern asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall);
extern asmlinkage long syscall_trace_enter(struct pt_regs *regs);
extern asmlinkage void syscall_trace_leave(struct pt_regs *regs);
extern void die(const char *, struct pt_regs *) __noreturn;

View File

@ -101,6 +101,7 @@ void output_thread_info_defines(void)
OFFSET(TI_CPU, thread_info, cpu);
OFFSET(TI_PRE_COUNT, thread_info, preempt_count);
OFFSET(TI_REGS, thread_info, regs);
OFFSET(TI_SYSCALL, thread_info, syscall);
DEFINE(_THREAD_SIZE, THREAD_SIZE);
DEFINE(_THREAD_MASK, THREAD_MASK);
DEFINE(_IRQ_STACK_SIZE, IRQ_STACK_SIZE);

View File

@ -1317,16 +1317,13 @@ long arch_ptrace(struct task_struct *child, long request,
* Notification of system call entry/exit
* - triggered by current->work.syscall_trace
*/
asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall)
asmlinkage long syscall_trace_enter(struct pt_regs *regs)
{
user_exit();
current_thread_info()->syscall = syscall;
if (test_thread_flag(TIF_SYSCALL_TRACE)) {
if (ptrace_report_syscall_entry(regs))
return -1;
syscall = current_thread_info()->syscall;
}
#ifdef CONFIG_SECCOMP
@ -1335,7 +1332,7 @@ asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall)
struct seccomp_data sd;
unsigned long args[6];
sd.nr = syscall;
sd.nr = current_thread_info()->syscall;
sd.arch = syscall_get_arch(current);
syscall_get_arguments(current, regs, args);
for (i = 0; i < 6; i++)
@ -1345,23 +1342,23 @@ asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall)
ret = __secure_computing(&sd);
if (ret == -1)
return ret;
syscall = current_thread_info()->syscall;
}
#endif
if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
trace_sys_enter(regs, regs->regs[2]);
audit_syscall_entry(syscall, regs->regs[4], regs->regs[5],
audit_syscall_entry(current_thread_info()->syscall,
regs->regs[4], regs->regs[5],
regs->regs[6], regs->regs[7]);
/*
* Negative syscall numbers are mistaken for rejected syscalls, but
* won't have had the return value set appropriately, so we do so now.
*/
if (syscall < 0)
if (current_thread_info()->syscall < 0)
syscall_set_return_value(current, regs, -ENOSYS, 0);
return syscall;
return current_thread_info()->syscall;
}
/*

View File

@ -77,6 +77,18 @@ loads_done:
PTR_WD load_a7, bad_stack_a7
.previous
/*
* syscall number is in v0 unless we called syscall(__NR_###)
* where the real syscall number is in a0
*/
subu t2, v0, __NR_O32_Linux
bnez t2, 1f /* __NR_syscall at offset 0 */
LONG_S a0, TI_SYSCALL($28) # Save a0 as syscall number
b 2f
1:
LONG_S v0, TI_SYSCALL($28) # Save v0 as syscall number
2:
lw t0, TI_FLAGS($28) # syscall tracing enabled?
li t1, _TIF_WORK_SYSCALL_ENTRY
and t0, t1
@ -114,16 +126,7 @@ syscall_trace_entry:
SAVE_STATIC
move a0, sp
/*
* syscall number is in v0 unless we called syscall(__NR_###)
* where the real syscall number is in a0
*/
move a1, v0
subu t2, v0, __NR_O32_Linux
bnez t2, 1f /* __NR_syscall at offset 0 */
lw a1, PT_R4(sp)
1: jal syscall_trace_enter
jal syscall_trace_enter
bltz v0, 1f # seccomp failed? Skip syscall

View File

@ -44,6 +44,8 @@ NESTED(handle_sysn32, PT_SIZE, sp)
sd a3, PT_R26(sp) # save a3 for syscall restarting
LONG_S v0, TI_SYSCALL($28) # Store syscall number
li t1, _TIF_WORK_SYSCALL_ENTRY
LONG_L t0, TI_FLAGS($28) # syscall tracing enabled?
and t0, t1, t0
@ -72,7 +74,6 @@ syscall_common:
n32_syscall_trace_entry:
SAVE_STATIC
move a0, sp
move a1, v0
jal syscall_trace_enter
bltz v0, 1f # seccomp failed? Skip syscall

View File

@ -46,6 +46,8 @@ NESTED(handle_sys64, PT_SIZE, sp)
sd a3, PT_R26(sp) # save a3 for syscall restarting
LONG_S v0, TI_SYSCALL($28) # Store syscall number
li t1, _TIF_WORK_SYSCALL_ENTRY
LONG_L t0, TI_FLAGS($28) # syscall tracing enabled?
and t0, t1, t0
@ -82,7 +84,6 @@ n64_syscall_exit:
syscall_trace_entry:
SAVE_STATIC
move a0, sp
move a1, v0
jal syscall_trace_enter
bltz v0, 1f # seccomp failed? Skip syscall

View File

@ -79,6 +79,22 @@ loads_done:
PTR_WD load_a7, bad_stack_a7
.previous
/*
* absolute syscall number is in v0 unless we called syscall(__NR_###)
* where the real syscall number is in a0
* note: NR_syscall is the first O32 syscall but the macro is
* only defined when compiling with -mabi=32 (CONFIG_32BIT)
* therefore __NR_O32_Linux is used (4000)
*/
subu t2, v0, __NR_O32_Linux
bnez t2, 1f /* __NR_syscall at offset 0 */
LONG_S a0, TI_SYSCALL($28) # Save a0 as syscall number
b 2f
1:
LONG_S v0, TI_SYSCALL($28) # Save v0 as syscall number
2:
li t1, _TIF_WORK_SYSCALL_ENTRY
LONG_L t0, TI_FLAGS($28) # syscall tracing enabled?
and t0, t1, t0
@ -113,22 +129,7 @@ trace_a_syscall:
sd a7, PT_R11(sp) # For indirect syscalls
move a0, sp
/*
* absolute syscall number is in v0 unless we called syscall(__NR_###)
* where the real syscall number is in a0
* note: NR_syscall is the first O32 syscall but the macro is
* only defined when compiling with -mabi=32 (CONFIG_32BIT)
* therefore __NR_O32_Linux is used (4000)
*/
.set push
.set reorder
subu t1, v0, __NR_O32_Linux
move a1, v0
bnez t1, 1f /* __NR_syscall at offset 0 */
ld a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
.set pop
1: jal syscall_trace_enter
jal syscall_trace_enter
bltz v0, 1f # seccomp failed? Skip syscall

View File

@ -197,6 +197,9 @@ static struct skcipher_alg algs[] = {
static int __init chacha_p10_init(void)
{
if (!cpu_has_feature(CPU_FTR_ARCH_31))
return 0;
static_branch_enable(&have_p10);
return crypto_register_skciphers(algs, ARRAY_SIZE(algs));
@ -204,10 +207,13 @@ static int __init chacha_p10_init(void)
static void __exit chacha_p10_exit(void)
{
if (!static_branch_likely(&have_p10))
return;
crypto_unregister_skciphers(algs, ARRAY_SIZE(algs));
}
module_cpu_feature_match(PPC_MODULE_FEATURE_P10, chacha_p10_init);
module_init(chacha_p10_init);
module_exit(chacha_p10_exit);
MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (P10 accelerated)");

View File

@ -1285,15 +1285,14 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
struct device *dev)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_group *grp = iommu_group_get(dev);
struct iommu_table_group *table_group;
struct iommu_group *grp;
/* At first attach the ownership is already set */
if (!domain) {
iommu_group_put(grp);
if (!domain)
return 0;
}
grp = iommu_group_get(dev);
table_group = iommu_group_get_iommudata(grp);
/*
* The domain being set to PLATFORM from earlier

View File

@ -340,7 +340,8 @@ SYM_CODE_START(pgm_check_handler)
mvc __PT_LAST_BREAK(8,%r11),__LC_PGM_LAST_BREAK
stctg %c1,%c1,__PT_CR1(%r11)
#if IS_ENABLED(CONFIG_KVM)
lg %r12,__LC_GMAP
ltg %r12,__LC_GMAP
jz 5f
clc __GMAP_ASCE(8,%r12), __PT_CR1(%r11)
jne 5f
BPENTER __SF_SIE_FLAGS(%r10),_TIF_ISOLATE_BP_GUEST

View File

@ -2633,6 +2633,16 @@ config MITIGATION_RFDS
stored in floating point, vector and integer registers.
See also <file:Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst>
config MITIGATION_SPECTRE_BHI
bool "Mitigate Spectre-BHB (Branch History Injection)"
depends on CPU_SUP_INTEL
default y
help
Enable BHI mitigations. BHI attacks are a form of Spectre V2 attacks
where the branch history buffer is poisoned to speculatively steer
indirect branches.
See <file:Documentation/admin-guide/hw-vuln/spectre.rst>
endif
config ARCH_HAS_ADD_PAGES

View File

@ -49,7 +49,7 @@ static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
if (likely(unr < NR_syscalls)) {
unr = array_index_nospec(unr, NR_syscalls);
regs->ax = sys_call_table[unr](regs);
regs->ax = x64_sys_call(regs, unr);
return true;
}
return false;
@ -66,7 +66,7 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
if (IS_ENABLED(CONFIG_X86_X32_ABI) && likely(xnr < X32_NR_syscalls)) {
xnr = array_index_nospec(xnr, X32_NR_syscalls);
regs->ax = x32_sys_call_table[xnr](regs);
regs->ax = x32_sys_call(regs, xnr);
return true;
}
return false;
@ -162,7 +162,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr)
if (likely(unr < IA32_NR_syscalls)) {
unr = array_index_nospec(unr, IA32_NR_syscalls);
regs->ax = ia32_sys_call_table[unr](regs);
regs->ax = ia32_sys_call(regs, unr);
} else if (nr != -1) {
regs->ax = __ia32_sys_ni_syscall(regs);
}
@ -189,7 +189,7 @@ static __always_inline bool int80_is_external(void)
}
/**
* int80_emulation - 32-bit legacy syscall entry
* do_int80_emulation - 32-bit legacy syscall C entry from asm
*
* This entry point can be used by 32-bit and 64-bit programs to perform
* 32-bit system calls. Instances of INT $0x80 can be found inline in
@ -207,7 +207,7 @@ static __always_inline bool int80_is_external(void)
* eax: system call number
* ebx, ecx, edx, esi, edi, ebp: arg1 - arg 6
*/
DEFINE_IDTENTRY_RAW(int80_emulation)
__visible noinstr void do_int80_emulation(struct pt_regs *regs)
{
int nr;
@ -255,6 +255,71 @@ DEFINE_IDTENTRY_RAW(int80_emulation)
instrumentation_end();
syscall_exit_to_user_mode(regs);
}
#ifdef CONFIG_X86_FRED
/*
* A FRED-specific INT80 handler is warranted for the follwing reasons:
*
* 1) As INT instructions and hardware interrupts are separate event
* types, FRED does not preclude the use of vector 0x80 for external
* interrupts. As a result, the FRED setup code does not reserve
* vector 0x80 and calling int80_is_external() is not merely
* suboptimal but actively incorrect: it could cause a system call
* to be incorrectly ignored.
*
* 2) It is called only for handling vector 0x80 of event type
* EVENT_TYPE_SWINT and will never be called to handle any external
* interrupt (event type EVENT_TYPE_EXTINT).
*
* 3) FRED has separate entry flows depending on if the event came from
* user space or kernel space, and because the kernel does not use
* INT insns, the FRED kernel entry handler fred_entry_from_kernel()
* falls through to fred_bad_type() if the event type is
* EVENT_TYPE_SWINT, i.e., INT insns. So if the kernel is handling
* an INT insn, it can only be from a user level.
*
* 4) int80_emulation() does a CLEAR_BRANCH_HISTORY. While FRED will
* likely take a different approach if it is ever needed: it
* probably belongs in either fred_intx()/ fred_other() or
* asm_fred_entrypoint_user(), depending on if this ought to be done
* for all entries from userspace or only system
* calls.
*
* 5) INT $0x80 is the fast path for 32-bit system calls under FRED.
*/
DEFINE_FREDENTRY_RAW(int80_emulation)
{
int nr;
enter_from_user_mode(regs);
instrumentation_begin();
add_random_kstack_offset();
/*
* FRED pushed 0 into regs::orig_ax and regs::ax contains the
* syscall number.
*
* User tracing code (ptrace or signal handlers) might assume
* that the regs::orig_ax contains a 32-bit number on invoking
* a 32-bit syscall.
*
* Establish the syscall convention by saving the 32bit truncated
* syscall number in regs::orig_ax and by invalidating regs::ax.
*/
regs->orig_ax = regs->ax & GENMASK(31, 0);
regs->ax = -ENOSYS;
nr = syscall_32_enter(regs);
local_irq_enable();
nr = syscall_enter_from_user_mode_work(regs, nr);
do_syscall_32_irqs_on(regs, nr);
instrumentation_end();
syscall_exit_to_user_mode(regs);
}
#endif
#else /* CONFIG_IA32_EMULATION */
/* Handles int $0x80 on a 32bit kernel */

View File

@ -116,6 +116,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
/* clobbers %rax, make sure it is after saving the syscall nr */
IBRS_ENTER
UNTRAIN_RET
CLEAR_BRANCH_HISTORY
call do_syscall_64 /* returns with IRQs disabled */
@ -1491,3 +1492,63 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
call make_task_dead
SYM_CODE_END(rewind_stack_and_make_dead)
.popsection
/*
* This sequence executes branches in order to remove user branch information
* from the branch history tracker in the Branch Predictor, therefore removing
* user influence on subsequent BTB lookups.
*
* It should be used on parts prior to Alder Lake. Newer parts should use the
* BHI_DIS_S hardware control instead. If a pre-Alder Lake part is being
* virtualized on newer hardware the VMM should protect against BHI attacks by
* setting BHI_DIS_S for the guests.
*
* CALLs/RETs are necessary to prevent Loop Stream Detector(LSD) from engaging
* and not clearing the branch history. The call tree looks like:
*
* call 1
* call 2
* call 2
* call 2
* call 2
* call 2
* ret
* ret
* ret
* ret
* ret
* ret
*
* This means that the stack is non-constant and ORC can't unwind it with %rsp
* alone. Therefore we unconditionally set up the frame pointer, which allows
* ORC to unwind properly.
*
* The alignment is for performance and not for safety, and may be safely
* refactored in the future if needed.
*/
SYM_FUNC_START(clear_bhb_loop)
push %rbp
mov %rsp, %rbp
movl $5, %ecx
ANNOTATE_INTRA_FUNCTION_CALL
call 1f
jmp 5f
.align 64, 0xcc
ANNOTATE_INTRA_FUNCTION_CALL
1: call 2f
RET
.align 64, 0xcc
2: movl $5, %eax
3: jmp 4f
nop
4: sub $1, %eax
jnz 3b
sub $1, %ecx
jnz 1b
RET
5: lfence
pop %rbp
RET
SYM_FUNC_END(clear_bhb_loop)
EXPORT_SYMBOL_GPL(clear_bhb_loop)
STACK_FRAME_NON_STANDARD(clear_bhb_loop)

View File

@ -92,6 +92,7 @@ SYM_INNER_LABEL(entry_SYSENTER_compat_after_hwframe, SYM_L_GLOBAL)
IBRS_ENTER
UNTRAIN_RET
CLEAR_BRANCH_HISTORY
/*
* SYSENTER doesn't filter flags, so we need to clear NT and AC
@ -206,6 +207,7 @@ SYM_INNER_LABEL(entry_SYSCALL_compat_after_hwframe, SYM_L_GLOBAL)
IBRS_ENTER
UNTRAIN_RET
CLEAR_BRANCH_HISTORY
movq %rsp, %rdi
call do_fast_syscall_32
@ -276,3 +278,17 @@ SYM_INNER_LABEL(entry_SYSRETL_compat_end, SYM_L_GLOBAL)
ANNOTATE_NOENDBR
int3
SYM_CODE_END(entry_SYSCALL_compat)
/*
* int 0x80 is used by 32 bit mode as a system call entry. Normally idt entries
* point to C routines, however since this is a system call interface the branch
* history needs to be scrubbed to protect against BHI attacks, and that
* scrubbing needs to take place in assembly code prior to entering any C
* routines.
*/
SYM_CODE_START(int80_emulation)
ANNOTATE_NOENDBR
UNWIND_HINT_FUNC
CLEAR_BRANCH_HISTORY
jmp do_int80_emulation
SYM_CODE_END(int80_emulation)

View File

@ -28,9 +28,9 @@ static noinstr void fred_bad_type(struct pt_regs *regs, unsigned long error_code
if (regs->fred_cs.sl > 0) {
pr_emerg("PANIC: invalid or fatal FRED event; event type %u "
"vector %u error 0x%lx aux 0x%lx at %04x:%016lx\n",
regs->fred_ss.type, regs->fred_ss.vector, regs->orig_ax,
regs->fred_ss.type, regs->fred_ss.vector, error_code,
fred_event_data(regs), regs->cs, regs->ip);
die("invalid or fatal FRED event", regs, regs->orig_ax);
die("invalid or fatal FRED event", regs, error_code);
panic("invalid or fatal FRED event");
} else {
unsigned long flags = oops_begin();
@ -38,10 +38,10 @@ static noinstr void fred_bad_type(struct pt_regs *regs, unsigned long error_code
pr_alert("BUG: invalid or fatal FRED event; event type %u "
"vector %u error 0x%lx aux 0x%lx at %04x:%016lx\n",
regs->fred_ss.type, regs->fred_ss.vector, regs->orig_ax,
regs->fred_ss.type, regs->fred_ss.vector, error_code,
fred_event_data(regs), regs->cs, regs->ip);
if (__die("Invalid or fatal FRED event", regs, regs->orig_ax))
if (__die("Invalid or fatal FRED event", regs, error_code))
sig = 0;
oops_end(flags, regs, sig);
@ -66,7 +66,7 @@ static noinstr void fred_intx(struct pt_regs *regs)
/* INT80 */
case IA32_SYSCALL_VECTOR:
if (ia32_enabled())
return int80_emulation(regs);
return fred_int80_emulation(regs);
fallthrough;
#endif

View File

@ -18,8 +18,25 @@
#include <asm/syscalls_32.h>
#undef __SYSCALL
/*
* The sys_call_table[] is no longer used for system calls, but
* kernel/trace/trace_syscalls.c still wants to know the system
* call address.
*/
#ifdef CONFIG_X86_32
#define __SYSCALL(nr, sym) __ia32_##sym,
__visible const sys_call_ptr_t ia32_sys_call_table[] = {
const sys_call_ptr_t sys_call_table[] = {
#include <asm/syscalls_32.h>
};
#undef __SYSCALL
#endif
#define __SYSCALL(nr, sym) case nr: return __ia32_##sym(regs);
long ia32_sys_call(const struct pt_regs *regs, unsigned int nr)
{
switch (nr) {
#include <asm/syscalls_32.h>
default: return __ia32_sys_ni_syscall(regs);
}
};

View File

@ -11,8 +11,23 @@
#include <asm/syscalls_64.h>
#undef __SYSCALL
/*
* The sys_call_table[] is no longer used for system calls, but
* kernel/trace/trace_syscalls.c still wants to know the system
* call address.
*/
#define __SYSCALL(nr, sym) __x64_##sym,
asmlinkage const sys_call_ptr_t sys_call_table[] = {
const sys_call_ptr_t sys_call_table[] = {
#include <asm/syscalls_64.h>
};
#undef __SYSCALL
#define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
{
switch (nr) {
#include <asm/syscalls_64.h>
default: return __x64_sys_ni_syscall(regs);
}
};

View File

@ -11,8 +11,12 @@
#include <asm/syscalls_x32.h>
#undef __SYSCALL
#define __SYSCALL(nr, sym) __x64_##sym,
#define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
asmlinkage const sys_call_ptr_t x32_sys_call_table[] = {
#include <asm/syscalls_x32.h>
long x32_sys_call(const struct pt_regs *regs, unsigned int nr)
{
switch (nr) {
#include <asm/syscalls_x32.h>
default: return __x64_sys_ni_syscall(regs);
}
};

View File

@ -1644,6 +1644,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
while (++i < cpuc->n_events) {
cpuc->event_list[i-1] = cpuc->event_list[i];
cpuc->event_constraint[i-1] = cpuc->event_constraint[i];
cpuc->assign[i-1] = cpuc->assign[i];
}
cpuc->event_constraint[i-1] = NULL;
--cpuc->n_events;

View File

@ -1693,6 +1693,7 @@ void x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
lbr->from = x86_pmu.lbr_from;
lbr->to = x86_pmu.lbr_to;
lbr->info = x86_pmu.lbr_info;
lbr->has_callstack = x86_pmu_has_lbr_callstack();
}
EXPORT_SYMBOL_GPL(x86_perf_get_lbr);

View File

@ -105,7 +105,7 @@ static bool cpu_is_self(int cpu)
* IPI implementation on Hyper-V.
*/
static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
bool exclude_self)
bool exclude_self)
{
struct hv_send_ipi_ex *ipi_arg;
unsigned long flags;
@ -132,8 +132,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
if (!cpumask_equal(mask, cpu_present_mask) || exclude_self) {
ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K;
nr_bank = cpumask_to_vpset_skip(&(ipi_arg->vp_set), mask,
exclude_self ? cpu_is_self : NULL);
nr_bank = cpumask_to_vpset_skip(&ipi_arg->vp_set, mask,
exclude_self ? cpu_is_self : NULL);
/*
* 'nr_bank <= 0' means some CPUs in cpumask can't be
@ -147,7 +147,7 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
}
status = hv_do_rep_hypercall(HVCALL_SEND_IPI_EX, 0, nr_bank,
ipi_arg, NULL);
ipi_arg, NULL);
ipi_mask_ex_done:
local_irq_restore(flags);
@ -155,7 +155,7 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
}
static bool __send_ipi_mask(const struct cpumask *mask, int vector,
bool exclude_self)
bool exclude_self)
{
int cur_cpu, vcpu, this_cpu = smp_processor_id();
struct hv_send_ipi ipi_arg;
@ -181,7 +181,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector,
return false;
}
if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
if (vector < HV_IPI_LOW_VECTOR || vector > HV_IPI_HIGH_VECTOR)
return false;
/*
@ -218,7 +218,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector,
}
status = hv_do_fast_hypercall16(HVCALL_SEND_IPI, ipi_arg.vector,
ipi_arg.cpu_mask);
ipi_arg.cpu_mask);
return hv_result_success(status);
do_ex_hypercall:
@ -241,7 +241,7 @@ static bool __send_ipi_one(int cpu, int vector)
return false;
}
if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
if (vector < HV_IPI_LOW_VECTOR || vector > HV_IPI_HIGH_VECTOR)
return false;
if (vp >= 64)

View File

@ -3,7 +3,6 @@
#include <linux/vmalloc.h>
#include <linux/mm.h>
#include <linux/clockchips.h>
#include <linux/acpi.h>
#include <linux/hyperv.h>
#include <linux/slab.h>
#include <linux/cpuhotplug.h>
@ -116,12 +115,11 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
{
struct hv_add_logical_processor_in *input;
struct hv_add_logical_processor_out *output;
struct hv_input_add_logical_processor *input;
struct hv_output_add_logical_processor *output;
u64 status;
unsigned long flags;
int ret = HV_STATUS_SUCCESS;
int pxm = node_to_pxm(node);
/*
* When adding a logical processor, the hypervisor may return
@ -137,11 +135,7 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
input->lp_index = lp_index;
input->apic_id = apic_id;
input->flags = 0;
input->proximity_domain_info.domain_id = pxm;
input->proximity_domain_info.flags.reserved = 0;
input->proximity_domain_info.flags.proximity_info_valid = 1;
input->proximity_domain_info.flags.proximity_preferred = 1;
input->proximity_domain_info = hv_numa_node_to_pxm_info(node);
status = hv_do_hypercall(HVCALL_ADD_LOGICAL_PROCESSOR,
input, output);
local_irq_restore(flags);
@ -166,7 +160,6 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
u64 status;
unsigned long irq_flags;
int ret = HV_STATUS_SUCCESS;
int pxm = node_to_pxm(node);
/* Root VPs don't seem to need pages deposited */
if (partition_id != hv_current_partition_id) {
@ -185,14 +178,7 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
input->vp_index = vp_index;
input->flags = flags;
input->subnode_type = HvSubnodeAny;
if (node != NUMA_NO_NODE) {
input->proximity_domain_info.domain_id = pxm;
input->proximity_domain_info.flags.reserved = 0;
input->proximity_domain_info.flags.proximity_info_valid = 1;
input->proximity_domain_info.flags.proximity_preferred = 1;
} else {
input->proximity_domain_info.as_uint64 = 0;
}
input->proximity_domain_info = hv_numa_node_to_pxm_info(node);
status = hv_do_hypercall(HVCALL_CREATE_VP, input, NULL);
local_irq_restore(irq_flags);

View File

@ -13,6 +13,7 @@
#include <asm/mpspec.h>
#include <asm/msr.h>
#include <asm/hardirq.h>
#include <asm/io.h>
#define ARCH_APICTIMER_STOPS_ON_C3 1
@ -98,7 +99,7 @@ static inline void native_apic_mem_write(u32 reg, u32 v)
static inline u32 native_apic_mem_read(u32 reg)
{
return *((volatile u32 *)(APIC_BASE + reg));
return readl((void __iomem *)(APIC_BASE + reg));
}
static inline void native_apic_mem_eoi(void)

View File

@ -79,6 +79,9 @@ do { \
#define __smp_mb__before_atomic() do { } while (0)
#define __smp_mb__after_atomic() do { } while (0)
/* Writing to CR3 provides a full memory barrier in switch_mm(). */
#define smp_mb__after_switch_mm() do { } while (0)
#include <asm-generic/barrier.h>
#endif /* _ASM_X86_BARRIER_H */

View File

@ -461,11 +461,15 @@
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
* CPUID levels like 0x80000022, etc.
* CPUID levels like 0x80000022, etc and Linux defined features.
*
* Reuse free bits when adding new feature flags!
*/
#define X86_FEATURE_AMD_LBR_PMC_FREEZE (21*32+ 0) /* AMD LBR and PMC Freeze */
#define X86_FEATURE_CLEAR_BHB_LOOP (21*32+ 1) /* "" Clear branch history at syscall entry using SW loop */
#define X86_FEATURE_BHI_CTRL (21*32+ 2) /* "" BHI_DIS_S HW control available */
#define X86_FEATURE_CLEAR_BHB_HW (21*32+ 3) /* "" BHI_DIS_S HW control enabled */
#define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* "" Clear branch history at vmexit using SW loop */
/*
* BUG word(s)
@ -515,4 +519,5 @@
#define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */
#define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */
#define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */
#define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */
#endif /* _ASM_X86_CPUFEATURES_H */

View File

@ -855,6 +855,7 @@ struct kvm_vcpu_arch {
int cpuid_nent;
struct kvm_cpuid_entry2 *cpuid_entries;
struct kvm_hypervisor_cpuid kvm_cpuid;
bool is_amd_compatible;
/*
* FIXME: Drop this macro and use KVM_NR_GOVERNED_FEATURES directly

View File

@ -61,10 +61,13 @@
#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
#define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable Branch History Injection behavior */
#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
/* A mask for bits which the kernel toggles when controlling mitigations */
#define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \
| SPEC_CTRL_RRSBA_DIS_S)
| SPEC_CTRL_RRSBA_DIS_S \
| SPEC_CTRL_BHI_DIS_S)
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
#define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */
@ -163,6 +166,10 @@
* are restricted to targets in
* kernel.
*/
#define ARCH_CAP_BHI_NO BIT(20) /*
* CPU is not affected by Branch
* History Injection.
*/
#define ARCH_CAP_PBRSB_NO BIT(24) /*
* Not susceptible to Post-Barrier
* Return Stack Buffer Predictions.

View File

@ -326,6 +326,19 @@
ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
.endm
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP
.endm
.macro CLEAR_BRANCH_HISTORY_VMEXIT
ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT
.endm
#else
#define CLEAR_BRANCH_HISTORY
#define CLEAR_BRANCH_HISTORY_VMEXIT
#endif
#else /* __ASSEMBLY__ */
#define ANNOTATE_RETPOLINE_SAFE \
@ -368,6 +381,10 @@ extern void srso_alias_return_thunk(void);
extern void entry_untrain_ret(void);
extern void entry_ibpb(void);
#ifdef CONFIG_X86_64
extern void clear_bhb_loop(void);
#endif
extern void (*x86_return_thunk)(void);
extern void __warn_thunk(void);

View File

@ -555,6 +555,7 @@ struct x86_pmu_lbr {
unsigned int from;
unsigned int to;
unsigned int info;
bool has_callstack;
};
extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);

View File

@ -16,19 +16,17 @@
#include <asm/thread_info.h> /* for TS_COMPAT */
#include <asm/unistd.h>
/* This is used purely for kernel/trace/trace_syscalls.c */
typedef long (*sys_call_ptr_t)(const struct pt_regs *);
extern const sys_call_ptr_t sys_call_table[];
#if defined(CONFIG_X86_32)
#define ia32_sys_call_table sys_call_table
#else
/*
* These may not exist, but still put the prototypes in so we
* can use IS_ENABLED().
*/
extern const sys_call_ptr_t ia32_sys_call_table[];
extern const sys_call_ptr_t x32_sys_call_table[];
#endif
extern long ia32_sys_call(const struct pt_regs *, unsigned int nr);
extern long x32_sys_call(const struct pt_regs *, unsigned int nr);
extern long x64_sys_call(const struct pt_regs *, unsigned int nr);
/*
* Only the low 32 bits of orig_ax are meaningful, so we return int.
@ -127,6 +125,7 @@ static inline int syscall_get_arch(struct task_struct *task)
}
bool do_syscall_64(struct pt_regs *regs, int nr);
void do_int80_emulation(struct pt_regs *regs);
#endif /* CONFIG_X86_32 */

View File

@ -1687,11 +1687,11 @@ static int x2apic_state;
static bool x2apic_hw_locked(void)
{
u64 ia32_cap;
u64 x86_arch_cap_msr;
u64 msr;
ia32_cap = x86_read_arch_cap_msr();
if (ia32_cap & ARCH_CAP_XAPIC_DISABLE) {
x86_arch_cap_msr = x86_read_arch_cap_msr();
if (x86_arch_cap_msr & ARCH_CAP_XAPIC_DISABLE) {
rdmsrl(MSR_IA32_XAPIC_DISABLE_STATUS, msr);
return (msr & LEGACY_XAPIC_DISABLED);
}

View File

@ -535,7 +535,6 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
static void early_init_amd(struct cpuinfo_x86 *c)
{
u64 value;
u32 dummy;
if (c->x86 >= 0xf)
@ -603,20 +602,6 @@ static void early_init_amd(struct cpuinfo_x86 *c)
early_detect_mem_encrypt(c);
/* Re-enable TopologyExtensions if switched off by BIOS */
if (c->x86 == 0x15 &&
(c->x86_model >= 0x10 && c->x86_model <= 0x6f) &&
!cpu_has(c, X86_FEATURE_TOPOEXT)) {
if (msr_set_bit(0xc0011005, 54) > 0) {
rdmsrl(0xc0011005, value);
if (value & BIT_64(54)) {
set_cpu_cap(c, X86_FEATURE_TOPOEXT);
pr_info_once(FW_INFO "CPU: Re-enabling disabled Topology Extensions Support.\n");
}
}
}
if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !cpu_has(c, X86_FEATURE_IBPB_BRTYPE)) {
if (c->x86 == 0x17 && boot_cpu_has(X86_FEATURE_AMD_IBPB))
setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);

View File

@ -61,6 +61,8 @@ EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
EXPORT_SYMBOL_GPL(x86_pred_cmd);
static u64 __ro_after_init x86_arch_cap_msr;
static DEFINE_MUTEX(spec_ctrl_mutex);
void (*x86_return_thunk)(void) __ro_after_init = __x86_return_thunk;
@ -144,6 +146,8 @@ void __init cpu_select_mitigations(void)
x86_spec_ctrl_base &= ~SPEC_CTRL_MITIGATIONS_MASK;
}
x86_arch_cap_msr = x86_read_arch_cap_msr();
/* Select the proper CPU mitigations before patching alternatives: */
spectre_v1_select_mitigation();
spectre_v2_select_mitigation();
@ -301,8 +305,6 @@ static const char * const taa_strings[] = {
static void __init taa_select_mitigation(void)
{
u64 ia32_cap;
if (!boot_cpu_has_bug(X86_BUG_TAA)) {
taa_mitigation = TAA_MITIGATION_OFF;
return;
@ -341,9 +343,8 @@ static void __init taa_select_mitigation(void)
* On MDS_NO=1 CPUs if ARCH_CAP_TSX_CTRL_MSR is not set, microcode
* update is required.
*/
ia32_cap = x86_read_arch_cap_msr();
if ( (ia32_cap & ARCH_CAP_MDS_NO) &&
!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR))
if ( (x86_arch_cap_msr & ARCH_CAP_MDS_NO) &&
!(x86_arch_cap_msr & ARCH_CAP_TSX_CTRL_MSR))
taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
/*
@ -401,8 +402,6 @@ static const char * const mmio_strings[] = {
static void __init mmio_select_mitigation(void)
{
u64 ia32_cap;
if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) ||
boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN) ||
cpu_mitigations_off()) {
@ -413,8 +412,6 @@ static void __init mmio_select_mitigation(void)
if (mmio_mitigation == MMIO_MITIGATION_OFF)
return;
ia32_cap = x86_read_arch_cap_msr();
/*
* Enable CPU buffer clear mitigation for host and VMM, if also affected
* by MDS or TAA. Otherwise, enable mitigation for VMM only.
@ -437,7 +434,7 @@ static void __init mmio_select_mitigation(void)
* be propagated to uncore buffers, clearing the Fill buffers on idle
* is required irrespective of SMT state.
*/
if (!(ia32_cap & ARCH_CAP_FBSDP_NO))
if (!(x86_arch_cap_msr & ARCH_CAP_FBSDP_NO))
static_branch_enable(&mds_idle_clear);
/*
@ -447,10 +444,10 @@ static void __init mmio_select_mitigation(void)
* FB_CLEAR or by the presence of both MD_CLEAR and L1D_FLUSH on MDS
* affected systems.
*/
if ((ia32_cap & ARCH_CAP_FB_CLEAR) ||
if ((x86_arch_cap_msr & ARCH_CAP_FB_CLEAR) ||
(boot_cpu_has(X86_FEATURE_MD_CLEAR) &&
boot_cpu_has(X86_FEATURE_FLUSH_L1D) &&
!(ia32_cap & ARCH_CAP_MDS_NO)))
!(x86_arch_cap_msr & ARCH_CAP_MDS_NO)))
mmio_mitigation = MMIO_MITIGATION_VERW;
else
mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED;
@ -508,7 +505,7 @@ static void __init rfds_select_mitigation(void)
if (rfds_mitigation == RFDS_MITIGATION_OFF)
return;
if (x86_read_arch_cap_msr() & ARCH_CAP_RFDS_CLEAR)
if (x86_arch_cap_msr & ARCH_CAP_RFDS_CLEAR)
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
else
rfds_mitigation = RFDS_MITIGATION_UCODE_NEEDED;
@ -659,8 +656,6 @@ void update_srbds_msr(void)
static void __init srbds_select_mitigation(void)
{
u64 ia32_cap;
if (!boot_cpu_has_bug(X86_BUG_SRBDS))
return;
@ -669,8 +664,7 @@ static void __init srbds_select_mitigation(void)
* are only exposed to SRBDS when TSX is enabled or when CPU is affected
* by Processor MMIO Stale Data vulnerability.
*/
ia32_cap = x86_read_arch_cap_msr();
if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM) &&
if ((x86_arch_cap_msr & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM) &&
!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
srbds_mitigation = SRBDS_MITIGATION_TSX_OFF;
else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
@ -813,7 +807,7 @@ static void __init gds_select_mitigation(void)
/* Will verify below that mitigation _can_ be disabled */
/* No microcode */
if (!(x86_read_arch_cap_msr() & ARCH_CAP_GDS_CTRL)) {
if (!(x86_arch_cap_msr & ARCH_CAP_GDS_CTRL)) {
if (gds_mitigation == GDS_MITIGATION_FORCE) {
/*
* This only needs to be done on the boot CPU so do it
@ -1544,20 +1538,25 @@ static enum spectre_v2_mitigation __init spectre_v2_select_retpoline(void)
return SPECTRE_V2_RETPOLINE;
}
static bool __ro_after_init rrsba_disabled;
/* Disable in-kernel use of non-RSB RET predictors */
static void __init spec_ctrl_disable_kernel_rrsba(void)
{
u64 ia32_cap;
if (rrsba_disabled)
return;
if (!(x86_arch_cap_msr & ARCH_CAP_RRSBA)) {
rrsba_disabled = true;
return;
}
if (!boot_cpu_has(X86_FEATURE_RRSBA_CTRL))
return;
ia32_cap = x86_read_arch_cap_msr();
if (ia32_cap & ARCH_CAP_RRSBA) {
x86_spec_ctrl_base |= SPEC_CTRL_RRSBA_DIS_S;
update_spec_ctrl(x86_spec_ctrl_base);
}
x86_spec_ctrl_base |= SPEC_CTRL_RRSBA_DIS_S;
update_spec_ctrl(x86_spec_ctrl_base);
rrsba_disabled = true;
}
static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
@ -1607,6 +1606,74 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
dump_stack();
}
/*
* Set BHI_DIS_S to prevent indirect branches in kernel to be influenced by
* branch history in userspace. Not needed if BHI_NO is set.
*/
static bool __init spec_ctrl_bhi_dis(void)
{
if (!boot_cpu_has(X86_FEATURE_BHI_CTRL))
return false;
x86_spec_ctrl_base |= SPEC_CTRL_BHI_DIS_S;
update_spec_ctrl(x86_spec_ctrl_base);
setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_HW);
return true;
}
enum bhi_mitigations {
BHI_MITIGATION_OFF,
BHI_MITIGATION_ON,
};
static enum bhi_mitigations bhi_mitigation __ro_after_init =
IS_ENABLED(CONFIG_MITIGATION_SPECTRE_BHI) ? BHI_MITIGATION_ON : BHI_MITIGATION_OFF;
static int __init spectre_bhi_parse_cmdline(char *str)
{
if (!str)
return -EINVAL;
if (!strcmp(str, "off"))
bhi_mitigation = BHI_MITIGATION_OFF;
else if (!strcmp(str, "on"))
bhi_mitigation = BHI_MITIGATION_ON;
else
pr_err("Ignoring unknown spectre_bhi option (%s)", str);
return 0;
}
early_param("spectre_bhi", spectre_bhi_parse_cmdline);
static void __init bhi_select_mitigation(void)
{
if (bhi_mitigation == BHI_MITIGATION_OFF)
return;
/* Retpoline mitigates against BHI unless the CPU has RRSBA behavior */
if (boot_cpu_has(X86_FEATURE_RETPOLINE) &&
!boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE)) {
spec_ctrl_disable_kernel_rrsba();
if (rrsba_disabled)
return;
}
if (spec_ctrl_bhi_dis())
return;
if (!IS_ENABLED(CONFIG_X86_64))
return;
/* Mitigate KVM by default */
setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT);
pr_info("Spectre BHI mitigation: SW BHB clearing on vm exit\n");
/* Mitigate syscalls when the mitigation is forced =on */
setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_LOOP);
pr_info("Spectre BHI mitigation: SW BHB clearing on syscall\n");
}
static void __init spectre_v2_select_mitigation(void)
{
enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@ -1718,6 +1785,9 @@ static void __init spectre_v2_select_mitigation(void)
mode == SPECTRE_V2_RETPOLINE)
spec_ctrl_disable_kernel_rrsba();
if (boot_cpu_has(X86_BUG_BHI))
bhi_select_mitigation();
spectre_v2_enabled = mode;
pr_info("%s\n", spectre_v2_strings[mode]);
@ -1832,8 +1902,6 @@ static void update_indir_branch_cond(void)
/* Update the static key controlling the MDS CPU buffer clear in idle */
static void update_mds_branch_idle(void)
{
u64 ia32_cap = x86_read_arch_cap_msr();
/*
* Enable the idle clearing if SMT is active on CPUs which are
* affected only by MSBDS and not any other MDS variant.
@ -1848,7 +1916,7 @@ static void update_mds_branch_idle(void)
if (sched_smt_active()) {
static_branch_enable(&mds_idle_clear);
} else if (mmio_mitigation == MMIO_MITIGATION_OFF ||
(ia32_cap & ARCH_CAP_FBSDP_NO)) {
(x86_arch_cap_msr & ARCH_CAP_FBSDP_NO)) {
static_branch_disable(&mds_idle_clear);
}
}
@ -2695,15 +2763,15 @@ static char *stibp_state(void)
switch (spectre_v2_user_stibp) {
case SPECTRE_V2_USER_NONE:
return ", STIBP: disabled";
return "; STIBP: disabled";
case SPECTRE_V2_USER_STRICT:
return ", STIBP: forced";
return "; STIBP: forced";
case SPECTRE_V2_USER_STRICT_PREFERRED:
return ", STIBP: always-on";
return "; STIBP: always-on";
case SPECTRE_V2_USER_PRCTL:
case SPECTRE_V2_USER_SECCOMP:
if (static_key_enabled(&switch_to_cond_stibp))
return ", STIBP: conditional";
return "; STIBP: conditional";
}
return "";
}
@ -2712,10 +2780,10 @@ static char *ibpb_state(void)
{
if (boot_cpu_has(X86_FEATURE_IBPB)) {
if (static_key_enabled(&switch_mm_always_ibpb))
return ", IBPB: always-on";
return "; IBPB: always-on";
if (static_key_enabled(&switch_mm_cond_ibpb))
return ", IBPB: conditional";
return ", IBPB: disabled";
return "; IBPB: conditional";
return "; IBPB: disabled";
}
return "";
}
@ -2725,14 +2793,32 @@ static char *pbrsb_eibrs_state(void)
if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
if (boot_cpu_has(X86_FEATURE_RSB_VMEXIT_LITE) ||
boot_cpu_has(X86_FEATURE_RSB_VMEXIT))
return ", PBRSB-eIBRS: SW sequence";
return "; PBRSB-eIBRS: SW sequence";
else
return ", PBRSB-eIBRS: Vulnerable";
return "; PBRSB-eIBRS: Vulnerable";
} else {
return ", PBRSB-eIBRS: Not affected";
return "; PBRSB-eIBRS: Not affected";
}
}
static const char *spectre_bhi_state(void)
{
if (!boot_cpu_has_bug(X86_BUG_BHI))
return "; BHI: Not affected";
else if (boot_cpu_has(X86_FEATURE_CLEAR_BHB_HW))
return "; BHI: BHI_DIS_S";
else if (boot_cpu_has(X86_FEATURE_CLEAR_BHB_LOOP))
return "; BHI: SW loop, KVM: SW loop";
else if (boot_cpu_has(X86_FEATURE_RETPOLINE) &&
!boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE) &&
rrsba_disabled)
return "; BHI: Retpoline";
else if (boot_cpu_has(X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT))
return "; BHI: Vulnerable, KVM: SW loop";
return "; BHI: Vulnerable";
}
static ssize_t spectre_v2_show_state(char *buf)
{
if (spectre_v2_enabled == SPECTRE_V2_LFENCE)
@ -2745,13 +2831,15 @@ static ssize_t spectre_v2_show_state(char *buf)
spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
return sysfs_emit(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
return sysfs_emit(buf, "%s%s%s%s%s%s%s\n",
return sysfs_emit(buf, "%s%s%s%s%s%s%s%s\n",
spectre_v2_strings[spectre_v2_enabled],
ibpb_state(),
boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? "; IBRS_FW" : "",
stibp_state(),
boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? "; RSB filling" : "",
pbrsb_eibrs_state(),
spectre_bhi_state(),
/* this should always be at the end */
spectre_v2_module_string());
}

View File

@ -1120,6 +1120,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#define NO_SPECTRE_V2 BIT(8)
#define NO_MMIO BIT(9)
#define NO_EIBRS_PBRSB BIT(10)
#define NO_BHI BIT(11)
#define VULNWL(vendor, family, model, whitelist) \
X86_MATCH_VENDOR_FAM_MODEL(vendor, family, model, whitelist)
@ -1182,18 +1183,18 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL_INTEL(ATOM_TREMONT_D, NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
/* AMD Family 0xf - 0x12 */
VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB),
VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB),
VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI),
VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI),
/* Zhaoxin Family 7 */
VULNWL(CENTAUR, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO),
VULNWL(ZHAOXIN, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO),
VULNWL(CENTAUR, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO | NO_BHI),
VULNWL(ZHAOXIN, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO | NO_BHI),
{}
};
@ -1283,25 +1284,25 @@ static bool __init cpu_matches(const struct x86_cpu_id *table, unsigned long whi
u64 x86_read_arch_cap_msr(void)
{
u64 ia32_cap = 0;
u64 x86_arch_cap_msr = 0;
if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, x86_arch_cap_msr);
return ia32_cap;
return x86_arch_cap_msr;
}
static bool arch_cap_mmio_immune(u64 ia32_cap)
static bool arch_cap_mmio_immune(u64 x86_arch_cap_msr)
{
return (ia32_cap & ARCH_CAP_FBSDP_NO &&
ia32_cap & ARCH_CAP_PSDP_NO &&
ia32_cap & ARCH_CAP_SBDR_SSDP_NO);
return (x86_arch_cap_msr & ARCH_CAP_FBSDP_NO &&
x86_arch_cap_msr & ARCH_CAP_PSDP_NO &&
x86_arch_cap_msr & ARCH_CAP_SBDR_SSDP_NO);
}
static bool __init vulnerable_to_rfds(u64 ia32_cap)
static bool __init vulnerable_to_rfds(u64 x86_arch_cap_msr)
{
/* The "immunity" bit trumps everything else: */
if (ia32_cap & ARCH_CAP_RFDS_NO)
if (x86_arch_cap_msr & ARCH_CAP_RFDS_NO)
return false;
/*
@ -1309,7 +1310,7 @@ static bool __init vulnerable_to_rfds(u64 ia32_cap)
* indicate that mitigation is needed because guest is running on a
* vulnerable hardware or may migrate to such hardware:
*/
if (ia32_cap & ARCH_CAP_RFDS_CLEAR)
if (x86_arch_cap_msr & ARCH_CAP_RFDS_CLEAR)
return true;
/* Only consult the blacklist when there is no enumeration: */
@ -1318,11 +1319,11 @@ static bool __init vulnerable_to_rfds(u64 ia32_cap)
static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
{
u64 ia32_cap = x86_read_arch_cap_msr();
u64 x86_arch_cap_msr = x86_read_arch_cap_msr();
/* Set ITLB_MULTIHIT bug if cpu is not in the whitelist and not mitigated */
if (!cpu_matches(cpu_vuln_whitelist, NO_ITLB_MULTIHIT) &&
!(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO))
!(x86_arch_cap_msr & ARCH_CAP_PSCHANGE_MC_NO))
setup_force_cpu_bug(X86_BUG_ITLB_MULTIHIT);
if (cpu_matches(cpu_vuln_whitelist, NO_SPECULATION))
@ -1334,7 +1335,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
if (!cpu_matches(cpu_vuln_whitelist, NO_SSB) &&
!(ia32_cap & ARCH_CAP_SSB_NO) &&
!(x86_arch_cap_msr & ARCH_CAP_SSB_NO) &&
!cpu_has(c, X86_FEATURE_AMD_SSB_NO))
setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
@ -1345,17 +1346,17 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
* Don't use AutoIBRS when SNP is enabled because it degrades host
* userspace indirect branch performance.
*/
if ((ia32_cap & ARCH_CAP_IBRS_ALL) ||
if ((x86_arch_cap_msr & ARCH_CAP_IBRS_ALL) ||
(cpu_has(c, X86_FEATURE_AUTOIBRS) &&
!cpu_feature_enabled(X86_FEATURE_SEV_SNP))) {
setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) &&
!(ia32_cap & ARCH_CAP_PBRSB_NO))
!(x86_arch_cap_msr & ARCH_CAP_PBRSB_NO))
setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB);
}
if (!cpu_matches(cpu_vuln_whitelist, NO_MDS) &&
!(ia32_cap & ARCH_CAP_MDS_NO)) {
!(x86_arch_cap_msr & ARCH_CAP_MDS_NO)) {
setup_force_cpu_bug(X86_BUG_MDS);
if (cpu_matches(cpu_vuln_whitelist, MSBDS_ONLY))
setup_force_cpu_bug(X86_BUG_MSBDS_ONLY);
@ -1374,9 +1375,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
* TSX_CTRL check alone is not sufficient for cases when the microcode
* update is not present or running as guest that don't get TSX_CTRL.
*/
if (!(ia32_cap & ARCH_CAP_TAA_NO) &&
if (!(x86_arch_cap_msr & ARCH_CAP_TAA_NO) &&
(cpu_has(c, X86_FEATURE_RTM) ||
(ia32_cap & ARCH_CAP_TSX_CTRL_MSR)))
(x86_arch_cap_msr & ARCH_CAP_TSX_CTRL_MSR)))
setup_force_cpu_bug(X86_BUG_TAA);
/*
@ -1402,7 +1403,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
* Set X86_BUG_MMIO_UNKNOWN for CPUs that are neither in the blacklist,
* nor in the whitelist and also don't enumerate MSR ARCH_CAP MMIO bits.
*/
if (!arch_cap_mmio_immune(ia32_cap)) {
if (!arch_cap_mmio_immune(x86_arch_cap_msr)) {
if (cpu_matches(cpu_vuln_blacklist, MMIO))
setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA);
else if (!cpu_matches(cpu_vuln_whitelist, NO_MMIO))
@ -1410,7 +1411,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
}
if (!cpu_has(c, X86_FEATURE_BTC_NO)) {
if (cpu_matches(cpu_vuln_blacklist, RETBLEED) || (ia32_cap & ARCH_CAP_RSBA))
if (cpu_matches(cpu_vuln_blacklist, RETBLEED) || (x86_arch_cap_msr & ARCH_CAP_RSBA))
setup_force_cpu_bug(X86_BUG_RETBLEED);
}
@ -1428,18 +1429,25 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
* disabling AVX2. The only way to do this in HW is to clear XCR0[2],
* which means that AVX will be disabled.
*/
if (cpu_matches(cpu_vuln_blacklist, GDS) && !(ia32_cap & ARCH_CAP_GDS_NO) &&
if (cpu_matches(cpu_vuln_blacklist, GDS) && !(x86_arch_cap_msr & ARCH_CAP_GDS_NO) &&
boot_cpu_has(X86_FEATURE_AVX))
setup_force_cpu_bug(X86_BUG_GDS);
if (vulnerable_to_rfds(ia32_cap))
if (vulnerable_to_rfds(x86_arch_cap_msr))
setup_force_cpu_bug(X86_BUG_RFDS);
/* When virtualized, eIBRS could be hidden, assume vulnerable */
if (!(x86_arch_cap_msr & ARCH_CAP_BHI_NO) &&
!cpu_matches(cpu_vuln_whitelist, NO_BHI) &&
(boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) ||
boot_cpu_has(X86_FEATURE_HYPERVISOR)))
setup_force_cpu_bug(X86_BUG_BHI);
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;
/* Rogue Data Cache Load? No! */
if (ia32_cap & ARCH_CAP_RDCL_NO)
if (x86_arch_cap_msr & ARCH_CAP_RDCL_NO)
return;
setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);

View File

@ -44,7 +44,10 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_F16C, X86_FEATURE_XMM2, },
{ X86_FEATURE_AES, X86_FEATURE_XMM2 },
{ X86_FEATURE_SHA_NI, X86_FEATURE_XMM2 },
{ X86_FEATURE_GFNI, X86_FEATURE_XMM2 },
{ X86_FEATURE_FMA, X86_FEATURE_AVX },
{ X86_FEATURE_VAES, X86_FEATURE_AVX },
{ X86_FEATURE_VPCLMULQDQ, X86_FEATURE_AVX },
{ X86_FEATURE_AVX2, X86_FEATURE_AVX, },
{ X86_FEATURE_AVX512F, X86_FEATURE_AVX, },
{ X86_FEATURE_AVX512IFMA, X86_FEATURE_AVX512F },
@ -56,9 +59,6 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_AVX512VL, X86_FEATURE_AVX512F },
{ X86_FEATURE_AVX512VBMI, X86_FEATURE_AVX512F },
{ X86_FEATURE_AVX512_VBMI2, X86_FEATURE_AVX512VL },
{ X86_FEATURE_GFNI, X86_FEATURE_AVX512VL },
{ X86_FEATURE_VAES, X86_FEATURE_AVX512VL },
{ X86_FEATURE_VPCLMULQDQ, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_VNNI, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_BITALG, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_4VNNIW, X86_FEATURE_AVX512F },

View File

@ -28,6 +28,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
{ X86_FEATURE_INTEL_PPIN, CPUID_EBX, 0, 0x00000007, 1 },
{ X86_FEATURE_RRSBA_CTRL, CPUID_EDX, 2, 0x00000007, 2 },
{ X86_FEATURE_BHI_CTRL, CPUID_EDX, 4, 0x00000007, 2 },
{ X86_FEATURE_CQM_LLC, CPUID_EDX, 1, 0x0000000f, 0 },
{ X86_FEATURE_CQM_OCCUP_LLC, CPUID_EDX, 0, 0x0000000f, 1 },
{ X86_FEATURE_CQM_MBM_TOTAL, CPUID_EDX, 1, 0x0000000f, 1 },

View File

@ -123,7 +123,6 @@ static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
#endif
set_cpu_possible(cpu, true);
set_cpu_present(cpu, true);
}
@ -210,7 +209,11 @@ static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
topo_info.nr_disabled_cpus++;
}
/* Register present and possible CPUs in the domain maps */
/*
* Register present and possible CPUs in the domain
* maps. cpu_possible_map will be updated in
* topology_init_possible_cpus() after enumeration is done.
*/
for (dom = TOPO_SMT_DOMAIN; dom < TOPO_MAX_DOMAIN; dom++)
set_bit(topo_apicid(apic_id, dom), apic_maps[dom].map);
}

View File

@ -29,11 +29,21 @@ static bool parse_8000_0008(struct topo_scan *tscan)
if (!sft)
sft = get_count_order(ecx.cpu_nthreads + 1);
topology_set_dom(tscan, TOPO_SMT_DOMAIN, sft, ecx.cpu_nthreads + 1);
/*
* cpu_nthreads describes the number of threads in the package
* sft is the number of APIC ID bits per package
*
* As the number of actual threads per core is not described in
* this leaf, just set the CORE domain shift and let the later
* parsers set SMT shift. Assume one thread per core by default
* which is correct if there are no other CPUID leafs to parse.
*/
topology_update_dom(tscan, TOPO_SMT_DOMAIN, 0, 1);
topology_set_dom(tscan, TOPO_CORE_DOMAIN, sft, ecx.cpu_nthreads + 1);
return true;
}
static void store_node(struct topo_scan *tscan, unsigned int nr_nodes, u16 node_id)
static void store_node(struct topo_scan *tscan, u16 nr_nodes, u16 node_id)
{
/*
* Starting with Fam 17h the DIE domain could probably be used to
@ -73,12 +83,14 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_0xb)
tscan->c->topo.initial_apicid = leaf.ext_apic_id;
/*
* If leaf 0xb is available, then SMT shift is set already. If not
* take it from ecx.threads_per_core and use topo_update_dom() -
* topology_set_dom() would propagate and overwrite the already
* propagated CORE level.
* If leaf 0xb is available, then the domain shifts are set
* already and nothing to do here.
*/
if (!has_0xb) {
/*
* Leaf 0x80000008 set the CORE domain shift already.
* Update the SMT domain, but do not propagate it.
*/
unsigned int nthreads = leaf.core_nthreads + 1;
topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(nthreads), nthreads);
@ -109,13 +121,13 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_0xb)
static bool parse_fam10h_node_id(struct topo_scan *tscan)
{
struct {
union {
union {
struct {
u64 node_id : 3,
nodes_per_pkg : 3,
unused : 58;
u64 msr;
};
u64 msr;
} nid;
if (!boot_cpu_has(X86_FEATURE_NODEID_MSR))
@ -135,6 +147,26 @@ static void legacy_set_llc(struct topo_scan *tscan)
tscan->c->topo.llc_id = apicid >> tscan->dom_shifts[TOPO_CORE_DOMAIN];
}
static void topoext_fixup(struct topo_scan *tscan)
{
struct cpuinfo_x86 *c = tscan->c;
u64 msrval;
/* Try to re-enable TopologyExtensions if switched off by BIOS */
if (cpu_has(c, X86_FEATURE_TOPOEXT) || c->x86_vendor != X86_VENDOR_AMD ||
c->x86 != 0x15 || c->x86_model < 0x10 || c->x86_model > 0x6f)
return;
if (msr_set_bit(0xc0011005, 54) <= 0)
return;
rdmsrl(0xc0011005, msrval);
if (msrval & BIT_64(54)) {
set_cpu_cap(c, X86_FEATURE_TOPOEXT);
pr_info_once(FW_INFO "CPU: Re-enabling disabled Topology Extensions Support.\n");
}
}
static void parse_topology_amd(struct topo_scan *tscan)
{
bool has_0xb = false;
@ -164,6 +196,7 @@ static void parse_topology_amd(struct topo_scan *tscan)
void cpu_parse_topology_amd(struct topo_scan *tscan)
{
tscan->amd_nodes_per_pkg = 1;
topoext_fixup(tscan);
parse_topology_amd(tscan);
if (tscan->amd_nodes_per_pkg > 1)

View File

@ -3,11 +3,6 @@
ccflags-y += -I $(srctree)/arch/x86/kvm
ccflags-$(CONFIG_KVM_WERROR) += -Werror
ifeq ($(CONFIG_FRAME_POINTER),y)
OBJECT_FILES_NON_STANDARD_vmx/vmenter.o := y
OBJECT_FILES_NON_STANDARD_svm/vmenter.o := y
endif
include $(srctree)/virt/kvm/Makefile.kvm
kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \

View File

@ -376,6 +376,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
kvm_update_pv_runtime(vcpu);
vcpu->arch.is_amd_compatible = guest_cpuid_is_amd_or_hygon(vcpu);
vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);

View File

@ -120,6 +120,16 @@ static inline bool guest_cpuid_is_intel(struct kvm_vcpu *vcpu)
return best && is_guest_vendor_intel(best->ebx, best->ecx, best->edx);
}
static inline bool guest_cpuid_is_amd_compatible(struct kvm_vcpu *vcpu)
{
return vcpu->arch.is_amd_compatible;
}
static inline bool guest_cpuid_is_intel_compatible(struct kvm_vcpu *vcpu)
{
return !guest_cpuid_is_amd_compatible(vcpu);
}
static inline int guest_cpuid_family(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;

View File

@ -2776,7 +2776,8 @@ int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type)
trig_mode = reg & APIC_LVT_LEVEL_TRIGGER;
r = __apic_accept_irq(apic, mode, vector, 1, trig_mode, NULL);
if (r && lvt_type == APIC_LVTPC)
if (r && lvt_type == APIC_LVTPC &&
guest_cpuid_is_intel_compatible(apic->vcpu))
kvm_lapic_set_reg(apic, APIC_LVTPC, reg | APIC_LVT_MASKED);
return r;
}

View File

@ -4935,7 +4935,7 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
context->cpu_role.base.level, is_efer_nx(context),
guest_can_use(vcpu, X86_FEATURE_GBPAGES),
is_cr4_pse(context),
guest_cpuid_is_amd_or_hygon(vcpu));
guest_cpuid_is_amd_compatible(vcpu));
}
static void __reset_rsvds_bits_mask_ept(struct rsvd_bits_validate *rsvd_check,
@ -5576,9 +5576,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
* that problem is swept under the rug; KVM's CPUID API is horrific and
* it's all but impossible to solve it without introducing a new API.
*/
vcpu->arch.root_mmu.root_role.word = 0;
vcpu->arch.guest_mmu.root_role.word = 0;
vcpu->arch.nested_mmu.root_role.word = 0;
vcpu->arch.root_mmu.root_role.invalid = 1;
vcpu->arch.guest_mmu.root_role.invalid = 1;
vcpu->arch.nested_mmu.root_role.invalid = 1;
vcpu->arch.root_mmu.cpu_role.ext.valid = 0;
vcpu->arch.guest_mmu.cpu_role.ext.valid = 0;
vcpu->arch.nested_mmu.cpu_role.ext.valid = 0;
@ -7399,7 +7399,8 @@ bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
* by the memslot, KVM can't use a hugepage due to the
* misaligned address regardless of memory attributes.
*/
if (gfn >= slot->base_gfn) {
if (gfn >= slot->base_gfn &&
gfn + nr_pages <= slot->base_gfn + slot->npages) {
if (hugepage_has_attrs(kvm, slot, gfn, level, attrs))
hugepage_clear_mixed(slot, gfn, level);
else

View File

@ -1548,17 +1548,21 @@ void kvm_tdp_mmu_try_split_huge_pages(struct kvm *kvm,
}
}
/*
* Clear the dirty status of all the SPTEs mapping GFNs in the memslot. If
* AD bits are enabled, this will involve clearing the dirty bit on each SPTE.
* If AD bits are not enabled, this will require clearing the writable bit on
* each SPTE. Returns true if an SPTE has been changed and the TLBs need to
* be flushed.
*/
static bool tdp_mmu_need_write_protect(struct kvm_mmu_page *sp)
{
/*
* All TDP MMU shadow pages share the same role as their root, aside
* from level, so it is valid to key off any shadow page to determine if
* write protection is needed for an entire tree.
*/
return kvm_mmu_page_ad_need_write_protect(sp) || !kvm_ad_enabled();
}
static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
gfn_t start, gfn_t end)
{
u64 dbit = kvm_ad_enabled() ? shadow_dirty_mask : PT_WRITABLE_MASK;
const u64 dbit = tdp_mmu_need_write_protect(root) ? PT_WRITABLE_MASK :
shadow_dirty_mask;
struct tdp_iter iter;
bool spte_set = false;
@ -1573,7 +1577,7 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
continue;
KVM_MMU_WARN_ON(kvm_ad_enabled() &&
KVM_MMU_WARN_ON(dbit == shadow_dirty_mask &&
spte_ad_need_write_protect(iter.old_spte));
if (!(iter.old_spte & dbit))
@ -1590,11 +1594,9 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
}
/*
* Clear the dirty status of all the SPTEs mapping GFNs in the memslot. If
* AD bits are enabled, this will involve clearing the dirty bit on each SPTE.
* If AD bits are not enabled, this will require clearing the writable bit on
* each SPTE. Returns true if an SPTE has been changed and the TLBs need to
* be flushed.
* Clear the dirty status (D-bit or W-bit) of all the SPTEs mapping GFNs in the
* memslot. Returns true if an SPTE has been changed and the TLBs need to be
* flushed.
*/
bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
const struct kvm_memory_slot *slot)
@ -1610,18 +1612,11 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm,
return spte_set;
}
/*
* Clears the dirty status of all the 4k SPTEs mapping GFNs for which a bit is
* set in mask, starting at gfn. The given memslot is expected to contain all
* the GFNs represented by set bits in the mask. If AD bits are enabled,
* clearing the dirty status will involve clearing the dirty bit on each SPTE
* or, if AD bits are not enabled, clearing the writable bit on each SPTE.
*/
static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
gfn_t gfn, unsigned long mask, bool wrprot)
{
u64 dbit = (wrprot || !kvm_ad_enabled()) ? PT_WRITABLE_MASK :
shadow_dirty_mask;
const u64 dbit = (wrprot || tdp_mmu_need_write_protect(root)) ? PT_WRITABLE_MASK :
shadow_dirty_mask;
struct tdp_iter iter;
lockdep_assert_held_write(&kvm->mmu_lock);
@ -1633,7 +1628,7 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
if (!mask)
break;
KVM_MMU_WARN_ON(kvm_ad_enabled() &&
KVM_MMU_WARN_ON(dbit == shadow_dirty_mask &&
spte_ad_need_write_protect(iter.old_spte));
if (iter.level > PG_LEVEL_4K ||
@ -1659,11 +1654,9 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
}
/*
* Clears the dirty status of all the 4k SPTEs mapping GFNs for which a bit is
* set in mask, starting at gfn. The given memslot is expected to contain all
* the GFNs represented by set bits in the mask. If AD bits are enabled,
* clearing the dirty status will involve clearing the dirty bit on each SPTE
* or, if AD bits are not enabled, clearing the writable bit on each SPTE.
* Clear the dirty status (D-bit or W-bit) of all the 4k SPTEs mapping GFNs for
* which a bit is set in mask, starting at gfn. The given memslot is expected to
* contain all the GFNs represented by set bits in the mask.
*/
void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot,

View File

@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
if (vcpu->kvm->arch.enable_pmu)
static_call(kvm_x86_pmu_refresh)(vcpu);
if (!vcpu->kvm->arch.enable_pmu)
return;
static_call(kvm_x86_pmu_refresh)(vcpu);
/*
* At RESET, both Intel and AMD CPUs set all enable bits for general
* purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
* was written for v1 PMUs don't unknowingly leave GP counters disabled
* in the global controls). Emulate that behavior when refreshing the
* PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
*/
if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)

View File

@ -52,7 +52,7 @@ enum kvm_only_cpuid_leafs {
#define X86_FEATURE_IPRED_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 1)
#define KVM_X86_FEATURE_RRSBA_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 2)
#define X86_FEATURE_DDPD_U KVM_X86_FEATURE(CPUID_7_2_EDX, 3)
#define X86_FEATURE_BHI_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
#define KVM_X86_FEATURE_BHI_CTRL KVM_X86_FEATURE(CPUID_7_2_EDX, 4)
#define X86_FEATURE_MCDT_NO KVM_X86_FEATURE(CPUID_7_2_EDX, 5)
/* CPUID level 0x80000007 (EDX). */
@ -128,6 +128,7 @@ static __always_inline u32 __feature_translate(int x86_feature)
KVM_X86_TRANSLATE_FEATURE(CONSTANT_TSC);
KVM_X86_TRANSLATE_FEATURE(PERFMON_V2);
KVM_X86_TRANSLATE_FEATURE(RRSBA_CTRL);
KVM_X86_TRANSLATE_FEATURE(BHI_CTRL);
default:
return x86_feature;
}

View File

@ -434,7 +434,7 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
/* Avoid using vmalloc for smaller buffers. */
size = npages * sizeof(struct page *);
if (size > PAGE_SIZE)
pages = __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
pages = __vmalloc(size, GFP_KERNEL_ACCOUNT);
else
pages = kmalloc(size, GFP_KERNEL_ACCOUNT);

View File

@ -1503,6 +1503,11 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
__free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE));
}
static struct sev_es_save_area *sev_es_host_save_area(struct svm_cpu_data *sd)
{
return page_address(sd->save_area) + 0x400;
}
static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@ -1519,12 +1524,8 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
* or subsequent vmload of host save area.
*/
vmsave(sd->save_area_pa);
if (sev_es_guest(vcpu->kvm)) {
struct sev_es_save_area *hostsa;
hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
sev_es_prepare_switch_to_guest(hostsa);
}
if (sev_es_guest(vcpu->kvm))
sev_es_prepare_switch_to_guest(sev_es_host_save_area(sd));
if (tsc_scaling)
__svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
@ -4101,6 +4102,7 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
{
struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, vcpu->cpu);
struct vcpu_svm *svm = to_svm(vcpu);
guest_state_enter_irqoff();
@ -4108,7 +4110,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in
amd_clear_divider();
if (sev_es_guest(vcpu->kvm))
__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted,
sev_es_host_save_area(sd));
else
__svm_vcpu_run(svm, spec_ctrl_intercepted);

View File

@ -698,7 +698,8 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
/* vmenter.S */
void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted,
struct sev_es_save_area *hostsa);
void __svm_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
#define DEFINE_KVM_GHCB_ACCESSORS(field) \

View File

@ -3,6 +3,7 @@
#include <asm/asm.h>
#include <asm/asm-offsets.h>
#include <asm/bitsperlong.h>
#include <asm/frame.h>
#include <asm/kvm_vcpu_regs.h>
#include <asm/nospec-branch.h>
#include "kvm-asm-offsets.h"
@ -67,7 +68,7 @@
"", X86_FEATURE_V_SPEC_CTRL
901:
.endm
.macro RESTORE_HOST_SPEC_CTRL_BODY
.macro RESTORE_HOST_SPEC_CTRL_BODY spec_ctrl_intercepted:req
900:
/* Same for after vmexit. */
mov $MSR_IA32_SPEC_CTRL, %ecx
@ -76,7 +77,7 @@
* Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
* if it was not intercepted during guest execution.
*/
cmpb $0, (%_ASM_SP)
cmpb $0, \spec_ctrl_intercepted
jnz 998f
rdmsr
movl %eax, SVM_spec_ctrl(%_ASM_DI)
@ -99,6 +100,7 @@
*/
SYM_FUNC_START(__svm_vcpu_run)
push %_ASM_BP
mov %_ASM_SP, %_ASM_BP
#ifdef CONFIG_X86_64
push %r15
push %r14
@ -268,7 +270,7 @@ SYM_FUNC_START(__svm_vcpu_run)
RET
RESTORE_GUEST_SPEC_CTRL_BODY
RESTORE_HOST_SPEC_CTRL_BODY
RESTORE_HOST_SPEC_CTRL_BODY (%_ASM_SP)
10: cmpb $0, _ASM_RIP(kvm_rebooting)
jne 2b
@ -290,66 +292,68 @@ SYM_FUNC_START(__svm_vcpu_run)
SYM_FUNC_END(__svm_vcpu_run)
#ifdef CONFIG_KVM_AMD_SEV
#ifdef CONFIG_X86_64
#define SEV_ES_GPRS_BASE 0x300
#define SEV_ES_RBX (SEV_ES_GPRS_BASE + __VCPU_REGS_RBX * WORD_SIZE)
#define SEV_ES_RBP (SEV_ES_GPRS_BASE + __VCPU_REGS_RBP * WORD_SIZE)
#define SEV_ES_RSI (SEV_ES_GPRS_BASE + __VCPU_REGS_RSI * WORD_SIZE)
#define SEV_ES_RDI (SEV_ES_GPRS_BASE + __VCPU_REGS_RDI * WORD_SIZE)
#define SEV_ES_R12 (SEV_ES_GPRS_BASE + __VCPU_REGS_R12 * WORD_SIZE)
#define SEV_ES_R13 (SEV_ES_GPRS_BASE + __VCPU_REGS_R13 * WORD_SIZE)
#define SEV_ES_R14 (SEV_ES_GPRS_BASE + __VCPU_REGS_R14 * WORD_SIZE)
#define SEV_ES_R15 (SEV_ES_GPRS_BASE + __VCPU_REGS_R15 * WORD_SIZE)
#endif
/**
* __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
* @svm: struct vcpu_svm *
* @spec_ctrl_intercepted: bool
*/
SYM_FUNC_START(__svm_sev_es_vcpu_run)
push %_ASM_BP
#ifdef CONFIG_X86_64
push %r15
push %r14
push %r13
push %r12
#else
push %edi
push %esi
#endif
push %_ASM_BX
FRAME_BEGIN
/*
* Save variables needed after vmexit on the stack, in inverse
* order compared to when they are needed.
* Save non-volatile (callee-saved) registers to the host save area.
* Except for RAX and RSP, all GPRs are restored on #VMEXIT, but not
* saved on VMRUN.
*/
mov %rbp, SEV_ES_RBP (%rdx)
mov %r15, SEV_ES_R15 (%rdx)
mov %r14, SEV_ES_R14 (%rdx)
mov %r13, SEV_ES_R13 (%rdx)
mov %r12, SEV_ES_R12 (%rdx)
mov %rbx, SEV_ES_RBX (%rdx)
/* Accessed directly from the stack in RESTORE_HOST_SPEC_CTRL. */
push %_ASM_ARG2
/* Save @svm. */
push %_ASM_ARG1
.ifnc _ASM_ARG1, _ASM_DI
/*
* Stash @svm in RDI early. On 32-bit, arguments are in RAX, RCX
* and RDX which are clobbered by RESTORE_GUEST_SPEC_CTRL.
* Save volatile registers that hold arguments that are needed after
* #VMEXIT (RDI=@svm and RSI=@spec_ctrl_intercepted).
*/
mov %_ASM_ARG1, %_ASM_DI
.endif
mov %rdi, SEV_ES_RDI (%rdx)
mov %rsi, SEV_ES_RSI (%rdx)
/* Clobbers RAX, RCX, RDX. */
/* Clobbers RAX, RCX, RDX (@hostsa). */
RESTORE_GUEST_SPEC_CTRL
/* Get svm->current_vmcb->pa into RAX. */
mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
mov SVM_current_vmcb(%rdi), %rax
mov KVM_VMCB_pa(%rax), %rax
/* Enter guest mode */
sti
1: vmrun %_ASM_AX
1: vmrun %rax
2: cli
/* Pop @svm to RDI, guest registers have been saved already. */
pop %_ASM_DI
#ifdef CONFIG_MITIGATION_RETPOLINE
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
#endif
/* Clobbers RAX, RCX, RDX. */
/* Clobbers RAX, RCX, RDX, consumes RDI (@svm) and RSI (@spec_ctrl_intercepted). */
RESTORE_HOST_SPEC_CTRL
/*
@ -361,30 +365,17 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
*/
UNTRAIN_RET_VM
/* "Pop" @spec_ctrl_intercepted. */
pop %_ASM_BX
pop %_ASM_BX
#ifdef CONFIG_X86_64
pop %r12
pop %r13
pop %r14
pop %r15
#else
pop %esi
pop %edi
#endif
pop %_ASM_BP
FRAME_END
RET
RESTORE_GUEST_SPEC_CTRL_BODY
RESTORE_HOST_SPEC_CTRL_BODY
RESTORE_HOST_SPEC_CTRL_BODY %sil
3: cmpb $0, _ASM_RIP(kvm_rebooting)
3: cmpb $0, kvm_rebooting(%rip)
jne 2b
ud2
_ASM_EXTABLE(1b, 3b)
SYM_FUNC_END(__svm_sev_es_vcpu_run)
#endif /* CONFIG_KVM_AMD_SEV */

View File

@ -535,7 +535,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
perf_capabilities = vcpu_get_perf_capabilities(vcpu);
if (cpuid_model_is_consistent(vcpu) &&
(perf_capabilities & PMU_CAP_LBR_FMT))
x86_perf_get_lbr(&lbr_desc->records);
memcpy(&lbr_desc->records, &vmx_lbr_caps, sizeof(vmx_lbr_caps));
else
lbr_desc->records.nr = 0;

View File

@ -275,6 +275,8 @@ SYM_INNER_LABEL_ALIGN(vmx_vmexit, SYM_L_GLOBAL)
call vmx_spec_ctrl_restore_host
CLEAR_BRANCH_HISTORY_VMEXIT
/* Put return value in AX */
mov %_ASM_BX, %_ASM_AX

View File

@ -218,6 +218,8 @@ module_param(ple_window_max, uint, 0444);
int __read_mostly pt_mode = PT_MODE_SYSTEM;
module_param(pt_mode, int, S_IRUGO);
struct x86_pmu_lbr __ro_after_init vmx_lbr_caps;
static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush);
static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond);
static DEFINE_MUTEX(vmx_l1d_flush_mutex);
@ -7862,10 +7864,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
vmx_update_exception_bitmap(vcpu);
}
static u64 vmx_get_perf_capabilities(void)
static __init u64 vmx_get_perf_capabilities(void)
{
u64 perf_cap = PMU_CAP_FW_WRITES;
struct x86_pmu_lbr lbr;
u64 host_perf_cap = 0;
if (!enable_pmu)
@ -7875,15 +7876,43 @@ static u64 vmx_get_perf_capabilities(void)
rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) {
x86_perf_get_lbr(&lbr);
if (lbr.nr)
x86_perf_get_lbr(&vmx_lbr_caps);
/*
* KVM requires LBR callstack support, as the overhead due to
* context switching LBRs without said support is too high.
* See intel_pmu_create_guest_lbr_event() for more info.
*/
if (!vmx_lbr_caps.has_callstack)
memset(&vmx_lbr_caps, 0, sizeof(vmx_lbr_caps));
else if (vmx_lbr_caps.nr)
perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
}
if (vmx_pebs_supported()) {
perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
if ((perf_cap & PERF_CAP_PEBS_FORMAT) < 4)
perf_cap &= ~PERF_CAP_PEBS_BASELINE;
/*
* Disallow adaptive PEBS as it is functionally broken, can be
* used by the guest to read *host* LBRs, and can be used to
* bypass userspace event filters. To correctly and safely
* support adaptive PEBS, KVM needs to:
*
* 1. Account for the ADAPTIVE flag when (re)programming fixed
* counters.
*
* 2. Gain support from perf (or take direct control of counter
* programming) to support events without adaptive PEBS
* enabled for the hardware counter.
*
* 3. Ensure LBR MSRs cannot hold host data on VM-Entry with
* adaptive PEBS enabled and MSR_PEBS_DATA_CFG.LBRS=1.
*
* 4. Document which PMU events are effectively exposed to the
* guest via adaptive PEBS, and make adaptive PEBS mutually
* exclusive with KVM_SET_PMU_EVENT_FILTER if necessary.
*/
perf_cap &= ~PERF_CAP_PEBS_BASELINE;
}
return perf_cap;

View File

@ -15,6 +15,7 @@
#include "vmx_ops.h"
#include "../cpuid.h"
#include "run_flags.h"
#include "../mmu.h"
#define MSR_TYPE_R 1
#define MSR_TYPE_W 2
@ -109,6 +110,8 @@ struct lbr_desc {
bool msr_passthrough;
};
extern struct x86_pmu_lbr vmx_lbr_caps;
/*
* The nested_vmx structure is part of vcpu_vmx, and holds information we need
* for correct emulation of VMX (i.e., nested VMX) on this vcpu.
@ -719,7 +722,8 @@ static inline bool vmx_need_pf_intercept(struct kvm_vcpu *vcpu)
if (!enable_ept)
return true;
return allow_smaller_maxphyaddr && cpuid_maxphyaddr(vcpu) < boot_cpu_data.x86_phys_bits;
return allow_smaller_maxphyaddr &&
cpuid_maxphyaddr(vcpu) < kvm_get_shadow_phys_bits();
}
static inline bool is_unrestricted_guest(struct kvm_vcpu *vcpu)

View File

@ -1621,7 +1621,7 @@ static bool kvm_is_immutable_feature_msr(u32 msr)
ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \
ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR)
ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO)
static u64 kvm_get_arch_capabilities(void)
{
@ -3470,7 +3470,7 @@ static bool is_mci_status_msr(u32 msr)
static bool can_set_mci_status(struct kvm_vcpu *vcpu)
{
/* McStatusWrEn enabled? */
if (guest_cpuid_is_amd_or_hygon(vcpu))
if (guest_cpuid_is_amd_compatible(vcpu))
return !!(vcpu->arch.msr_hwcr & BIT_ULL(18));
return false;

View File

@ -382,8 +382,15 @@ SYM_FUNC_END(call_depth_return_thunk)
SYM_CODE_START(__x86_return_thunk)
UNWIND_HINT_FUNC
ANNOTATE_NOENDBR
#if defined(CONFIG_MITIGATION_UNRET_ENTRY) || \
defined(CONFIG_MITIGATION_SRSO) || \
defined(CONFIG_MITIGATION_CALL_DEPTH_TRACKING)
ALTERNATIVE __stringify(ANNOTATE_UNRET_SAFE; ret), \
"jmp warn_thunk_thunk", X86_FEATURE_ALWAYS
#else
ANNOTATE_UNRET_SAFE
ret
#endif
int3
SYM_CODE_END(__x86_return_thunk)
EXPORT_SYMBOL(__x86_return_thunk)

View File

@ -645,6 +645,14 @@ static void blkdev_flush_mapping(struct block_device *bdev)
bdev_write_inode(bdev);
}
static void blkdev_put_whole(struct block_device *bdev)
{
if (atomic_dec_and_test(&bdev->bd_openers))
blkdev_flush_mapping(bdev);
if (bdev->bd_disk->fops->release)
bdev->bd_disk->fops->release(bdev->bd_disk);
}
static int blkdev_get_whole(struct block_device *bdev, blk_mode_t mode)
{
struct gendisk *disk = bdev->bd_disk;
@ -663,20 +671,21 @@ static int blkdev_get_whole(struct block_device *bdev, blk_mode_t mode)
if (!atomic_read(&bdev->bd_openers))
set_init_blocksize(bdev);
if (test_bit(GD_NEED_PART_SCAN, &disk->state))
bdev_disk_changed(disk, false);
atomic_inc(&bdev->bd_openers);
if (test_bit(GD_NEED_PART_SCAN, &disk->state)) {
/*
* Only return scanning errors if we are called from contexts
* that explicitly want them, e.g. the BLKRRPART ioctl.
*/
ret = bdev_disk_changed(disk, false);
if (ret && (mode & BLK_OPEN_STRICT_SCAN)) {
blkdev_put_whole(bdev);
return ret;
}
}
return 0;
}
static void blkdev_put_whole(struct block_device *bdev)
{
if (atomic_dec_and_test(&bdev->bd_openers))
blkdev_flush_mapping(bdev);
if (bdev->bd_disk->fops->release)
bdev->bd_disk->fops->release(bdev->bd_disk);
}
static int blkdev_get_part(struct block_device *part, blk_mode_t mode)
{
struct gendisk *disk = part->bd_disk;

View File

@ -1409,6 +1409,12 @@ static int blkcg_css_online(struct cgroup_subsys_state *css)
return 0;
}
void blkg_init_queue(struct request_queue *q)
{
INIT_LIST_HEAD(&q->blkg_list);
mutex_init(&q->blkcg_mutex);
}
int blkcg_init_disk(struct gendisk *disk)
{
struct request_queue *q = disk->queue;
@ -1416,9 +1422,6 @@ int blkcg_init_disk(struct gendisk *disk)
bool preloaded;
int ret;
INIT_LIST_HEAD(&q->blkg_list);
mutex_init(&q->blkcg_mutex);
new_blkg = blkg_alloc(&blkcg_root, disk, GFP_KERNEL);
if (!new_blkg)
return -ENOMEM;

View File

@ -189,6 +189,7 @@ struct blkcg_policy {
extern struct blkcg blkcg_root;
extern bool blkcg_debug_stats;
void blkg_init_queue(struct request_queue *q);
int blkcg_init_disk(struct gendisk *disk);
void blkcg_exit_disk(struct gendisk *disk);
@ -482,6 +483,7 @@ struct blkcg {
};
static inline struct blkcg_gq *blkg_lookup(struct blkcg *blkcg, void *key) { return NULL; }
static inline void blkg_init_queue(struct request_queue *q) { }
static inline int blkcg_init_disk(struct gendisk *disk) { return 0; }
static inline void blkcg_exit_disk(struct gendisk *disk) { }
static inline int blkcg_policy_register(struct blkcg_policy *pol) { return 0; }

View File

@ -442,6 +442,8 @@ struct request_queue *blk_alloc_queue(struct queue_limits *lim, int node_id)
init_waitqueue_head(&q->mq_freeze_wq);
mutex_init(&q->mq_freeze_lock);
blkg_init_queue(q);
/*
* Init percpu_ref in atomic mode so that it's faster to shutdown.
* See blk_register_queue() for details.
@ -1195,6 +1197,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
if (unlikely(!rq_list_empty(plug->cached_rq)))
blk_mq_free_plug_rqs(plug);
plug->cur_ktime = 0;
current->flags &= ~PF_BLOCK_TS;
}

View File

@ -1347,7 +1347,7 @@ static bool iocg_kick_delay(struct ioc_gq *iocg, struct ioc_now *now)
{
struct ioc *ioc = iocg->ioc;
struct blkcg_gq *blkg = iocg_to_blkg(iocg);
u64 tdelta, delay, new_delay;
u64 tdelta, delay, new_delay, shift;
s64 vover, vover_pct;
u32 hwa;
@ -1362,8 +1362,9 @@ static bool iocg_kick_delay(struct ioc_gq *iocg, struct ioc_now *now)
/* calculate the current delay in effect - 1/2 every second */
tdelta = now->now - iocg->delay_at;
if (iocg->delay)
delay = iocg->delay >> div64_u64(tdelta, USEC_PER_SEC);
shift = div64_u64(tdelta, USEC_PER_SEC);
if (iocg->delay && shift < BITS_PER_LONG)
delay = iocg->delay >> shift;
else
delay = 0;
@ -1438,8 +1439,11 @@ static void iocg_pay_debt(struct ioc_gq *iocg, u64 abs_vpay,
lockdep_assert_held(&iocg->ioc->lock);
lockdep_assert_held(&iocg->waitq.lock);
/* make sure that nobody messed with @iocg */
WARN_ON_ONCE(list_empty(&iocg->active_list));
/*
* make sure that nobody messed with @iocg. Check iocg->pd.online
* to avoid warn when removing blkcg or disk.
*/
WARN_ON_ONCE(list_empty(&iocg->active_list) && iocg->pd.online);
WARN_ON_ONCE(iocg->inuse > 1);
iocg->abs_vdebt -= min(abs_vpay, iocg->abs_vdebt);

View File

@ -182,17 +182,13 @@ static int blk_validate_limits(struct queue_limits *lim)
return -EINVAL;
/*
* Devices that require a virtual boundary do not support scatter/gather
* I/O natively, but instead require a descriptor list entry for each
* page (which might not be identical to the Linux PAGE_SIZE). Because
* of that they are not limited by our notion of "segment size".
* Stacking device may have both virtual boundary and max segment
* size limit, so allow this setting now, and long-term the two
* might need to move out of stacking limits since we have immutable
* bvec and lower layer bio splitting is supposed to handle the two
* correctly.
*/
if (lim->virt_boundary_mask) {
if (WARN_ON_ONCE(lim->max_segment_size &&
lim->max_segment_size != UINT_MAX))
return -EINVAL;
lim->max_segment_size = UINT_MAX;
} else {
if (!lim->virt_boundary_mask) {
/*
* The maximum segment size has an odd historic 64k default that
* drivers probably should override. Just like the I/O size we

View File

@ -563,7 +563,8 @@ static int blkdev_common_ioctl(struct block_device *bdev, blk_mode_t mode,
return -EACCES;
if (bdev_is_partition(bdev))
return -EINVAL;
return disk_scan_partitions(bdev->bd_disk, mode);
return disk_scan_partitions(bdev->bd_disk,
mode | BLK_OPEN_STRICT_SCAN);
case BLKTRACESTART:
case BLKTRACESTOP:
case BLKTRACETEARDOWN:

Some files were not shown because too many files have changed in this diff Show More