arm64 updates for 6.6

CPU features and system registers:
 	* Advertise hinted conditional branch support (FEAT_HBC) to
 	  userspace
 
 	* Avoid false positive "SANITY CHECK" warning when xCR registers
 	  differ outside of the length field
 
 Documentation:
 	* Fix macro name typo in SME documentation
 
 Entry code:
 	* Unmask exceptions earlier on the system call entry path
 
 Memory management:
 	* Don't bother clearing PTE_RDONLY for dirty ptes in
 	  pte_wrprotect() and pte_modify()
 
 Perf and PMU drivers:
 	* Initial support for Coresight TRBE devices on ACPI systems (the
 	  coresight driver changes will come later)
 
 	* Fix hw_breakpoint single-stepping when called from bpf
 
 	* Fixes for DDR PMU on i.MX8MP SoC
 
 	* Add NUMA-awareness to Hisilicon PCIe PMU driver
 
 	* Fix locking dependency issue in Arm DMC620 PMU driver
 
 	* Workaround Hisilicon erratum 162001900 in the SMMUv3 PMU driver
 
 	* Add support for Arm CMN-700 r3 parts to the CMN PMU driver
 
 	* Add support for recent Arm Cortex CPU PMUs
 
 	* Update Hisilicon PMU maintainers
 
 Selftests:
 	* Add a bunch of new features to the hwcap test (JSCVT, PMULL,
 	  AES, SHA1, etc)
 
 	* Fix SSVE test to leave streaming-mode after grabbing the
 	  signal context
 
 	* Add new test for SVE vector-length changes with SME enabled
 
 Miscellaneous:
 	* Allow compiler to warn on suspicious looking system register
 	  expressions
 
 	* Work around SDEI firmware bug by aborting any running
 	  handlers on a kernel crash
 
 	* Fix some harmless warnings when building with W=1
 
 	* Remove some unused function declarations
 
 	* Other minor fixes and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmTon4QQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNG0nCAC9lTqppELnqXPA3FswONhtDBnKEufZHp0+
 4+Z6CPjAYZpd7ruiezvxeZA62tZl3eX+tYOx+6lf4xYxFA5W/RQdmxM7e0mGJd+n
 sgps85kxArApCgJR9zJiTCAIPXzKH5ObsFWWbcRljI9fiISVDTYn1JFAEx9UERI5
 5yr6blYF2H115oD8V2f/0vVObGOAuiqNnzqJIuKL1I8H9xBK0pssrKvuCCN8J2o4
 28+PeO7PzwWPiSfnO15bLd/bGuzbMCcexv4/DdjtLZaAanW7crJRVAzOon+URuVx
 JXmkzQvXkOgSKnEFwfVRYTsUbtOz2cBafjSujVmjwIBymhbBCZR/
 =WqmX
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "I think we have a bit less than usual on the architecture side, but
  that's somewhat balanced out by a large crop of perf/PMU driver
  updates and extensions to our selftests.

  CPU features and system registers:

   - Advertise hinted conditional branch support (FEAT_HBC) to userspace

   - Avoid false positive "SANITY CHECK" warning when xCR registers
     differ outside of the length field

  Documentation:

   - Fix macro name typo in SME documentation

  Entry code:

   - Unmask exceptions earlier on the system call entry path

  Memory management:

   - Don't bother clearing PTE_RDONLY for dirty ptes in pte_wrprotect()
     and pte_modify()

  Perf and PMU drivers:

   - Initial support for Coresight TRBE devices on ACPI systems (the
     coresight driver changes will come later)

   - Fix hw_breakpoint single-stepping when called from bpf

   - Fixes for DDR PMU on i.MX8MP SoC

   - Add NUMA-awareness to Hisilicon PCIe PMU driver

   - Fix locking dependency issue in Arm DMC620 PMU driver

   - Workaround Hisilicon erratum 162001900 in the SMMUv3 PMU driver

   - Add support for Arm CMN-700 r3 parts to the CMN PMU driver

   - Add support for recent Arm Cortex CPU PMUs

   - Update Hisilicon PMU maintainers

  Selftests:

   - Add a bunch of new features to the hwcap test (JSCVT, PMULL, AES,
     SHA1, etc)

   - Fix SSVE test to leave streaming-mode after grabbing the signal
     context

   - Add new test for SVE vector-length changes with SME enabled

  Miscellaneous:

   - Allow compiler to warn on suspicious looking system register
     expressions

   - Work around SDEI firmware bug by aborting any running handlers on a
     kernel crash

   - Fix some harmless warnings when building with W=1

   - Remove some unused function declarations

   - Other minor fixes and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (62 commits)
  drivers/perf: hisi: Update HiSilicon PMU maintainers
  arm_pmu: acpi: Add a representative platform device for TRBE
  arm_pmu: acpi: Refactor arm_spe_acpi_register_device()
  kselftest/arm64: Fix hwcaps selftest build
  hw_breakpoint: fix single-stepping when using bpf_overflow_handler
  arm64/sysreg: refactor deprecated strncpy
  kselftest/arm64: add jscvt feature to hwcap test
  kselftest/arm64: add pmull feature to hwcap test
  kselftest/arm64: add AES feature check to hwcap test
  kselftest/arm64: add SHA1 and related features to hwcap test
  arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments
  kselftest/arm64: build BTI tests in output directory
  perf/imx_ddr: don't enable counter0 if none of 4 counters are used
  perf/imx_ddr: speed up overflow frequency of cycle
  drivers/perf: hisi: Schedule perf session according to locality
  kselftest/arm64: fix a memleak in zt_regs_run()
  perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock dependency
  perf/smmuv3: Add MODULE_ALIAS for module auto loading
  perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
  kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
  ...
This commit is contained in:
Linus Torvalds 2023-08-28 17:34:54 -07:00
commit 542034175c
68 changed files with 1074 additions and 393 deletions

View File

@ -63,6 +63,14 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #1902691 | ARM64_ERRATUM_1902691 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2051678 | ARM64_ERRATUM_2051678 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2077057 | ARM64_ERRATUM_2077057 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2441009 | ARM64_ERRATUM_2441009 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2658417 | ARM64_ERRATUM_2658417 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
@ -109,14 +117,6 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A77 | #1508412 | ARM64_ERRATUM_1508412 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2051678 | ARM64_ERRATUM_2051678 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2077057 | ARM64_ERRATUM_2077057 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2441009 | ARM64_ERRATUM_2441009 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2658417 | ARM64_ERRATUM_2658417 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2119858 | ARM64_ERRATUM_2119858 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2054223 | ARM64_ERRATUM_2054223 |
@ -198,6 +198,9 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip08 SMMU PMCG | #162001900 | N/A |
| | Hip09 SMMU PMCG | | |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 |
+----------------+-----------------+-----------------+-----------------------------+

View File

@ -322,7 +322,7 @@ The regset data starts with struct user_za_header, containing:
VL is supported.
* The size and layout of the payload depends on the header fields. The
SME_PT_ZA_*() macros are provided to facilitate access to the data.
ZA_PT_ZA*() macros are provided to facilitate access to the data.
* In either case, for SETREGSET it is permissible to omit the payload, in which
case the vector length and flags are changed and PSTATE.ZA is set to 0

View File

@ -49,9 +49,14 @@ properties:
- arm,cortex-a77-pmu
- arm,cortex-a78-pmu
- arm,cortex-a510-pmu
- arm,cortex-a520-pmu
- arm,cortex-a710-pmu
- arm,cortex-a715-pmu
- arm,cortex-a720-pmu
- arm,cortex-x1-pmu
- arm,cortex-x2-pmu
- arm,cortex-x3-pmu
- arm,cortex-x4-pmu
- arm,neoverse-e1-pmu
- arm,neoverse-n1-pmu
- arm,neoverse-n2-pmu

View File

@ -9306,7 +9306,7 @@ F: drivers/crypto/hisilicon/hpre/hpre_crypto.c
F: drivers/crypto/hisilicon/hpre/hpre_main.c
HISILICON HNS3 PMU DRIVER
M: Guangbin Huang <huangguangbin2@huawei.com>
M: Jijie Shao <shaojijie@huawei.com>
S: Supported
F: Documentation/admin-guide/perf/hns3-pmu.rst
F: drivers/perf/hisilicon/hns3_pmu.c
@ -9344,7 +9344,7 @@ F: Documentation/devicetree/bindings/net/hisilicon*.txt
F: drivers/net/ethernet/hisilicon/
HISILICON PMU DRIVER
M: Shaokun Zhang <zhangshaokun@hisilicon.com>
M: Yicong Yang <yangyicong@hisilicon.com>
M: Jonathan Cameron <jonathan.cameron@huawei.com>
S: Supported
W: http://www.hisilicon.com

View File

@ -626,7 +626,7 @@ int hw_breakpoint_arch_parse(struct perf_event *bp,
hw->address &= ~alignment_mask;
hw->ctrl.len <<= offset;
if (is_default_overflow_handler(bp)) {
if (uses_default_overflow_handler(bp)) {
/*
* Mismatch breakpoints are required for single-stepping
* breakpoints.
@ -798,7 +798,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr,
* Otherwise, insert a temporary mismatch breakpoint so that
* we can single-step over the watchpoint trigger.
*/
if (!is_default_overflow_handler(wp))
if (!uses_default_overflow_handler(wp))
continue;
step:
enable_single_step(wp, instruction_pointer(regs));
@ -811,7 +811,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr,
info->trigger = addr;
pr_debug("watchpoint fired: address = 0x%x\n", info->trigger);
perf_bp_event(wp, regs);
if (is_default_overflow_handler(wp))
if (uses_default_overflow_handler(wp))
enable_single_step(wp, instruction_pointer(regs));
}
@ -886,7 +886,7 @@ static void breakpoint_handler(unsigned long unknown, struct pt_regs *regs)
info->trigger = addr;
pr_debug("breakpoint fired: address = 0x%x\n", addr);
perf_bp_event(bp, regs);
if (is_default_overflow_handler(bp))
if (uses_default_overflow_handler(bp))
enable_single_step(bp, addr);
goto unlock;
}

View File

@ -1793,9 +1793,6 @@ config ARM64_PAN
The feature is detected at runtime, and will remain as a 'nop'
instruction if the cpu does not implement the feature.
config AS_HAS_LDAPR
def_bool $(as-instr,.arch_extension rcpc)
config AS_HAS_LSE_ATOMICS
def_bool $(as-instr,.arch_extension lse)
@ -1933,6 +1930,9 @@ config AS_HAS_ARMV8_3
config AS_HAS_CFI_NEGATE_RA_STATE
def_bool $(as-instr,.cfi_startproc\n.cfi_negate_ra_state\n.cfi_endproc\n)
config AS_HAS_LDAPR
def_bool $(as-instr,.arch_extension rcpc)
endmenu # "ARMv8.3 architectural features"
menu "ARMv8.4 architectural features"

View File

@ -42,6 +42,9 @@
#define ACPI_MADT_GICC_SPE (offsetof(struct acpi_madt_generic_interrupt, \
spe_interrupt) + sizeof(u16))
#define ACPI_MADT_GICC_TRBE (offsetof(struct acpi_madt_generic_interrupt, \
trbe_interrupt) + sizeof(u16))
/* Basic configuration for ACPI */
#ifdef CONFIG_ACPI
pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);

View File

@ -138,6 +138,7 @@
#define KERNEL_HWCAP_SME_B16B16 __khwcap2_feature(SME_B16B16)
#define KERNEL_HWCAP_SME_F16F16 __khwcap2_feature(SME_F16F16)
#define KERNEL_HWCAP_MOPS __khwcap2_feature(MOPS)
#define KERNEL_HWCAP_HBC __khwcap2_feature(HBC)
/*
* This yields a mask that user programs can use to figure out what

View File

@ -118,31 +118,4 @@
#define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PTE_RDONLY)
#endif
/*
* To make optimal use of block mappings when laying out the linear
* mapping, round down the base of physical memory to a size that can
* be mapped efficiently, i.e., either PUD_SIZE (4k granule) or PMD_SIZE
* (64k granule), or a multiple that can be mapped using contiguous bits
* in the page tables: 32 * PMD_SIZE (16k granule)
*/
#if defined(CONFIG_ARM64_4K_PAGES)
#define ARM64_MEMSTART_SHIFT PUD_SHIFT
#elif defined(CONFIG_ARM64_16K_PAGES)
#define ARM64_MEMSTART_SHIFT CONT_PMD_SHIFT
#else
#define ARM64_MEMSTART_SHIFT PMD_SHIFT
#endif
/*
* sparsemem vmemmap imposes an additional requirement on the alignment of
* memstart_addr, due to the fact that the base of the vmemmap region
* has a direct correspondence, and needs to appear sufficiently aligned
* in the virtual address space.
*/
#if ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
#define ARM64_MEMSTART_ALIGN (1UL << SECTION_SIZE_BITS)
#else
#define ARM64_MEMSTART_ALIGN (1UL << ARM64_MEMSTART_SHIFT)
#endif
#endif /* __ASM_KERNEL_PGTABLE_H */

View File

@ -64,7 +64,6 @@ extern void arm64_memblock_init(void);
extern void paging_init(void);
extern void bootmem_init(void);
extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
extern void init_mem_pgprot(void);
extern void create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot);
extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,

View File

@ -103,6 +103,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
#define pte_young(pte) (!!(pte_val(pte) & PTE_AF))
#define pte_special(pte) (!!(pte_val(pte) & PTE_SPECIAL))
#define pte_write(pte) (!!(pte_val(pte) & PTE_WRITE))
#define pte_rdonly(pte) (!!(pte_val(pte) & PTE_RDONLY))
#define pte_user(pte) (!!(pte_val(pte) & PTE_USER))
#define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN))
#define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT))
@ -120,7 +121,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
(__boundary - 1 < (end) - 1) ? __boundary : (end); \
})
#define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
#define pte_hw_dirty(pte) (pte_write(pte) && !pte_rdonly(pte))
#define pte_sw_dirty(pte) (!!(pte_val(pte) & PTE_DIRTY))
#define pte_dirty(pte) (pte_sw_dirty(pte) || pte_hw_dirty(pte))
@ -212,7 +213,7 @@ static inline pte_t pte_wrprotect(pte_t pte)
* clear), set the PTE_DIRTY bit.
*/
if (pte_hw_dirty(pte))
pte = pte_mkdirty(pte);
pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
pte = clear_pte_bit(pte, __pgprot(PTE_WRITE));
pte = set_pte_bit(pte, __pgprot(PTE_RDONLY));
@ -823,7 +824,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
PTE_ATTRINDX_MASK;
/* preserve the hardware dirty information */
if (pte_hw_dirty(pte))
pte = pte_mkdirty(pte);
pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
return pte;
}

View File

@ -17,6 +17,9 @@
#include <asm/virt.h>
DECLARE_PER_CPU(struct sdei_registered_event *, sdei_active_normal_event);
DECLARE_PER_CPU(struct sdei_registered_event *, sdei_active_critical_event);
extern unsigned long sdei_exit_mode;
/* Software Delegated Exception entry point from firmware*/
@ -29,6 +32,9 @@ asmlinkage void __sdei_asm_entry_trampoline(unsigned long event_num,
unsigned long pc,
unsigned long pstate);
/* Abort a running handler. Context is discarded. */
void __sdei_handler_abort(void);
/*
* The above entry point does the minimum to call C code. This function does
* anything else, before calling the driver.

View File

@ -803,15 +803,21 @@
/*
* For registers without architectural names, or simply unsupported by
* GAS.
*
* __check_r forces warnings to be generated by the compiler when
* evaluating r which wouldn't normally happen due to being passed to
* the assembler via __stringify(r).
*/
#define read_sysreg_s(r) ({ \
u64 __val; \
u32 __maybe_unused __check_r = (u32)(r); \
asm volatile(__mrs_s("%0", r) : "=r" (__val)); \
__val; \
})
#define write_sysreg_s(v, r) do { \
u64 __val = (u64)(v); \
u32 __maybe_unused __check_r = (u32)(r); \
asm volatile(__msr_s(r, "%x0") : : "rZ" (__val)); \
} while (0)

View File

@ -103,5 +103,6 @@
#define HWCAP2_SME_B16B16 (1UL << 41)
#define HWCAP2_SME_F16F16 (1UL << 42)
#define HWCAP2_MOPS (1UL << 43)
#define HWCAP2_HBC (1UL << 44)
#endif /* _UAPI__ASM_HWCAP_H */

View File

@ -222,7 +222,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CSSC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRFM_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_MOPS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_EL1_APA3_SHIFT, 4, 0),
@ -2708,12 +2708,8 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.desc = "Enhanced Virtualization Traps",
.capability = ARM64_HAS_EVT,
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.sys_reg = SYS_ID_AA64MMFR2_EL1,
.sign = FTR_UNSIGNED,
.field_pos = ID_AA64MMFR2_EL1_EVT_SHIFT,
.field_width = 4,
.min_field_value = ID_AA64MMFR2_EL1_EVT_IMP,
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
},
{},
};
@ -2844,6 +2840,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64ISAR2_EL1, RPRES, IMP, CAP_HWCAP, KERNEL_HWCAP_RPRES),
HWCAP_CAP(ID_AA64ISAR2_EL1, WFxT, IMP, CAP_HWCAP, KERNEL_HWCAP_WFXT),
HWCAP_CAP(ID_AA64ISAR2_EL1, MOPS, IMP, CAP_HWCAP, KERNEL_HWCAP_MOPS),
HWCAP_CAP(ID_AA64ISAR2_EL1, BC, IMP, CAP_HWCAP, KERNEL_HWCAP_HBC),
#ifdef CONFIG_ARM64_SME
HWCAP_CAP(ID_AA64PFR1_EL1, SME, IMP, CAP_HWCAP, KERNEL_HWCAP_SME),
HWCAP_CAP(ID_AA64SMFR0_EL1, FA64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64),

View File

@ -9,8 +9,6 @@
#include <linux/acpi.h>
#include <linux/cpuidle.h>
#include <linux/cpu_pm.h>
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/psci.h>
#ifdef CONFIG_ACPI_PROCESSOR_IDLE

View File

@ -126,6 +126,7 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_SME_B16B16] = "smeb16b16",
[KERNEL_HWCAP_SME_F16F16] = "smef16f16",
[KERNEL_HWCAP_MOPS] = "mops",
[KERNEL_HWCAP_HBC] = "hbc",
};
#ifdef CONFIG_COMPAT

View File

@ -355,6 +355,35 @@ static bool cortex_a76_erratum_1463225_debug_handler(struct pt_regs *regs)
}
#endif /* CONFIG_ARM64_ERRATUM_1463225 */
/*
* As per the ABI exit SME streaming mode and clear the SVE state not
* shared with FPSIMD on syscall entry.
*/
static inline void fp_user_discard(void)
{
/*
* If SME is active then exit streaming mode. If ZA is active
* then flush the SVE registers but leave userspace access to
* both SVE and SME enabled, otherwise disable SME for the
* task and fall through to disabling SVE too. This means
* that after a syscall we never have any streaming mode
* register state to track, if this changes the KVM code will
* need updating.
*/
if (system_supports_sme())
sme_smstop_sm();
if (!system_supports_sve())
return;
if (test_thread_flag(TIF_SVE)) {
unsigned int sve_vq_minus_one;
sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
sve_flush_live(true, sve_vq_minus_one);
}
}
UNHANDLED(el1t, 64, sync)
UNHANDLED(el1t, 64, irq)
UNHANDLED(el1t, 64, fiq)
@ -644,6 +673,8 @@ static void noinstr el0_svc(struct pt_regs *regs)
{
enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
fp_user_discard();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc(regs);
exit_to_user_mode(regs);
}
@ -783,6 +814,7 @@ static void noinstr el0_svc_compat(struct pt_regs *regs)
{
enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc_compat(regs);
exit_to_user_mode(regs);
}

View File

@ -986,9 +986,13 @@ SYM_CODE_START(__sdei_asm_handler)
mov x19, x1
#if defined(CONFIG_VMAP_STACK) || defined(CONFIG_SHADOW_CALL_STACK)
/* Store the registered-event for crash_smp_send_stop() */
ldrb w4, [x19, #SDEI_EVENT_PRIORITY]
#endif
cbnz w4, 1f
adr_this_cpu dst=x5, sym=sdei_active_normal_event, tmp=x6
b 2f
1: adr_this_cpu dst=x5, sym=sdei_active_critical_event, tmp=x6
2: str x19, [x5]
#ifdef CONFIG_VMAP_STACK
/*
@ -1055,6 +1059,14 @@ SYM_CODE_START(__sdei_asm_handler)
ldr_l x2, sdei_exit_mode
/* Clear the registered-event seen by crash_smp_send_stop() */
ldrb w3, [x4, #SDEI_EVENT_PRIORITY]
cbnz w3, 1f
adr_this_cpu dst=x5, sym=sdei_active_normal_event, tmp=x6
b 2f
1: adr_this_cpu dst=x5, sym=sdei_active_critical_event, tmp=x6
2: str xzr, [x5]
alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
sdei_handler_exit exit_mode=x2
alternative_else_nop_endif
@ -1065,4 +1077,15 @@ alternative_else_nop_endif
#endif
SYM_CODE_END(__sdei_asm_handler)
NOKPROBE(__sdei_asm_handler)
SYM_CODE_START(__sdei_handler_abort)
mov_q x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
adr x1, 1f
ldr_l x2, sdei_exit_mode
sdei_handler_exit exit_mode=x2
// exit the handler and jump to the next instruction.
// Exit will stomp x0-x17, PSTATE, ELR_ELx, and SPSR_ELx.
1: ret
SYM_CODE_END(__sdei_handler_abort)
NOKPROBE(__sdei_handler_abort)
#endif /* CONFIG_ARM_SDE_INTERFACE */

View File

@ -1179,9 +1179,6 @@ void sve_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
*/
u64 read_zcr_features(void)
{
u64 zcr;
unsigned int vq_max;
/*
* Set the maximum possible VL, and write zeroes to all other
* bits to see if they stick.
@ -1189,12 +1186,8 @@ u64 read_zcr_features(void)
sve_kernel_enable(NULL);
write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
zcr = read_sysreg_s(SYS_ZCR_EL1);
zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
vq_max = sve_vq_from_vl(sve_get_vl());
zcr |= vq_max - 1; /* set LEN field to maximum effective value */
return zcr;
/* Return LEN value that would be written to get the maximum VL */
return sve_vq_from_vl(sve_get_vl()) - 1;
}
void __init sve_setup(void)
@ -1349,9 +1342,6 @@ void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
*/
u64 read_smcr_features(void)
{
u64 smcr;
unsigned int vq_max;
sme_kernel_enable(NULL);
/*
@ -1360,12 +1350,8 @@ u64 read_smcr_features(void)
write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_LEN_MASK,
SYS_SMCR_EL1);
smcr = read_sysreg_s(SYS_SMCR_EL1);
smcr &= ~(u64)SMCR_ELx_LEN_MASK; /* Only the LEN field */
vq_max = sve_vq_from_vl(sme_get_vl());
smcr |= vq_max - 1; /* set LEN field to maximum effective value */
return smcr;
/* Return LEN value that would be written to get the maximum VL */
return sve_vq_from_vl(sme_get_vl()) - 1;
}
void __init sme_setup(void)

View File

@ -113,7 +113,7 @@ SYM_CODE_START(primary_entry)
*/
#if VA_BITS > 48
mrs_s x0, SYS_ID_AA64MMFR2_EL1
tst x0, #0xf << ID_AA64MMFR2_EL1_VARange_SHIFT
tst x0, ID_AA64MMFR2_EL1_VARange_MASK
mov x0, #VA_BITS
mov x25, #VA_BITS_MIN
csel x25, x25, x0, eq
@ -756,7 +756,7 @@ SYM_FUNC_START(__cpu_secondary_check52bitva)
b.ne 2f
mrs_s x0, SYS_ID_AA64MMFR2_EL1
and x0, x0, #(0xf << ID_AA64MMFR2_EL1_VARange_SHIFT)
and x0, x0, ID_AA64MMFR2_EL1_VARange_MASK
cbnz x0, 2f
update_early_cpu_boot_status \

View File

@ -654,7 +654,7 @@ static int breakpoint_handler(unsigned long unused, unsigned long esr,
perf_bp_event(bp, regs);
/* Do we need to handle the stepping? */
if (is_default_overflow_handler(bp))
if (uses_default_overflow_handler(bp))
step = 1;
unlock:
rcu_read_unlock();
@ -733,7 +733,7 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,
static int watchpoint_report(struct perf_event *wp, unsigned long addr,
struct pt_regs *regs)
{
int step = is_default_overflow_handler(wp);
int step = uses_default_overflow_handler(wp);
struct arch_hw_breakpoint *info = counter_arch_bp(wp);
info->trigger = addr;

View File

@ -262,9 +262,9 @@ static __init void __parse_cmdline(const char *cmdline, bool parse_aliases)
if (!len)
return;
len = min(len, ARRAY_SIZE(buf) - 1);
strncpy(buf, cmdline, len);
buf[len] = 0;
len = strscpy(buf, cmdline, ARRAY_SIZE(buf));
if (len == -E2BIG)
len = ARRAY_SIZE(buf) - 1;
if (strcmp(buf, "--") == 0)
return;

View File

@ -11,8 +11,6 @@
#include <linux/io.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/of_pci.h>
#include <linux/of_platform.h>
#include <linux/pci.h>
#include <linux/pci-acpi.h>
#include <linux/pci-ecam.h>

View File

@ -891,7 +891,8 @@ static int sve_set_common(struct task_struct *target,
break;
default:
WARN_ON_ONCE(1);
return -EINVAL;
ret = -EINVAL;
goto out;
}
/*

View File

@ -47,6 +47,9 @@ DEFINE_PER_CPU(unsigned long *, sdei_shadow_call_stack_normal_ptr);
DEFINE_PER_CPU(unsigned long *, sdei_shadow_call_stack_critical_ptr);
#endif
DEFINE_PER_CPU(struct sdei_registered_event *, sdei_active_normal_event);
DEFINE_PER_CPU(struct sdei_registered_event *, sdei_active_critical_event);
static void _free_sdei_stack(unsigned long * __percpu *ptr, int cpu)
{
unsigned long *p;

View File

@ -1044,10 +1044,8 @@ void crash_smp_send_stop(void)
* If this cpu is the only one alive at this point in time, online or
* not, there are no stop messages to be sent around, so just back out.
*/
if (num_other_online_cpus() == 0) {
sdei_mask_local_cpu();
return;
}
if (num_other_online_cpus() == 0)
goto skip_ipi;
cpumask_copy(&mask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &mask);
@ -1066,7 +1064,9 @@ void crash_smp_send_stop(void)
pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
cpumask_pr_args(&mask));
skip_ipi:
sdei_mask_local_cpu();
sdei_handler_abort();
}
bool smp_crash_stop_failed(void)

View File

@ -8,7 +8,6 @@
#include <linux/randomize_kstack.h>
#include <linux/syscalls.h>
#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
#include <asm/exception.h>
#include <asm/fpsimd.h>
@ -101,8 +100,6 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
* (Similarly for HVC and SMC elsewhere.)
*/
local_daif_restore(DAIF_PROCCTX);
if (flags & _TIF_MTE_ASYNC_FAULT) {
/*
* Process the asynchronous tag check fault before the actual
@ -153,38 +150,8 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
syscall_trace_exit(regs);
}
/*
* As per the ABI exit SME streaming mode and clear the SVE state not
* shared with FPSIMD on syscall entry.
*/
static inline void fp_user_discard(void)
{
/*
* If SME is active then exit streaming mode. If ZA is active
* then flush the SVE registers but leave userspace access to
* both SVE and SME enabled, otherwise disable SME for the
* task and fall through to disabling SVE too. This means
* that after a syscall we never have any streaming mode
* register state to track, if this changes the KVM code will
* need updating.
*/
if (system_supports_sme())
sme_smstop_sm();
if (!system_supports_sve())
return;
if (test_thread_flag(TIF_SVE)) {
unsigned int sve_vq_minus_one;
sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
sve_flush_live(true, sve_vq_minus_one);
}
}
void do_el0_svc(struct pt_regs *regs)
{
fp_user_discard();
el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table);
}

View File

@ -50,9 +50,7 @@ SECTIONS
. = ALIGN(4);
.altinstructions : {
__alt_instructions = .;
*(.altinstructions)
__alt_instructions_end = .;
}
.dynamic : { *(.dynamic) } :text :dynamic

View File

@ -73,6 +73,33 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20)
/*
* To make optimal use of block mappings when laying out the linear
* mapping, round down the base of physical memory to a size that can
* be mapped efficiently, i.e., either PUD_SIZE (4k granule) or PMD_SIZE
* (64k granule), or a multiple that can be mapped using contiguous bits
* in the page tables: 32 * PMD_SIZE (16k granule)
*/
#if defined(CONFIG_ARM64_4K_PAGES)
#define ARM64_MEMSTART_SHIFT PUD_SHIFT
#elif defined(CONFIG_ARM64_16K_PAGES)
#define ARM64_MEMSTART_SHIFT CONT_PMD_SHIFT
#else
#define ARM64_MEMSTART_SHIFT PMD_SHIFT
#endif
/*
* sparsemem vmemmap imposes an additional requirement on the alignment of
* memstart_addr, due to the fact that the base of the vmemmap region
* has a direct correspondence, and needs to appear sufficiently aligned
* in the virtual address space.
*/
#if ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
#define ARM64_MEMSTART_ALIGN (1UL << SECTION_SIZE_BITS)
#else
#define ARM64_MEMSTART_ALIGN (1UL << ARM64_MEMSTART_SHIFT)
#endif
static int __init reserve_crashkernel_low(unsigned long long low_size)
{
unsigned long long low_base;

View File

@ -447,7 +447,7 @@ SYM_FUNC_START(__cpu_setup)
* via capabilities.
*/
mrs x9, ID_AA64MMFR1_EL1
and x9, x9, #0xf
and x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
cbz x9, 1f
orr tcr, tcr, #TCR_HA // hardware Access flag update
1:

View File

@ -1708,7 +1708,10 @@ static void __init arm_smmu_v3_pmcg_init_resources(struct resource *res,
static struct acpi_platform_list pmcg_plat_info[] __initdata = {
/* HiSilicon Hip08 Platform */
{"HISI ", "HIP08 ", 0, ACPI_SIG_IORT, greater_than_or_equal,
"Erratum #162001800", IORT_SMMU_V3_PMCG_HISI_HIP08},
"Erratum #162001800, Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP08},
/* HiSilicon Hip09 Platform */
{"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal,
"Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
{ }
};

View File

@ -1095,3 +1095,22 @@ int sdei_event_handler(struct pt_regs *regs,
return err;
}
NOKPROBE_SYMBOL(sdei_event_handler);
void sdei_handler_abort(void)
{
/*
* If the crash happened in an SDEI event handler then we need to
* finish the handler with the firmware so that we can have working
* interrupts in the crash kernel.
*/
if (__this_cpu_read(sdei_active_critical_event)) {
pr_warn("still in SDEI critical event context, attempting to finish handler.\n");
__sdei_handler_abort();
__this_cpu_write(sdei_active_critical_event, NULL);
}
if (__this_cpu_read(sdei_active_normal_event)) {
pr_warn("still in SDEI normal event context, attempting to finish handler.\n");
__sdei_handler_abort();
__this_cpu_write(sdei_active_normal_event, NULL);
}
}

View File

@ -92,7 +92,7 @@ config ARM_PMU_ACPI
config ARM_SMMU_V3_PMU
tristate "ARM SMMUv3 Performance Monitors Extension"
depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT)
depends on ARM64 || (COMPILE_TEST && 64BIT)
depends on GENERIC_MSI_IRQ
help
Provides support for the ARM SMMUv3 Performance Monitor Counter

View File

@ -236,10 +236,37 @@ static const struct attribute_group ali_drw_pmu_cpumask_attr_group = {
.attrs = ali_drw_pmu_cpumask_attrs,
};
static ssize_t ali_drw_pmu_identifier_show(struct device *dev,
struct device_attribute *attr,
char *page)
{
return sysfs_emit(page, "%s\n", "ali_drw_pmu");
}
static umode_t ali_drw_pmu_identifier_attr_visible(struct kobject *kobj,
struct attribute *attr, int n)
{
return attr->mode;
}
static struct device_attribute ali_drw_pmu_identifier_attr =
__ATTR(identifier, 0444, ali_drw_pmu_identifier_show, NULL);
static struct attribute *ali_drw_pmu_identifier_attrs[] = {
&ali_drw_pmu_identifier_attr.attr,
NULL
};
static const struct attribute_group ali_drw_pmu_identifier_attr_group = {
.attrs = ali_drw_pmu_identifier_attrs,
.is_visible = ali_drw_pmu_identifier_attr_visible
};
static const struct attribute_group *ali_drw_pmu_attr_groups[] = {
&ali_drw_pmu_events_attr_group,
&ali_drw_pmu_cpumask_attr_group,
&ali_drw_pmu_format_group,
&ali_drw_pmu_identifier_attr_group,
NULL,
};

View File

@ -9,8 +9,6 @@
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/of_irq.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>
#include <linux/printk.h>

View File

@ -7,10 +7,7 @@
#include <linux/io.h>
#include <linux/interrupt.h>
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_irq.h>
#include <linux/of_platform.h>
#include <linux/of.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>
#include <linux/slab.h>

View File

@ -72,6 +72,8 @@
/* For most nodes, this is all there is */
#define CMN_PMU_EVENT_SEL 0x000
#define CMN__PMU_CBUSY_SNTHROTTLE_SEL GENMASK_ULL(44, 42)
#define CMN__PMU_SN_HOME_SEL GENMASK_ULL(40, 39)
#define CMN__PMU_HBT_LBT_SEL GENMASK_ULL(38, 37)
#define CMN__PMU_CLASS_OCCUP_ID GENMASK_ULL(36, 35)
/* Technically this is 4 bits wide on DNs, but we only use 2 there anyway */
#define CMN__PMU_OCCUP1_ID GENMASK_ULL(34, 32)
@ -226,6 +228,7 @@ enum cmn_revision {
REV_CMN700_R0P0 = 0,
REV_CMN700_R1P0,
REV_CMN700_R2P0,
REV_CMN700_R3P0,
REV_CI700_R0P0 = 0,
REV_CI700_R1P0,
REV_CI700_R2P0,
@ -254,6 +257,9 @@ enum cmn_node_type {
CMN_TYPE_CCHA,
CMN_TYPE_CCLA,
CMN_TYPE_CCLA_RNI,
CMN_TYPE_HNS = 0x200,
CMN_TYPE_HNS_MPAM_S,
CMN_TYPE_HNS_MPAM_NS,
/* Not a real node type */
CMN_TYPE_WP = 0x7770
};
@ -263,6 +269,8 @@ enum cmn_filter_select {
SEL_OCCUP1ID,
SEL_CLASS_OCCUP_ID,
SEL_CBUSY_SNTHROTTLE_SEL,
SEL_HBT_LBT_SEL,
SEL_SN_HOME_SEL,
SEL_MAX
};
@ -742,8 +750,8 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj,
_CMN_EVENT_ATTR(_model, dn_##_name, CMN_TYPE_DVM, _event, _occup, _fsel)
#define CMN_EVENT_DTC(_name) \
CMN_EVENT_ATTR(CMN_ANY, dtc_##_name, CMN_TYPE_DTC, 0)
#define _CMN_EVENT_HNF(_model, _name, _event, _occup, _fsel) \
_CMN_EVENT_ATTR(_model, hnf_##_name, CMN_TYPE_HNF, _event, _occup, _fsel)
#define CMN_EVENT_HNF(_model, _name, _event) \
CMN_EVENT_ATTR(_model, hnf_##_name, CMN_TYPE_HNF, _event)
#define CMN_EVENT_HNI(_name, _event) \
CMN_EVENT_ATTR(CMN_ANY, hni_##_name, CMN_TYPE_HNI, _event)
#define CMN_EVENT_HNP(_name, _event) \
@ -768,6 +776,8 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj,
CMN_EVENT_ATTR(CMN_ANY, ccla_##_name, CMN_TYPE_CCLA, _event)
#define CMN_EVENT_CCLA_RNI(_name, _event) \
CMN_EVENT_ATTR(CMN_ANY, ccla_rni_##_name, CMN_TYPE_CCLA_RNI, _event)
#define CMN_EVENT_HNS(_name, _event) \
CMN_EVENT_ATTR(CMN_ANY, hns_##_name, CMN_TYPE_HNS, _event)
#define CMN_EVENT_DVM(_model, _name, _event) \
_CMN_EVENT_DVM(_model, _name, _event, 0, SEL_NONE)
@ -775,32 +785,68 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj,
_CMN_EVENT_DVM(_model, _name##_all, _event, 0, SEL_OCCUP1ID), \
_CMN_EVENT_DVM(_model, _name##_dvmop, _event, 1, SEL_OCCUP1ID), \
_CMN_EVENT_DVM(_model, _name##_dvmsync, _event, 2, SEL_OCCUP1ID)
#define CMN_EVENT_HNF(_model, _name, _event) \
_CMN_EVENT_HNF(_model, _name, _event, 0, SEL_NONE)
#define CMN_EVENT_HNF_CLS(_model, _name, _event) \
_CMN_EVENT_HNF(_model, _name##_class0, _event, 0, SEL_CLASS_OCCUP_ID), \
_CMN_EVENT_HNF(_model, _name##_class1, _event, 1, SEL_CLASS_OCCUP_ID), \
_CMN_EVENT_HNF(_model, _name##_class2, _event, 2, SEL_CLASS_OCCUP_ID), \
_CMN_EVENT_HNF(_model, _name##_class3, _event, 3, SEL_CLASS_OCCUP_ID)
#define CMN_EVENT_HNF_SNT(_model, _name, _event) \
_CMN_EVENT_HNF(_model, _name##_all, _event, 0, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_HNF(_model, _name##_group0_read, _event, 1, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_HNF(_model, _name##_group0_write, _event, 2, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_HNF(_model, _name##_group1_read, _event, 3, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_HNF(_model, _name##_group1_write, _event, 4, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_HNF(_model, _name##_read, _event, 5, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_HNF(_model, _name##_write, _event, 6, SEL_CBUSY_SNTHROTTLE_SEL)
#define _CMN_EVENT_XP(_name, _event) \
#define CMN_EVENT_HN_OCC(_model, _name, _type, _event) \
_CMN_EVENT_ATTR(_model, _name##_all, _type, _event, 0, SEL_OCCUP1ID), \
_CMN_EVENT_ATTR(_model, _name##_read, _type, _event, 1, SEL_OCCUP1ID), \
_CMN_EVENT_ATTR(_model, _name##_write, _type, _event, 2, SEL_OCCUP1ID), \
_CMN_EVENT_ATTR(_model, _name##_atomic, _type, _event, 3, SEL_OCCUP1ID), \
_CMN_EVENT_ATTR(_model, _name##_stash, _type, _event, 4, SEL_OCCUP1ID)
#define CMN_EVENT_HN_CLS(_model, _name, _type, _event) \
_CMN_EVENT_ATTR(_model, _name##_class0, _type, _event, 0, SEL_CLASS_OCCUP_ID), \
_CMN_EVENT_ATTR(_model, _name##_class1, _type, _event, 1, SEL_CLASS_OCCUP_ID), \
_CMN_EVENT_ATTR(_model, _name##_class2, _type, _event, 2, SEL_CLASS_OCCUP_ID), \
_CMN_EVENT_ATTR(_model, _name##_class3, _type, _event, 3, SEL_CLASS_OCCUP_ID)
#define CMN_EVENT_HN_SNT(_model, _name, _type, _event) \
_CMN_EVENT_ATTR(_model, _name##_all, _type, _event, 0, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_ATTR(_model, _name##_group0_read, _type, _event, 1, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_ATTR(_model, _name##_group0_write, _type, _event, 2, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_ATTR(_model, _name##_group1_read, _type, _event, 3, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_ATTR(_model, _name##_group1_write, _type, _event, 4, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_ATTR(_model, _name##_read, _type, _event, 5, SEL_CBUSY_SNTHROTTLE_SEL), \
_CMN_EVENT_ATTR(_model, _name##_write, _type, _event, 6, SEL_CBUSY_SNTHROTTLE_SEL)
#define CMN_EVENT_HNF_OCC(_model, _name, _event) \
CMN_EVENT_HN_OCC(_model, hnf_##_name, CMN_TYPE_HNF, _event)
#define CMN_EVENT_HNF_CLS(_model, _name, _event) \
CMN_EVENT_HN_CLS(_model, hnf_##_name, CMN_TYPE_HNS, _event)
#define CMN_EVENT_HNF_SNT(_model, _name, _event) \
CMN_EVENT_HN_SNT(_model, hnf_##_name, CMN_TYPE_HNF, _event)
#define CMN_EVENT_HNS_OCC(_name, _event) \
CMN_EVENT_HN_OCC(CMN_ANY, hns_##_name, CMN_TYPE_HNS, _event), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_rxsnp, CMN_TYPE_HNS, _event, 5, SEL_OCCUP1ID), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_lbt, CMN_TYPE_HNS, _event, 6, SEL_OCCUP1ID), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_hbt, CMN_TYPE_HNS, _event, 7, SEL_OCCUP1ID)
#define CMN_EVENT_HNS_CLS( _name, _event) \
CMN_EVENT_HN_CLS(CMN_ANY, hns_##_name, CMN_TYPE_HNS, _event)
#define CMN_EVENT_HNS_SNT(_name, _event) \
CMN_EVENT_HN_SNT(CMN_ANY, hns_##_name, CMN_TYPE_HNS, _event)
#define CMN_EVENT_HNS_HBT(_name, _event) \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_all, CMN_TYPE_HNS, _event, 0, SEL_HBT_LBT_SEL), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_hbt, CMN_TYPE_HNS, _event, 1, SEL_HBT_LBT_SEL), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_lbt, CMN_TYPE_HNS, _event, 2, SEL_HBT_LBT_SEL)
#define CMN_EVENT_HNS_SNH(_name, _event) \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_all, CMN_TYPE_HNS, _event, 0, SEL_SN_HOME_SEL), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_sn, CMN_TYPE_HNS, _event, 1, SEL_SN_HOME_SEL), \
_CMN_EVENT_ATTR(CMN_ANY, hns_##_name##_home, CMN_TYPE_HNS, _event, 2, SEL_SN_HOME_SEL)
#define _CMN_EVENT_XP_MESH(_name, _event) \
__CMN_EVENT_XP(e_##_name, (_event) | (0 << 2)), \
__CMN_EVENT_XP(w_##_name, (_event) | (1 << 2)), \
__CMN_EVENT_XP(n_##_name, (_event) | (2 << 2)), \
__CMN_EVENT_XP(s_##_name, (_event) | (3 << 2)), \
__CMN_EVENT_XP(s_##_name, (_event) | (3 << 2))
#define _CMN_EVENT_XP_PORT(_name, _event) \
__CMN_EVENT_XP(p0_##_name, (_event) | (4 << 2)), \
__CMN_EVENT_XP(p1_##_name, (_event) | (5 << 2)), \
__CMN_EVENT_XP(p2_##_name, (_event) | (6 << 2)), \
__CMN_EVENT_XP(p3_##_name, (_event) | (7 << 2))
#define _CMN_EVENT_XP(_name, _event) \
_CMN_EVENT_XP_MESH(_name, _event), \
_CMN_EVENT_XP_PORT(_name, _event)
/* Good thing there are only 3 fundamental XP events... */
#define CMN_EVENT_XP(_name, _event) \
_CMN_EVENT_XP(req_##_name, (_event) | (0 << 5)), \
@ -813,6 +859,10 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj,
_CMN_EVENT_XP(snp2_##_name, (_event) | (7 << 5)), \
_CMN_EVENT_XP(req2_##_name, (_event) | (8 << 5))
#define CMN_EVENT_XP_DAT(_name, _event) \
_CMN_EVENT_XP_PORT(dat_##_name, (_event) | (3 << 5)), \
_CMN_EVENT_XP_PORT(dat2_##_name, (_event) | (6 << 5))
static struct attribute *arm_cmn_event_attrs[] = {
CMN_EVENT_DTC(cycles),
@ -862,11 +912,7 @@ static struct attribute *arm_cmn_event_attrs[] = {
CMN_EVENT_HNF(CMN_ANY, mc_retries, 0x0c),
CMN_EVENT_HNF(CMN_ANY, mc_reqs, 0x0d),
CMN_EVENT_HNF(CMN_ANY, qos_hh_retry, 0x0e),
_CMN_EVENT_HNF(CMN_ANY, qos_pocq_occupancy_all, 0x0f, 0, SEL_OCCUP1ID),
_CMN_EVENT_HNF(CMN_ANY, qos_pocq_occupancy_read, 0x0f, 1, SEL_OCCUP1ID),
_CMN_EVENT_HNF(CMN_ANY, qos_pocq_occupancy_write, 0x0f, 2, SEL_OCCUP1ID),
_CMN_EVENT_HNF(CMN_ANY, qos_pocq_occupancy_atomic, 0x0f, 3, SEL_OCCUP1ID),
_CMN_EVENT_HNF(CMN_ANY, qos_pocq_occupancy_stash, 0x0f, 4, SEL_OCCUP1ID),
CMN_EVENT_HNF_OCC(CMN_ANY, qos_pocq_occupancy, 0x0f),
CMN_EVENT_HNF(CMN_ANY, pocq_addrhaz, 0x10),
CMN_EVENT_HNF(CMN_ANY, pocq_atomic_addrhaz, 0x11),
CMN_EVENT_HNF(CMN_ANY, ld_st_swp_adq_full, 0x12),
@ -943,7 +989,7 @@ static struct attribute *arm_cmn_event_attrs[] = {
CMN_EVENT_XP(txflit_valid, 0x01),
CMN_EVENT_XP(txflit_stall, 0x02),
CMN_EVENT_XP(partial_dat_flit, 0x03),
CMN_EVENT_XP_DAT(partial_dat_flit, 0x03),
/* We treat watchpoints as a special made-up class of XP events */
CMN_EVENT_ATTR(CMN_ANY, watchpoint_up, CMN_TYPE_WP, CMN_WP_UP),
CMN_EVENT_ATTR(CMN_ANY, watchpoint_down, CMN_TYPE_WP, CMN_WP_DOWN),
@ -1132,6 +1178,66 @@ static struct attribute *arm_cmn_event_attrs[] = {
CMN_EVENT_CCLA(pfwd_sndr_stalls_static_crd, 0x2a),
CMN_EVENT_CCLA(pfwd_sndr_stalls_dynmaic_crd, 0x2b),
CMN_EVENT_HNS_HBT(cache_miss, 0x01),
CMN_EVENT_HNS_HBT(slc_sf_cache_access, 0x02),
CMN_EVENT_HNS_HBT(cache_fill, 0x03),
CMN_EVENT_HNS_HBT(pocq_retry, 0x04),
CMN_EVENT_HNS_HBT(pocq_reqs_recvd, 0x05),
CMN_EVENT_HNS_HBT(sf_hit, 0x06),
CMN_EVENT_HNS_HBT(sf_evictions, 0x07),
CMN_EVENT_HNS(dir_snoops_sent, 0x08),
CMN_EVENT_HNS(brd_snoops_sent, 0x09),
CMN_EVENT_HNS_HBT(slc_eviction, 0x0a),
CMN_EVENT_HNS_HBT(slc_fill_invalid_way, 0x0b),
CMN_EVENT_HNS(mc_retries_local, 0x0c),
CMN_EVENT_HNS_SNH(mc_reqs_local, 0x0d),
CMN_EVENT_HNS(qos_hh_retry, 0x0e),
CMN_EVENT_HNS_OCC(qos_pocq_occupancy, 0x0f),
CMN_EVENT_HNS(pocq_addrhaz, 0x10),
CMN_EVENT_HNS(pocq_atomic_addrhaz, 0x11),
CMN_EVENT_HNS(ld_st_swp_adq_full, 0x12),
CMN_EVENT_HNS(cmp_adq_full, 0x13),
CMN_EVENT_HNS(txdat_stall, 0x14),
CMN_EVENT_HNS(txrsp_stall, 0x15),
CMN_EVENT_HNS(seq_full, 0x16),
CMN_EVENT_HNS(seq_hit, 0x17),
CMN_EVENT_HNS(snp_sent, 0x18),
CMN_EVENT_HNS(sfbi_dir_snp_sent, 0x19),
CMN_EVENT_HNS(sfbi_brd_snp_sent, 0x1a),
CMN_EVENT_HNS(intv_dirty, 0x1c),
CMN_EVENT_HNS(stash_snp_sent, 0x1d),
CMN_EVENT_HNS(stash_data_pull, 0x1e),
CMN_EVENT_HNS(snp_fwded, 0x1f),
CMN_EVENT_HNS(atomic_fwd, 0x20),
CMN_EVENT_HNS(mpam_hardlim, 0x21),
CMN_EVENT_HNS(mpam_softlim, 0x22),
CMN_EVENT_HNS(snp_sent_cluster, 0x23),
CMN_EVENT_HNS(sf_imprecise_evict, 0x24),
CMN_EVENT_HNS(sf_evict_shared_line, 0x25),
CMN_EVENT_HNS_CLS(pocq_class_occup, 0x26),
CMN_EVENT_HNS_CLS(pocq_class_retry, 0x27),
CMN_EVENT_HNS_CLS(class_mc_reqs_local, 0x28),
CMN_EVENT_HNS_CLS(class_cgnt_cmin, 0x29),
CMN_EVENT_HNS_SNT(sn_throttle, 0x2a),
CMN_EVENT_HNS_SNT(sn_throttle_min, 0x2b),
CMN_EVENT_HNS(sf_precise_to_imprecise, 0x2c),
CMN_EVENT_HNS(snp_intv_cln, 0x2d),
CMN_EVENT_HNS(nc_excl, 0x2e),
CMN_EVENT_HNS(excl_mon_ovfl, 0x2f),
CMN_EVENT_HNS(snp_req_recvd, 0x30),
CMN_EVENT_HNS(snp_req_byp_pocq, 0x31),
CMN_EVENT_HNS(dir_ccgha_snp_sent, 0x32),
CMN_EVENT_HNS(brd_ccgha_snp_sent, 0x33),
CMN_EVENT_HNS(ccgha_snp_stall, 0x34),
CMN_EVENT_HNS(lbt_req_hardlim, 0x35),
CMN_EVENT_HNS(hbt_req_hardlim, 0x36),
CMN_EVENT_HNS(sf_reupdate, 0x37),
CMN_EVENT_HNS(excl_sf_imprecise, 0x38),
CMN_EVENT_HNS(snp_pocq_addrhaz, 0x39),
CMN_EVENT_HNS(mc_retries_remote, 0x3a),
CMN_EVENT_HNS_SNH(mc_reqs_remote, 0x3b),
CMN_EVENT_HNS_CLS(class_mc_reqs_remote, 0x3c),
NULL
};
@ -1373,6 +1479,10 @@ static int arm_cmn_set_event_sel_hi(struct arm_cmn_node *dn,
dn->occupid[fsel].val = occupid;
reg = FIELD_PREP(CMN__PMU_CBUSY_SNTHROTTLE_SEL,
dn->occupid[SEL_CBUSY_SNTHROTTLE_SEL].val) |
FIELD_PREP(CMN__PMU_SN_HOME_SEL,
dn->occupid[SEL_SN_HOME_SEL].val) |
FIELD_PREP(CMN__PMU_HBT_LBT_SEL,
dn->occupid[SEL_HBT_LBT_SEL].val) |
FIELD_PREP(CMN__PMU_CLASS_OCCUP_ID,
dn->occupid[SEL_CLASS_OCCUP_ID].val) |
FIELD_PREP(CMN__PMU_OCCUP1_ID,
@ -2200,6 +2310,7 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
case CMN_TYPE_CCRA:
case CMN_TYPE_CCHA:
case CMN_TYPE_CCLA:
case CMN_TYPE_HNS:
dn++;
break;
/* Nothing to see here */
@ -2207,6 +2318,8 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
case CMN_TYPE_MPAM_NS:
case CMN_TYPE_RNSAM:
case CMN_TYPE_CXLA:
case CMN_TYPE_HNS_MPAM_S:
case CMN_TYPE_HNS_MPAM_NS:
break;
/*
* Split "optimised" combination nodes into separate

View File

@ -66,8 +66,13 @@
#define DMC620_PMU_COUNTERn_OFFSET(n) \
(DMC620_PMU_COUNTERS_BASE + 0x28 * (n))
static LIST_HEAD(dmc620_pmu_irqs);
/*
* dmc620_pmu_irqs_lock: protects dmc620_pmu_irqs list
* dmc620_pmu_node_lock: protects pmus_node lists in all dmc620_pmu instances
*/
static DEFINE_MUTEX(dmc620_pmu_irqs_lock);
static DEFINE_MUTEX(dmc620_pmu_node_lock);
static LIST_HEAD(dmc620_pmu_irqs);
struct dmc620_pmu_irq {
struct hlist_node node;
@ -475,9 +480,9 @@ static int dmc620_pmu_get_irq(struct dmc620_pmu *dmc620_pmu, int irq_num)
return PTR_ERR(irq);
dmc620_pmu->irq = irq;
mutex_lock(&dmc620_pmu_irqs_lock);
mutex_lock(&dmc620_pmu_node_lock);
list_add_rcu(&dmc620_pmu->pmus_node, &irq->pmus_node);
mutex_unlock(&dmc620_pmu_irqs_lock);
mutex_unlock(&dmc620_pmu_node_lock);
return 0;
}
@ -486,9 +491,11 @@ static void dmc620_pmu_put_irq(struct dmc620_pmu *dmc620_pmu)
{
struct dmc620_pmu_irq *irq = dmc620_pmu->irq;
mutex_lock(&dmc620_pmu_irqs_lock);
mutex_lock(&dmc620_pmu_node_lock);
list_del_rcu(&dmc620_pmu->pmus_node);
mutex_unlock(&dmc620_pmu_node_lock);
mutex_lock(&dmc620_pmu_irqs_lock);
if (!refcount_dec_and_test(&irq->refcount)) {
mutex_unlock(&dmc620_pmu_irqs_lock);
return;
@ -638,10 +645,10 @@ static int dmc620_pmu_cpu_teardown(unsigned int cpu,
return 0;
/* We're only reading, but this isn't the place to be involving RCU */
mutex_lock(&dmc620_pmu_irqs_lock);
mutex_lock(&dmc620_pmu_node_lock);
list_for_each_entry(dmc620_pmu, &irq->pmus_node, pmus_node)
perf_pmu_migrate_context(&dmc620_pmu->pmu, irq->cpu, target);
mutex_unlock(&dmc620_pmu_irqs_lock);
mutex_unlock(&dmc620_pmu_node_lock);
WARN_ON(irq_set_affinity(irq->irq_num, cpumask_of(target)));
irq->cpu = target;

View File

@ -20,7 +20,7 @@
#include <linux/interrupt.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/of_device.h>
#include <linux/of.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>
#include <linux/spinlock.h>

View File

@ -69,6 +69,62 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
acpi_unregister_gsi(gsi);
}
static int __maybe_unused
arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
u16 (*parse_gsi)(struct acpi_madt_generic_interrupt *))
{
int cpu, this_hetid, hetid, irq, ret;
u16 this_gsi = 0, gsi = 0;
/*
* Ensure that platform device must have IORESOURCE_IRQ
* resource to hold gsi interrupt.
*/
if (pdev->num_resources != 1)
return -ENXIO;
if (pdev->resource[0].flags != IORESOURCE_IRQ)
return -ENXIO;
/*
* Sanity check all the GICC tables for the same interrupt
* number. For now, only support homogeneous ACPI machines.
*/
for_each_possible_cpu(cpu) {
struct acpi_madt_generic_interrupt *gicc;
gicc = acpi_cpu_get_madt_gicc(cpu);
if (gicc->header.length < len)
return gsi ? -ENXIO : 0;
this_gsi = parse_gsi(gicc);
this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
if (!gsi) {
hetid = this_hetid;
gsi = this_gsi;
} else if (hetid != this_hetid || gsi != this_gsi) {
pr_warn("ACPI: %s: must be homogeneous\n", pdev->name);
return -ENXIO;
}
}
if (!this_gsi)
return 0;
irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
if (irq < 0) {
pr_warn("ACPI: %s Unable to register interrupt: %d\n", pdev->name, gsi);
return -ENXIO;
}
pdev->resource[0].start = irq;
ret = platform_device_register(pdev);
if (ret)
acpi_unregister_gsi(gsi);
return ret;
}
#if IS_ENABLED(CONFIG_ARM_SPE_PMU)
static struct resource spe_resources[] = {
{
@ -84,6 +140,11 @@ static struct platform_device spe_dev = {
.num_resources = ARRAY_SIZE(spe_resources)
};
static u16 arm_spe_parse_gsi(struct acpi_madt_generic_interrupt *gicc)
{
return gicc->spe_interrupt;
}
/*
* For lack of a better place, hook the normal PMU MADT walk
* and create a SPE device if we detect a recent MADT with
@ -91,47 +152,10 @@ static struct platform_device spe_dev = {
*/
static void arm_spe_acpi_register_device(void)
{
int cpu, hetid, irq, ret;
bool first = true;
u16 gsi = 0;
/*
* Sanity check all the GICC tables for the same interrupt number.
* For now, we only support homogeneous ACPI/SPE machines.
*/
for_each_possible_cpu(cpu) {
struct acpi_madt_generic_interrupt *gicc;
gicc = acpi_cpu_get_madt_gicc(cpu);
if (gicc->header.length < ACPI_MADT_GICC_SPE)
return;
if (first) {
gsi = gicc->spe_interrupt;
if (!gsi)
return;
hetid = find_acpi_cpu_topology_hetero_id(cpu);
first = false;
} else if ((gsi != gicc->spe_interrupt) ||
(hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
pr_warn("ACPI: SPE must be homogeneous\n");
return;
}
}
irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE,
ACPI_ACTIVE_HIGH);
if (irq < 0) {
pr_warn("ACPI: SPE Unable to register interrupt: %d\n", gsi);
return;
}
spe_resources[0].start = irq;
ret = platform_device_register(&spe_dev);
if (ret < 0) {
int ret = arm_acpi_register_pmu_device(&spe_dev, ACPI_MADT_GICC_SPE,
arm_spe_parse_gsi);
if (ret)
pr_warn("ACPI: SPE: Unable to register device\n");
acpi_unregister_gsi(gsi);
}
}
#else
static inline void arm_spe_acpi_register_device(void)
@ -139,6 +163,40 @@ static inline void arm_spe_acpi_register_device(void)
}
#endif /* CONFIG_ARM_SPE_PMU */
#if IS_ENABLED(CONFIG_CORESIGHT_TRBE)
static struct resource trbe_resources[] = {
{
/* irq */
.flags = IORESOURCE_IRQ,
}
};
static struct platform_device trbe_dev = {
.name = ARMV8_TRBE_PDEV_NAME,
.id = -1,
.resource = trbe_resources,
.num_resources = ARRAY_SIZE(trbe_resources)
};
static u16 arm_trbe_parse_gsi(struct acpi_madt_generic_interrupt *gicc)
{
return gicc->trbe_interrupt;
}
static void arm_trbe_acpi_register_device(void)
{
int ret = arm_acpi_register_pmu_device(&trbe_dev, ACPI_MADT_GICC_TRBE,
arm_trbe_parse_gsi);
if (ret)
pr_warn("ACPI: TRBE: Unable to register device\n");
}
#else
static inline void arm_trbe_acpi_register_device(void)
{
}
#endif /* CONFIG_CORESIGHT_TRBE */
static int arm_pmu_acpi_parse_irqs(void)
{
int irq, cpu, irq_cpu, err;
@ -374,6 +432,7 @@ static int arm_pmu_acpi_init(void)
return 0;
arm_spe_acpi_register_device();
arm_trbe_acpi_register_device();
return 0;
}

View File

@ -16,7 +16,6 @@
#include <linux/irqdesc.h>
#include <linux/kconfig.h>
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/percpu.h>
#include <linux/perf/arm_pmu.h>
#include <linux/platform_device.h>

View File

@ -721,38 +721,15 @@ static void armv8pmu_enable_event(struct perf_event *event)
* Enable counter and interrupt, and set the counter to count
* the event that we're interested in.
*/
/*
* Disable counter
*/
armv8pmu_disable_event_counter(event);
/*
* Set event.
*/
armv8pmu_write_event_type(event);
/*
* Enable interrupt for this counter
*/
armv8pmu_enable_event_irq(event);
/*
* Enable counter
*/
armv8pmu_enable_event_counter(event);
}
static void armv8pmu_disable_event(struct perf_event *event)
{
/*
* Disable counter
*/
armv8pmu_disable_event_counter(event);
/*
* Disable interrupt for this counter
*/
armv8pmu_disable_event_irq(event);
}
@ -1266,9 +1243,14 @@ PMUV3_INIT_SIMPLE(armv8_cortex_a76)
PMUV3_INIT_SIMPLE(armv8_cortex_a77)
PMUV3_INIT_SIMPLE(armv8_cortex_a78)
PMUV3_INIT_SIMPLE(armv9_cortex_a510)
PMUV3_INIT_SIMPLE(armv9_cortex_a520)
PMUV3_INIT_SIMPLE(armv9_cortex_a710)
PMUV3_INIT_SIMPLE(armv9_cortex_a715)
PMUV3_INIT_SIMPLE(armv9_cortex_a720)
PMUV3_INIT_SIMPLE(armv8_cortex_x1)
PMUV3_INIT_SIMPLE(armv9_cortex_x2)
PMUV3_INIT_SIMPLE(armv9_cortex_x3)
PMUV3_INIT_SIMPLE(armv9_cortex_x4)
PMUV3_INIT_SIMPLE(armv8_neoverse_e1)
PMUV3_INIT_SIMPLE(armv8_neoverse_n1)
PMUV3_INIT_SIMPLE(armv9_neoverse_n2)
@ -1334,9 +1316,14 @@ static const struct of_device_id armv8_pmu_of_device_ids[] = {
{.compatible = "arm,cortex-a77-pmu", .data = armv8_cortex_a77_pmu_init},
{.compatible = "arm,cortex-a78-pmu", .data = armv8_cortex_a78_pmu_init},
{.compatible = "arm,cortex-a510-pmu", .data = armv9_cortex_a510_pmu_init},
{.compatible = "arm,cortex-a520-pmu", .data = armv9_cortex_a520_pmu_init},
{.compatible = "arm,cortex-a710-pmu", .data = armv9_cortex_a710_pmu_init},
{.compatible = "arm,cortex-a715-pmu", .data = armv9_cortex_a715_pmu_init},
{.compatible = "arm,cortex-a720-pmu", .data = armv9_cortex_a720_pmu_init},
{.compatible = "arm,cortex-x1-pmu", .data = armv8_cortex_x1_pmu_init},
{.compatible = "arm,cortex-x2-pmu", .data = armv9_cortex_x2_pmu_init},
{.compatible = "arm,cortex-x3-pmu", .data = armv9_cortex_x3_pmu_init},
{.compatible = "arm,cortex-x4-pmu", .data = armv9_cortex_x4_pmu_init},
{.compatible = "arm,neoverse-e1-pmu", .data = armv8_neoverse_e1_pmu_init},
{.compatible = "arm,neoverse-n1-pmu", .data = armv8_neoverse_n1_pmu_init},
{.compatible = "arm,neoverse-n2-pmu", .data = armv9_neoverse_n2_pmu_init},

View File

@ -115,6 +115,7 @@
#define SMMU_PMCG_PA_SHIFT 12
#define SMMU_PMCG_EVCNTR_RDONLY BIT(0)
#define SMMU_PMCG_HARDEN_DISABLE BIT(1)
static int cpuhp_state_num;
@ -159,6 +160,20 @@ static inline void smmu_pmu_enable(struct pmu *pmu)
writel(SMMU_PMCG_CR_ENABLE, smmu_pmu->reg_base + SMMU_PMCG_CR);
}
static int smmu_pmu_apply_event_filter(struct smmu_pmu *smmu_pmu,
struct perf_event *event, int idx);
static inline void smmu_pmu_enable_quirk_hip08_09(struct pmu *pmu)
{
struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu);
unsigned int idx;
for_each_set_bit(idx, smmu_pmu->used_counters, smmu_pmu->num_counters)
smmu_pmu_apply_event_filter(smmu_pmu, smmu_pmu->events[idx], idx);
smmu_pmu_enable(pmu);
}
static inline void smmu_pmu_disable(struct pmu *pmu)
{
struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu);
@ -167,6 +182,22 @@ static inline void smmu_pmu_disable(struct pmu *pmu)
writel(0, smmu_pmu->reg_base + SMMU_PMCG_IRQ_CTRL);
}
static inline void smmu_pmu_disable_quirk_hip08_09(struct pmu *pmu)
{
struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu);
unsigned int idx;
/*
* The global disable of PMU sometimes fail to stop the counting.
* Harden this by writing an invalid event type to each used counter
* to forcibly stop counting.
*/
for_each_set_bit(idx, smmu_pmu->used_counters, smmu_pmu->num_counters)
writel(0xffff, smmu_pmu->reg_base + SMMU_PMCG_EVTYPER(idx));
smmu_pmu_disable(pmu);
}
static inline void smmu_pmu_counter_set_value(struct smmu_pmu *smmu_pmu,
u32 idx, u64 value)
{
@ -765,7 +796,10 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu *smmu_pmu)
switch (model) {
case IORT_SMMU_V3_PMCG_HISI_HIP08:
/* HiSilicon Erratum 162001800 */
smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY;
smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY | SMMU_PMCG_HARDEN_DISABLE;
break;
case IORT_SMMU_V3_PMCG_HISI_HIP09:
smmu_pmu->options |= SMMU_PMCG_HARDEN_DISABLE;
break;
}
@ -890,6 +924,16 @@ static int smmu_pmu_probe(struct platform_device *pdev)
if (!dev->of_node)
smmu_pmu_get_acpi_options(smmu_pmu);
/*
* For platforms suffer this quirk, the PMU disable sometimes fails to
* stop the counters. This will leads to inaccurate or error counting.
* Forcibly disable the counters with these quirk handler.
*/
if (smmu_pmu->options & SMMU_PMCG_HARDEN_DISABLE) {
smmu_pmu->pmu.pmu_enable = smmu_pmu_enable_quirk_hip08_09;
smmu_pmu->pmu.pmu_disable = smmu_pmu_disable_quirk_hip08_09;
}
/* Pick one CPU to be the preferred one to use */
smmu_pmu->on_cpu = raw_smp_processor_id();
WARN_ON(irq_set_affinity(smmu_pmu->irq, cpumask_of(smmu_pmu->on_cpu)));
@ -984,6 +1028,7 @@ static void __exit arm_smmu_pmu_exit(void)
module_exit(arm_smmu_pmu_exit);
MODULE_ALIAS("platform:arm-smmu-v3-pmcg");
MODULE_DESCRIPTION("PMU driver for ARM SMMUv3 Performance Monitors Extension");
MODULE_AUTHOR("Neil Leeder <nleeder@codeaurora.org>");
MODULE_AUTHOR("Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>");

View File

@ -25,8 +25,7 @@
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of.h>
#include <linux/perf_event.h>
#include <linux/perf/arm_pmu.h>
#include <linux/platform_device.h>

View File

@ -10,10 +10,9 @@
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_irq.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
#define COUNTER_CNTL 0x0
@ -28,6 +27,8 @@
#define CNTL_CLEAR_MASK 0xFFFFFFFD
#define CNTL_OVER_MASK 0xFFFFFFFE
#define CNTL_CP_SHIFT 16
#define CNTL_CP_MASK (0xFF << CNTL_CP_SHIFT)
#define CNTL_CSV_SHIFT 24
#define CNTL_CSV_MASK (0xFFU << CNTL_CSV_SHIFT)
@ -35,6 +36,8 @@
#define EVENT_CYCLES_COUNTER 0
#define NUM_COUNTERS 4
/* For removing bias if cycle counter CNTL.CP is set to 0xf0 */
#define CYCLES_COUNTER_MASK 0x0FFFFFFF
#define AXI_MASKING_REVERT 0xffff0000 /* AXI_MASKING(MSB 16bits) + AXI_ID(LSB 16bits) */
#define to_ddr_pmu(p) container_of(p, struct ddr_pmu, pmu)
@ -101,6 +104,7 @@ struct ddr_pmu {
const struct fsl_ddr_devtype_data *devtype_data;
int irq;
int id;
int active_counter;
};
static ssize_t ddr_perf_identifier_show(struct device *dev,
@ -427,6 +431,17 @@ static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config,
writel(0, pmu->base + reg);
val = CNTL_EN | CNTL_CLEAR;
val |= FIELD_PREP(CNTL_CSV_MASK, config);
/*
* On i.MX8MP we need to bias the cycle counter to overflow more often.
* We do this by initializing bits [23:16] of the counter value via the
* COUNTER_CTRL Counter Parameter (CP) field.
*/
if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER_ENHANCED) {
if (counter == EVENT_CYCLES_COUNTER)
val |= FIELD_PREP(CNTL_CP_MASK, 0xf0);
}
writel(val, pmu->base + reg);
} else {
/* Disable counter */
@ -466,6 +481,12 @@ static void ddr_perf_event_update(struct perf_event *event)
int ret;
new_raw_count = ddr_perf_read_counter(pmu, counter);
/* Remove the bias applied in ddr_perf_counter_enable(). */
if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER_ENHANCED) {
if (counter == EVENT_CYCLES_COUNTER)
new_raw_count &= CYCLES_COUNTER_MASK;
}
local64_add(new_raw_count, &event->count);
/*
@ -495,6 +516,10 @@ static void ddr_perf_event_start(struct perf_event *event, int flags)
ddr_perf_counter_enable(pmu, event->attr.config, counter, true);
if (!pmu->active_counter++)
ddr_perf_counter_enable(pmu, EVENT_CYCLES_ID,
EVENT_CYCLES_COUNTER, true);
hwc->state = 0;
}
@ -548,6 +573,10 @@ static void ddr_perf_event_stop(struct perf_event *event, int flags)
ddr_perf_counter_enable(pmu, event->attr.config, counter, false);
ddr_perf_event_update(event);
if (!--pmu->active_counter)
ddr_perf_counter_enable(pmu, EVENT_CYCLES_ID,
EVENT_CYCLES_COUNTER, false);
hwc->state |= PERF_HES_STOPPED;
}
@ -565,25 +594,10 @@ static void ddr_perf_event_del(struct perf_event *event, int flags)
static void ddr_perf_pmu_enable(struct pmu *pmu)
{
struct ddr_pmu *ddr_pmu = to_ddr_pmu(pmu);
/* enable cycle counter if cycle is not active event list */
if (ddr_pmu->events[EVENT_CYCLES_COUNTER] == NULL)
ddr_perf_counter_enable(ddr_pmu,
EVENT_CYCLES_ID,
EVENT_CYCLES_COUNTER,
true);
}
static void ddr_perf_pmu_disable(struct pmu *pmu)
{
struct ddr_pmu *ddr_pmu = to_ddr_pmu(pmu);
if (ddr_pmu->events[EVENT_CYCLES_COUNTER] == NULL)
ddr_perf_counter_enable(ddr_pmu,
EVENT_CYCLES_ID,
EVENT_CYCLES_COUNTER,
false);
}
static int ddr_perf_init(struct ddr_pmu *pmu, void __iomem *base,

View File

@ -7,9 +7,7 @@
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_irq.h>
#include <linux/platform_device.h>
#include <linux/perf_event.h>
/* Performance monitor configuration */

View File

@ -665,8 +665,8 @@ static int hisi_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
if (pcie_pmu->on_cpu == -1) {
pcie_pmu->on_cpu = cpu;
WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(cpu)));
pcie_pmu->on_cpu = cpumask_local_spread(0, dev_to_node(&pcie_pmu->pdev->dev));
WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(pcie_pmu->on_cpu)));
}
return 0;
@ -676,14 +676,23 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
{
struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
unsigned int target;
cpumask_t mask;
int numa_node;
/* Nothing to do if this CPU doesn't own the PMU */
if (pcie_pmu->on_cpu != cpu)
return 0;
pcie_pmu->on_cpu = -1;
/* Choose a new CPU from all online cpus. */
target = cpumask_any_but(cpu_online_mask, cpu);
/* Choose a local CPU from all online cpus. */
numa_node = dev_to_node(&pcie_pmu->pdev->dev);
if (cpumask_and(&mask, cpumask_of_node(numa_node), cpu_online_mask) &&
cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
target = cpumask_any(&mask);
else
target = cpumask_any_but(cpu_online_mask, cpu);
if (target >= nr_cpu_ids) {
pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
return 0;

View File

@ -8,11 +8,10 @@
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/perf_event.h>
#include <linux/hrtimer.h>
#include <linux/acpi.h>
#include <linux/platform_device.h>
/* Performance Counters Operating Mode Control Registers */
#define DDRC_PERF_CNT_OP_MODE_CTRL 0x8020

View File

@ -6,10 +6,9 @@
#define pr_fmt(fmt) "tad_pmu: " fmt
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/cpuhotplug.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>

View File

@ -1833,7 +1833,6 @@ static int xgene_pmu_probe(struct platform_device *pdev)
const struct xgene_pmu_data *dev_data;
const struct of_device_id *of_id;
struct xgene_pmu *xgene_pmu;
struct resource *res;
int irq, rc;
int version;
@ -1883,8 +1882,7 @@ static int xgene_pmu_probe(struct platform_device *pdev)
xgene_pmu->version = version;
dev_info(&pdev->dev, "X-Gene PMU version %d\n", xgene_pmu->version);
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
xgene_pmu->pcppmu_csr = devm_ioremap_resource(&pdev->dev, res);
xgene_pmu->pcppmu_csr = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(xgene_pmu->pcppmu_csr)) {
dev_err(&pdev->dev, "ioremap failed for PCP PMU resource\n");
return PTR_ERR(xgene_pmu->pcppmu_csr);

View File

@ -21,6 +21,7 @@
*/
#define IORT_SMMU_V3_PMCG_GENERIC 0x00000000 /* Generic SMMUv3 PMCG */
#define IORT_SMMU_V3_PMCG_HISI_HIP08 0x00000001 /* HiSilicon HIP08 PMCG */
#define IORT_SMMU_V3_PMCG_HISI_HIP09 0x00000002 /* HiSilicon HIP09 PMCG */
int iort_register_domain_token(int trans_id, phys_addr_t base,
struct fwnode_handle *fw_node);

View File

@ -47,10 +47,12 @@ int sdei_unregister_ghes(struct ghes *ghes);
int sdei_mask_local_cpu(void);
int sdei_unmask_local_cpu(void);
void __init sdei_init(void);
void sdei_handler_abort(void);
#else
static inline int sdei_mask_local_cpu(void) { return 0; }
static inline int sdei_unmask_local_cpu(void) { return 0; }
static inline void sdei_init(void) { }
static inline void sdei_handler_abort(void) { }
#endif /* CONFIG_ARM_SDE_INTERFACE */

View File

@ -187,5 +187,6 @@ void armpmu_free_irq(int irq, int cpu);
#endif /* CONFIG_ARM_PMU */
#define ARMV8_SPE_PDEV_NAME "arm,spe-v1"
#define ARMV8_TRBE_PDEV_NAME "arm,trbe"
#endif /* __ARM_PMU_H__ */

View File

@ -1316,15 +1316,31 @@ extern int perf_event_output(struct perf_event *event,
struct pt_regs *regs);
static inline bool
is_default_overflow_handler(struct perf_event *event)
__is_default_overflow_handler(perf_overflow_handler_t overflow_handler)
{
if (likely(event->overflow_handler == perf_event_output_forward))
if (likely(overflow_handler == perf_event_output_forward))
return true;
if (unlikely(event->overflow_handler == perf_event_output_backward))
if (unlikely(overflow_handler == perf_event_output_backward))
return true;
return false;
}
#define is_default_overflow_handler(event) \
__is_default_overflow_handler((event)->overflow_handler)
#ifdef CONFIG_BPF_SYSCALL
static inline bool uses_default_overflow_handler(struct perf_event *event)
{
if (likely(is_default_overflow_handler(event)))
return true;
return __is_default_overflow_handler(event->orig_overflow_handler);
}
#else
#define uses_default_overflow_handler(event) \
is_default_overflow_handler(event)
#endif
extern void
perf_event_header__init_id(struct perf_event_header *header,
struct perf_sample_data *data,

View File

@ -42,6 +42,18 @@
# define __always_inline inline __attribute__((always_inline))
#endif
#ifndef __always_unused
#define __always_unused __attribute__((__unused__))
#endif
#ifndef __noreturn
#define __noreturn __attribute__((__noreturn__))
#endif
#ifndef unreachable
#define unreachable() __builtin_unreachable()
#endif
#ifndef noinline
#define noinline
#endif
@ -190,4 +202,10 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
#define ___PASTE(a, b) a##b
#define __PASTE(a, b) ___PASTE(a, b)
#ifndef OPTIMIZER_HIDE_VAR
/* Make the optimizer believe the variable can be manipulated arbitrarily. */
#define OPTIMIZER_HIDE_VAR(var) \
__asm__ ("" : "=r" (var) : "0" (var))
#endif
#endif /* _TOOLS_LINUX_COMPILER_H */

View File

@ -19,6 +19,8 @@ CFLAGS += -I$(top_srcdir)/tools/testing/selftests/
CFLAGS += $(KHDR_INCLUDES)
CFLAGS += -I$(top_srcdir)/tools/include
export CFLAGS
export top_srcdir

View File

@ -19,19 +19,38 @@
#include "../../kselftest.h"
#define TESTS_PER_HWCAP 2
#define TESTS_PER_HWCAP 3
/*
* Function expected to generate SIGILL when the feature is not
* supported and return when it is supported. If SIGILL is generated
* then the handler must be able to skip over the instruction safely.
* Function expected to generate exception when the feature is not
* supported and return when it is supported. If the specific exception
* is generated then the handler must be able to skip over the
* instruction safely.
*
* Note that it is expected that for many architecture extensions
* there are no specific traps due to no architecture state being
* added so we may not fault if running on a kernel which doesn't know
* to add the hwcap.
*/
typedef void (*sigill_fn)(void);
typedef void (*sig_fn)(void);
static void aes_sigill(void)
{
/* AESE V0.16B, V0.16B */
asm volatile(".inst 0x4e284800" : : : );
}
static void atomics_sigill(void)
{
/* STADD W0, [SP] */
asm volatile(".inst 0xb82003ff" : : : );
}
static void crc32_sigill(void)
{
/* CRC32W W0, W0, W1 */
asm volatile(".inst 0x1ac14800" : : : );
}
static void cssc_sigill(void)
{
@ -39,6 +58,29 @@ static void cssc_sigill(void)
asm volatile(".inst 0xdac01c00" : : : "x0");
}
static void fp_sigill(void)
{
asm volatile("fmov s0, #1");
}
static void ilrcpc_sigill(void)
{
/* LDAPUR W0, [SP, #8] */
asm volatile(".inst 0x994083e0" : : : );
}
static void jscvt_sigill(void)
{
/* FJCVTZS W0, D0 */
asm volatile(".inst 0x1e7e0000" : : : );
}
static void lrcpc_sigill(void)
{
/* LDAPR W0, [SP, #0] */
asm volatile(".inst 0xb8bfc3e0" : : : );
}
static void mops_sigill(void)
{
char dst[1], src[1];
@ -53,11 +95,35 @@ static void mops_sigill(void)
: "cc", "memory");
}
static void pmull_sigill(void)
{
/* PMULL V0.1Q, V0.1D, V0.1D */
asm volatile(".inst 0x0ee0e000" : : : );
}
static void rng_sigill(void)
{
asm volatile("mrs x0, S3_3_C2_C4_0" : : : "x0");
}
static void sha1_sigill(void)
{
/* SHA1H S0, S0 */
asm volatile(".inst 0x5e280800" : : : );
}
static void sha2_sigill(void)
{
/* SHA256H Q0, Q0, V0.4S */
asm volatile(".inst 0x5e004000" : : : );
}
static void sha512_sigill(void)
{
/* SHA512H Q0, Q0, V0.2D */
asm volatile(".inst 0xce608000" : : : );
}
static void sme_sigill(void)
{
/* RDSVL x0, #0 */
@ -208,14 +274,45 @@ static void svebf16_sigill(void)
asm volatile(".inst 0x658aa000" : : : "z0");
}
static void hbc_sigill(void)
{
/* BC.EQ +4 */
asm volatile("cmp xzr, xzr\n"
".inst 0x54000030" : : : "cc");
}
static void uscat_sigbus(void)
{
/* unaligned atomic access */
asm volatile("ADD x1, sp, #2" : : : );
/* STADD W0, [X1] */
asm volatile(".inst 0xb820003f" : : : );
}
static const struct hwcap_data {
const char *name;
unsigned long at_hwcap;
unsigned long hwcap_bit;
const char *cpuinfo;
sigill_fn sigill_fn;
sig_fn sigill_fn;
bool sigill_reliable;
sig_fn sigbus_fn;
bool sigbus_reliable;
} hwcaps[] = {
{
.name = "AES",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_AES,
.cpuinfo = "aes",
.sigill_fn = aes_sigill,
},
{
.name = "CRC32",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_CRC32,
.cpuinfo = "crc32",
.sigill_fn = crc32_sigill,
},
{
.name = "CSSC",
.at_hwcap = AT_HWCAP2,
@ -223,6 +320,50 @@ static const struct hwcap_data {
.cpuinfo = "cssc",
.sigill_fn = cssc_sigill,
},
{
.name = "FP",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_FP,
.cpuinfo = "fp",
.sigill_fn = fp_sigill,
},
{
.name = "JSCVT",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_JSCVT,
.cpuinfo = "jscvt",
.sigill_fn = jscvt_sigill,
},
{
.name = "LRCPC",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_LRCPC,
.cpuinfo = "lrcpc",
.sigill_fn = lrcpc_sigill,
},
{
.name = "LRCPC2",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_ILRCPC,
.cpuinfo = "ilrcpc",
.sigill_fn = ilrcpc_sigill,
},
{
.name = "LSE",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_ATOMICS,
.cpuinfo = "atomics",
.sigill_fn = atomics_sigill,
},
{
.name = "LSE2",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_USCAT,
.cpuinfo = "uscat",
.sigill_fn = atomics_sigill,
.sigbus_fn = uscat_sigbus,
.sigbus_reliable = true,
},
{
.name = "MOPS",
.at_hwcap = AT_HWCAP2,
@ -231,6 +372,13 @@ static const struct hwcap_data {
.sigill_fn = mops_sigill,
.sigill_reliable = true,
},
{
.name = "PMULL",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_PMULL,
.cpuinfo = "pmull",
.sigill_fn = pmull_sigill,
},
{
.name = "RNG",
.at_hwcap = AT_HWCAP2,
@ -244,6 +392,27 @@ static const struct hwcap_data {
.hwcap_bit = HWCAP2_RPRFM,
.cpuinfo = "rprfm",
},
{
.name = "SHA1",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_SHA1,
.cpuinfo = "sha1",
.sigill_fn = sha1_sigill,
},
{
.name = "SHA2",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_SHA2,
.cpuinfo = "sha2",
.sigill_fn = sha2_sigill,
},
{
.name = "SHA512",
.at_hwcap = AT_HWCAP,
.hwcap_bit = HWCAP_SHA512,
.cpuinfo = "sha512",
.sigill_fn = sha512_sigill,
},
{
.name = "SME",
.at_hwcap = AT_HWCAP2,
@ -386,20 +555,32 @@ static const struct hwcap_data {
.hwcap_bit = HWCAP2_SVE_EBF16,
.cpuinfo = "sveebf16",
},
{
.name = "HBC",
.at_hwcap = AT_HWCAP2,
.hwcap_bit = HWCAP2_HBC,
.cpuinfo = "hbc",
.sigill_fn = hbc_sigill,
.sigill_reliable = true,
},
};
static bool seen_sigill;
typedef void (*sighandler_fn)(int, siginfo_t *, void *);
static void handle_sigill(int sig, siginfo_t *info, void *context)
{
ucontext_t *uc = context;
seen_sigill = true;
/* Skip over the offending instruction */
uc->uc_mcontext.pc += 4;
#define DEF_SIGHANDLER_FUNC(SIG, NUM) \
static bool seen_##SIG; \
static void handle_##SIG(int sig, siginfo_t *info, void *context) \
{ \
ucontext_t *uc = context; \
\
seen_##SIG = true; \
/* Skip over the offending instruction */ \
uc->uc_mcontext.pc += 4; \
}
DEF_SIGHANDLER_FUNC(sigill, SIGILL);
DEF_SIGHANDLER_FUNC(sigbus, SIGBUS);
bool cpuinfo_present(const char *name)
{
FILE *f;
@ -442,25 +623,78 @@ bool cpuinfo_present(const char *name)
return false;
}
static int install_sigaction(int signum, sighandler_fn handler)
{
int ret;
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = handler;
sa.sa_flags = SA_RESTART | SA_SIGINFO;
sigemptyset(&sa.sa_mask);
ret = sigaction(signum, &sa, NULL);
if (ret < 0)
ksft_exit_fail_msg("Failed to install SIGNAL handler: %s (%d)\n",
strerror(errno), errno);
return ret;
}
static void uninstall_sigaction(int signum)
{
if (sigaction(signum, NULL, NULL) < 0)
ksft_exit_fail_msg("Failed to uninstall SIGNAL handler: %s (%d)\n",
strerror(errno), errno);
}
#define DEF_INST_RAISE_SIG(SIG, NUM) \
static bool inst_raise_##SIG(const struct hwcap_data *hwcap, \
bool have_hwcap) \
{ \
if (!hwcap->SIG##_fn) { \
ksft_test_result_skip(#SIG"_%s\n", hwcap->name); \
/* assume that it would raise exception in default */ \
return true; \
} \
\
install_sigaction(NUM, handle_##SIG); \
\
seen_##SIG = false; \
hwcap->SIG##_fn(); \
\
if (have_hwcap) { \
/* Should be able to use the extension */ \
ksft_test_result(!seen_##SIG, \
#SIG"_%s\n", hwcap->name); \
} else if (hwcap->SIG##_reliable) { \
/* Guaranteed a SIGNAL */ \
ksft_test_result(seen_##SIG, \
#SIG"_%s\n", hwcap->name); \
} else { \
/* Missing SIGNAL might be fine */ \
ksft_print_msg(#SIG"_%sreported for %s\n", \
seen_##SIG ? "" : "not ", \
hwcap->name); \
ksft_test_result_skip(#SIG"_%s\n", \
hwcap->name); \
} \
\
uninstall_sigaction(NUM); \
return seen_##SIG; \
}
DEF_INST_RAISE_SIG(sigill, SIGILL);
DEF_INST_RAISE_SIG(sigbus, SIGBUS);
int main(void)
{
int i;
const struct hwcap_data *hwcap;
int i, ret;
bool have_cpuinfo, have_hwcap;
struct sigaction sa;
bool have_cpuinfo, have_hwcap, raise_sigill;
ksft_print_header();
ksft_set_plan(ARRAY_SIZE(hwcaps) * TESTS_PER_HWCAP);
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = handle_sigill;
sa.sa_flags = SA_RESTART | SA_SIGINFO;
sigemptyset(&sa.sa_mask);
ret = sigaction(SIGILL, &sa, NULL);
if (ret < 0)
ksft_exit_fail_msg("Failed to install SIGILL handler: %s (%d)\n",
strerror(errno), errno);
for (i = 0; i < ARRAY_SIZE(hwcaps); i++) {
hwcap = &hwcaps[i];
@ -473,30 +707,15 @@ int main(void)
ksft_test_result(have_hwcap == have_cpuinfo,
"cpuinfo_match_%s\n", hwcap->name);
if (hwcap->sigill_fn) {
seen_sigill = false;
hwcap->sigill_fn();
if (have_hwcap) {
/* Should be able to use the extension */
ksft_test_result(!seen_sigill, "sigill_%s\n",
hwcap->name);
} else if (hwcap->sigill_reliable) {
/* Guaranteed a SIGILL */
ksft_test_result(seen_sigill, "sigill_%s\n",
hwcap->name);
} else {
/* Missing SIGILL might be fine */
ksft_print_msg("SIGILL %sreported for %s\n",
seen_sigill ? "" : "not ",
hwcap->name);
ksft_test_result_skip("sigill_%s\n",
hwcap->name);
}
} else {
ksft_test_result_skip("sigill_%s\n",
hwcap->name);
}
/*
* Testing for SIGBUS only makes sense after make sure
* that the instruction does not cause a SIGILL signal.
*/
raise_sigill = inst_raise_sigill(hwcap, have_hwcap);
if (!raise_sigill)
inst_raise_sigbus(hwcap, have_hwcap);
else
ksft_test_result_skip("sigbus_%s\n", hwcap->name);
}
ksft_print_cnts();

View File

@ -20,12 +20,20 @@
#include "syscall-abi.h"
/*
* The kernel defines a much larger SVE_VQ_MAX than is expressable in
* the architecture, this creates a *lot* of overhead filling the
* buffers (especially ZA) on emulated platforms so use the actual
* architectural maximum instead.
*/
#define ARCH_SVE_VQ_MAX 16
static int default_sme_vl;
static int sve_vl_count;
static unsigned int sve_vls[SVE_VQ_MAX];
static unsigned int sve_vls[ARCH_SVE_VQ_MAX];
static int sme_vl_count;
static unsigned int sme_vls[SVE_VQ_MAX];
static unsigned int sme_vls[ARCH_SVE_VQ_MAX];
extern void do_syscall(int sve_vl, int sme_vl);
@ -130,9 +138,9 @@ static int check_fpr(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
#define SVE_Z_SHARED_BYTES (128 / 8)
static uint8_t z_zero[__SVE_ZREG_SIZE(SVE_VQ_MAX)];
uint8_t z_in[SVE_NUM_ZREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)];
uint8_t z_out[SVE_NUM_ZREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)];
static uint8_t z_zero[__SVE_ZREG_SIZE(ARCH_SVE_VQ_MAX)];
uint8_t z_in[SVE_NUM_ZREGS * __SVE_ZREG_SIZE(ARCH_SVE_VQ_MAX)];
uint8_t z_out[SVE_NUM_ZREGS * __SVE_ZREG_SIZE(ARCH_SVE_VQ_MAX)];
static void setup_z(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
uint64_t svcr)
@ -190,8 +198,8 @@ static int check_z(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
return errors;
}
uint8_t p_in[SVE_NUM_PREGS * __SVE_PREG_SIZE(SVE_VQ_MAX)];
uint8_t p_out[SVE_NUM_PREGS * __SVE_PREG_SIZE(SVE_VQ_MAX)];
uint8_t p_in[SVE_NUM_PREGS * __SVE_PREG_SIZE(ARCH_SVE_VQ_MAX)];
uint8_t p_out[SVE_NUM_PREGS * __SVE_PREG_SIZE(ARCH_SVE_VQ_MAX)];
static void setup_p(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
uint64_t svcr)
@ -222,8 +230,8 @@ static int check_p(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
return errors;
}
uint8_t ffr_in[__SVE_PREG_SIZE(SVE_VQ_MAX)];
uint8_t ffr_out[__SVE_PREG_SIZE(SVE_VQ_MAX)];
uint8_t ffr_in[__SVE_PREG_SIZE(ARCH_SVE_VQ_MAX)];
uint8_t ffr_out[__SVE_PREG_SIZE(ARCH_SVE_VQ_MAX)];
static void setup_ffr(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
uint64_t svcr)
@ -300,8 +308,8 @@ static int check_svcr(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
return errors;
}
uint8_t za_in[ZA_SIG_REGS_SIZE(SVE_VQ_MAX)];
uint8_t za_out[ZA_SIG_REGS_SIZE(SVE_VQ_MAX)];
uint8_t za_in[ZA_SIG_REGS_SIZE(ARCH_SVE_VQ_MAX)];
uint8_t za_out[ZA_SIG_REGS_SIZE(ARCH_SVE_VQ_MAX)];
static void setup_za(struct syscall_cfg *cfg, int sve_vl, int sme_vl,
uint64_t svcr)
@ -470,9 +478,9 @@ void sve_count_vls(void)
return;
/*
* Enumerate up to SVE_VQ_MAX vector lengths
* Enumerate up to ARCH_SVE_VQ_MAX vector lengths
*/
for (vq = SVE_VQ_MAX; vq > 0; vq /= 2) {
for (vq = ARCH_SVE_VQ_MAX; vq > 0; vq /= 2) {
vl = prctl(PR_SVE_SET_VL, vq * 16);
if (vl == -1)
ksft_exit_fail_msg("PR_SVE_SET_VL failed: %s (%d)\n",
@ -496,9 +504,9 @@ void sme_count_vls(void)
return;
/*
* Enumerate up to SVE_VQ_MAX vector lengths
* Enumerate up to ARCH_SVE_VQ_MAX vector lengths
*/
for (vq = SVE_VQ_MAX; vq > 0; vq /= 2) {
for (vq = ARCH_SVE_VQ_MAX; vq > 0; vq /= 2) {
vl = prctl(PR_SME_SET_VL, vq * 16);
if (vl == -1)
ksft_exit_fail_msg("PR_SME_SET_VL failed: %s (%d)\n",

View File

@ -2,8 +2,6 @@
TEST_GEN_PROGS := btitest nobtitest
PROGS := $(patsubst %,gen/%,$(TEST_GEN_PROGS))
# These tests are built as freestanding binaries since otherwise BTI
# support in ld.so is required which is not currently widespread; when
# it is available it will still be useful to test this separately as the
@ -18,44 +16,41 @@ CFLAGS_COMMON = -ffreestanding -Wall -Wextra $(CFLAGS)
BTI_CC_COMMAND = $(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -c -o $@ $<
NOBTI_CC_COMMAND = $(CC) $(CFLAGS_NOBTI) $(CFLAGS_COMMON) -c -o $@ $<
%-bti.o: %.c
$(OUTPUT)/%-bti.o: %.c
$(BTI_CC_COMMAND)
%-bti.o: %.S
$(OUTPUT)/%-bti.o: %.S
$(BTI_CC_COMMAND)
%-nobti.o: %.c
$(OUTPUT)/%-nobti.o: %.c
$(NOBTI_CC_COMMAND)
%-nobti.o: %.S
$(OUTPUT)/%-nobti.o: %.S
$(NOBTI_CC_COMMAND)
BTI_OBJS = \
test-bti.o \
signal-bti.o \
start-bti.o \
syscall-bti.o \
system-bti.o \
teststubs-bti.o \
trampoline-bti.o
gen/btitest: $(BTI_OBJS)
$(OUTPUT)/test-bti.o \
$(OUTPUT)/signal-bti.o \
$(OUTPUT)/start-bti.o \
$(OUTPUT)/syscall-bti.o \
$(OUTPUT)/system-bti.o \
$(OUTPUT)/teststubs-bti.o \
$(OUTPUT)/trampoline-bti.o
$(OUTPUT)/btitest: $(BTI_OBJS)
$(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -nostdlib -static -o $@ $^
NOBTI_OBJS = \
test-nobti.o \
signal-nobti.o \
start-nobti.o \
syscall-nobti.o \
system-nobti.o \
teststubs-nobti.o \
trampoline-nobti.o
gen/nobtitest: $(NOBTI_OBJS)
$(OUTPUT)/test-nobti.o \
$(OUTPUT)/signal-nobti.o \
$(OUTPUT)/start-nobti.o \
$(OUTPUT)/syscall-nobti.o \
$(OUTPUT)/system-nobti.o \
$(OUTPUT)/teststubs-nobti.o \
$(OUTPUT)/trampoline-nobti.o
$(OUTPUT)/nobtitest: $(NOBTI_OBJS)
$(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -nostdlib -static -o $@ $^
# Including KSFT lib.mk here will also mangle the TEST_GEN_PROGS list
# to account for any OUTPUT target-dirs optionally provided by
# the toplevel makefile
include ../../lib.mk
$(TEST_GEN_PROGS): $(PROGS)
cp $(PROGS) $(OUTPUT)/

View File

@ -1,21 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2019 Arm Limited
* Original author: Dave Martin <Dave.Martin@arm.com>
*/
#ifndef COMPILER_H
#define COMPILER_H
#define __always_unused __attribute__((__unused__))
#define __noreturn __attribute__((__noreturn__))
#define __unreachable() __builtin_unreachable()
/* curse(e) has value e, but the compiler cannot assume so */
#define curse(e) ({ \
__typeof__(e) __curse_e = (e); \
asm ("" : "+r" (__curse_e)); \
__curse_e; \
})
#endif /* ! COMPILER_H */

View File

@ -1,2 +0,0 @@
btitest
nobtitest

View File

@ -8,12 +8,10 @@
#include <asm/unistd.h>
#include "compiler.h"
void __noreturn exit(int n)
{
syscall(__NR_exit, n);
__unreachable();
unreachable();
}
ssize_t write(int fd, const void *buf, size_t size)

View File

@ -14,12 +14,12 @@ typedef __kernel_size_t size_t;
typedef __kernel_ssize_t ssize_t;
#include <linux/errno.h>
#include <linux/compiler.h>
#include <asm/hwcap.h>
#include <asm/ptrace.h>
#include <asm/unistd.h>
#include "compiler.h"
long syscall(int nr, ...);
void __noreturn exit(int n);

View File

@ -17,7 +17,6 @@
typedef struct ucontext ucontext_t;
#include "btitest.h"
#include "compiler.h"
#include "signal.h"
#define EXPECTED_TESTS 18

View File

@ -6,6 +6,7 @@
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
@ -39,9 +40,11 @@ struct vec_data {
int max_vl;
};
#define VEC_SVE 0
#define VEC_SME 1
static struct vec_data vec_data[] = {
{
[VEC_SVE] = {
.name = "SVE",
.hwcap_type = AT_HWCAP,
.hwcap = HWCAP_SVE,
@ -51,7 +54,7 @@ static struct vec_data vec_data[] = {
.prctl_set = PR_SVE_SET_VL,
.default_vl_file = "/proc/sys/abi/sve_default_vector_length",
},
{
[VEC_SME] = {
.name = "SME",
.hwcap_type = AT_HWCAP2,
.hwcap = HWCAP2_SME,
@ -551,7 +554,8 @@ static void prctl_set_onexec(struct vec_data *data)
/* For each VQ verify that setting via prctl() does the right thing */
static void prctl_set_all_vqs(struct vec_data *data)
{
int ret, vq, vl, new_vl;
int ret, vq, vl, new_vl, i;
int orig_vls[ARRAY_SIZE(vec_data)];
int errors = 0;
if (!data->min_vl || !data->max_vl) {
@ -560,6 +564,9 @@ static void prctl_set_all_vqs(struct vec_data *data)
return;
}
for (i = 0; i < ARRAY_SIZE(vec_data); i++)
orig_vls[i] = vec_data[i].rdvl();
for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; vq++) {
vl = sve_vl_from_vq(vq);
@ -582,6 +589,22 @@ static void prctl_set_all_vqs(struct vec_data *data)
errors++;
}
/* Did any other VLs change? */
for (i = 0; i < ARRAY_SIZE(vec_data); i++) {
if (&vec_data[i] == data)
continue;
if (!(getauxval(vec_data[i].hwcap_type) & vec_data[i].hwcap))
continue;
if (vec_data[i].rdvl() != orig_vls[i]) {
ksft_print_msg("%s VL changed from %d to %d\n",
vec_data[i].name, orig_vls[i],
vec_data[i].rdvl());
errors++;
}
}
/* Was that the VL we asked for? */
if (new_vl == vl)
continue;
@ -644,18 +667,107 @@ static const test_type tests[] = {
prctl_set_all_vqs,
};
static inline void smstart(void)
{
asm volatile("msr S0_3_C4_C7_3, xzr");
}
static inline void smstart_sm(void)
{
asm volatile("msr S0_3_C4_C3_3, xzr");
}
static inline void smstop(void)
{
asm volatile("msr S0_3_C4_C6_3, xzr");
}
/*
* Verify we can change the SVE vector length while SME is active and
* continue to use SME afterwards.
*/
static void change_sve_with_za(void)
{
struct vec_data *sve_data = &vec_data[VEC_SVE];
bool pass = true;
int ret, i;
if (sve_data->min_vl == sve_data->max_vl) {
ksft_print_msg("Only one SVE VL supported, can't change\n");
ksft_test_result_skip("change_sve_while_sme\n");
return;
}
/* Ensure we will trigger a change when we set the maximum */
ret = prctl(sve_data->prctl_set, sve_data->min_vl);
if (ret != sve_data->min_vl) {
ksft_print_msg("Failed to set SVE VL %d: %d\n",
sve_data->min_vl, ret);
pass = false;
}
/* Enable SM and ZA */
smstart();
/* Trigger another VL change */
ret = prctl(sve_data->prctl_set, sve_data->max_vl);
if (ret != sve_data->max_vl) {
ksft_print_msg("Failed to set SVE VL %d: %d\n",
sve_data->max_vl, ret);
pass = false;
}
/*
* Spin for a bit with SM enabled to try to trigger another
* save/restore. We can't use syscalls without exiting
* streaming mode.
*/
for (i = 0; i < 100000000; i++)
smstart_sm();
/*
* TODO: Verify that ZA was preserved over the VL change and
* spin.
*/
/* Clean up after ourselves */
smstop();
ret = prctl(sve_data->prctl_set, sve_data->default_vl);
if (ret != sve_data->default_vl) {
ksft_print_msg("Failed to restore SVE VL %d: %d\n",
sve_data->default_vl, ret);
pass = false;
}
ksft_test_result(pass, "change_sve_with_za\n");
}
typedef void (*test_all_type)(void);
static const struct {
const char *name;
test_all_type test;
} all_types_tests[] = {
{ "change_sve_with_za", change_sve_with_za },
};
int main(void)
{
bool all_supported = true;
int i, j;
ksft_print_header();
ksft_set_plan(ARRAY_SIZE(tests) * ARRAY_SIZE(vec_data));
ksft_set_plan(ARRAY_SIZE(tests) * ARRAY_SIZE(vec_data) +
ARRAY_SIZE(all_types_tests));
for (i = 0; i < ARRAY_SIZE(vec_data); i++) {
struct vec_data *data = &vec_data[i];
unsigned long supported;
supported = getauxval(data->hwcap_type) & data->hwcap;
if (!supported)
all_supported = false;
for (j = 0; j < ARRAY_SIZE(tests); j++) {
if (supported)
@ -666,5 +778,12 @@ int main(void)
}
}
for (i = 0; i < ARRAY_SIZE(all_types_tests); i++) {
if (all_supported)
all_types_tests[i].test();
else
ksft_test_result_skip("%s\n", all_types_tests[i].name);
}
ksft_exit_pass();
}

View File

@ -8,6 +8,8 @@
#include <stdio.h>
#include <string.h>
#include <linux/compiler.h>
#include "test_signals.h"
int test_init(struct tdescr *td);
@ -60,13 +62,25 @@ static __always_inline bool get_current_context(struct tdescr *td,
size_t dest_sz)
{
static volatile bool seen_already;
int i;
char *uc = (char *)dest_uc;
assert(td && dest_uc);
/* it's a genuine invocation..reinit */
seen_already = 0;
td->live_uc_valid = 0;
td->live_sz = dest_sz;
memset(dest_uc, 0x00, td->live_sz);
/*
* This is a memset() but we don't want the compiler to
* optimise it into either instructions or a library call
* which might be incompatible with streaming mode.
*/
for (i = 0; i < td->live_sz; i++) {
uc[i] = 0;
OPTIMIZER_HIDE_VAR(uc[0]);
}
td->live_uc = dest_uc;
/*
* Grab ucontext_t triggering a SIGTRAP.
@ -103,6 +117,17 @@ static __always_inline bool get_current_context(struct tdescr *td,
:
: "memory");
/*
* If we were grabbing a streaming mode context then we may
* have entered streaming mode behind the system's back and
* libc or compiler generated code might decide to do
* something invalid in streaming mode, or potentially even
* the state of ZA. Issue a SMSTOP to exit both now we have
* grabbed the state.
*/
if (td->feats_supported & FEAT_SME)
asm volatile("msr S0_3_C4_C6_3, xzr");
/*
* If we get here with seen_already==1 it implies the td->live_uc
* context has been used to get back here....this probably means

View File

@ -65,6 +65,7 @@ int zt_regs_run(struct tdescr *td, siginfo_t *si, ucontext_t *uc)
if (memcmp(zeros, (char *)zt + ZT_SIG_REGS_OFFSET,
ZT_SIG_REGS_SIZE(zt->nregs)) != 0) {
fprintf(stderr, "ZT data invalid\n");
free(zeros);
return 1;
}