Commit graph

6702 commits

Author SHA1 Message Date
Michael Ellerman 627e66f29a Merge changes from Paul Gortmaker
Merge the changes to retire the legacy WR sbc8548 and sbc8641 platforms
from Paul. These were sent as a pull request, but I rebased them onto
rc2 so as not to pull too many unrelated changes in to my next.

Description from Paul's pull request follows:

In v2.6.27 (2008, 917f0af9e5) the sbc8260 support was implicitly
retired by not being carried forward through the ppc --> powerpc
device tree transition.

Then, in v3.6 (2012, b048b4e17c) we retired the support for the
sbc8560 boards.

Next, in v4.18 (2017, 3bc6cf5a86) we retired the support for the
2006 vintage sbc834x boards.

The sbc8548 and sbc8641d boards were maybe 1-2 years newer than the
sbc834x boards, but it is also 3+ years later, so it makes sense to
now retire them as well - which is what is done here.

These two remaining WR boards were based on the Freescale MPC8548-CDS
and the MPC8641D-HPCN reference board implementations.  Having had the
chance to use these and many other Fsl ref boards, I know this:  The
Freescale reference boards were typically produced in limited quantity
and primarily available to BSP developers and hardware designers, and
not likely to have found a 2nd life with hobbyists and/or collectors.

It was good to have that BSP code subjected to mainline review and
hence also widely available back in the day. But given the above, we
should probably also be giving serious consideration to retiring
additional similar age/type reference board platforms as well.

I've always felt it is important for us to be proactive in retiring
old code, since it has a genuine non-zero carrying cost, as described
in the 930d52c012 merge log.  But for the here and now, we just
clean up the remaining BSP code that I had added for SBC platforms.

Link: https://lore.kernel.org/r/20210824174209.GB160508@windriver.com
2021-08-27 00:49:06 +10:00
Paul Gortmaker d7c1814f2f powerpc: retire sbc8641d board support
The support was for this was added to mainline over 12 years ago, in
v2.6.26 [4e8aae89a3] just around the ppc --> powerpc migration.

I believe the board was introduced shortly after the sbc8548 board,
making it roughly a 14 year old platform - with the CPU speed and
memory size typical for that era.

I haven't had one of these boards for several years, and availability
was discontinued several years before that.

Given that, there is no point in adding a burden to testing coverage
that builds all possible defconfigs, so it makes sense to remove it.

Of course it will remain in the git history forever, for anyone who
happens to find a functional board and wants to tinker with it.

Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2021-08-27 00:48:18 +10:00
Paul Gortmaker c12adb0678 powerpc: retire sbc8548 board support
The support was for this was mainlined 13 years ago, in v2.6.25
[0e0fffe887] just around the ppc --> powerpc migration.

I believe the board was introduced a year or two before that, so it
is roughly a 15 year old platform - with the CPU speed and memory size
that was typical for that era.

I haven't had one of these boards for several years, and availability
was discontinued several years before that.

Given that, there is no point in adding a burden to testing coverage
that builds all possible defconfigs, so it makes sense to remove it.

Of course it will remain in the git history forever, for anyone who
happens to find a functional board and wants to tinker with it.

Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2021-08-27 00:48:11 +10:00
Michael Ellerman 465e333e77 Merge branch 'topic/ppc-kvm' into next
Merge some KVM patches we are keeping in a topic branch in case there
are any merge conflicts that need resolving.
2021-08-26 21:21:11 +10:00
Christophe Leroy 806c0e6e7e powerpc: Refactor verification of MSR_RI
40x and BOOKE don't have MSR_RI therefore all tests involving
MSR_RI may be problematic on those plateforms.

Create helpers to check or set MSR_RI in regs, and use them
in common code.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c2fb93708196734f4176dda334aaa3055f213b89.1629707037.git.christophe.leroy@csgroup.eu
2021-08-26 21:21:07 +10:00
Xiongwei Song 4f8e78c075 powerpc: Add esr as a synonym for pt_regs.dsisr
Create an anonymous union for dsisr and esr regsiters, we can reference
esr to get the exception detail when CONFIG_4xx=y or CONFIG_BOOKE=y.
Otherwise, reference dsisr. This makes code more clear.

Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
[mpe: Reword commit title]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210807010239.416055-2-sxwjean@me.com
2021-08-26 21:21:06 +10:00
Nicholas Piggin 0c8fb653d4 powerpc/64s: Remove WORT SPR from POWER9/10
This register is not architected and not implemented in POWER9 or 10,
it just reads back zeroes for compatibility.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Link: https://lore.kernel.org/r/20210811160134.904987-11-npiggin@gmail.com
2021-08-25 16:37:18 +10:00
Cédric Le Goater 4cb266074a powerpc/pseries/vas: Declare pseries_vas_fault_thread_fn() as static
This fixes a compile error with W=1.

Fixes: 6d0aaf5e0d ("powerpc/pseries/vas: Setup IRQ and fault handling")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210819125656.14498-3-clg@kaod.org
2021-08-20 22:17:18 +10:00
Michael Ellerman 8b893ef190 powerpc/pseries: Fix build error when NUMA=n
As reported by lkp, if NUMA=n we see a build error:

   arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'pseries_cpu_hotplug_init':
   arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: 'node_to_cpumask_map' undeclared
    1022 |        node_to_cpumask_map[node]);

Use cpumask_of_node() which has an empty stub for NUMA=n, and when
NUMA=y does a lookup from node_to_cpumask_map[].

Fixes: bd1dd4c5f5 ("powerpc/pseries: Prevent free CPU ids being reused on another node")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210816041032.2839343-1-mpe@ellerman.id.au
2021-08-16 14:11:00 +10:00
Aneesh Kumar K.V 1c6b5a7e74 powerpc/pseries: Add support for FORM2 associativity
PAPR interface currently supports two different ways of communicating resource
grouping details to the OS. These are referred to as Form 0 and Form 1
associativity grouping. Form 0 is the older format and is now considered
deprecated. This patch adds another resource grouping named FORM2.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-6-aneesh.kumar@linux.ibm.com
2021-08-13 22:04:27 +10:00
Aneesh Kumar K.V ef31cb83d1 powerpc/pseries: Add a helper for form1 cpu distance
This helper is only used with the dispatch trace log collection.
A later patch will add Form2 affinity support and this change helps
in keeping that simpler. Also add a comment explaining we don't expect
the code to be called with FORM0

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-5-aneesh.kumar@linux.ibm.com
2021-08-13 22:04:27 +10:00
Aneesh Kumar K.V 8ddc6448ec powerpc/pseries: Consolidate different NUMA distance update code paths
The associativity details of the newly added resourced are collected from
the hypervisor via "ibm,configure-connector" rtas call. Update the numa
distance details of the newly added numa node after the above call.

Instead of updating NUMA distance every time we lookup a node id
from the associativity property, add helpers that can be used
during boot which does this only once. Also remove the distance
update from node id lookup helpers.

Currently, we duplicate parsing code for ibm,associativity and
ibm,associativity-lookup-arrays in the kernel. The associativity array provided
by these device tree properties are very similar and hence can use
a helper to parse the node id and numa distance details.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-4-aneesh.kumar@linux.ibm.com
2021-08-13 22:04:26 +10:00
Aneesh Kumar K.V 0eacd06bb8 powerpc/pseries: Rename TYPE1_AFFINITY to FORM1_AFFINITY
Also make related code cleanup that will allow adding FORM2_AFFINITY in
later patches. No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-3-aneesh.kumar@linux.ibm.com
2021-08-13 22:04:26 +10:00
Aneesh Kumar K.V dbf77fed8b powerpc: rename powerpc_debugfs_root to arch_debugfs_dir
No functional change in this patch. arch_debugfs_dir is the generic kernel
name declared in linux/debugfs.h for arch-specific debugfs directory.
Architectures like x86/s390 already use the name. Rename powerpc
specific powerpc_debugfs_root to arch_debugfs_dir.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132831.233794-2-aneesh.kumar@linux.ibm.com
2021-08-13 22:04:26 +10:00
Marc Zyngier 1bce542500 powerpc: Bulk conversion to generic_handle_domain_irq()
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210802162630.2219813-13-maz@kernel.org
2021-08-10 23:15:02 +10:00
Cédric Le Goater c325712b5f powerpc/powernv/pci: Rework pnv_opal_pci_msi_eoi()
pnv_opal_pci_msi_eoi() is called from KVM to EOI passthrough interrupts
when in real mode. Adding MSI domain broke the hack using the
'ioda.irq_chip' field to deduce the owning PHB. Fix that by using the
IRQ chip data in the MSI domain.

The 'ioda.irq_chip' field is now unused and could be removed from the
pnv_phb struct.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-30-clg@kaod.org
2021-08-10 23:15:01 +10:00
Cédric Le Goater 5cd69651ce powerpc/powernv/pci: Set the IRQ chip data for P8/CXL devices
Before MSI domains, the default IRQ chip of PHB3 MSIs was patched by
pnv_set_msi_irq_chip() with the custom EOI handler pnv_ioda2_msi_eoi()
and the owning PHB was deduced from the 'ioda.irq_chip' field. This
path has been deprecated by the MSI domains but it is still in use by
the P8 CAPI 'cxl' driver.

Rewriting this driver to support MSI would be a waste of time.
Nevertheless, we can still remove the IRQ chip patch and set the IRQ
chip data instead. This is cleaner.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-29-clg@kaod.org
2021-08-10 23:15:01 +10:00
Cédric Le Goater f1a377f86f powerpc/powernv/pci: Adapt is_pnv_opal_msi() to detect passthrough interrupt
The pnv_ioda2_msi_eoi() chip handler is not used anymore for MSIs.
Simply use the check on the PSI-MSI chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-27-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater 6d9ba6121b powerpc/powernv/pci: Drop unused MSI code
MSIs should be fully managed by the PCI and IRQ subsystems now.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-26-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater 3005123eea powerpc/pseries/pci: Drop unused MSI code
MSIs should be fully managed by the PCI and IRQ subsystems now.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-25-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater 679e30b953 powerpc/pci: Drop XIVE restriction on MSI domains
The PowerNV and pSeries platforms now have support for both the XICS
and XIVE IRQ domains.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-23-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater bbb25af8fb powerpc/powernv/pci: Customize the MSI EOI handler to support PHB3
PHB3s need an extra OPAL call to EOI the interrupt. The call takes an
OPAL HW IRQ number but it is translated into a vector number in OPAL.
Here, we directly use the vector number of the in-the-middle "PNV-MSI"
domain instead of grabbing the OPAL HW IRQ number in the XICS parent
domain.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-22-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater ba418a0278 KVM: PPC: Book3S HV: Use the new IRQ chip to detect passthrough interrupts
Passthrough PCI MSI interrupts are detected in KVM with a check on a
specific EOI handler (P8) or on XIVE (P9). We can now check the
PCI-MSI IRQ chip which is cleaner.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-14-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater 0fcfe2247e powerpc/powernv/pci: Add MSI domains
This is very similar to the MSI domains of the pSeries platform. The
MSI allocator is directly handled under the Linux PHB in the
in-the-middle "PNV-MSI" domain.

Only the XIVE (P9/P10) parent domain is supported for now. Support for
XICS will come later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-13-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater 2c50d7e99e powerpc/powernv/pci: Introduce __pnv_pci_ioda_msi_setup()
It will be used as a 'compose_msg' handler of the MSI domain introduced
later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-12-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater 174db9e7f7 powerpc/pseries/pci: Add support of MSI domains to PHB hotplug
Simply allocate or release the MSI domains when a PHB is inserted in
or removed from the machine.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-11-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater 9a014f4568 powerpc/pseries/pci: Add a msi_free() handler to clear XIVE data
The MSI domain clears the IRQ with msi_domain_free(), which calls
irq_domain_free_irqs_top(), which clears the handler data. This is a
problem for the XIVE controller since we need to unmap MMIO pages and
free a specific XIVE structure.

The 'msi_free()' handler is called before irq_domain_free_irqs_top()
when the handler data is still available. Use that to clear the XIVE
controller data.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-10-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater 07817a578a powerpc/pseries/pci: Add a domain_free_irqs() handler
The RTAS firmware can not disable one MSI at a time. It's all or
nothing. We need a custom free IRQ handler for that.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-9-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater a5f3d2c17b powerpc/pseries/pci: Add MSI domains
Two IRQ domains are added on top of default machine IRQ domain.

First, the top level "pSeries-PCI-MSI" domain deals with the MSI
specificities. In this domain, the HW IRQ numbers are generated by the
PCI MSI layer, they compose a unique ID for an MSI source with the PCI
device identifier and the MSI vector number.

These numbers can be quite large on a pSeries machine running under
the IBM Hypervisor and /sys/kernel/irq/ and /proc/interrupts will
require small fixes to show them correctly.

Second domain is the in-the-middle "pSeries-MSI" domain which acts as
a proxy between the PCI MSI subsystem and the machine IRQ subsystem.
It usually allocate the MSI vector numbers but, on pSeries machines,
this is done by the RTAS FW and RTAS returns IRQ numbers in the IRQ
number space of the machine. This is why the in-the-middle "pSeries-MSI"
domain has the same HW IRQ numbers as its parent domain.

Only the XIVE (P9/P10) parent domain is supported for now. We still
need to add support for IRQ domain hierarchy under XICS.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-6-clg@kaod.org
2021-08-10 23:14:57 +10:00
Cédric Le Goater e812020073 powerpc/pseries/pci: Introduce rtas_prepare_msi_irqs()
This splits the routine setting the MSIs in two parts: allocation of
MSIs for the PCI device at the FW level (RTAS) and the actual mapping
and activation of the IRQs.

rtas_prepare_msi_irqs() will serve as a handler for the PCI MSI domain.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-3-clg@kaod.org
2021-08-10 23:14:57 +10:00
Cédric Le Goater 786e5b102a powerpc/pseries/pci: Introduce __find_pe_total_msi()
It will help to size the PCI MSI domain.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-2-clg@kaod.org
2021-08-10 23:14:56 +10:00
Sebastian Andrzej Siewior 5ae36401ca powerpc: Replace deprecated CPU-hotplug functions.
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210803141621.780504-4-bigeasy@linutronix.de
2021-08-10 23:14:56 +10:00
Laurent Dufour bd1dd4c5f5 powerpc/pseries: Prevent free CPU ids being reused on another node
When a CPU is hot added, the CPU ids are taken from the available mask
from the lower possible set. If that set of values was previously used
for a CPU attached to a different node, it appears to an application as
if these CPUs have migrated from one node to another node which is not
expected.

To prevent this, it is needed to record the CPU ids used for each node
and to not reuse them on another node. However, to prevent CPU hot plug
to fail, in the case the CPU ids is starved on a node, the capability to
reuse other nodes’ free CPU ids is kept. A warning is displayed in such
a case to warn the user.

A new CPU bit mask (node_recorded_ids_map) is introduced for each
possible node. It is populated with the CPU onlined at boot time, and
then when a CPU is hot plugged to a node. The bits in that mask remain
when the CPU is hot unplugged, to remind this CPU ids have been used for
this node.

If no id set was found, a retry is made without removing the ids used on
the other nodes to try reusing them. This is the way ids have been
allocated prior to this patch.

The effect of this patch can be seen by removing and adding CPUs using
the Qemu monitor. In the following case, the first CPU from the node 2
is removed, then the first one from the node 1 is removed too. Later,
the first CPU of the node 2 is added back. Without that patch, the
kernel will number these CPUs using the first CPU ids available which
are the ones freed when removing the second CPU of the node 0. This
leads to the CPU ids 16-23 to move from the node 1 to the node 2. With
the patch applied, the CPU ids 32-39 are used since they are the lowest
free ones which have not been used on another node.

At boot time:
  [root@vm40 ~]# numactl -H | grep cpus
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Vanilla kernel, after the CPU hot unplug/plug operations:
  [root@vm40 ~]# numactl -H | grep cpus
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 1 cpus: 24 25 26 27 28 29 30 31
  node 2 cpus: 16 17 18 19 20 21 22 23 40 41 42 43 44 45 46 47

Patched kernel, after the CPU hot unplug/plug operations:
  [root@vm40 ~]# numactl -H | grep cpus
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 1 cpus: 24 25 26 27 28 29 30 31
  node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210429174908.16613-1-ldufour@linux.ibm.com
2021-08-10 23:14:55 +10:00
Laurent Dufour d144f4d5a8 pseries/drmem: update LMBs after LPM
After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is handled by the kernel, but the memory's node is not updated because
there is no way to move a memory block between nodes from the Linux kernel
point of view.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node ibm,dynamic-reconfiguration-memory to
match the added or removed LMB. But the LMB's associativity node has not
been updated after the DT node update and thus the node is overwritten by
the Linux's topology instead of the hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity. However, ignore the
call to that hook when the update has been triggered by drmem_update_dt().
Because, in that case, the LMB tree has been used to set the DT property
and thus it doesn't need to be updated back. Since drmem_update_dt() is
called under the protection of the device_hotplug_lock and the hook is
called in the same context, use a simple boolean variable to detect that
call.

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210517090606.56930-1-ldufour@linux.ibm.com
2021-08-10 23:14:55 +10:00
Hari Bathini 8119cefd9a powerpc/kexec: blacklist functions called in real mode for kprobe
As kprobe does not handle events happening in real mode, blacklist the
functions that only get called in real mode or in kexec sequence with
MMU turned off.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/162626687834.155313.4692863392927831843.stgit@hbathini-workstation.ibm.com
2021-07-26 20:38:51 +10:00
Gustavo A. R. Silva 104aba8dd7 powerpc/smp: Fix fall-through warning for Clang
Fix the following fallthrough warning:

arch/powerpc/platforms/powermac/smp.c:149:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/60ef0750.I8J+C6KAtb0xVOAa%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-14 11:10:40 -05:00
Linus Torvalds 1459718d7d powerpc fixes for 5.14 #2
Fix crashes on 64-bit Book3E due to use of Book3S only mtmsrd instruction.
 
 Fix "scheduling while atomic" warnings at boot due to preempt count underflow.
 
 Two commits fixing our handling of BPF atomic instructions.
 
 Fix error handling in xive when allocating an IPI.
 
 Fix lockup on kernel exec fault on 603.
 
 Thanks to: Bharata B Rao, Cédric Le Goater, Christian Zigotzky, Christophe Leroy, Guenter
 Roeck, Jiri Olsa, Naveen N. Rao, Nicholas Piggin, Valentin Schneider.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmDoT7cTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFbSEACa703VD0R/Zx7/MASoc0nfLkoyDvOS
 1dQ7isuNLLwTgFCymS36w5cyUxpQJUepUFLjZajJ5rHgGM60i78trZgrxKp5o+us
 r/F530QI2RZdko2rrtoSVUe6Oj4z0l2lYtBHa7UybKIRG7R/po/DmYYe1/Wmq7Bc
 fu/R3C7m5/HY63E2mzdCPFHfNDPZavScXyPUpfgEka7PGDJsTCZqIOzeGt9oqm6f
 lEysuIvBNvq5NVvUOaBiW4dlbkxckVANvj43kjeX+c0YnJ7MTW/xCYl4AuaAxL2T
 Kc23mWgTj2ONrThbhrB5Bq2uf9bMA+4cJKTRfWiGslxLyhXP59jBnJvn6H5oCPf/
 pr/iLjA8bBS7HnaeAEjC74IIs8SDnhkWjW9ec3Da2Z4ihl8W15Bkf0+g3GOMdj3J
 hrphnhAayr4NKuXgkI/SfoFrAiH+V00LabPA1IFm5Zgvs5DSX/Lygtcf/kNcCZeQ
 jI0jgy5HjmVhTyySw+htVicdW8xJ/rvXqJDguLkxiNEZN0PF4UQSPUX+ARBnMUDT
 Bn/RSGWrgMR5VI1+kpYephhv/PFmF6dKUFpSFrBqt/sZ7rVj4jJCZpNdLDmkaIYi
 v1H4eNN3p85KeApa4PJXRuXrxi5F5XRkGTlk2Sy51ClyFYT6WlzKUcOhMlG4HuzO
 oro2jl2LF9steg==
 =q3Ag
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Fix crashes on 64-bit Book3E due to use of Book3S only mtmsrd
  instruction.

  Fix "scheduling while atomic" warnings at boot due to preempt count
  underflow.

  Two commits fixing our handling of BPF atomic instructions.

  Fix error handling in xive when allocating an IPI.

  Fix lockup on kernel exec fault on 603.

  Thanks to Bharata B Rao, Cédric Le Goater, Christian Zigotzky,
  Christophe Leroy, Guenter Roeck, Jiri Olsa, Naveen N. Rao, Nicholas
  Piggin, and Valentin Schneider"

* tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
  powerpc/64e: Fix system call illegal mtmsrd instruction
  powerpc/xive: Fix error handling when allocating an IPI
  powerpc/bpf: Reject atomic ops in ppc32 JIT
  powerpc/bpf: Fix detecting BPF atomic instructions
  powerpc/mm: Fix lockup on kernel exec fault
2021-07-09 10:26:52 -07:00
Aneesh Kumar K.V feac00aad1 powerpc/mm: enable HAVE_MOVE_PMD support
mremap HAVE_MOVE_PMD/PUD optimization time comparison for 1GB region:
1GB mremap - Source PTE-aligned, Destination PTE-aligned
  mremap time:      2292772ns
1GB mremap - Source PMD-aligned, Destination PMD-aligned
  mremap time:      1158928ns
1GB mremap - Source PUD-aligned, Destination PUD-aligned
  mremap time:        63886ns

Link: https://lkml.kernel.org/r/20210616045735.374532-4-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:23 -07:00
Valentin Schneider 2c669ef697 powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
Powerpc currently resets a CPU's idle task preempt_count to 0 before
said task starts executing the secondary startup routine (and becomes an
idle task proper).

This conflicts with commit f1a0a376ca ("sched/core: Initialize the
idle task with preemption disabled").

which initializes all of the idle tasks' preempt_count to
PREEMPT_DISABLED during smp_init(). Note that this was superfluous
before said commit, as back then the hotplug machinery would invoke
init_idle() via idle_thread_get(), which would have already reset the
CPU's idle task's preempt_count to PREEMPT_ENABLED.

Get rid of this preempt_count write.

Fixes: f1a0a376ca ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210707183831.2106509-1-valentin.schneider@arm.com
2021-07-08 23:38:10 +10:00
Linus Torvalds 019b3fd94b powerpc updates for 5.14
- A big series refactoring parts of our KVM code, and converting some to C.
 
  - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs.
 
  - Support for the Microwatt soft-core.
 
  - Optimisations to our interrupt return path on 64-bit.
 
  - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10.
 
  - Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li,
 Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique
 Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley,
 Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
 Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun
 Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain,
 YueHaibing, Zhang Jianhua, Zhen Lei.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmDfFS4THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFxHEAC88NJ+Gz87LiTQFt6QjhziBaJUd+sY
 uqADPRROr4P50O8PjYZbMi2qbXzOlLkZO4wJWX7jpZ1F9KmbPNqY2shD8h4ahyge
 F/uqzBW1FXBJfnDEKdU2MzalkeTP+dwxLZyouUamjDCGNLFjOV4x/Fft5otOdXjO
 k9uO6yoGyOkWYzjC+Y/irNPlIDDByB/+bD92Cb52Y2mXMDDEnx4JzbtkeJW+8udT
 Sjn3bWzeL+dz5GehjMKwK4+SptNiyQGOgM8FwtnKUMvgzxv04DqCGjr9YC12L2Z7
 VoFZc4GzVgtf8DZg4fJ3KG5aG2nH3Tui7jc9lUckdrxixDAZw5wSG7CQ39gFb/5+
 7A4fEJk4Z3h5llibwxAZrC7wV8ZDDXn8oRFzRcOJjfxYaD+ohOOyWHIebwkdiXYx
 nfYI7sBcScDLXeBvHtDra2GJpbFSpVL3S/QNhhi1vKVNrFSyAgbAybcVL2xPLZ6+
 8Mh7A8xt+hf2bo9AXuYJDo9mwXWfg1093d0kT+AslcRhZioBk18c2AiZLIz0FzuL
 Ua/e5FPb99x9LSdcZHvaAXBoHT2iTgDyCyDa3gkIesyuRX6ggHoFcVQuvdDcbJ9d
 H8LK+Tahy1Y+E5b6KdtU8mDEGE+QG+CWLnwQ6YSCaL/MYgaFzNa32Jdj1fmztSBC
 cttP43kHZ7ljTw==
 =zo4d
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A big series refactoring parts of our KVM code, and converting some
   to C.

 - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on
   some CPUs.

 - Support for the Microwatt soft-core.

 - Optimisations to our interrupt return path on 64-bit.

 - Support for userspace access to the NX GZIP accelerator on PowerVM on
   Power10.

 - Enable KUAP and KUEP by default on 32-bit Book3S CPUs.

 - Other smaller features, fixes & cleanups.

Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe
Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand,
Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe,
Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika
Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar
Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei.

* tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits)
  powerpc: Only build restart_table.c for 64s
  powerpc/64s: move ret_from_fork etc above __end_soft_masked
  powerpc/64s/interrupt: clean up interrupt return labels
  powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols
  powerpc/64: enable MSR[EE] in irq replay pt_regs
  powerpc/64s/interrupt: preserve regs->softe for NMI interrupts
  powerpc/64s: add a table of implicit soft-masked addresses
  powerpc/64e: remove implicit soft-masking and interrupt exit restart logic
  powerpc/64e: fix CONFIG_RELOCATABLE build warnings
  powerpc/64s: fix hash page fault interrupt handler
  powerpc/4xx: Fix setup_kuep() on SMP
  powerpc/32s: Fix setup_{kuap/kuep}() on SMP
  powerpc/interrupt: Use names in check_return_regs_valid()
  powerpc/interrupt: Also use exit_must_hard_disable() on PPC32
  powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE
  powerpc/ptrace: Refactor regs_set_return_{msr/ip}
  powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}
  powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
  powerpc/pseries/vas: Include irqdomain.h
  powerpc: mark local variables around longjmp as volatile
  ...
2021-07-02 12:54:34 -07:00
Linus Torvalds 71bd934101 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "190 patches.

  Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
  vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
  migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
  zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
  core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
  signals, exec, kcov, selftests, compress/decompress, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  ipc/util.c: use binary search for max_idx
  ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
  ipc: use kmalloc for msg_queue and shmid_kernel
  ipc sem: use kvmalloc for sem_undo allocation
  lib/decompressors: remove set but not used variabled 'level'
  selftests/vm/pkeys: exercise x86 XSAVE init state
  selftests/vm/pkeys: refill shadow register after implicit kernel write
  selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
  selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
  kcov: add __no_sanitize_coverage to fix noinstr for all architectures
  exec: remove checks in __register_bimfmt()
  x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
  hfsplus: report create_date to kstat.btime
  hfsplus: remove unnecessary oom message
  nilfs2: remove redundant continue statement in a while-loop
  kprobes: remove duplicated strong free_insn_page in x86 and s390
  init: print out unknown kernel parameters
  checkpatch: do not complain about positive return values starting with EPOLL
  checkpatch: improve the indented label test
  checkpatch: scripts/spdxcheck.py now requires python3
  ...
2021-07-02 12:08:10 -07:00
Kefeng Wang 63703f37aa mm: generalize ZONE_[DMA|DMA32]
ZONE_[DMA|DMA32] configs have duplicate definitions on platforms that
subscribe to them.  Instead, just make them generic options which can be
selected on applicable platforms.

Also only x86/arm64 architectures could enable both ZONE_DMA and
ZONE_DMA32 if EXPERT, add ARCH_HAS_ZONE_DMA_SET to make dma zone
configurable and visible on the two architectures.

Link: https://lkml.kernel.org/r/20210528074557.17768-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>	[RISC-V]
Acked-by: Michal Simek <michal.simek@xilinx.com>	[microblaze]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:30 -07:00
Linus Torvalds 21edf50948 Updates for the interrupt subsystem:
Core changes:
 
   - Cleanup and simplification of common code to invoke the low level
     interrupt flow handlers when this invocation requires irqdomain
     resolution. Add the necessary core infrastructure.
 
   - Provide a proper interface for modular PMU drivers to set the
     interrupt affinity.
 
   - Add a request flag which allows to exclude interrupts from spurious
     interrupt detection. Useful especially for IPI handlers which always
     return IRQ_HANDLED which turns the spurious interrupt detection into a
     pointless waste of CPU cycles.
 
 Driver changes:
 
   - Bulk convert interrupt chip drivers to the new irqdomain low level flow
     handler invocation mechanism.
 
   - Add device tree bindings for the Renesas R-Car M3-W+ SoC
 
   - Enable modular build of the Qualcomm PDC driver
 
   - The usual small fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmDbIg8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobZNEAC2wTq3Ishk026va7g5mbQVSvAQyf8G
 0msmgJ48lJWVL9a6JUogNcCO7sZCTcAy4CYbuHI6kz1fGZZnNWSCrtEz0rFNAdWE
 WVR2k8ExR2R73vJm+K50WUMMj8YsefRnIFXWlJdTp+pksr3TZ7Lo70taGUK/6tMo
 aL0dqvnf7Vb3LG0iIkaHWLF4HnyK/UGqB+121rlL4UhI1/g+3EUxNWNcY5eg/dmc
 Ym73U1uDsjydp3/3jm8v8NYNtwCDGpujZZc/88RFLjP6PMpF1S9JUvDEt+LHJi0a
 cdS3RreB78HYXpLg5NtDFJwIegRMLSitvCGPBjHvWBzbifkMsA2zWIb6Cs8VkYys
 vuPoEGZ0ol+SWvcnSh5Xy36nyr4iGIBhQql47UAaqelSxsYPjvCCSD4yJV3k8hnC
 ZuDscOekXUMn75qZR0quNdi1SkgKpGZxK73QFbuW3Apl5EgArVai6kq0rbl6zlx6
 ACy0SEcevhOcpU6WpqDgrmUBgFr+M8zina8edRELgiFEuWT6pYxKwrN3pT4U5djO
 e5V3YuNzzwzvtUoXN4AiTlT8gwRiGfgeiEvHpvZBXPNvk5ffS6XzPiV81ZMWiBkb
 ReoCbqME3PKoxj1VAHJvVXHbcjiPIJeCRdV+5vQSNh1SPSQOmEdWyJtNUDrSkoym
 QkKKY5jrOhPhlQ==
 =FIKh
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates for the interrupt subsystem:

  Core changes:

   - Cleanup and simplification of common code to invoke the low level
     interrupt flow handlers when this invocation requires irqdomain
     resolution. Add the necessary core infrastructure.

   - Provide a proper interface for modular PMU drivers to set the
     interrupt affinity.

   - Add a request flag which allows to exclude interrupts from spurious
     interrupt detection. Useful especially for IPI handlers which
     always return IRQ_HANDLED which turns the spurious interrupt
     detection into a pointless waste of CPU cycles.

  Driver changes:

   - Bulk convert interrupt chip drivers to the new irqdomain low level
     flow handler invocation mechanism.

   - Add device tree bindings for the Renesas R-Car M3-W+ SoC

   - Enable modular build of the Qualcomm PDC driver

   - The usual small fixes and improvements"

* tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  dt-bindings: interrupt-controller: arm,gic-v3: Describe GICv3 optional properties
  irqchip: gic-pm: Remove redundant error log of clock bulk
  irqchip/sun4i: Remove unnecessary oom message
  irqchip/irq-imx-gpcv2: Remove unnecessary oom message
  irqchip/imgpdc: Remove unnecessary oom message
  irqchip/gic-v3-its: Remove unnecessary oom message
  irqchip/gic-v2m: Remove unnecessary oom message
  irqchip/exynos-combiner: Remove unnecessary oom message
  irqchip: Bulk conversion to generic_handle_domain_irq()
  genirq: Move non-irqdomain handle_domain_irq() handling into ARM's handle_IRQ()
  genirq: Add generic_handle_domain_irq() helper
  irqchip/nvic: Convert from handle_IRQ() to handle_domain_irq()
  irqdesc: Fix __handle_domain_irq() comment
  genirq: Use irq_resolve_mapping() to implement __handle_domain_irq() and co
  irqdomain: Introduce irq_resolve_mapping()
  irqdomain: Protect the linear revmap with RCU
  irqdomain: Cache irq_data instead of a virq number in the revmap
  irqdomain: Use struct_size() helper when allocating irqdomain
  irqdomain: Make normal and nomap irqdomains exclusive
  powerpc: Move the use of irq_domain_add_nomap() behind a config option
  ...
2021-06-29 12:25:04 -07:00
Michael Ellerman c736fb9705 powerpc/pseries/vas: Include irqdomain.h
There are patches in flight to break the dependency between asm/irq.h
and linux/irqdomain.h, which would break compilation of vas.c because it
needs the declaration of irq_create_mapping() etc.

So add an explicit include of irqdomain.h to avoid that becoming a
problem in future.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210625045337.3197833-1-mpe@ellerman.id.au
2021-06-25 14:53:47 +10:00
Nathan Lynch bfb0c9fcf5 powerpc/pseries/dlpar: use rtas_get_sensor()
Instead of making bare calls to get-sensor-state, use
rtas_get_sensor(), which correctly handles busy and extended delay
statuses.

Fixes: ab519a011c ("powerpc/pseries: Kernel DLPAR Infrastructure")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210504025329.1713878-1-nathanl@linux.ibm.com
2021-06-25 14:47:20 +10:00
Kajol Jain d2827e5e2e powerpc/papr_scm: trivial: fix typo in a comment
There is a spelling mistake "byes" -> "bytes" in a comment of
function drc_pmem_query_stats(). Fix that typo.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210418074003.6651-1-kjain@linux.ibm.com
2021-06-25 14:47:19 +10:00
Michael Ellerman 9583922563 powerpc: Fix is_kvm_guest() / kvm_para_available()
Commit a21d1becaa ("powerpc: Reintroduce is_kvm_guest() as a fast-path
check") added is_kvm_guest() and changed kvm_para_available() to use it.

is_kvm_guest() checks a static key, kvm_guest, and that static key is
set in check_kvm_guest().

The problem is check_kvm_guest() is only called on pseries, and even
then only in some configurations. That means is_kvm_guest() always
returns false on all non-pseries and some pseries depending on
configuration. That's a bug.

For PR KVM guests this is noticable because they no longer do live
patching of themselves, which can be detected by the omission of a
message in dmesg such as:

  KVM: Live patching for a fast VM worked

To fix it make check_kvm_guest() an initcall, to ensure it's always
called at boot. It needs to be core so that it runs before
kvm_guest_init() which is postcore. To be an initcall it needs to return
int, where 0 means success, so update that.

We still call it manually in pSeries_smp_probe(), because that runs
before init calls are run.

Fixes: a21d1becaa ("powerpc: Reintroduce is_kvm_guest() as a fast-path check")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210623130514.2543232-1-mpe@ellerman.id.au
2021-06-25 14:47:19 +10:00
Michael Ellerman 24d33ac5b8 powerpc/64s: Make prom_init require RELOCATABLE
When we boot from open firmware (OF) using PPC_OF_BOOT_TRAMPOLINE, aka.
prom_init, we run parts of the kernel at an address other than the link
address. That happens because OF loads the kernel above zero (OF is at
zero) and we run prom_init before copying the kernel down to zero.

Currently that works even for non-relocatable kernels, because we do
various fixups to the prom_init code to make it run where it's loaded.

However those fixups are not sufficient if the kernel becomes large
enough. In that case prom_init()'s final call to __start() can end up
generating a plt branch:

bl      c000000002000018 <00000078.plt_branch.__start>

That results in the kernel jumping to the linked address of __start,
0xc000000000000000, when really it needs to jump to the
0xc000000000000000 + the runtime address because the kernel is still
running at the load address.

We could do further shenanigans to handle that, see Jordan's patch for
example:
  https://lore.kernel.org/linuxppc-dev/20210421021721.1539289-1-jniethe5@gmail.com

However it is much simpler to just require a kernel with prom_init() to
be built relocatable. The result works in all configurations without
further work, and requires less code.

This should have no effect on most people, as our defconfigs and
essentially all distro configs already have RELOCATABLE enabled.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210623130454.2542945-1-mpe@ellerman.id.au
2021-06-25 14:47:19 +10:00
Vaibhav Jain de21e1377c powerpc/papr_scm: Add support for reporting dirty-shutdown-count
Persistent memory devices like NVDIMMs can loose cached writes in case
something prevents flush on power-fail. Such situations are termed as
dirty shutdown and are exposed to applications as
last-shutdown-state (LSS) flag and a dirty-shutdown-counter(DSC) as
described at [1]. The latter being useful in conditions where multiple
applications want to detect a dirty shutdown event without racing with
one another.

PAPR-NVDIMMs have so far only exposed LSS style flags to indicate a
dirty-shutdown-state. This patch further adds support for DSC via the
"ibm,persistence-failed-count" device tree property of an NVDIMM. This
property is a monotonic increasing 64-bit counter thats an indication
of number of times an NVDIMM has encountered a dirty-shutdown event
causing persistence loss.

Since this value is not expected to change after system-boot hence
papr_scm reads & caches its value during NVDIMM probe and exposes it
as a PAPR sysfs attributed named 'dirty_shutdown' to match the name of
similarly named NFIT sysfs attribute. Also this value is available to
libnvdimm via PAPR_PDSM_HEALTH payload. 'struct nd_papr_pdsm_health'
has been extended to add a new member called 'dimm_dsc' presence of
which is indicated by the newly introduced PDSM_DIMM_DSC_VALID flag.

References:
[1] https://pmem.io/documents/Dirty_Shutdown_Handling-V1.0.pdf

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210624080621.252038-1-vaibhav@linux.ibm.com
2021-06-25 14:47:18 +10:00
Vaibhav Jain ed78f56e12 powerpc/papr_scm: Make 'perf_stats' invisible if perf-stats unavailable
In case performance stats for an nvdimm are not available, reading the
'perf_stats' sysfs file returns an -ENOENT error. A better approach is
to make the 'perf_stats' file entirely invisible to indicate that
performance stats for an nvdimm are unavailable.

So this patch updates 'papr_nd_attribute_group' to add a 'is_visible'
callback implemented as newly introduced 'papr_nd_attribute_visible()'
that returns an appropriate mode in case performance stats aren't
supported in a given nvdimm.

Also the initialization of 'papr_scm_priv.stat_buffer_len' is moved
from papr_scm_nvdimm_init() to papr_scm_probe() so that it value is
available when 'papr_nd_attribute_visible()' is called during nvdimm
initialization.

Even though 'perf_stats' attribute is available since v5.9, there are
no known user-space tools/scripts that are dependent on presence of its
sysfs file. Hence I dont expect any user-space breakage with this
patch.

Fixes: 2d02bf835e ("powerpc/papr_scm: Fetch nvdimm performance stats from PHYP")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210513092349.285021-1-vaibhav@linux.ibm.com
2021-06-25 14:47:18 +10:00