Commit graph

1248 commits

Author SHA1 Message Date
Dave Hansen
20f4d69243 EDAC, {sb,skx}_edac: Use Intel model macros instead of open-coding them
We now have symbolic names for a bunch of Intel CPU models via
asm/intel-family.h. The original conversion missed the EDAC drivers.
Convert them.

Signed-off-by: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160929204321.9FAE5F84@viggo.jf.intel.com
[ Remove comment, macro name is descriptive enough. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-10-19 12:32:40 +02:00
Linus Torvalds
19fe416532 * Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO
buffers (Thor Thayer)
 
 * Split the memory controller part out of mpc85xx and share it with a
 * new Freescale ARM Layerscape driver (York Sun)
 
 * amd64_edac fixes (Yazen Ghannam)
 
 * Misc cleanups, refactoring and fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX80EwAAoJEBLB8Bhh3lVKWwYP/1bbODQ7o+XhO8IaCDffYk30
 8y4WdSnI0/QcP8JbSvFA7y6Zn4L0BbrbYhKLRDAg9c34V2bMaqonCnkDtT6YatUb
 6l0H/hQ/Cah9AOm5PJLYg6O9s+ZBT8zA5b+F2Z9kUsuB6LSnVhp9skNrH6KPlm0U
 4pFaLnHQenQPZbuRCfRxPU49ZuKtBZtQDkJLJlHXwn7e1qZy2Q4tMnnEtsY6U2ea
 t3Hj+F8g+cdoiTQXOceCcOTR8GqDI6szgzn7vpXAGYvljBndszauAkxO7by79jg1
 I8AQfgwoBF5CYL2Q0pzT1maHmmG2sydeRAHIvhmGxiEfFz1abWhriXbS33c32q8a
 iFiVMAUIaSKpB/sB+5w5ymuBctI1mX5EQVW+8Xl2Gxt+olnhdJMocHnvQdYkfsYm
 Ka8LcbaiK6ZQTbs/cIMOc2paE0AFPu5uXKHCPeZlhQAxOBvSPuDAv0+qUB/of5Uq
 1SPidtsTmCI7X2hrdHAH9hLEkSjq68v3kqL5YnZL3H4gA3WohQEmX9ybjk097Kus
 WWEhdi/PSFX0qQKotMUUDuxfNcKI6PZH9p+i2dN6tNCkiTDdb0Eo5lCXN7RVVhvq
 qfE0Fcc4uDzh5MUS5jT58MWpA1cfdu9jbAf2BwFIU/poJcaeqy/SMyzCL+1D2/u6
 dmDAtQbKUUwiltB8QzQd
 =pcI8
 -----END PGP SIGNATURE-----

Merge tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:
 "A lot of movement in the EDAC tree this time around, coarse summary
  below:

   - Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO
     buffers (Thor Thayer)

   - split the memory controller part out of mpc85xx and share it with a
     new Freescale ARM Layerscape driver (York Sun)

   - amd64_edac fixes (Yazen Ghannam)

   - misc cleanups, refactoring and fixes all over the place"

* tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (37 commits)
  EDAC, altera: Add IRQ Flags to disable IRQ while handling
  EDAC, altera: Correct EDAC IRQ error message
  EDAC, amd64: Autoload module using x86_cpu_id
  EDAC, sb_edac: Remove NULL pointer check on array pci_tad
  EDAC: Remove NO_IRQ from powerpc-only drivers
  EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe()
  EDAC, fsl_ddr: Add entry to MAINTAINERS
  EDAC: Move Doug Thompson to CREDITS
  EDAC, I3000: Orphan driver
  EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul()
  EDAC, layerscape: Add Layerscape EDAC support
  EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed
  EDAC, fsl_ddr: Add support for little endian
  EDAC, fsl_ddr: Add missing DDR DRAM types
  EDAC, fsl_ddr: Rename macros and names
  EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx
  EDAC, mpc85xx: Replace printk() with pr_* format
  EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1
  EDAC, altera: Rename MC trigger to common name
  EDAC, altera: Rename device trigger to common name
  ...
2016-10-04 12:06:26 -07:00
Thor Thayer
a29d64a45e EDAC, altera: Add IRQ Flags to disable IRQ while handling
Add the IRQF_ONESHOT and IRQF_TRIGGER_HIGH flags to disable the IRQ
while executing the IRQ handler. Remove the IRQF_SHARED because these
are not shared IRQs in the domain. Exposed when flooding IRQs.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1474582419-7053-2-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-23 12:03:34 +02:00
Thor Thayer
3763569f4c EDAC, altera: Correct EDAC IRQ error message
Correct the error message sent out in the case of a single bit error IRQ
allocation.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1474582419-7053-1-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-23 11:52:39 +02:00
Yazen Ghannam
d6efab74f6 EDAC, amd64: Autoload module using x86_cpu_id
Reinstate driver autoloading now that PCI dependency is gone.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1473984445-1726-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-21 12:48:15 +02:00
Yazen Ghannam
a884675b87 x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly
Bank 4 is reserved on family 0x17 and shouldn't generate any MCE
records. However, broken hardware and software is not something unheard
of so warn about bank 4 errors. They shouldn't be coming from bank 4
naturally but users can still use mce_amd_inj to simulate errors from it
for testing purposed.

Also, avoid special handling in the injector mce_amd_inj like it is
being done on the older families.

[ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ]

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Aravind Gopalakrishnan  <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:14 +02:00
Yazen Ghannam
4b711f92c9 x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems
The MCA_SYND and MCA_IPID registers contain valuable information and
should be included in MCE output. The MCA_SYND register contains
syndrome and other error information, and the MCA_IPID register will
uniquely identify the MCA bank's type without having to rely on system
software.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:13 +02:00
Yazen Ghannam
5896820e0a x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types
Scalable MCA defines a number of IP types. An MCA bank on an SMCA
system is defined as one of these IP types. A bank's type is uniquely
identified by the combination of the HWID and MCATYPE values read from
its MCA_IPID register.

Add the required tables in order to be able to lookup error descriptions
based on a bank's type and the error's extended error code.

[ bp: Align comments, simplify a bit. ]

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:10 +02:00
Yazen Ghannam
856095b179 EDAC/mce_amd: Use SMCA prefix for error descriptions arrays
The error descriptions defined for Fam17h can be reused for other SMCA
systems, so their names should reflect this.

Change f17h prefix to smca for error descriptions.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472673994-12235-4-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:09 +02:00
Yazen Ghannam
c019b951e1 EDAC/mce_amd: Add missing SMCA error descriptions
Add missing SMCA error descriptions to the error descriptions arrays.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472673994-12235-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:09 +02:00
Yazen Ghannam
b300e87300 EDAC/mce_amd: Print syndrome register value on SMCA systems
Print SyndV bit status and print the raw value of the MCA_SYND register.
Further decoding of the syndrome from struct mce.synd can be done in
other places where appropriate, e.g. DRAM ECC.

Boris: make the error stanza more compact by putting the error address
and syndrome on the same line:

  [Hardware Error]: Corrected error, no action required.
  [Hardware Error]: CPU:2 (17:0:0) MC4_STATUS[-|CE|-|PCC|AddrV|-|-|SyndV|CECC]: 0x96204100001e0117
  [Hardware Error]: Error Addr: 0x000000007f4c52e3, Syndrome: 0x0000000000000000
  [Hardware Error]: Invalid IP block specified.
  [Hardware Error]: cache level: L3/GEN, tx: DATA, mem-tx: RD

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1467633035-32080-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:07 +02:00
Colin Ian King
c7c35407cd EDAC, sb_edac: Remove NULL pointer check on array pci_tad
pvt->pci_tad is a NUM_CHANNELS array of struct pci_dev pointers and
hence cannot be NULL, so the NULL pointer check on pci_tad is redundant.
Remove it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160908083801.14766-1-colin.king@canonical.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-12 20:15:43 +02:00
Michael Ellerman
372095723a EDAC: Remove NO_IRQ from powerpc-only drivers
We'd like to eventually remove NO_IRQ on powerpc, so remove usages of it
from powerpc-only drivers.

The pdata structs are kzalloc'ed, so we don't need to initialise those
to 0, we can just drop the assignments entirely.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/1473674436-19467-1-git-send-email-mpe@ellerman.id.au
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-12 13:04:56 +02:00
Wei Yongjun
43fa9ba632 EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe()
Return negative error code from the edac_mc_add_mc() error handling case
instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1473350284-26482-1-git-send-email-weiyj.lk@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-09 19:27:22 +02:00
York Sun
f47ae798d8 EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul()
Replace obsolete simple_strtoul() with kstrtoul().

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471990593-27536-1-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:03 +02:00
York Sun
eeb3d68b6c EDAC, layerscape: Add Layerscape EDAC support
Add DDR EDAC driver for ARM-based compatible controllers. Both
big-endian and little-endian are supported, as specified in device tree.

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471990465-27443-1-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:03 +02:00
York Sun
55764ed37e EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed
When compiled as a module, removing it causes kernel warnings
when irq_dispose_mapping() is called. Instead of calling
irq_of_parse_and_map(), use platform_get_irq() to acquire the IRQ
number.

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: morbidrsa@gmail.com
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-8-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:02 +02:00
York Sun
339fdff14c EDAC, fsl_ddr: Add support for little endian
Get endianness from device tree. Both big endian and little endian are
supported. Default to big endian for backwards compatibility to MPC85xx.

Signed-off-by: York Sun <york.sun@nxp.com>
Acked-by: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: morbidrsa@gmail.com
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-7-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:02 +02:00
York Sun
4e2c3252d2 EDAC, fsl_ddr: Add missing DDR DRAM types
The compatible DDR controllers may support DDR, DDR2, DDR3, DDR4 DRAM.
An individual controller doesn't support all of them. The EDAC driver
reads SDRAM_CFG to determine which mode is configured.

Add DDR4 and drop the defines used only in the mtype assignment.

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: morbidrsa@gmail.com
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-6-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:01 +02:00
York Sun
d43a9fb202 EDAC, fsl_ddr: Rename macros and names
Use FSL-specific prefix for macros, variables and functions.

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-5-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:01 +02:00
York Sun
ea2eb9a8b6 EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx
The mpc85xx-compatible DDR controllers are used on ARM-based SoCs too.
Carve out the DDR part from the mpc85xx EDAC driver in preparation to
support both architectures.

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470946525-3410-1-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:28:00 +02:00
York Sun
88857ebe71 EDAC, mpc85xx: Replace printk() with pr_* format
Replace printk() with pr_err/pr_warn/pr_info macros.

Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-3-git-send-email-york.sun@nxp.com
[ Boris: unbreak strings for easier greppability. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:27:59 +02:00
York Sun
9e6a03a044 EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1
On e500v1, read fault exception enable (RFXE) controls whether assertion
of core_fault_in causes a machine check interrupt. Assertion of
core_fault_in can result from uncorrectable data error, such as an L2
multi-bit ECC error. It can also occur from a system error if logic on
the integrated device signals a fault for nonfatal errors. RFXE bit is
cleared out of reset, and should be left clear for normal operation.
Assertion of core_fault_in does not cause a machine check.

RFXE is set specifically for RIO (Rapid IO) and PCI for book E to catch
the errors by machine check. With this bit set, the EDAC driver can't
get the interrupt in case of uncorrectable error. So this bit is cleared
in favor of EDAC. However, the benefit of catching such uncorrectable
error doesn't outweigh the other errors which may hang the system.
Besides, e500v2 has different errors masked by RFXE, and e500mc doesn't
support this bit. It is more reasonable to leave RFXE as is in the EDAC
driver, and leave the uncorrectable errors triggering machine check for
e500v1.

Suggested-by: Scott Wood <oss@buserror.net>
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-2-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 10:27:59 +02:00
Thor Thayer
b8978badc4 EDAC, altera: Rename MC trigger to common name
Rename the Memory Controller debug trigger to the same common name as
the EDAC devices.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471622666-15197-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 09:06:42 +02:00
Thor Thayer
f399f34bdb EDAC, altera: Rename device trigger to common name
The L2 and OCRAM devices have different ecc trigger names than the other
EDAC devices (FIFO peripherals). Make them all the same and remove the
character array from the device structure.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471622666-15197-2-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-09-01 08:55:24 +02:00
Tony Luck
4ec656bdf4 EDAC, skx_edac: Add EDAC driver for Skylake
This is an entirely new driver instead of yet another set of patches
to sb_edac.c because:

1) Mapping from PCI devices to socket/memory controller is significantly
   different. Skylake scatters devices on a socket across a number of
   PCI buses.
2) There is an extra level of interleaving via the "mcroute" register
   that would be a little messy to squeeze into the old driver.
3) Validation is getting too expensive. Changes to sb_edac need to
   be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and
   Knights Landing.

Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-21 10:58:34 -07:00
Tillmann Heidsieck
6fa06b0d9e EDAC, mpc85xx: Fix PCIe error capture
According to the reference manual of MPC8572 and T4240, bit 31 of
PEX_ERR_CAP_STAT is W1C (write 1 to clear).

Add the corresponding write to PEX_ERR_CAP_STAT in order to fix the PCIe
error capture.

Tested on a T4240 processor.

Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de>
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160815190849.29327-1-theidsieck@leenox.de
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-18 10:17:40 +02:00
Bhaktipriya Shridhar
7bb8b77779 EDAC, wq: Remove deprecated create_singlethread_workqueue()
Replace the deprecated create_singlethread_workqueue() with
alloc_ordered_workqueue() with WQ_MEM_RECLAIM. This is the identity
conversion.

It's not recommended to stall it from memory pressure. Hence,
WQ_MEM_RECLAIM has been set to ensure forward progress under memory
pressure.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160813164124.GA9077@Karyakshetra
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-15 07:21:29 +02:00
Wei Yongjun
9bcd919eb8 EDAC, altera: Make a10_eccmgr_ic_ops static
Fix the following sparse warning:

  drivers/edac/altera_edac.c:1649:23: warning:
   symbol 'a10_eccmgr_ic_ops' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Reviewed-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>
Link: http://lkml.kernel.org/r/1470836667-11822-1-git-send-email-weiyj.lk@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-10 18:37:06 +02:00
Thor Thayer
911049845d EDAC, altera: Add Arria10 SD-MMC EDAC support
Add Altera Arria10 SD-MMC FIFO memory EDAC support. The SD-MMC is a
dual port RAM implementation which is different than any of the other
peripherals and therefore requires additional code.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1470753653-23465-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-10 14:43:14 +02:00
Yazen Ghannam
dc0a50a841 EDAC, amd64: Fix channel decode on Fam15hMod60h systems
Fam15hMod60h systems are using the channel decode of Fam15hMod30h which
gives incorrect results. Fam15hMod60h systems should use the generic
channel decode method plus a couple more cases.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1470236355-30039-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-08 05:59:42 +02:00
Thor Thayer
485fe9e24e EDAC, altera: Add Arria10 QSPI support
Add Altera Arria10 QSPI FIFO memory support.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-9-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-08 05:59:39 +02:00
Thor Thayer
c609581d1f EDAC, altera: Add Arria10 USB support
Add Altera Arria10 USB FIFO memory support.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-8-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-08 05:59:37 +02:00
Thor Thayer
e8263793b7 EDAC, altera: Add Arria10 DMA support
Add Altera Arria10 DMA FIFO memory support.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-7-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-08 05:59:36 +02:00
Thor Thayer
c6882fb2e8 EDAC, altera: Add Arria10 NAND support
Add Altera Arria10 NAND FIFO memory support.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-6-git-send-email-tthayer@opensource.altera.com
[ Reformat loop in altr_edac_a10_probe() for better readability. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-08 05:59:35 +02:00
Lukasz Odzioba
c5b48fa7e2 EDAC, sb_edac: Fix channel reporting on Knights Landing
On Intel Xeon Phi Knights Landing processor family the channels of the
memory controller have untypical arrangement - MC0 is mapped to CH3,4,5
and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the
channel name incorrectly.

We missed this change earlier, so the code already contains similar
comment, but the translation function is incorrect.

Without this patch:
  errors in DIMM_A and DIMM_D were reported in DIMM_D
  errors in DIMM_B and DIMM_E were reported in DIMM_E
  errors in DIMM_C and DIMM_F were reported in DIMM_F

Correct this.

Hubert Chrzaniuk:
 - rebased to 4.8
 - comments and code cleanup

Fixes: d0cdf90031 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lukasz.anaczkowski@intel.com
Cc: lukasz.odzioba@intel.com
Cc: mchehab@kernel.org
Cc: <stable@vger.kernel.org> # v4.5..
Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
[ Boris: Simplify a bit by removing char mc. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-08-08 05:52:08 +02:00
Linus Torvalds
c79a14defb * Altera Arria10 ethernet FIFO buffer support (Thor Thayer)
* Minor cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXlbvGAAoJEBLB8Bhh3lVK9g0P/16Z5/WIqrrLjJegcZ3qKb4m
 u2b5FBwbTDQyj/smZmfZfhZUleO7HDayL188fHPKO11my+Mvjyws8IZCvHshy7+3
 b6Krno8NXZuNYtm32srSQQoLE4eN5pookOPMe2KJcRFghwnDqOo0ZfMZ8Lt6RVGg
 YtqiVZ3DwaqoHUbvxkHvdJuMbtVEZIEC6SzK9t4URmdncAx+3/BTLqDSYc4LaL+B
 tI6REgMy+QT1IUMImuUbOA5ktd78K+D26YKF9NuTSWNQvwzjSYy4y6WmDsjYpZF6
 h6YWgGwERIy6gws5IdvkBMKX0s1RV7/sY6xFBsEQrX+IFywyY5OcI1gPRttD6VAn
 ORJ4j7WNH8Phqld9YxjEX8yTwGxlxge88J/lb0i9s5jXPMX7ioxAqAjdpsJj6PlA
 sNDP0fTyCECqME73cWaAS9bdrzB2dgOxEDfcWKWR24vZeHfnh72owD1k+J0iWu9c
 0R2usPB1Yjbo6ODGpqG1R+u09LR+RpNe38oT5GcsFrj6/EaGz0kn2QulaNtylsDo
 J5qkm02X1h3+F8/UHrzKavOGk82IQzLCbcW7eHPmyzkH8PTe1semLk9jHnXN5sK3
 uL2nirZQVIF6QLbYxf0Uf2oSVs05xc7u9ZT+QOl6hqh/6/K7DWC45SoABwMXDaHH
 jXr+y4vtr5IR8GQfZ/nd
 =mdxx
 -----END PGP SIGNATURE-----

Merge tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:
 "This last cycle, Thor was busy adding Arria10 eth FIFO support to the
  altera_edac driver along with other improvements.  We have two
  cleanups/fixes too.

  Summary:

   - Altera Arria10 ethernet FIFO buffer support (Thor Thayer)

   - Minor cleanups"

* tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  ARM: dts: Add Arria10 Ethernet EDAC devicetree entry
  EDAC, altera: Add Arria10 Ethernet EDAC support
  EDAC, altera: Add Arria10 ECC memory init functions
  Documentation: dt: socfpga: Add Arria10 Ethernet binding
  EDAC, altera: Drop some ifdeffery
  EDAC, altera: Add panic flag check to A10 IRQ
  EDAC, altera: Check parent status for Arria10 EDAC block
  EDAC, altera: Make all private data structures static
  EDAC: Correct channel count limit
  EDAC, amd64_edac: Init opstate at the proper time during init
  EDAC, altera: Handle Arria10 SDRAM child node
  EDAC, altera: Add ECC Manager IRQ controller support
  Documentation: dt: socfpga: Add interrupt-controller to ecc-manager
2016-07-27 13:40:47 -07:00
Tony Luck
0ba169ac36 EDAC, sb_edac: Fix Knights Landing
In commit 2c1ea4c700 ("EDAC, sb_edac: Use cpu family/model in driver
detection") I broke Knights Landing because I failed to notice that it
called a wrapper macro "sbridge_get_all_devices_knl" instead of
"sbridge_get_all_devices" like all the other types.

Now that we include the processor type in the pci_id_table structure we
can skip the wrappers and just have the sbridge_get_all_devices() check
the type to decide whether to allow duplicate devices and controllers to
have registers spread across buses.

Fixes: 2c1ea4c700 ("EDAC, sb_edac: Use cpu family/model in driver detection")
Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-16 06:11:59 +09:00
Thor Thayer
ab8c1e0fb0 EDAC, altera: Add Arria10 Ethernet EDAC support
Add Altera Arria10 Ethernet FIFO memory EDAC support. Update to support
a common compatibility string for all Ethernet FIFOs in the DT.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-8-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-25 11:31:34 +02:00
Thor Thayer
1166fde93d EDAC, altera: Add Arria10 ECC memory init functions
In preparation for additional memory module ECCs, add the memory
initialization functions and helpers.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-7-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-24 21:16:38 +02:00
Thor Thayer
6b300fb953 EDAC, altera: Drop some ifdeffery
Make the IRQ and check_deps() functions available to all the memory
buffers by moving them outside of the OCRAM only area.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-5-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-24 14:18:33 +02:00
Thor Thayer
2b083d65ff EDAC, altera: Add panic flag check to A10 IRQ
In preparation for additional memory module ECCs, the IRQ function will
check a panic flag before doing a kernel panic on double bit errors.

OCRAM uncorrectable errors cause a panic because sleep/resume functions
and FPGA contents during sleep are stored in OCRAM.

ECCs on peripheral FIFO buffers will not cause a kernel panic on DBERRs
because the packet can be retried and therefore recovered.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-24 11:58:43 +02:00
Thor Thayer
44ec9b307e EDAC, altera: Check parent status for Arria10 EDAC block
In preparation for the Arria10 ECC modules, check the status of the
parent in the device tree to ensure the block is enabled. Skip if no
parent phandle is set in the device tree.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-2-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-24 11:58:42 +02:00
Thor Thayer
1cf7037724 EDAC, altera: Make all private data structures static
The device private data structures are used only here so make them
static.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-4-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-24 11:58:42 +02:00
Borislav Petkov
bba142957e EDAC: Correct channel count limit
c44696fff0 ("EDAC: Remove arbitrary limit on number of channels")
lifted the arbitrary limit on memory controller channels in EDAC.
However, the dynamic channel attributes dynamic_csrow_dimm_attr and
dynamic_csrow_ce_count_attr remained 6.

This wasn't a problem except channels 6 and 7 weren't visible in sysfs
on machines with more than 6 channels after the conversion to static
attr groups with

  2c1946b6d6 ("EDAC: Use static attribute groups for managing sysfs entries")

 [ without that, we're exploding in edac_create_sysfs_mci_device()
   because we're dereferencing out of the bounds of the
   dynamic_csrow_dimm_attr array. ]

Add attributes for channels 6 and 7 along with a guard for the
future, should more channels be required and/or to sanity check for
misconfigured machines.

We still need to check against the number of channels present on the MC
first, as Thor reported.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reported-by: Hironobu Ishii <ishii.hironobu@jp.fujitsu.com>
Tested-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: <stable@vger.kernel.org> # 4.2
2016-06-16 10:06:35 +02:00
Borislav Petkov
6ba92fea1b EDAC, amd64_edac: Init opstate at the proper time during init
It is useless to do it if we're loaded on unsupported hardware so do
that only after we have detected at least 1 supported AMD northbridge.

Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-16 01:13:18 +02:00
Thor Thayer
ab564cb51e EDAC, altera: Handle Arria10 SDRAM child node
Separate the device match arrays for each platform to prevent CycloneV
matches when calling of_platform_populate() on the Arria10 ECC manager
node.

If the SDRAM is a child node of ECC manager, call probe function via
of_platform_populate().

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1464193783-5071-4-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-08 13:23:09 +02:00
Thor Thayer
13ab8448d2 EDAC, altera: Add ECC Manager IRQ controller support
To better support child devices, the ECC manager needs to be
implemented as an IRQ controller.

Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1465331757-10227-1-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-08 13:23:01 +02:00
Tony Luck
665f05e0b8 EDAC, sb_edac: Readd accidentally dropped Broadwell-D support
In commit

  2c1ea4c700 ("EDAC, sb_edac: Use cpu family/model in driver detection")

we switched from using PCI ids to determine which platform we are
running on to using CPU model instead.

I forgot that Broadwell-DE has its own distinct model number different
from Broadwell-EP or -EX.

Fixing this isn't just adding a line to the array of cpuids - the
exising code assumed a 1:1 mapping between entries in that array and the
"enum type" values. Added the type to pci_id_table structure to remove
this dependency and allows two Broadwell cpu models.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 2c1ea4c700 ("EDAC, sb_edac: Use cpu family/model in driver detection")
Link: http://lkml.kernel.org/r/b3cffe40dec6dfe0235a5d52a504f0ba86a07ce7.1464902605.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-03 17:28:21 +02:00
Nicholas Krause
fbedcaf43f EDAC: Fix workqueues poll period resetting
After the workqueue cleanup, we're registering workqueues based on
the presence of an ->edac_check function. When that is the case,
we're setting OP_RUNNING_POLL. But we forgot to check that in
edac_mc_reset_delay_period(), leading to:

  BUG: unable to handle kernel paging request at 0000000000015d10
  IP: [ .. ] queued_spin_lock_slowpath
  PGD 3ffcc8067 PUD 3ffc56067 PMD 0
  Oops: 0002 [#1] SMP
  Modules linked in: ...
  CPU: 1 PID: 2792 Comm: edactest Not tainted 4.6.0-dirty #1
  Hardware name: HP ProLiant MicroServer, BIOS O41     10/01/2013
  Stack:
  Call Trace:
    ? _raw_spin_lock_irqsave
    ? lock_timer_base.isra.34
    ? del_timer
    ? try_to_grab_pending
    ? mod_delayed_work_on
    ? edac_mc_reset_delay_period
    ? edac_set_poll_msec
    ? param_attr_store
    ? module_attr_store
    ? kernfs_fop_write
    ? __vfs_write
    ? __vfs_read
    ? __alloc_fd
    ? vfs_write
    ? SyS_write
    ? entry_SYSCALL_64_fastpath
  Code:
  RIP  [ .. ] queued_spin_lock_slowpath
   RSP <>
  CR2: 0000000000015d10
  ---[ end trace 3f286bc71cca15d1 ]---
  Kernel panic - not syncing: Fatal exception

Fix it.

Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1463697958-13406-1-git-send-email-xerofoify@gmail.com
[ Rewrite commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2016-06-03 11:14:27 +02:00