qemu/qapi/cxl.json
Jonathan Cameron 9547754f40 hw/cxl: QMP based poison injection support
Inject poison using QMP command cxl-inject-poison to add an entry to the
poison list.

For now, the poison is not returned CXL.mem reads, but only via the
mailbox command Get Poison List. So a normal memory read to an address
that is on the poison list will not yet result in a synchronous exception
(and similar for partial cacheline writes).
That is left for a future patch.

See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)

Kernel patches to use this interface here:
https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/

To inject poison using QMP (telnet to the QMP port)
{ "execute": "qmp_capabilities" }

{ "execute": "cxl-inject-poison",
    "arguments": {
         "path": "/machine/peripheral/cxl-pmem0",
         "start": 2048,
         "length": 256
    }
}

Adjusted to select a device on your machine.

Note that the poison list supported is kept short enough to avoid the
complexity of state machine that is needed to handle the MORE flag.

Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230526170010.574-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-22 18:55:14 -04:00

176 lines
4.3 KiB
Python

# -*- Mode: Python -*-
# vim: filetype=python
##
# = CXL devices
##
##
# @cxl-inject-poison:
#
# Poison records indicate that a CXL memory device knows that a
# particular memory region may be corrupted. This may be because of
# locally detected errors (e.g. ECC failure) or poisoned writes
# received from other components in the system. This injection
# mechanism enables testing of the OS handling of poison records which
# may be queried via the CXL mailbox.
#
# @path: CXL type 3 device canonical QOM path
#
# @start: Start address; must be 64 byte aligned.
#
# @length: Length of poison to inject; must be a multiple of 64 bytes.
#
# Since: 8.1
##
{ 'command': 'cxl-inject-poison',
'data': { 'path': 'str', 'start': 'uint64', 'length': 'size' }}
##
# @CxlUncorErrorType:
#
# Type of uncorrectable CXL error to inject. These errors are
# reported via an AER uncorrectable internal error with additional
# information logged at the CXL device.
#
# @cache-data-parity: Data error such as data parity or data ECC error
# CXL.cache
#
# @cache-address-parity: Address parity or other errors associated
# with the address field on CXL.cache
#
# @cache-be-parity: Byte enable parity or other byte enable errors on
# CXL.cache
#
# @cache-data-ecc: ECC error on CXL.cache
#
# @mem-data-parity: Data error such as data parity or data ECC error
# on CXL.mem
#
# @mem-address-parity: Address parity or other errors associated with
# the address field on CXL.mem
#
# @mem-be-parity: Byte enable parity or other byte enable errors on
# CXL.mem.
#
# @mem-data-ecc: Data ECC error on CXL.mem.
#
# @reinit-threshold: REINIT threshold hit.
#
# @rsvd-encoding: Received unrecognized encoding.
#
# @poison-received: Received poison from the peer.
#
# @receiver-overflow: Buffer overflows (first 3 bits of header log
# indicate which)
#
# @internal: Component specific error
#
# @cxl-ide-tx: Integrity and data encryption tx error.
#
# @cxl-ide-rx: Integrity and data encryption rx error.
#
# Since: 8.0
##
{ 'enum': 'CxlUncorErrorType',
'data': ['cache-data-parity',
'cache-address-parity',
'cache-be-parity',
'cache-data-ecc',
'mem-data-parity',
'mem-address-parity',
'mem-be-parity',
'mem-data-ecc',
'reinit-threshold',
'rsvd-encoding',
'poison-received',
'receiver-overflow',
'internal',
'cxl-ide-tx',
'cxl-ide-rx'
]
}
##
# @CXLUncorErrorRecord:
#
# Record of a single error including header log.
#
# @type: Type of error
#
# @header: 16 DWORD of header.
#
# Since: 8.0
##
{ 'struct': 'CXLUncorErrorRecord',
'data': {
'type': 'CxlUncorErrorType',
'header': [ 'uint32' ]
}
}
##
# @cxl-inject-uncorrectable-errors:
#
# Command to allow injection of multiple errors in one go. This
# allows testing of multiple header log handling in the OS.
#
# @path: CXL Type 3 device canonical QOM path
#
# @errors: Errors to inject
#
# Since: 8.0
##
{ 'command': 'cxl-inject-uncorrectable-errors',
'data': { 'path': 'str',
'errors': [ 'CXLUncorErrorRecord' ] }}
##
# @CxlCorErrorType:
#
# Type of CXL correctable error to inject
#
# @cache-data-ecc: Data ECC error on CXL.cache
#
# @mem-data-ecc: Data ECC error on CXL.mem
#
# @crc-threshold: Component specific and applicable to 68 byte Flit
# mode only.
#
# @cache-poison-received: Received poison from a peer on CXL.cache.
#
# @mem-poison-received: Received poison from a peer on CXL.mem
#
# @physical: Received error indication from the physical layer.
#
# Since: 8.0
##
{ 'enum': 'CxlCorErrorType',
'data': ['cache-data-ecc',
'mem-data-ecc',
'crc-threshold',
'retry-threshold',
'cache-poison-received',
'mem-poison-received',
'physical']
}
##
# @cxl-inject-correctable-error:
#
# Command to inject a single correctable error. Multiple error
# injection of this error type is not interesting as there is no
# associated header log. These errors are reported via AER as a
# correctable internal error, with additional detail available from
# the CXL device.
#
# @path: CXL Type 3 device canonical QOM path
#
# @type: Type of error.
#
# Since: 8.0
##
{'command': 'cxl-inject-correctable-error',
'data': {'path': 'str', 'type': 'CxlCorErrorType'}}