Merge branch 'for-6.3/cxl-ram-region' into cxl/next

Include the support for enumerating and provisioning ram regions for
v6.3. This also include a default policy change for ram / volatile
device-dax instances to assign them to the dax_kmem driver by default.
This commit is contained in:
Dan Williams 2023-02-10 18:11:01 -08:00
commit b8b9ffced0
29 changed files with 1491 additions and 320 deletions

View file

@ -198,7 +198,7 @@ Description:
What: /sys/bus/cxl/devices/endpointX/CDAT
Date: July, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RO) If this sysfs entry is not present no DOE mailbox was
@ -209,7 +209,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/mode
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@ -229,7 +229,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
@ -240,7 +240,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/dpa_size
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@ -260,7 +260,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RO) The number of targets across which this decoder's host
@ -275,7 +275,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RO) The number of consecutive bytes of host physical address
@ -285,25 +285,25 @@ Description:
interleave_granularity).
What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region
Date: May, 2022
KernelVersion: v5.20
What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region
Date: May, 2022, January, 2023
KernelVersion: v6.0 (pmem), v6.3 (ram)
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Write a string in the form 'regionZ' to start the process
of defining a new persistent memory region (interleave-set)
within the decode range bounded by root decoder 'decoderX.Y'.
The value written must match the current value returned from
reading this attribute. An atomic compare exchange operation is
done on write to assign the requested id to a region and
allocate the region-id for the next creation attempt. EBUSY is
returned if the region name written does not match the current
cached value.
of defining a new persistent, or volatile memory region
(interleave-set) within the decode range bounded by root decoder
'decoderX.Y'. The value written must match the current value
returned from reading this attribute. An atomic compare exchange
operation is done on write to assign the requested id to a
region and allocate the region-id for the next creation attempt.
EBUSY is returned if the region name written does not match the
current cached value.
What: /sys/bus/cxl/devices/decoderX.Y/delete_region
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(WO) Write a string in the form 'regionZ' to delete that region,
@ -312,17 +312,18 @@ Description:
What: /sys/bus/cxl/devices/regionZ/uuid
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Write a unique identifier for the region. This field must
be set for persistent regions and it must not conflict with the
UUID of another region.
UUID of another region. For volatile ram regions this
attribute is a read-only empty string.
What: /sys/bus/cxl/devices/regionZ/interleave_granularity
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Set the number of consecutive bytes each device in the
@ -333,7 +334,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/interleave_ways
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Configures the number of devices participating in the
@ -343,7 +344,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/size
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) System physical address space to be consumed by the region.
@ -358,9 +359,20 @@ Description:
results in the same address being allocated.
What: /sys/bus/cxl/devices/regionZ/mode
Date: January, 2023
KernelVersion: v6.3
Contact: linux-cxl@vger.kernel.org
Description:
(RO) The mode of a region is established at region creation time
and dictates the mode of the endpoint decoder that comprise the
region. For more details on the possible modes see
/sys/bus/cxl/devices/decoderX.Y/mode
What: /sys/bus/cxl/devices/regionZ/resource
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RO) A region is a contiguous partition of a CXL root decoder
@ -372,7 +384,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/target[0..N]
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Write an endpoint decoder object name to 'targetX' where X
@ -391,7 +403,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/commit
Date: May, 2022
KernelVersion: v5.20
KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Write a boolean 'true' string value to this attribute to

View file

@ -6034,6 +6034,7 @@ M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: nvdimm@lists.linux.dev
L: linux-cxl@vger.kernel.org
S: Supported
F: drivers/dax/

View file

@ -718,7 +718,7 @@ static void hmat_register_target_devices(struct memory_target *target)
for (res = target->memregions.child; res; res = res->sibling) {
int target_nid = pxm_to_node(target->memory_pxm);
hmem_register_device(target_nid, res);
hmem_register_resource(target_nid, res);
}
}
@ -869,4 +869,4 @@ static __init int hmat_init(void)
acpi_put_table(tbl);
return 0;
}
device_initcall(hmat_init);
subsys_initcall(hmat_init);

View file

@ -104,12 +104,22 @@ config CXL_SUSPEND
depends on SUSPEND && CXL_MEM
config CXL_REGION
bool
bool "CXL: Region Support"
default CXL_BUS
# For MAX_PHYSMEM_BITS
depends on SPARSEMEM
select MEMREGION
select GET_FREE_REGION
help
Enable the CXL core to enumerate and provision CXL regions. A CXL
region is defined by one or more CXL expanders that decode a given
system-physical address range. For CXL regions established by
platform-firmware this option enables memory error handling to
identify the devices participating in a given interleaved memory
range. Otherwise, platform-firmware managed CXL is enabled by being
placed in the system address map and does not need a driver.
If unsure say 'y'
config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)"

View file

@ -731,7 +731,8 @@ static void __exit cxl_acpi_exit(void)
cxl_bus_drain();
}
module_init(cxl_acpi_init);
/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
subsys_initcall(cxl_acpi_init);
module_exit(cxl_acpi_exit);
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);

View file

@ -11,15 +11,18 @@ extern struct attribute_group cxl_base_attribute_group;
#ifdef CONFIG_CXL_REGION
extern struct device_attribute dev_attr_create_pmem_region;
extern struct device_attribute dev_attr_create_ram_region;
extern struct device_attribute dev_attr_delete_region;
extern struct device_attribute dev_attr_region;
extern const struct device_type cxl_pmem_region_type;
extern const struct device_type cxl_dax_region_type;
extern const struct device_type cxl_region_type;
void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
#define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
#define CXL_REGION_TYPE(x) (&cxl_region_type)
#define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
#define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
#define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
int cxl_region_init(void);
void cxl_region_exit(void);
#else
@ -37,6 +40,7 @@ static inline void cxl_region_exit(void)
#define CXL_REGION_TYPE(x) NULL
#define SET_CXL_REGION_ATTR(x)
#define CXL_PMEM_REGION_TYPE(x) NULL
#define CXL_DAX_REGION_TYPE(x) NULL
#endif
struct cxl_send_command;
@ -56,9 +60,6 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
extern struct rw_semaphore cxl_dpa_rwsem;
bool is_switch_decoder(struct device *dev);
struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
int cxl_memdev_init(void);
void cxl_memdev_exit(void);
void cxl_mbox_init(void);

View file

@ -279,7 +279,7 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
return 0;
}
static int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
resource_size_t base, resource_size_t len,
resource_size_t skipped)
{
@ -295,6 +295,7 @@ static int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
}
EXPORT_SYMBOL_NS_GPL(devm_cxl_dpa_reserve, CXL);
resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled)
{
@ -676,6 +677,14 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld)
port->commit_end--;
cxld->flags &= ~CXL_DECODER_F_ENABLE;
/* Userspace is now responsible for reconfiguring this decoder */
if (is_endpoint_decoder(&cxld->dev)) {
struct cxl_endpoint_decoder *cxled;
cxled = to_cxl_endpoint_decoder(&cxld->dev);
cxled->state = CXL_DECODER_STATE_MANUAL;
}
return 0;
}
@ -783,6 +792,9 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
return rc;
}
*dpa_base += dpa_size + skip;
cxled->state = CXL_DECODER_STATE_AUTO;
return 0;
}
@ -826,7 +838,8 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
cxled = cxl_endpoint_decoder_alloc(port);
if (IS_ERR(cxled)) {
dev_warn(&port->dev,
"Failed to allocate the decoder\n");
"Failed to allocate decoder%d.%d\n",
port->id, i);
return PTR_ERR(cxled);
}
cxld = &cxled->cxld;
@ -836,7 +849,8 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
cxlsd = cxl_switch_decoder_alloc(port, target_count);
if (IS_ERR(cxlsd)) {
dev_warn(&port->dev,
"Failed to allocate the decoder\n");
"Failed to allocate decoder%d.%d\n",
port->id, i);
return PTR_ERR(cxlsd);
}
cxld = &cxlsd->cxld;
@ -844,13 +858,16 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
rc = init_hdm_decoder(port, cxld, target_map, hdm, i, &dpa_base);
if (rc) {
dev_warn(&port->dev,
"Failed to initialize decoder%d.%d\n",
port->id, i);
put_device(&cxld->dev);
return rc;
}
rc = add_hdm_decoder(port, cxld, target_map);
if (rc) {
dev_warn(&port->dev,
"Failed to add decoder to port\n");
"Failed to add decoder%d.%d\n", port->id, i);
return rc;
}
}

View file

@ -246,6 +246,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
if (rc < 0)
goto err;
cxlmd->id = rc;
cxlmd->depth = -1;
dev = &cxlmd->dev;
device_initialize(dev);

View file

@ -214,11 +214,6 @@ static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds)
return devm_add_action_or_reset(host, clear_mem_enable, cxlds);
}
static bool range_contains(struct range *r1, struct range *r2)
{
return r1->start <= r2->start && r1->end >= r2->end;
}
/* require dvsec ranges to be covered by a locked platform window */
static int dvsec_range_allowed(struct device *dev, void *arg)
{

View file

@ -46,6 +46,8 @@ static int cxl_device_id(struct device *dev)
return CXL_DEVICE_NVDIMM;
if (dev->type == CXL_PMEM_REGION_TYPE())
return CXL_DEVICE_PMEM_REGION;
if (dev->type == CXL_DAX_REGION_TYPE())
return CXL_DEVICE_DAX_REGION;
if (is_cxl_port(dev)) {
if (is_cxl_root(to_cxl_port(dev)))
return CXL_DEVICE_ROOT;
@ -180,17 +182,7 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr,
{
struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
switch (cxled->mode) {
case CXL_DECODER_RAM:
return sysfs_emit(buf, "ram\n");
case CXL_DECODER_PMEM:
return sysfs_emit(buf, "pmem\n");
case CXL_DECODER_NONE:
return sysfs_emit(buf, "none\n");
case CXL_DECODER_MIXED:
default:
return sysfs_emit(buf, "mixed\n");
}
return sysfs_emit(buf, "%s\n", cxl_decoder_mode_name(cxled->mode));
}
static ssize_t mode_store(struct device *dev, struct device_attribute *attr,
@ -304,6 +296,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
&dev_attr_cap_type3.attr,
&dev_attr_target_list.attr,
SET_CXL_REGION_ATTR(create_pmem_region)
SET_CXL_REGION_ATTR(create_ram_region)
SET_CXL_REGION_ATTR(delete_region)
NULL,
};
@ -315,6 +308,13 @@ static bool can_create_pmem(struct cxl_root_decoder *cxlrd)
return (cxlrd->cxlsd.cxld.flags & flags) == flags;
}
static bool can_create_ram(struct cxl_root_decoder *cxlrd)
{
unsigned long flags = CXL_DECODER_F_TYPE3 | CXL_DECODER_F_RAM;
return (cxlrd->cxlsd.cxld.flags & flags) == flags;
}
static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute *a, int n)
{
struct device *dev = kobj_to_dev(kobj);
@ -323,7 +323,11 @@ static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute *
if (a == CXL_REGION_ATTR(create_pmem_region) && !can_create_pmem(cxlrd))
return 0;
if (a == CXL_REGION_ATTR(delete_region) && !can_create_pmem(cxlrd))
if (a == CXL_REGION_ATTR(create_ram_region) && !can_create_ram(cxlrd))
return 0;
if (a == CXL_REGION_ATTR(delete_region) &&
!(can_create_pmem(cxlrd) || can_create_ram(cxlrd)))
return 0;
return a->mode;
@ -444,6 +448,7 @@ bool is_endpoint_decoder(struct device *dev)
{
return dev->type == &cxl_decoder_endpoint_type;
}
EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, CXL);
bool is_root_decoder(struct device *dev)
{
@ -455,6 +460,7 @@ bool is_switch_decoder(struct device *dev)
{
return is_root_decoder(dev) || dev->type == &cxl_decoder_switch_type;
}
EXPORT_SYMBOL_NS_GPL(is_switch_decoder, CXL);
struct cxl_decoder *to_cxl_decoder(struct device *dev)
{
@ -482,6 +488,7 @@ struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev)
return NULL;
return container_of(dev, struct cxl_switch_decoder, cxld.dev);
}
EXPORT_SYMBOL_NS_GPL(to_cxl_switch_decoder, CXL);
static void cxl_ep_release(struct cxl_ep *ep)
{
@ -1207,6 +1214,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
get_device(&endpoint->dev);
dev_set_drvdata(dev, endpoint);
cxlmd->depth = endpoint->depth;
return devm_add_action_or_reset(dev, delete_endpoint, cxlmd);
}
EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, CXL);
@ -1241,50 +1249,55 @@ static void reap_dports(struct cxl_port *port)
}
}
struct detach_ctx {
struct cxl_memdev *cxlmd;
int depth;
};
static int port_has_memdev(struct device *dev, const void *data)
{
const struct detach_ctx *ctx = data;
struct cxl_port *port;
if (!is_cxl_port(dev))
return 0;
port = to_cxl_port(dev);
if (port->depth != ctx->depth)
return 0;
return !!cxl_ep_load(port, ctx->cxlmd);
}
static void cxl_detach_ep(void *data)
{
struct cxl_memdev *cxlmd = data;
struct device *iter;
for (iter = &cxlmd->dev; iter; iter = grandparent(iter)) {
struct device *dport_dev = grandparent(iter);
for (int i = cxlmd->depth - 1; i >= 1; i--) {
struct cxl_port *port, *parent_port;
struct detach_ctx ctx = {
.cxlmd = cxlmd,
.depth = i,
};
struct device *dev;
struct cxl_ep *ep;
bool died = false;
if (!dport_dev)
break;
port = find_cxl_port(dport_dev, NULL);
if (!port)
dev = bus_find_device(&cxl_bus_type, NULL, &ctx,
port_has_memdev);
if (!dev)
continue;
if (is_cxl_root(port)) {
put_device(&port->dev);
continue;
}
port = to_cxl_port(dev);
parent_port = to_cxl_port(port->dev.parent);
device_lock(&parent_port->dev);
if (!parent_port->dev.driver) {
/*
* The bottom-up race to delete the port lost to a
* top-down port disable, give up here, because the
* parent_port ->remove() will have cleaned up all
* descendants.
*/
device_unlock(&parent_port->dev);
put_device(&port->dev);
continue;
}
device_lock(&port->dev);
ep = cxl_ep_load(port, cxlmd);
dev_dbg(&cxlmd->dev, "disconnect %s from %s\n",
ep ? dev_name(ep->ep) : "", dev_name(&port->dev));
cxl_ep_remove(port, ep);
if (ep && !port->dead && xa_empty(&port->endpoints) &&
!is_cxl_root(parent_port)) {
!is_cxl_root(parent_port) && parent_port->dev.driver) {
/*
* This was the last ep attached to a dynamically
* enumerated port. Block new cxl_add_ep() and garbage
@ -1620,6 +1633,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
}
cxlrd->calc_hb = calc_hb;
mutex_init(&cxlrd->range_lock);
cxld = &cxlsd->cxld;
cxld->dev.type = &cxl_decoder_root_type;
@ -2003,6 +2017,6 @@ static void cxl_core_exit(void)
debugfs_remove_recursive(cxl_debugfs);
}
module_init(cxl_core_init);
subsys_initcall(cxl_core_init);
module_exit(cxl_core_exit);
MODULE_LICENSE("GPL v2");

File diff suppressed because it is too large Load diff

View file

@ -277,6 +277,8 @@ resource_size_t cxl_rcrb_to_component(struct device *dev,
* cxl_decoder flags that define the type of memory / devices this
* decoder supports as well as configuration lock status See "CXL 2.0
* 8.2.5.12.7 CXL HDM Decoder 0 Control Register" for details.
* Additionally indicate whether decoder settings were autodetected,
* user customized.
*/
#define CXL_DECODER_F_RAM BIT(0)
#define CXL_DECODER_F_PMEM BIT(1)
@ -336,12 +338,36 @@ enum cxl_decoder_mode {
CXL_DECODER_DEAD,
};
static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
{
static const char * const names[] = {
[CXL_DECODER_NONE] = "none",
[CXL_DECODER_RAM] = "ram",
[CXL_DECODER_PMEM] = "pmem",
[CXL_DECODER_MIXED] = "mixed",
};
if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED)
return names[mode];
return "mixed";
}
/*
* Track whether this decoder is reserved for region autodiscovery, or
* free for userspace provisioning.
*/
enum cxl_decoder_state {
CXL_DECODER_STATE_MANUAL,
CXL_DECODER_STATE_AUTO,
};
/**
* struct cxl_endpoint_decoder - Endpoint / SPA to DPA decoder
* @cxld: base cxl_decoder_object
* @dpa_res: actively claimed DPA span of this decoder
* @skip: offset into @dpa_res where @cxld.hpa_range maps
* @mode: which memory type / access-mode-partition this decoder targets
* @state: autodiscovery state
* @pos: interleave position in @cxld.region
*/
struct cxl_endpoint_decoder {
@ -349,6 +375,7 @@ struct cxl_endpoint_decoder {
struct resource *dpa_res;
resource_size_t skip;
enum cxl_decoder_mode mode;
enum cxl_decoder_state state;
int pos;
};
@ -382,6 +409,7 @@ typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
* @region_id: region id for next region provisioning event
* @calc_hb: which host bridge covers the n'th position by granularity
* @platform_data: platform specific configuration data
* @range_lock: sync region autodiscovery by address range
* @cxlsd: base cxl switch decoder
*/
struct cxl_root_decoder {
@ -389,6 +417,7 @@ struct cxl_root_decoder {
atomic_t region_id;
cxl_calc_hb_fn calc_hb;
void *platform_data;
struct mutex range_lock;
struct cxl_switch_decoder cxlsd;
};
@ -438,6 +467,13 @@ struct cxl_region_params {
*/
#define CXL_REGION_F_INCOHERENT 0
/*
* Indicate whether this region has been assembled by autodetection or
* userspace assembly. Prevent endpoint decoders outside of automatic
* detection from being added to the region.
*/
#define CXL_REGION_F_AUTO 1
/**
* struct cxl_region - CXL region
* @dev: This region's device
@ -493,6 +529,12 @@ struct cxl_pmem_region {
struct cxl_pmem_region_mapping mapping[];
};
struct cxl_dax_region {
struct device dev;
struct cxl_region *cxlr;
struct range hpa_range;
};
/**
* struct cxl_port - logical collection of upstream port devices and
* downstream port devices to construct a CXL memory
@ -633,8 +675,10 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port,
struct cxl_decoder *to_cxl_decoder(struct device *dev);
struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
bool is_root_decoder(struct device *dev);
bool is_switch_decoder(struct device *dev);
bool is_endpoint_decoder(struct device *dev);
struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets,
@ -685,6 +729,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
#define CXL_DEVICE_MEMORY_EXPANDER 5
#define CXL_DEVICE_REGION 6
#define CXL_DEVICE_PMEM_REGION 7
#define CXL_DEVICE_DAX_REGION 8
#define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
#define CXL_MODALIAS_FMT "cxl:t%d"
@ -701,6 +746,9 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct device *dev);
#ifdef CONFIG_CXL_REGION
bool is_cxl_pmem_region(struct device *dev);
struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
int cxl_add_to_region(struct cxl_port *root,
struct cxl_endpoint_decoder *cxled);
struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
#else
static inline bool is_cxl_pmem_region(struct device *dev)
{
@ -710,6 +758,15 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
{
return NULL;
}
static inline int cxl_add_to_region(struct cxl_port *root,
struct cxl_endpoint_decoder *cxled)
{
return 0;
}
static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
{
return NULL;
}
#endif
/*

View file

@ -39,6 +39,7 @@
* @cxl_nvb: coordinate removal of @cxl_nvd if present
* @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
* @id: id number of this memdev instance.
* @depth: endpoint port depth
*/
struct cxl_memdev {
struct device dev;
@ -48,6 +49,7 @@ struct cxl_memdev {
struct cxl_nvdimm_bridge *cxl_nvb;
struct cxl_nvdimm *cxl_nvd;
int id;
int depth;
};
static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
@ -80,6 +82,9 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
}
struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
resource_size_t base, resource_size_t len,
resource_size_t skipped);
static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
struct cxl_memdev *cxlmd)

View file

@ -30,57 +30,111 @@ static void schedule_detach(void *cxlmd)
schedule_cxl_memdev_detach(cxlmd);
}
static int cxl_port_probe(struct device *dev)
static int discover_region(struct device *dev, void *root)
{
struct cxl_endpoint_decoder *cxled;
int rc;
if (!is_endpoint_decoder(dev))
return 0;
cxled = to_cxl_endpoint_decoder(dev);
if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
return 0;
if (cxled->state != CXL_DECODER_STATE_AUTO)
return 0;
/*
* Region enumeration is opportunistic, if this add-event fails,
* continue to the next endpoint decoder.
*/
rc = cxl_add_to_region(root, cxled);
if (rc)
dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
return 0;
}
static int cxl_switch_port_probe(struct cxl_port *port)
{
struct cxl_port *port = to_cxl_port(dev);
struct cxl_hdm *cxlhdm;
int rc;
rc = devm_cxl_port_enumerate_dports(port);
if (rc < 0)
return rc;
if (!is_cxl_endpoint(port)) {
rc = devm_cxl_port_enumerate_dports(port);
if (rc < 0)
return rc;
if (rc == 1)
return devm_cxl_add_passthrough_decoder(port);
}
if (rc == 1)
return devm_cxl_add_passthrough_decoder(port);
cxlhdm = devm_cxl_setup_hdm(port);
if (IS_ERR(cxlhdm))
return PTR_ERR(cxlhdm);
if (is_cxl_endpoint(port)) {
struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport);
struct cxl_dev_state *cxlds = cxlmd->cxlds;
return devm_cxl_enumerate_decoders(cxlhdm);
}
/* Cache the data early to ensure is_visible() works */
read_cdat_data(port);
static int cxl_endpoint_port_probe(struct cxl_port *port)
{
struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport);
struct cxl_dev_state *cxlds = cxlmd->cxlds;
struct cxl_hdm *cxlhdm;
struct cxl_port *root;
int rc;
get_device(&cxlmd->dev);
rc = devm_add_action_or_reset(dev, schedule_detach, cxlmd);
if (rc)
return rc;
cxlhdm = devm_cxl_setup_hdm(port);
if (IS_ERR(cxlhdm))
return PTR_ERR(cxlhdm);
rc = cxl_hdm_decode_init(cxlds, cxlhdm);
if (rc)
return rc;
/* Cache the data early to ensure is_visible() works */
read_cdat_data(port);
rc = cxl_await_media_ready(cxlds);
if (rc) {
dev_err(dev, "Media not active (%d)\n", rc);
return rc;
}
}
get_device(&cxlmd->dev);
rc = devm_add_action_or_reset(&port->dev, schedule_detach, cxlmd);
if (rc)
return rc;
rc = devm_cxl_enumerate_decoders(cxlhdm);
rc = cxl_hdm_decode_init(cxlds, cxlhdm);
if (rc)
return rc;
rc = cxl_await_media_ready(cxlds);
if (rc) {
dev_err(dev, "Couldn't enumerate decoders (%d)\n", rc);
dev_err(&port->dev, "Media not active (%d)\n", rc);
return rc;
}
rc = devm_cxl_enumerate_decoders(cxlhdm);
if (rc)
return rc;
/*
* This can't fail in practice as CXL root exit unregisters all
* descendant ports and that in turn synchronizes with cxl_port_probe()
*/
root = find_cxl_root(&cxlmd->dev);
/*
* Now that all endpoint decoders are successfully enumerated, try to
* assemble regions from committed decoders
*/
device_for_each_child(&port->dev, root, discover_region);
put_device(&root->dev);
return 0;
}
static int cxl_port_probe(struct device *dev)
{
struct cxl_port *port = to_cxl_port(dev);
if (is_cxl_endpoint(port))
return cxl_endpoint_port_probe(port);
return cxl_switch_port_probe(port);
}
static ssize_t CDAT_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
loff_t offset, size_t count)

View file

@ -45,12 +45,25 @@ config DEV_DAX_HMEM
Say M if unsure.
config DEV_DAX_CXL
tristate "CXL DAX: direct access to CXL RAM regions"
depends on CXL_REGION && DEV_DAX
default CXL_REGION && DEV_DAX
help
CXL RAM regions are either mapped by platform-firmware
and published in the initial system-memory map as "System RAM", mapped
by platform-firmware as "Soft Reserved", or dynamically provisioned
after boot by the CXL driver. In the latter two cases a device-dax
instance is created to access that unmapped-by-default address range.
Per usual it can remain as dedicated access via a device interface, or
converted to "System RAM" via the dax_kmem facility.
config DEV_DAX_HMEM_DEVICES
depends on DEV_DAX_HMEM && DAX=y
depends on DEV_DAX_HMEM && DAX
def_bool y
config DEV_DAX_KMEM
tristate "KMEM DAX: volatile-use of persistent memory"
tristate "KMEM DAX: map dax-devices as System-RAM"
default DEV_DAX
depends on DEV_DAX
depends on MEMORY_HOTPLUG # for add_memory() and friends

View file

@ -3,10 +3,12 @@ obj-$(CONFIG_DAX) += dax.o
obj-$(CONFIG_DEV_DAX) += device_dax.o
obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o
obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
obj-$(CONFIG_DEV_DAX_CXL) += dax_cxl.o
dax-y := super.o
dax-y += bus.o
device_dax-y := device.o
dax_pmem-y := pmem.o
dax_cxl-y := cxl.o
obj-y += hmem/

View file

@ -56,6 +56,25 @@ static int dax_match_id(struct dax_device_driver *dax_drv, struct device *dev)
return match;
}
static int dax_match_type(struct dax_device_driver *dax_drv, struct device *dev)
{
enum dax_driver_type type = DAXDRV_DEVICE_TYPE;
struct dev_dax *dev_dax = to_dev_dax(dev);
if (dev_dax->region->res.flags & IORESOURCE_DAX_KMEM)
type = DAXDRV_KMEM_TYPE;
if (dax_drv->type == type)
return 1;
/* default to device mode if dax_kmem is disabled */
if (dax_drv->type == DAXDRV_DEVICE_TYPE &&
!IS_ENABLED(CONFIG_DEV_DAX_KMEM))
return 1;
return 0;
}
enum id_action {
ID_REMOVE,
ID_ADD,
@ -216,14 +235,9 @@ static int dax_bus_match(struct device *dev, struct device_driver *drv)
{
struct dax_device_driver *dax_drv = to_dax_drv(drv);
/*
* All but the 'device-dax' driver, which has 'match_always'
* set, requires an exact id match.
*/
if (dax_drv->match_always)
if (dax_match_id(dax_drv, dev))
return 1;
return dax_match_id(dax_drv, dev);
return dax_match_type(dax_drv, dev);
}
/*
@ -1413,13 +1427,10 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
}
EXPORT_SYMBOL_GPL(devm_create_dev_dax);
static int match_always_count;
int __dax_driver_register(struct dax_device_driver *dax_drv,
struct module *module, const char *mod_name)
{
struct device_driver *drv = &dax_drv->drv;
int rc = 0;
/*
* dax_bus_probe() calls dax_drv->probe() unconditionally.
@ -1434,26 +1445,7 @@ int __dax_driver_register(struct dax_device_driver *dax_drv,
drv->mod_name = mod_name;
drv->bus = &dax_bus_type;
/* there can only be one default driver */
mutex_lock(&dax_bus_lock);
match_always_count += dax_drv->match_always;
if (match_always_count > 1) {
match_always_count--;
WARN_ON(1);
rc = -EINVAL;
}
mutex_unlock(&dax_bus_lock);
if (rc)
return rc;
rc = driver_register(drv);
if (rc && dax_drv->match_always) {
mutex_lock(&dax_bus_lock);
match_always_count -= dax_drv->match_always;
mutex_unlock(&dax_bus_lock);
}
return rc;
return driver_register(drv);
}
EXPORT_SYMBOL_GPL(__dax_driver_register);
@ -1463,7 +1455,6 @@ void dax_driver_unregister(struct dax_device_driver *dax_drv)
struct dax_id *dax_id, *_id;
mutex_lock(&dax_bus_lock);
match_always_count -= dax_drv->match_always;
list_for_each_entry_safe(dax_id, _id, &dax_drv->ids, list) {
list_del(&dax_id->list);
kfree(dax_id);

View file

@ -11,7 +11,10 @@ struct dax_device;
struct dax_region;
void dax_region_put(struct dax_region *dax_region);
#define IORESOURCE_DAX_STATIC (1UL << 0)
/* dax bus specific ioresource flags */
#define IORESOURCE_DAX_STATIC BIT(0)
#define IORESOURCE_DAX_KMEM BIT(1)
struct dax_region *alloc_dax_region(struct device *parent, int region_id,
struct range *range, int target_node, unsigned int align,
unsigned long flags);
@ -25,10 +28,15 @@ struct dev_dax_data {
struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data);
enum dax_driver_type {
DAXDRV_KMEM_TYPE,
DAXDRV_DEVICE_TYPE,
};
struct dax_device_driver {
struct device_driver drv;
struct list_head ids;
int match_always;
enum dax_driver_type type;
int (*probe)(struct dev_dax *dev);
void (*remove)(struct dev_dax *dev);
};

53
drivers/dax/cxl.c Normal file
View file

@ -0,0 +1,53 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
#include <linux/module.h>
#include <linux/dax.h>
#include "../cxl/cxl.h"
#include "bus.h"
static int cxl_dax_region_probe(struct device *dev)
{
struct cxl_dax_region *cxlr_dax = to_cxl_dax_region(dev);
int nid = phys_to_target_node(cxlr_dax->hpa_range.start);
struct cxl_region *cxlr = cxlr_dax->cxlr;
struct dax_region *dax_region;
struct dev_dax_data data;
struct dev_dax *dev_dax;
if (nid == NUMA_NO_NODE)
nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start);
dax_region = alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid,
PMD_SIZE, IORESOURCE_DAX_KMEM);
if (!dax_region)
return -ENOMEM;
data = (struct dev_dax_data) {
.dax_region = dax_region,
.id = -1,
.size = range_len(&cxlr_dax->hpa_range),
};
dev_dax = devm_create_dev_dax(&data);
if (IS_ERR(dev_dax))
return PTR_ERR(dev_dax);
/* child dev_dax instances now own the lifetime of the dax_region */
dax_region_put(dax_region);
return 0;
}
static struct cxl_driver cxl_dax_region_driver = {
.name = "cxl_dax_region",
.probe = cxl_dax_region_probe,
.id = CXL_DEVICE_DAX_REGION,
.drv = {
.suppress_bind_attrs = true,
},
};
module_cxl_driver(cxl_dax_region_driver);
MODULE_ALIAS_CXL(CXL_DEVICE_DAX_REGION);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Intel Corporation");
MODULE_IMPORT_NS(CXL);

View file

@ -475,8 +475,7 @@ EXPORT_SYMBOL_GPL(dev_dax_probe);
static struct dax_device_driver device_dax_driver = {
.probe = dev_dax_probe,
/* all probe actions are unwound by devm, so .remove isn't necessary */
.match_always = 1,
.type = DAXDRV_DEVICE_TYPE,
};
static int __init dax_init(void)

View file

@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o
# device_hmem.o deliberately precedes dax_hmem.o for initcall ordering
obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device_hmem.o
obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o
device_hmem-y := device.o
dax_hmem-y := hmem.o

View file

@ -8,6 +8,8 @@
static bool nohmem;
module_param_named(disable, nohmem, bool, 0444);
static bool platform_initialized;
static DEFINE_MUTEX(hmem_resource_lock);
static struct resource hmem_active = {
.name = "HMEM devices",
.start = 0,
@ -15,80 +17,66 @@ static struct resource hmem_active = {
.flags = IORESOURCE_MEM,
};
void hmem_register_device(int target_nid, struct resource *r)
int walk_hmem_resources(struct device *host, walk_hmem_fn fn)
{
struct resource *res;
int rc = 0;
mutex_lock(&hmem_resource_lock);
for (res = hmem_active.child; res; res = res->sibling) {
rc = fn(host, (int) res->desc, res);
if (rc)
break;
}
mutex_unlock(&hmem_resource_lock);
return rc;
}
EXPORT_SYMBOL_GPL(walk_hmem_resources);
static void __hmem_register_resource(int target_nid, struct resource *res)
{
/* define a clean / non-busy resource for the platform device */
struct resource res = {
.start = r->start,
.end = r->end,
.flags = IORESOURCE_MEM,
.desc = IORES_DESC_SOFT_RESERVED,
};
struct platform_device *pdev;
struct memregion_info info;
int rc, id;
struct resource *new;
int rc;
if (nohmem)
return;
rc = region_intersects(res.start, resource_size(&res), IORESOURCE_MEM,
IORES_DESC_SOFT_RESERVED);
if (rc != REGION_INTERSECTS)
return;
id = memregion_alloc(GFP_KERNEL);
if (id < 0) {
pr_err("memregion allocation failure for %pr\n", &res);
new = __request_region(&hmem_active, res->start, resource_size(res), "",
0);
if (!new) {
pr_debug("hmem range %pr already active\n", res);
return;
}
pdev = platform_device_alloc("hmem", id);
new->desc = target_nid;
if (platform_initialized)
return;
pdev = platform_device_alloc("hmem_platform", 0);
if (!pdev) {
pr_err("hmem device allocation failure for %pr\n", &res);
goto out_pdev;
}
if (!__request_region(&hmem_active, res.start, resource_size(&res),
dev_name(&pdev->dev), 0)) {
dev_dbg(&pdev->dev, "hmem range %pr already active\n", &res);
goto out_active;
}
pdev->dev.numa_node = numa_map_to_online_node(target_nid);
info = (struct memregion_info) {
.target_node = target_nid,
};
rc = platform_device_add_data(pdev, &info, sizeof(info));
if (rc < 0) {
pr_err("hmem memregion_info allocation failure for %pr\n", &res);
goto out_resource;
}
rc = platform_device_add_resources(pdev, &res, 1);
if (rc < 0) {
pr_err("hmem resource allocation failure for %pr\n", &res);
goto out_resource;
pr_err_once("failed to register device-dax hmem_platform device\n");
return;
}
rc = platform_device_add(pdev);
if (rc < 0) {
dev_err(&pdev->dev, "device add failed for %pr\n", &res);
goto out_resource;
}
if (rc)
platform_device_put(pdev);
else
platform_initialized = true;
}
return;
void hmem_register_resource(int target_nid, struct resource *res)
{
if (nohmem)
return;
out_resource:
__release_region(&hmem_active, res.start, resource_size(&res));
out_active:
platform_device_put(pdev);
out_pdev:
memregion_free(id);
mutex_lock(&hmem_resource_lock);
__hmem_register_resource(target_nid, res);
mutex_unlock(&hmem_resource_lock);
}
static __init int hmem_register_one(struct resource *res, void *data)
{
hmem_register_device(phys_to_target_node(res->start), res);
hmem_register_resource(phys_to_target_node(res->start), res);
return 0;
}
@ -104,4 +92,4 @@ static __init int hmem_init(void)
* As this is a fallback for address ranges unclaimed by the ACPI HMAT
* parsing it must be at an initcall level greater than hmat_init().
*/
late_initcall(hmem_init);
device_initcall(hmem_init);

View file

@ -3,6 +3,7 @@
#include <linux/memregion.h>
#include <linux/module.h>
#include <linux/pfn_t.h>
#include <linux/dax.h>
#include "../bus.h"
static bool region_idle;
@ -10,30 +11,32 @@ module_param_named(region_idle, region_idle, bool, 0644);
static int dax_hmem_probe(struct platform_device *pdev)
{
unsigned long flags = IORESOURCE_DAX_KMEM;
struct device *dev = &pdev->dev;
struct dax_region *dax_region;
struct memregion_info *mri;
struct dev_dax_data data;
struct dev_dax *dev_dax;
struct resource *res;
struct range range;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res)
return -ENOMEM;
/*
* @region_idle == true indicates that an administrative agent
* wants to manipulate the range partitioning before the devices
* are created, so do not send them to the dax_kmem driver by
* default.
*/
if (region_idle)
flags = 0;
mri = dev->platform_data;
range.start = res->start;
range.end = res->end;
dax_region = alloc_dax_region(dev, pdev->id, &range, mri->target_node,
PMD_SIZE, 0);
dax_region = alloc_dax_region(dev, pdev->id, &mri->range,
mri->target_node, PMD_SIZE, flags);
if (!dax_region)
return -ENOMEM;
data = (struct dev_dax_data) {
.dax_region = dax_region,
.id = -1,
.size = region_idle ? 0 : resource_size(res),
.size = region_idle ? 0 : range_len(&mri->range),
};
dev_dax = devm_create_dev_dax(&data);
if (IS_ERR(dev_dax))
@ -44,22 +47,131 @@ static int dax_hmem_probe(struct platform_device *pdev)
return 0;
}
static int dax_hmem_remove(struct platform_device *pdev)
{
/* devm handles teardown */
return 0;
}
static struct platform_driver dax_hmem_driver = {
.probe = dax_hmem_probe,
.remove = dax_hmem_remove,
.driver = {
.name = "hmem",
},
};
module_platform_driver(dax_hmem_driver);
static void release_memregion(void *data)
{
memregion_free((long) data);
}
static void release_hmem(void *pdev)
{
platform_device_unregister(pdev);
}
static int hmem_register_device(struct device *host, int target_nid,
const struct resource *res)
{
struct platform_device *pdev;
struct memregion_info info;
long id;
int rc;
if (IS_ENABLED(CONFIG_CXL_REGION) &&
region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
IORES_DESC_CXL) != REGION_DISJOINT) {
dev_dbg(host, "deferring range to CXL: %pr\n", res);
return 0;
}
rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
IORES_DESC_SOFT_RESERVED);
if (rc != REGION_INTERSECTS)
return 0;
id = memregion_alloc(GFP_KERNEL);
if (id < 0) {
dev_err(host, "memregion allocation failure for %pr\n", res);
return -ENOMEM;
}
rc = devm_add_action_or_reset(host, release_memregion, (void *) id);
if (rc)
return rc;
pdev = platform_device_alloc("hmem", id);
if (!pdev) {
dev_err(host, "device allocation failure for %pr\n", res);
return -ENOMEM;
}
pdev->dev.numa_node = numa_map_to_online_node(target_nid);
info = (struct memregion_info) {
.target_node = target_nid,
.range = {
.start = res->start,
.end = res->end,
},
};
rc = platform_device_add_data(pdev, &info, sizeof(info));
if (rc < 0) {
dev_err(host, "memregion_info allocation failure for %pr\n",
res);
goto out_put;
}
rc = platform_device_add(pdev);
if (rc < 0) {
dev_err(host, "%s add failed for %pr\n", dev_name(&pdev->dev),
res);
goto out_put;
}
return devm_add_action_or_reset(host, release_hmem, pdev);
out_put:
platform_device_put(pdev);
return rc;
}
static int dax_hmem_platform_probe(struct platform_device *pdev)
{
return walk_hmem_resources(&pdev->dev, hmem_register_device);
}
static struct platform_driver dax_hmem_platform_driver = {
.probe = dax_hmem_platform_probe,
.driver = {
.name = "hmem_platform",
},
};
static __init int dax_hmem_init(void)
{
int rc;
rc = platform_driver_register(&dax_hmem_platform_driver);
if (rc)
return rc;
rc = platform_driver_register(&dax_hmem_driver);
if (rc)
platform_driver_unregister(&dax_hmem_platform_driver);
return rc;
}
static __exit void dax_hmem_exit(void)
{
platform_driver_unregister(&dax_hmem_driver);
platform_driver_unregister(&dax_hmem_platform_driver);
}
module_init(dax_hmem_init);
module_exit(dax_hmem_exit);
/* Allow for CXL to define its own dax regions */
#if IS_ENABLED(CONFIG_CXL_REGION)
#if IS_MODULE(CONFIG_CXL_ACPI)
MODULE_SOFTDEP("pre: cxl_acpi");
#endif
#endif
MODULE_ALIAS("platform:hmem*");
MODULE_ALIAS("platform:hmem_platform*");
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Intel Corporation");

View file

@ -239,6 +239,7 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
static struct dax_device_driver device_dax_kmem_driver = {
.probe = dev_dax_kmem_probe,
.remove = dev_dax_kmem_remove,
.type = DAXDRV_KMEM_TYPE,
};
static int __init dax_kmem_init(void)

View file

@ -262,11 +262,14 @@ static inline bool dax_mapping(struct address_space *mapping)
}
#ifdef CONFIG_DEV_DAX_HMEM_DEVICES
void hmem_register_device(int target_nid, struct resource *r);
void hmem_register_resource(int target_nid, struct resource *r);
#else
static inline void hmem_register_device(int target_nid, struct resource *r)
static inline void hmem_register_resource(int target_nid, struct resource *r)
{
}
#endif
typedef int (*walk_hmem_fn)(struct device *dev, int target_nid,
const struct resource *res);
int walk_hmem_resources(struct device *dev, walk_hmem_fn fn);
#endif

View file

@ -3,10 +3,12 @@
#define _MEMREGION_H_
#include <linux/types.h>
#include <linux/errno.h>
#include <linux/range.h>
#include <linux/bug.h>
struct memregion_info {
int target_node;
struct range range;
};
#ifdef CONFIG_MEMREGION

View file

@ -13,6 +13,11 @@ static inline u64 range_len(const struct range *range)
return range->end - range->start + 1;
}
static inline bool range_contains(struct range *r1, struct range *r2)
{
return r1->start <= r2->start && r1->end >= r2->end;
}
int add_range(struct range *range, int az, int nr_range,
u64 start, u64 end);

View file

@ -31,8 +31,8 @@ static volatile u8 forced_mask = 0xff;
static void *fill_start, *target_start;
static size_t fill_size, target_size;
static bool range_contains(char *haystack_start, size_t haystack_size,
char *needle_start, size_t needle_size)
static bool stackinit_range_contains(char *haystack_start, size_t haystack_size,
char *needle_start, size_t needle_size)
{
if (needle_start >= haystack_start &&
needle_start + needle_size <= haystack_start + haystack_size)
@ -175,7 +175,7 @@ static noinline void test_ ## name (struct kunit *test) \
\
/* Validate that compiler lined up fill and target. */ \
KUNIT_ASSERT_TRUE_MSG(test, \
range_contains(fill_start, fill_size, \
stackinit_range_contains(fill_start, fill_size, \
target_start, target_size), \
"stack fill missed target!? " \
"(fill %zu wide, target offset by %d)\n", \

View file

@ -703,6 +703,142 @@ static int mock_decoder_reset(struct cxl_decoder *cxld)
return 0;
}
static void default_mock_decoder(struct cxl_decoder *cxld)
{
cxld->hpa_range = (struct range){
.start = 0,
.end = -1,
};
cxld->interleave_ways = 1;
cxld->interleave_granularity = 256;
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
}
static int first_decoder(struct device *dev, void *data)
{
struct cxl_decoder *cxld;
if (!is_switch_decoder(dev))
return 0;
cxld = to_cxl_decoder(dev);
if (cxld->id == 0)
return 1;
return 0;
}
static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
{
struct acpi_cedt_cfmws *window = mock_cfmws[0];
struct platform_device *pdev = NULL;
struct cxl_endpoint_decoder *cxled;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *port, *iter;
const int size = SZ_512M;
struct cxl_memdev *cxlmd;
struct cxl_dport *dport;
struct device *dev;
bool hb0 = false;
u64 base;
int i;
if (is_endpoint_decoder(&cxld->dev)) {
cxled = to_cxl_endpoint_decoder(&cxld->dev);
cxlmd = cxled_to_memdev(cxled);
WARN_ON(!dev_is_platform(cxlmd->dev.parent));
pdev = to_platform_device(cxlmd->dev.parent);
/* check is endpoint is attach to host-bridge0 */
port = cxled_to_port(cxled);
do {
if (port->uport == &cxl_host_bridge[0]->dev) {
hb0 = true;
break;
}
if (is_cxl_port(port->dev.parent))
port = to_cxl_port(port->dev.parent);
else
port = NULL;
} while (port);
port = cxled_to_port(cxled);
}
/*
* The first decoder on the first 2 devices on the first switch
* attached to host-bridge0 mock a fake / static RAM region. All
* other decoders are default disabled. Given the round robin
* assignment those devices are named cxl_mem.0, and cxl_mem.4.
*
* See 'cxl list -BMPu -m cxl_mem.0,cxl_mem.4'
*/
if (!hb0 || pdev->id % 4 || pdev->id > 4 || cxld->id > 0) {
default_mock_decoder(cxld);
return;
}
base = window->base_hpa;
cxld->hpa_range = (struct range) {
.start = base,
.end = base + size - 1,
};
cxld->interleave_ways = 2;
eig_to_granularity(window->granularity, &cxld->interleave_granularity);
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->flags = CXL_DECODER_F_ENABLE;
cxled->state = CXL_DECODER_STATE_AUTO;
port->commit_end = cxld->id;
devm_cxl_dpa_reserve(cxled, 0, size / cxld->interleave_ways, 0);
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
/*
* Now that endpoint decoder is set up, walk up the hierarchy
* and setup the switch and root port decoders targeting @cxlmd.
*/
iter = port;
for (i = 0; i < 2; i++) {
dport = iter->parent_dport;
iter = dport->port;
dev = device_find_child(&iter->dev, NULL, first_decoder);
/*
* Ancestor ports are guaranteed to be enumerated before
* @port, and all ports have at least one decoder.
*/
if (WARN_ON(!dev))
continue;
cxlsd = to_cxl_switch_decoder(dev);
if (i == 0) {
/* put cxl_mem.4 second in the decode order */
if (pdev->id == 4)
cxlsd->target[1] = dport;
else
cxlsd->target[0] = dport;
} else
cxlsd->target[0] = dport;
cxld = &cxlsd->cxld;
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->flags = CXL_DECODER_F_ENABLE;
iter->commit_end = 0;
/*
* Switch targets 2 endpoints, while host bridge targets
* one root port
*/
if (i == 0)
cxld->interleave_ways = 2;
else
cxld->interleave_ways = 1;
cxld->interleave_granularity = 256;
cxld->hpa_range = (struct range) {
.start = base,
.end = base + size - 1,
};
put_device(dev);
}
}
static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
{
struct cxl_port *port = cxlhdm->port;
@ -748,16 +884,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
cxld = &cxled->cxld;
}
cxld->hpa_range = (struct range) {
.start = 0,
.end = -1,
};
cxld->interleave_ways = min_not_zero(target_count, 1);
cxld->interleave_granularity = SZ_4K;
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
mock_init_hdm_decoder(cxld);
if (target_count) {
rc = device_for_each_child(port->uport, &ctx,