system/qemu - HydraGit

mirror of https://gitlab.com/qemu-project/qemu synced 2024-11-05 20:35:44 +00:00

Author	SHA1	Message	Date
Haozhong Zhang	6388e18de9	qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList It may need to treat PC-DIMM and NVDIMM differently, e.g., when deciding the necessity of non-volatile flag bit in SRAT memory affinity structures. A new field 'nvdimm' is added to the union type MemoryDeviceInfo for such purpose. Its type is currently PCDIMMDeviceInfo and will be updated when necessary in the future. It also fixes "info memory-devices"/query-memory-devices which currently show nvdimm devices as dimm devices since object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to TYPE_PC_DIMM which it's been inherited from. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-03-20 03:34:52 +02:00
Haozhong Zhang	52c95cae4e	pc-dimm: make qmp_pc_dimm_device_list() sort devices by address Make qmp_pc_dimm_device_list() return sorted by start address list of devices so that it could be reused in places that would need sorted list. Reuse existing pc_dimm_built_list() to get sorted list. While at it hide recursive callbacks from callers, so that: qmp_pc_dimm_device_list(qdev_get_machine(), &list); could be replaced with simpler: list = qmp_pc_dimm_device_list(); follow up patch will use it in build_srat() Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-03-20 03:34:52 +02:00
David Hildenbrand	81ce6aa51a	numa: we don't implement NUMA for s390x Right now it is possible to crash QEMU for s390x by providing e.g. -numa node,nodeid=0,cpus=0-1 Problem is, that numa.c uses mc->cpu_index_to_instance_props as an indicator whether NUMA is supported by a machine type. We don't implement NUMA for s390x ("topology") yet. However we need mc->cpu_index_to_instance_props for query-cpus. So let's fix this case by also checking for mc->get_default_cpu_node_id, which will be needed by machine_set_cpu_numa_node(). qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by this machine-type While at it, make s390_cpu_index_to_props() look like on other architectures. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180227110255.20999-1-david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2018-03-08 15:49:23 +01:00
Markus Armbruster	112ed241f5	qapi: Empty out qapi-schema.json The previous commit improved compile time by including less of the generated QAPI headers. This is impossible for stuff defined directly in qapi-schema.json, because that ends up in headers that that pull in everything. Move everything but include directives from qapi-schema.json to new sub-module qapi/misc.json, then include just the "misc" shard where possible. It's possible everywhere, except: * monitor.c needs qmp-command.h to get qmp_init_marshal() * monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need qapi-event.h to get enum QAPIEvent Perhaps we'll get rid of those some other day. Adding a type to qapi/migration.json now recompiles some 120 instead of 2300 out of 5100 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-25-armbru@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Markus Armbruster	e688df6bc4	Include qapi/error.h exactly where needed This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit `34e304e975` resolved, OSX breakage fixed]	2018-02-09 13:50:17 +01:00
Marcelo Tosatti	e85687ffe2	qemu: improve hugepage allocation failure message Improve hugepage allocation failure message, indicating what is happening to the user. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20180115201700.GA4439@amt.cnet> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-02-05 13:54:38 +01:00
Haozhong Zhang	9837684316	hostmem-file: add "align" option When mmap(2) the backend files, QEMU uses the host page size (getpagesize(2)) by default as the alignment of mapping address. However, some backends may require alignments different than the page size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned, fails with a kernel message like [617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff) Because there is no common approach to get such alignment requirement, we add the 'align' option to 'memory-backend-file', so that users or management utils, which have enough knowledge about the backend, can specify a proper alignment via this option. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [ehabkost: fixed typo, fixed error_setg() format string] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2018-01-19 11:18:51 -02:00
Philippe Mathieu-Daudé	1330f1e2b7	numa: remove unused #include Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-12-18 17:07:02 +03:00
Igor Mammedov	f47bd1c839	spapr: replace numa_get_node() with lookup in pc-dimm list SPAPR is the last user of numa_get_node() and a bunch of supporting code to maintain numa_info[x].addr list. Get LMB node id from pc-dimm list, which allows to remove ~80LOC maintaining dynamic address range lookup list. It also removes pc-dimm dependency on numa_[un]set_mem_node_id() and makes pc-dimms a sole source of information about which node it belongs to and removes duplicate data from global numa_info. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-12-15 09:49:24 +11:00
Dou Liyang	7b8be49d36	NUMA: Enable adding NUMA node implicitly Linux and Windows need ACPI SRAT table to make memory hotplug work properly, however currently QEMU doesn't create SRAT table if numa options aren't present on CLI. Which breaks both linux and windows guests in certain conditions: * Windows: won't enable memory hotplug without SRAT table at all * Linux: if QEMU is started with initial memory all below 4Gb and no SRAT table present, guest kernel will use nommu DMA ops, which breaks 32bit hw drivers when memory is hotplugged and guest tries to use it with that drivers. Fix above issues by automatically creating a numa node when QEMU is started with memory hotplug enabled but without '-numa' options on CLI. (PS: auto-create numa node only for new machine types so not to break migration). Which would provide SRAT table to guests without explicit -numa options on CLI and would allow: * Windows: to enable memory hotplug * Linux: switch to SWIOTLB DMA ops, to bounce DMA transfers to 32bit allocated buffers that legacy drivers/hw can handle. [Rewritten by Igor] Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Alistair Francis <alistair23@gmail.com> Cc: Takao Indoh <indou.takao@jp.fujitsu.com> Cc: Izumi Taku <izumi.taku@jp.fujitsu.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-11-16 17:46:53 +02:00
Igor Mammedov	cc001888b7	numa: fixup parsed NumaNodeOptions earlier numa 'mem' option with suffix or without one is possible only on CLI/HMP. Instead of fixing up special suffix less CLI case deep in parse_numa_node() do it earlier right after option is parsed into NumaNodeOptions with OptVisistor so that the rest of the code would use valid values in NumaNodeOptions and won't have to reparse QemuOpts. It will help to isolate CLI/HMP parts in parse_numa() and split out parsed NumaNodeOptions processing into separate function that could be reused by QMP handler where we have only NumaNodeOptions and don't need any fixups. While at it reuse qemu_strtosz_MiB() instead of manually checking for suffixes. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1507801198-98182-1-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-10-27 16:04:28 +02:00
Dou Liyang	f51878ba86	NUMA: Replace MAX_NODES with nb_numa_nodes in for loop In QEMU, the number of the NUMA nodes is determined by parse_numa_opts(). Then, QEMU uses it for iteration, for example: for (i = 0; i < nb_numa_nodes; i++) However, in memory_region_allocate_system_memory(), it uses MAX_NODES not nb_numa_nodes. So, replace MAX_NODES with nb_numa_nodes to keep code consistency and reduce the loop times. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Message-Id: <1503387936-3483-1-git-send-email-douly.fnst@cn.fujitsu.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-09-19 16:51:33 -03:00
Vadim Galitsyn	31959e82fb	hmp: extend "info numa" with hotplugged memory information Report amount of hotplugged memory in addition to total amount per NUMA node. Signed-off-by: Vadim Galitsyn <vadim.galitsyn@profitbricks.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: qemu-devel@nongnu.org Message-Id: <20170829153022.27004-2-vadim.galitsyn@profitbricks.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2017-09-14 15:52:10 +01:00
Peter Maydell	1cfe48c1ce	memory: Rename memory_region_init_ram() to memory_region_init_ram_nomigrate() Rename memory_region_init_ram() to memory_region_init_ram_nomigrate(). This leaves the way clear for us to provide a memory_region_init_ram() which does handle migration. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-4-git-send-email-peter.maydell@linaro.org	2017-07-14 17:59:42 +01:00
Marc-André Lureau	61d7c14437	numa: use get_uint() for "size" property "size" is a property of TYPE_MEMORY_BACKEND. host_memory_backend_get_size() and host_memory_backend_set_size() use visit_type_size(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-39-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Igor Mammedov	d41f3e750d	numa: make sure that all cpus have has_node_id set if numa is enabled It fixes/add missing _PXM object for non mapped CPU (x86) and missing fdt node (virt-arm). It ensures that possible_cpus contains complete mapping if numa is enabled by the time machine_init() is executed. As result non completely mapped CPUs: 1) appear in ACPI/fdt blobs 2) QMP query-hotpluggable-cpus command shows bound nodes for such CPUs 3) allows to drop checks for has_node_id in numa only code, reducing number of invariants incomplete mapping could produce 4) moves fixup/implicit node init from runtime numa_cpu_pre_plug() (when CPU object is created) to machine_numa_finish_init() which helps to fix [1, 2] and make possible_cpus complete source of numa mapping available even before CPUs are created. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1496161442-96665-4-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-06-05 14:59:08 -03:00
Igor Mammedov	60bed6a30a	numa: move default mapping init to machine there is no need use cpu_index_to_instance_props() for setting default cpu -> node mapping. Generic machine code can do it without cpu_index by just enabling already preset defaults in possible_cpus. PS: as bonus it makes one less user of cpu_index_to_instance_props() Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1496161442-96665-3-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-06-05 14:59:08 -03:00
Igor Mammedov	a0ceb640d0	numa: consolidate cpu_preplug fixups/checks for pc/arm/spapr Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1496161442-96665-2-git-send-email-imammedo@redhat.com> [ehabkost: Fix indentation] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-06-05 14:59:08 -03:00
Eduardo Habkost	f892291eee	numa: Fix format string for "Invalid node" message Some compilers complain about the PRIu16 format string with the MAX(src, dst) and MAX_NODES arguments. Example output from Apple LLVM version 7.3.0 (clang-703.0.31): numa.c:236:20: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat] MAX(src, dst), MAX_NODES); ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ include/qapi/error.h:163:35: note: expanded from macro 'error_setg' (fmt), ## __VA_ARGS__) ^~~~~~~~~~~ glib/2.52.2/include/glib-2.0/glib/gmacros.h:288:20: note: expanded from macro 'MAX' #define MAX(a, b) (((a) > (b)) ? (a) : (b)) ^~~~~~~~~~~~~~~~~~~~~~~~~ numa.c:236:35: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat] MAX(src, dst), MAX_NODES); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~ include/qapi/error.h:163:35: note: expanded from macro 'error_setg' (fmt), ## __VA_ARGS__) ^~~~~~~~~~~ include/sysemu/sysemu.h:165:19: note: expanded from macro 'MAX_NODES' #define MAX_NODES 128 ^~~ MAX(src, dst) promotes the src and dst arguments to int, and MAX_NODES is an int. Use %d to silence those warnings. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170530184013.31044-1-ehabkost@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-30 16:09:58 -03:00
Stefan Hajnoczi	ba9915e1f8	x86 and machine queue, 2017-05-11 Highlights: * New "-numa cpu" option * NUMA distance configuration * migration/i386 vmstatification -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJZFLh3AAoJECgHk2+YTcWmppQQAJe9Y5a3VwHXqvHbwBHX2ysn RZDAUPd9DWpbM+UUydyKOVIZ7u5RXbbVq4E0NeCD8VYYd+grZB5Wo1cAzy3b4U2j 2s+MDqaPMtZtGoqxTsyQOVoVxazT5Kf1zglK+iUEzik44J7LGdro+ty2Z7Ut2c11 q9rE/GNS78czBm7c4lxgkxXW4N95K/tEGlLtDQ7uct//3U/ZimF+mO6GcbVFlOWT 4iEbOz2sqvBVv22nLJRufiPgFNIW4hizAz5KBWxwGFCCKvT3N6yYNKKjzEpCw+jE lpjIRODU02yIZZZY841fLRtyrk7p4zORS8jRaHTdEJgb5bGc/YazxxVL8nzRQT1W VxFwAMd+UNrDkV24hpN++Ln2O+b3kwcGZ7uA/qu9d5WvSYUKXlHqcMJ35q6zuhAI /ecfYO7EZfVP86VjIt5IH04iV8RChA9Q6de+kQEFa6wHUxufeCOwCFqukGo8zj07 plX8NcjnzYmSXKnYjHOHao4rKT+DiJhRB60rFiMeKP/qvKbZPjtgsIeonhHm53qZ /QwkhowahHKkpAnetIl0QHm8KS4YudAofMi/Fl+he4gRkEbSQVAo6iQb2L4cjcLC LNSDDsIVWGem4gCR+vcsFqB3lggRDfltHXm15JKh92UMpOr6RI6s8pD55T7EdnPC CfdxWB5kYM6/lLbOHj94 =48wH -----END PGP SIGNATURE----- Merge remote-tracking branch 'ehabkost/tags/x86-and-machine-pull-request' into staging x86 and machine queue, 2017-05-11 Highlights: * New "-numa cpu" option * NUMA distance configuration * migration/i386 vmstatification # gpg: Signature made Thu 11 May 2017 08:16:07 PM BST # gpg: using RSA key 0x2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # gpg: Note: This key has expired! # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * ehabkost/tags/x86-and-machine-pull-request: (29 commits) migration/i386: Remove support for pre-0.12 formats vmstatification: i386 FPReg migration/i386: Remove old non-softfloat 64bit FP support tests: check -numa node,cpu=props_list usecase numa: add '-numa cpu,...' option for property based node mapping numa: remove node_cpu bitmaps as they are no longer used numa: use possible_cpus for not mapped CPUs check machine: call machine init from wrapper numa: remove no longer need numa_post_machine_init() tests: numa: add case for QMP command query-cpus QMP: include CpuInstanceProperties into query_cpus output output virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() numa: do default mapping based on possible_cpus instead of node_cpu bitmaps numa: mirror cpu to node mapping in MachineState::possible_cpus numa: add check that board supports cpu_index to node mapping virt-arm: add node-id property to CPU pc: add node-id property to CPU spapr: add node-id property to sPAPR core ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-05-15 14:12:03 +01:00
Igor Mammedov	419fcdec3c	numa: add '-numa cpu,...' option for property based node mapping legacy cpu to node mapping is using cpu index values to map VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]' option. However cpu index is internal concept and QEMU users have to guess /reimplement qemu's logic/ to map it to a concrete cpu socket/core/thread to make sane CPUs placement across numa nodes. This patch allows to map cpu objects to numa nodes using the same properties as used for cpus with -device/device_add (socket-id/core-id/thread-id/node-id). At present valid properties/values to address CPUs could be fetched using hotpluggable-cpus monitor/qmp command, it will require user to start qemu twice when creating domain to fetch possible CPUs for a machine type/-smp layout first and then the second time with numa explicit mapping for actual usage. The first step results could be saved and reused to set/change mapping later as far as machine type/-smp stays the same. Proposed impl. supports exact and wildcard matching to simplify CLI and allow to set mapping for a specific cpu or group of cpu objects specified by matched properties. For example: # exact mapping x86 -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n # exact mapping SPAPR -numa cpu,node-id=x,core-id=y # wildcard mapping, all cpu objects that match socket-id=y # are mapped to node-id=x -numa cpu,node-id=x,socket-id=y Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1494415802-227633-18-git-send-email-imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	1171ae9a5b	numa: remove node_cpu bitmaps as they are no longer used Postfactum "CPU(s) present in multiple NUMA nodes" check was the last user of node_cpu bitmaps, but it's not need as machine_set_cpu_numa_node() does the similar check at the time mapping is set for cpus (i.e. when -numa cpus= is parsed) and ensures that cpu can be mapped only to one node. Remove duplicate check based on node_cpu bitmaps and since the last user is gone remove node_cpu as well, which completes internal transition from legacy bitmap based mapping storage to possible_cpus storage. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-17-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	ec78f8114b	numa: use possible_cpus for not mapped CPUs check and remove corresponding part in numa.c that uses node_cpu bitmaps. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-16-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	3b8a8557f7	numa: remove no longer need numa_post_machine_init() CPUState::numa_node is still in use but now it's set by board when it creates CPU objects. So there isn't any need to set it again after all CPU's are created, since it's been already set. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-14-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	6accfb7823	tests: numa: add case for QMP command query-cpus Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-13-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	af9b20e8d2	numa: do default mapping based on possible_cpus instead of node_cpu bitmaps Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-8-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	7c88e65d9e	numa: mirror cpu to node mapping in MachineState::possible_cpus Introduce machine_set_cpu_numa_node() helper that stores node mapping for CPU in MachineState::possible_cpus. CPU and node it belongs to is specified by 'props' argument. Patch doesn't remove old way of storing mapping in numa_info[X].node_cpu as removing it at the same time makes patch rather big. Instead it just mirrors mapping in possible_cpus and follow up per target patches will switch to possible_cpus and numa_info[X].node_cpu will be removed once there isn't any users left. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-7-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	64c2a8f6d3	numa: add check that board supports cpu_index to node mapping Default node mapping initialization already checks that board supports cpu_index to node mapping and refuses to start if it's not supported. Do the same for explicitly provided mapping "-numa node,cpus=..." Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-6-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	ea089eebbd	numa: move source of default CPUs to NUMA node mapping into boards Originally CPU threads were by default assigned in round-robin fashion. However it was causing issues in guest since CPU threads from the same socket/core could be placed on different NUMA nodes. Commit `fb43b73b` (pc: fix default VCPU to NUMA node mapping) fixed it by grouping threads within a socket on the same node introducing cpu_index_to_socket_id() callback and commit `20bb648d` (spapr: Fix default NUMA node allocation for threads) reused callback to fix similar issues for SPAPR machine even though socket doesn't make much sense there. As result QEMU ended up having 3 default distribution rules used by 3 targets /virt-arm, spapr, pc/. In effort of moving NUMA mapping for CPUs into possible_cpus, generalize default mapping in numa.c by making boards decide on default mapping and let them explicitly tell generic numa code to which node a CPU thread belongs to by replacing cpu_index_to_socket_id() with @cpu_index_to_instance_props() which provides default node_id assigned by board to specified cpu_index. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1494415802-227633-2-git-send-email-imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Laurent Vivier	3bfe57165b	numa: equally distribute memory on nodes When there are more nodes than available memory to put the minimum allowed memory by node, all the memory is put on the last node. This is because we put (ram_size / nb_numa_nodes) & ~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this case the value is 0. This is particularly true with pseries, as the memory must be aligned to 256MB. To avoid this problem, this patch uses an error diffusion algorithm [1] to distribute equally the memory on nodes. We introduce numa_auto_assign_ram() function in MachineClass to keep compatibility between machine type versions. The legacy function is used with pseries-2.9, pc-q35-2.9 and pc-i440fx-2.9 (and previous), the new one with all others. Example: qemu-system-ppc64 -S -nographic -nodefaults -monitor stdio -m 1G -smp 8 \ -numa node -numa node -numa node \ -numa node -numa node -numa node Before: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 0 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 0 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 1024 MB After: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 256 MB [1] https://en.wikipedia.org/wiki/Error_diffusion Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170502162955.1610-2-lvivier@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:47 -03:00
He Chen	0f203430dd	numa: Allow setting NUMA distance for different NUMA nodes This patch is going to add SLIT table support in QEMU, and provides additional option `dist` for command `-numa` to allow user set vNUMA distance by QEMU command. With this patch, when a user wants to create a guest that contains several vNUMA nodes and also wants to set distance among those nodes, the QEMU command would like: ``` -numa node,nodeid=0,cpus=0 \ -numa node,nodeid=1,cpus=1 \ -numa node,nodeid=2,cpus=2 \ -numa node,nodeid=3,cpus=3 \ -numa dist,src=0,dst=1,val=21 \ -numa dist,src=0,dst=2,val=31 \ -numa dist,src=0,dst=3,val=41 \ -numa dist,src=1,dst=2,val=21 \ -numa dist,src=1,dst=3,val=31 \ -numa dist,src=2,dst=3,val=21 \ ``` Signed-off-by: He Chen <he.chen@linux.intel.com> Message-Id: <1493260558-20728-1-git-send-email-he.chen@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:37 -03:00
Ishani Chugh	d0e31a105e	Remove reduntant qemu: from error functions This patch removes redundant "qemu:" from error functions. The link to the bitesized task is: http://wiki.qemu-project.org/Contribute/BiteSizedTasks#Error_checking Signed-off-by: Ishani Chugh <chugh.ishani@research.iiit.ac.in> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-05-07 09:57:51 +03:00
Laurent Vivier	55641213fc	numa,spapr: align default numa node memory size to 256MB Since commit `224245b` ("spapr: Add LMB DR connectors"), NUMA node memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE). But when "-numa" option is provided without "mem" parameter, the memory is equally divided between nodes, but 8MB aligned. This can be not valid for pseries. In that case we can have: $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB With this patch, we have: (qemu) info numa 3 nodes node 0 cpus: 0 node 0 size: 1280 MB node 1 cpus: node 1 size: 1280 MB node 2 cpus: node 2 size: 1536 MB Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-22 11:32:42 +11:00
Markus Armbruster	d081a49af8	numa: Flatten simple union NumaOptions Simple unions are simpler than flat unions in the schema, but more complicated in C and on the QMP wire: there's extra indirection in C and extra nesting on the wire, both pointless. They're best avoided in new code. NumaOptions isn't new, but it's only used internally, not in QMP. Convert it to a flat union. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487709988-14322-2-git-send-email-armbru@redhat.com>	2017-02-22 19:50:46 +01:00
Paolo Bonzini	0987d735a3	ramblock-notifier: new This adds a notify interface of ram block additions and removals. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-16 17:52:35 +01:00
Igor Mammedov	cdda2018e3	numa: make -numa parser dynamically allocate CPUs masks so it won't impose an additional limits on max_cpus limits supported by different targets. It removes global MAX_CPUMASK_BITS constant and need to bump it up whenever max_cpus is being increased for a target above MAX_CPUMASK_BITS value. Use runtime max_cpus value instead to allocate sufficiently sized node_cpu bitmasks in numa parser. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479466974-249781-1-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: Added asserts to ensure cpu_index < max_cpus] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-01-12 15:51:36 -02:00
Igor Mammedov	e1ff3c67e8	monitor: fix qmp/hmp query-memdev not reporting IDs of memory backends Considering 'id' is mandatory for user_creatable objects/backends and user_creatable_add_type() always has it as an argument regardless of where from it is called CLI/monitor or QMP, Fix issue by adding 'id' property to hostmem backends and set it in user_creatable_add_type() for every object that implements 'id' property. Then later at query-memdev time get 'id' from object directly. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1484052795-158195-4-git-send-email-imammedo@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-01-12 15:35:06 -02:00
Igor Mammedov	6bea1ddf8b	numa: reduce code duplication by adding helper numa_get_node_for_cpu() Replace repeated pattern for (i = 0; i < nb_numa_nodes; i++) { if (test_bit(idx, numa_info[i].node_cpu)) { ... break; with a helper function to lookup numa node index for cpu. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-10-10 01:16:57 +03:00
Marc-André Lureau	157e94e8a2	numa: do not leak NumaOptions In all cases, call qapi_free_NumaOptions(), by using a common ending block. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-08-07 23:59:59 +04:00
Greg Kurz	0b21757124	numa: set the memory backend "is_mapped" field Commit `2aece63` "hostmem: detect host backend memory is being used properly" added a way to know if a memory backend is busy or available for use. It caused a slight regression if we pass the same backend to a NUMA node and to a pc-dimm device: -m 1G,slots=2,maxmem=2G \ -object memory-backend-ram,size=1G,id=mem-mem1 \ -device pc-dimm,id=dimm-mem1,memdev=mem-mem1 \ -numa node,nodeid=0,memdev=mem-mem1 Before commit `2aece63`, this would cause QEMU to print an error message and to exit gracefully: qemu-system-ppc64: -device pc-dimm,id=dimm-mem1,memdev=mem-mem1: can't use already busy memdev: mem-mem1 Since commit `2aece63`, QEMU hits an assertion in the memory code: qemu-system-ppc64: memory.c:1934: memory_region_add_subregion_common: Assertion `!subregion->container' failed. Aborted This happens because pc-dimm devices don't use memory_region_is_mapped() anymore and cannot guess the backend is already used by a NUMA node. Let's revert to the previous behavior by turning the NUMA code to also call host_memory_backend_set_mapped() when it uses a backend. Fixes: `2aece63c8a` Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <146891691503.15642.9817215371777203794.stgit@bahia.lan> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-08-02 12:03:58 +02:00
Eric Blake	09204eac9b	opts-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need opts_visitor_cleanup(); which in turn means we no longer need to return a subtype from opts_visitor_new() nor a public upcast function. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-6-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	32bafa8fdd	qapi: Don't special-case simple union wrappers Simple unions were carrying a special case that hid their 'data' QMP member from the resulting C struct, via the hack method QAPISchemaObjectTypeVariant.simple_union_type(). But by using the work we started by unboxing flat union and alternate branches, coupled with the ability to visit the members of an implicit type, we can now expose the simple union's implicit type in qapi-types.h: \| struct q_obj_ImageInfoSpecificQCow2_wrapper { \| ImageInfoSpecificQCow2 data; \| }; \| \| struct q_obj_ImageInfoSpecificVmdk_wrapper { \| ImageInfoSpecificVmdk data; \| }; ... \| struct ImageInfoSpecific { \| ImageInfoSpecificKind type; \| union { /* union tag is @type / \| void data; \|- ImageInfoSpecificQCow2 qcow2; \|- ImageInfoSpecificVmdk vmdk; \|+ q_obj_ImageInfoSpecificQCow2_wrapper qcow2; \|+ q_obj_ImageInfoSpecificVmdk_wrapper vmdk; \| } u; \| }; Doing this removes asymmetry between QAPI's QMP side and its C side (both sides now expose 'data'), and means that the treatment of a simple union as sugar for a flat union is now equivalent in both languages (previously the two approaches used a different layer of dereferencing, where the simple union could be converted to a flat union with equivalent C layout but different {} on the wire, or to an equivalent QMP wire form but with different C representation). Using the implicit type also lets us get rid of the simple_union_type() hack. Of course, now all clients of simple unions have to adjust from using su->u.member to using su->u.member.data; while this touches a number of files in the tree, some earlier cleanup patches helped minimize the change to the initialization of a temporary variable rather than every single member access. The generated qapi-visit.c code is also affected by the layout change: \|@@ -7393,10 +7393,10 @@ void visit_type_ImageInfoSpecific_member \| } \| switch (obj->type) { \| case IMAGE_INFO_SPECIFIC_KIND_QCOW2: \|- visit_type_ImageInfoSpecificQCow2(v, "data", &obj->u.qcow2, &err); \|+ visit_type_q_obj_ImageInfoSpecificQCow2_wrapper_members(v, &obj->u.qcow2, &err); \| break; \| case IMAGE_INFO_SPECIFIC_KIND_VMDK: \|- visit_type_ImageInfoSpecificVmdk(v, "data", &obj->u.vmdk, &err); \|+ visit_type_q_obj_ImageInfoSpecificVmdk_wrapper_members(v, &obj->u.vmdk, &err); \| break; \| default: \| abort(); Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1458254921-17042-13-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-03-18 10:29:26 +01:00
Eric Blake	96a1616c85	qapi-dealloc: Reduce use outside of generated code No need to roll our own use of the dealloc visitors when we can just directly use the qapi_free_FOO() functions that do what we want in one line. In net.c, inline net_visit() into its remaining lone caller. After this patch, test-visitor-serialization.c is the only non-generated file that needs to use a dealloc visitor, because it is testing low level aspects of the visitor interface. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1456262075-3311-2-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-03-04 17:16:32 +01:00
Eric Blake	51e72bc1dd	qapi: Swap visit_* arguments for consistent 'name' placement JSON uses "name":value, but many of our visitor interfaces were called with visit_type_FOO(v, &value, name, errp). This can be a bit confusing to have to mentally swap the parameter order to match JSON order. It's particularly bad for visit_start_struct(), where the 'name' parameter is smack in the middle of the otherwise-related group of 'obj, kind, size' parameters! It's time to do a global swap of the parameter ordering, so that the 'name' parameter is always immediately after the Visitor argument. Additional reason in favor of the swap: the existing include/qjson.h prefers listing 'name' first in json_prop_(), and I have plans to unify that file with the qapi visitors; listing 'name' first in qapi will minimize churn to the (admittedly few) qjson.h clients. Later patches will then fix docs, object.h, visitor-impl.h, and those clients to match. Done by first patching scripts/qapi.py by hand to make generated files do what I want, then by running the following Coccinelle script to affect the rest of the code base: $ spatch --sp-file script `git grep -l '\bvisit_' -- '*/.[ch]'` I then had to apply some touchups (Coccinelle insisted on TAB indentation in visitor.h, and botched the signature of visit_type_enum() by rewriting 'const char const strings[]' to the syntactically invalid 'const charconst[] strings'). The movement of parameters is sufficient to provoke compiler errors if any callers were missed. // Part 1: Swap declaration order @@ type TV, TErr, TObj, T1, T2; identifier OBJ, ARG1, ARG2; @@ void visit_start_struct -(TV v, TObj OBJ, T1 ARG1, const char name, T2 ARG2, TErr errp) +(TV v, const char name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp) { ... } @@ type bool, TV, T1; identifier ARG1; @@ bool visit_optional -(TV v, T1 ARG1, const char name) +(TV v, const char name, T1 ARG1) { ... } @@ type TV, TErr, TObj, T1; identifier OBJ, ARG1; @@ void visit_get_next_type -(TV v, TObj OBJ, T1 ARG1, const char name, TErr errp) +(TV v, const char name, TObj OBJ, T1 ARG1, TErr errp) { ... } @@ type TV, TErr, TObj, T1, T2; identifier OBJ, ARG1, ARG2; @@ void visit_type_enum -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char name, TErr errp) +(TV v, const char name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp) { ... } @@ type TV, TErr, TObj; identifier OBJ; identifier VISIT_TYPE =~ "^visit_type_"; @@ void VISIT_TYPE -(TV v, TObj OBJ, const char name, TErr errp) +(TV v, const char name, TObj OBJ, TErr errp) { ... } // Part 2: swap caller order @@ expression V, NAME, OBJ, ARG1, ARG2, ERR; identifier VISIT_TYPE =~ "^visit_type_"; @@ ( -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR) +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR) \| -visit_optional(V, ARG1, NAME) +visit_optional(V, NAME, ARG1) \| -visit_get_next_type(V, OBJ, ARG1, NAME, ERR) +visit_get_next_type(V, NAME, OBJ, ARG1, ERR) \| -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR) +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR) \| -VISIT_TYPE(V, OBJ, NAME, ERR) +VISIT_TYPE(V, NAME, OBJ, ERR) ) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-02-08 17:29:56 +01:00
Peter Maydell	d38ea87ac5	all: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org	2016-02-04 17:41:30 +00:00
Luiz Capitulino	fae947b096	memory: exit when hugepage allocation fails if mem-prealloc When -mem-prealloc is passed on the command-line, the expected behavior is to exit if the hugepage allocation fails. However, this behavior is broken since commit `cc57501dee` which made hugepage allocation fall back to regular ram in case of faliure. This commit restores the expected behavior for -mem-prealloc. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Message-Id: <20160122091501.75bbd42a@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-01-26 15:58:14 +01:00
Markus Armbruster	007b06578a	Use error_fatal to simplify obvious fatal errors Done with this Coccinelle semantic patch: @@ type T; identifier FUN, RET; expression list ARGS; expression ERR, EC; @@ ( - T RET = FUN(ARGS, &ERR); + T RET = FUN(ARGS, &error_fatal); \| - RET = FUN(ARGS, &ERR); + RET = FUN(ARGS, &error_fatal); \| - FUN(ARGS, &ERR); + FUN(ARGS, &error_fatal); ) - if (ERR != NULL) { - error_report_err(ERR); - exit(EC); - } This is actually a more elegant version of my initial semantic patch by courtesy of Eduardo. It leaves dead Error * variables behind, cleaned up manually. Cc: qemu-arm@nongnu.org Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>	2016-01-13 11:58:58 +01:00
Markus Armbruster	2f6f826e03	numa: Clean up query-memdev error handling qmp_query_memdev() has two error paths: * When object_get_objects_root() returns null. It never does, so simply drop the useless error handling. * When query_memdev() fails. It leaks err then. But any failure there is actually a programming error. Switch it to &error_abort, and drop the useless error handling. Messed up in commit `76b5d85` "qmp: add query-memdev". Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2015-12-18 15:50:24 -02:00
Eric Blake	1fd5d4fea4	memory: Convert to new qapi union layout We have two issues with our qapi union layout: 1) Even though the QMP wire format spells the tag 'type', the C code spells it 'kind', requiring some hacks in the generator. 2) The C struct uses an anonymous union, which places all tag values in the same namespace as all non-variant members. This leads to spurious collisions if a tag value matches a non-variant member's name. Make the conversion to the new layout for memory-related code. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1445898903-12082-21-git-send-email-eblake@redhat.com> [Commit message tweaked slightly] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2015-11-02 08:30:28 +01:00
Markus Armbruster	f8ed85ac99	Fix bad error handling after memory_region_init_ram() Symptom: $ qemu-system-x86_64 -m 10000000 Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456: upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory Aborted (core dumped) Root cause: commit `ef701d7` screwed up handling of out-of-memory conditions. Before the commit, we report the error and exit(1), in one place, ram_block_add(). The commit lifts the error handling up the call chain some, to three places. Fine. Except it uses &error_abort in these places, changing the behavior from exit(1) to abort(), and thus undoing the work of commit `3922825` "exec: Don't abort when we can't allocate guest memory". The three places are: * memory_region_init_ram() Commit `4994653` (right after commit `ef701d7`) lifted the error handling further, through memory_region_init_ram(), multiplying the incorrect use of &error_abort. Later on, imitation of existing (bad) code may have created more. * memory_region_init_ram_ptr() The &error_abort is still there. * memory_region_init_rom_device() Doesn't need fixing, because commit `33e0eb5` (soon after commit `ef701d7`) lifted the error handling further, and in the process changed it from &error_abort to passing it up the call chain. Correct, because the callers are realize() methods. Fix the error handling after memory_region_init_ram() with a Coccinelle semantic patch: @r@ expression mr, owner, name, size, err; position p; @@ memory_region_init_ram(mr, owner, name, size, ( - &error_abort + &error_fatal \| err@p ) ); @script:python@ p << r.p; @@ print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column) When the last argument is &error_abort, it gets replaced by &error_fatal. This is the fix. If the last argument is anything else, its position is reported. This lets us check the fix is complete. Four positions get reported: * ram_backend_memory_alloc() Error is passed up the call chain, ultimately through user_creatable_complete(). As far as I can tell, it's callers all handle the error sanely. * fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize() DeviceClass.realize() methods, errors handled sanely further up the call chain. We're good. Test case again behaves: $ qemu-system-x86_64 -m 10000000 qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory [Exit 1 ] The next commits will repair the rest of commit ef701d7's damage. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>	2015-09-18 14:39:29 +02:00

1 2

96 commits