system/qemu - HydraGit

mirror of https://gitlab.com/qemu-project/qemu synced 2024-10-18 17:02:46 +00:00

Author	SHA1	Message	Date
Prasad Singamsetty	37f51384ae	intel-iommu: Extend address width to 48 bits The current implementation of Intel IOMMU code only supports 39 bits iova address width. This patch provides a new parameter (x-aw-bits) for intel-iommu to extend its address width to 48 bits but keeping the default the same (39 bits). The reason for not changing the default is to avoid potential compatibility problems with live migration of intel-iommu enabled QEMU guest. The only valid values for 'x-aw-bits' parameter are 39 and 48. After enabling larger address width (48), we should be able to map larger iova addresses in the guest. For example, a QEMU guest that is configured with large memory ( >=1TB ). To check whether 48 bits aw is enabled, we can grep in the guest dmesg output with line: "DMAR: Host address width 48". Signed-off-by: Prasad Singamsetty <prasad.singamsety@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-01-18 21:52:38 +02:00
Prasad Singamsetty	92e5d85e83	intel-iommu: Redefine macros to enable supporting 48 bit address width The current implementation of Intel IOMMU code only supports 39 bits host/iova address width so number of macros use hard coded values based on that. This patch is to redefine them so they can be used with variable address widths. This patch doesn't add any new functionality but enables adding support for 48 bit address width. Signed-off-by: Prasad Singamsetty <prasad.singamsety@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-01-18 21:52:38 +02:00
Haozhong Zhang	c68bcb3a99	target/i386: add clflushopt to "Skylake-Server" cpu model CPUID_7_0_EBX_CLFLUSHOPT is missed in current "Skylake-Server" cpu model. Add it to "Skylake-Server" cpu model on pc-i440fx-2.12 and pc-q35-2.12. Keep it disabled in "Skylake-Server" cpu model on older machine types. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171219033730.12748-3-haozhong.zhang@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2018-01-17 23:04:31 -02:00
Haozhong Zhang	df47ce8af4	pc: add 2.12 machine types Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171219033730.12748-2-haozhong.zhang@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2018-01-17 23:04:31 -02:00
Michael S. Tsirkin	acc95bc850	Merge remote-tracking branch 'origin/master' into HEAD Resolve conflicts around apb. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-01-11 22:03:50 +02:00
Sergio Andres Gomez Del Real	2cb9f06e3d	apic: add function to apic that will be used by hvf This patch adds the function apic_get_highest_priority_irr to apic.c and exports it through the interface in apic.h for use by hvf. Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com> Message-Id: <20170913090522.4022-8-Sergio.G.DelReal@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-22 15:01:19 +01:00
Peter Xu	bf33cc75ad	intel_iommu: remove X86_IOMMU_PCI_DEVFN_MAX We have PCI_DEVFN_MAX now. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Liu, Yi L <yi.l.liu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-12-22 01:42:03 +02:00
Philippe Mathieu-Daudé	0d5d8a3a90	hw/misc/pvpanic: extract public API from i386/pc to "hw/misc/pvpanic.h" and remove the old i386/pc dependency. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé	489983d6b4	hw/net/ne2000: extract ne2k-isa code from i386/pc to ne2000-isa.c - add "hw/net/ne2000-isa.h" - remove the old i386 dependency Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Hervé Poussineau <hpoussin@reactos.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> [PPC] Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé	866e2b3727	hw/display/vga: extract public API from i386/pc to "hw/display/vga.h" and remove the old i386/pc dependency. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé	b1c439d179	hw/acpi/ich9: extract ACPI_PM_PROP_TCO_ENABLED from i386/pc enable_tco is specific to i386/pc. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé	9dc047ce8f	hw/acpi: ACPI_PM_* defines are not restricted to i386 arch this allows to remove the old i386/pc dependency on acpi/core. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-12-18 17:07:02 +03:00
Chao Gao	861fec459b	i386/msi: Correct mask of destination ID in MSI address According to SDM 10.11.1, only [19:12] bits of MSI address are Destination ID, change the mask to avoid ambiguity for VT-d spec has used the bit 4 to indicate a remappable interrupt request. Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-12-01 18:28:15 +02:00
Marcel Apfelbaum	9fa99d2519	hw/pci-host: Fix x86 Host Bridges 64bit PCI hole Currently there is no MMIO range over 4G reserved for PCI hotplug. Since the 32bit PCI hole depends on the number of cold-plugged PCI devices and other factors, it is very possible is too small to hotplug PCI devices with large BARs. Fix it by reserving 2G for I4400FX chipset in order to comply with older Win32 Guest OSes and 32G for Q35 chipset. Even if the new defaults of pci-hole64-size will appear in "info qtree" also for older machines, the property was not implemented so no changes will be visible to guests. Note this is a regression since prev QEMU versions had some range reserved for 64bit PCI hotplug. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-11-16 17:46:53 +02:00
Gonglei	6c69dfb67e	i386/cpu/hyperv: support over 64 vcpus for windows guests Starting with Windows Server 2012 and Windows 8, if CPUID.40000005.EAX contains a value of -1, Windows assumes specific limit to the number of VPs. In this case, Windows Server 2012 guest VMs may use more than 64 VPs, up to the maximum supported number of processors applicable to the specific Windows version being used. https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs For compatibility, Let's introduce a new property for X86CPU, named "x-hv-max-vps" as Eduardo's suggestion, and set it to 0x40 before machine 2.10. (The "x-" prefix indicates that the property is not supposed to be a stable user interface.) Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <1505143227-14324-1-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 16:20:49 +02:00
Marcel Apfelbaum	a6fd5b0e05	pc: add 2.11 machine types Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-09-08 16:15:17 +03:00
Peter Xu	07f7b73398	intel_iommu: use access_flags for iotlb It was cached by read/write separately. Let's merge them. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-08-02 00:13:25 +03:00
Daniel P. Berrange	1ce36bfe64	i386: expose "TCGTCGTCGTCG" in the 0x40000000 CPUID leaf Currently when running KVM, we expose "KVMKVMKVM\0\0\0" in the 0x40000000 CPUID leaf. Other hypervisors (VMWare, HyperV, Xen, BHyve) all do the same thing, which leaves TCG as the odd one out. The CPUID signature is used by software to detect which virtual environment they are running in and (potentially) change behaviour in certain ways. For example, systemd supports a ConditionVirtualization= setting in unit files. The virt-what command can also report the virt type it is running on Currently both these apps have to resort to custom hacks like looking for 'fw-cfg' entry in the /proc/device-tree file to identify TCG. This change thus proposes a signature "TCGTCGTCGTCG" to be reported when running under TCG. To hide this, the -cpu option tcg-cpuid=off can be used. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <20170509132736.10071-3-berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-07-17 15:41:30 -03:00
Alexey Kardashevskiy	1221a47467	memory/iommu: introduce IOMMUMemoryRegionClass This finishes QOM'fication of IOMMUMemoryRegion by introducing a IOMMUMemoryRegionClass. This also provides a fastpath analog for IOMMU_MEMORY_REGION_GET_CLASS(). This makes IOMMUMemoryRegion an abstract class. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170711035620.4232-3-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Alexey Kardashevskiy	3df9d74806	memory/iommu: QOM'fy IOMMU MemoryRegion This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion as a parent. This moves IOMMU-related fields from MR to IOMMU MR. However to avoid dymanic QOM casting in fast path (address_space_translate, etc), this adds an @is_iommu boolean flag to MR and provides new helper to do simple cast to IOMMU MR - memory_region_get_iommu. The flag is set in the instance init callback. This defines memory_region_is_iommu as memory_region_get_iommu()!=NULL. This switches MemoryRegion to IOMMUMemoryRegion in most places except the ones where MemoryRegion may be an alias. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20170711035620.4232-2-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Thomas Huth	2099935dbf	Move CONFIG_KVM related definitions to kvm_i386.h pc.h and sysemu/kvm.h are also included from common code (where CONFIG_KVM is not available), so the #defines that depend on CONFIG_KVM should not be declared here to avoid that anybody is using them in a wrong way. Since we're also going to poison CONFIG_KVM for common code, let's move them to kvm_i386.h instead. Most of the dummy definitions from sysemu/kvm.h are also unused since the code that uses them is only compiled for CONFIG_KVM (e.g. target/i386/kvm.c), so the unused defines are also simply dropped here instead of being moved. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1498454578-18709-3-git-send-email-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 14:30:03 +02:00
Laszlo Ersek	2f295167e0	q35/mch: implement extended TSEG sizes The q35 machine type currently lets the guest firmware select a 1MB, 2MB or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even that is not enough when a lot of VCPUs (more than approx. 224) are configured -- SMRAM footprint scales largely proportionally with VCPU count. Introduce a new property for "mch" called "extended-tseg-mbytes", which expresses (in megabytes) the user's choice of TSEG (SMRAM) size. Invent a new, QEMU-specific register in the config space of the DRAM Controller, at offset 0x50, in order to allow guest firmware to query the TSEG (SMRAM) size. According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller Register Address Map (D0:F0)": Warning: Address locations that are not listed are considered Intel Reserved registers locations. Reads to Reserved registers may return non-zero values. Writes to reserved locations may cause system failures. All registers that are defined in the PCI 2.3 specification, but are not necessary or implemented in this component are simply not included in this document. The reserved/unimplemented space in the PCI configuration header space is not documented as such in this summary. Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part of the standard PCI config space header. And they precede the capability list as well, which starts at 0xe0 for this device. When the guest writes value 0xffff to this register, the value that can be read back is that of "mch.extended-tseg-mbytes" -- unless it remains 0xffff. The guest is required to write 0xffff first (as opposed to a read-only register) because PCI config space is generally not cleared on QEMU reset, and after S3 resume or reboot, new guest firmware running on old QEMU could read a guest OS-injected value from this register. After reading the available "extended" TSEG size, the guest firmware may actually request that TSEG size by writing pattern 11b to the ESMRAMC register's TSEG_SZ bit-field. (The Intel spec referenced above defines only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.) On the QEMU command line, the value can be set with -global mch.extended-tseg-mbytes=N The default value for 2.10+ q35 machine types is 16. The value is limited to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users are responsible for choosing sensible TSEG sizes. On 2.9 and earlier q35 machine types, the default value is 0. This lets the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50, keep their original behavior. When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is set to that value on reset, for completeness. PCI config space is migrated automatically, so no VMSD changes are necessary. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027 Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-06-16 18:07:08 +03:00
Eduardo Habkost	1f43571604	pc: Use "min-[x]level" on compat_props Since the automatic cpuid-level code was introduced in commit `c39c0edf9b` ("target-i386: Automatically set level/xlevel/xlevel2 when needed"), the CPU model tables just define the default CPUID level code (set using "min-level"). Setting "[x]level" forces CPUID level to a specific value and disable the automatic-level logic. But the PC compat code was not updated and the existing "[x]level" compat properties broke compatibility for people using features that triggered the auto-level code. To keep previous behavior, we should set "min-[x]level" instead of "[x]level" on compat_props. This was not a problem for most cases, because old machine-types don't have full-cpuid-auto-level enabled. The only common use case it broke was the CPUID[7] auto-level code, that was already enabled since the first CPUID[7] feature was introduced (in QEMU 1.4.0). This causes the regression reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1454641 Change the PC compat code to use "min-[x]level" instead of "[x]level" on compat_props, and add new test cases to ensure we don't break this again. Reported-by: "Guo, Zhiyi" <zhguo@redhat.com> Fixes: `c39c0edf9b` ("target-i386: Automatically set level/xlevel/xlevel2 when needed") Cc: qemu-stable@nongnu.org Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-06-05 14:59:08 -03:00
Peter Xu	dbaabb25f4	intel_iommu: support passthrough (PT) Hardware support for VT-d device passthrough. Although current Linux can live with iommu=pt even without this, but this is faster than when using software passthrough. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Liu, Yi L <yi.l.liu@linux.intel.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	465238d9f8	pc: add 2.10 machine type CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Richard Henderson <rth@twiddle.net> CC: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-10 22:04:23 +03:00
Igor Mammedov	98e753a6e5	pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot Since 2.7 commit (`b2a575a` Add optionrom compatible with fw_cfg DMA version) regressed migration during firmware exection time by abusing fwcfg.dma_enabled property to decide loading dma version of option rom AND by mistake disabling DMA for 2.6 and earlier globally instead of only for option rom. so 2.6 machine type guest is broken when it already runs firmware in DMA mode but migrated to qemu-2.7(pc-2.6) at that time; a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport) b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=ioport,oprom=ioport) to: a b from a OK FAIL b OK OK So we currently have broken forward migration from qemu-2.6 to qemu-2.[789] that however could be fixed for 2.10 by re-enabling DMA for 2.[56] machine types and allowing dma capable option rom only since 2.7. As result qemu should end up with: c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport) to: a b c from a OK FAIL OK b OK OK OK c OK FAIL OK where forward migration from qemu-2.6 to qemu-2.10 should work again leaving only qemu-2.[789]:pc-2.6 broken. Reported-by: Eduardo Habkost <ehabkost@redhat.com> Analyzed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-10 22:04:23 +03:00
Phil Dennis-Jordan	6103451aeb	hw/i386: Build-time assertion on pc/q35 reset register being identical. This adds a clarifying comment and build time assert to the FADT reset register field initialisation: the reset register is the same on both machine types. Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu> Message-Id: <1489558827-28971-3-git-send-email-phil@philjordan.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-05-03 12:29:40 +02:00
Peter Xu	dd4d607e40	intel_iommu: enable remote IOTLB This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch upstream: "IOMMU: enable intel_iommu map and unmap notifiers" https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html However I removed/fixed some content, and added my own codes. Instead of translate() every page for iotlb invalidations (which is slower), we walk the pages when needed and notify in a hook function. This patch enables vfio devices for VT-d emulation. And, since we already have vhost DMAR support via device-iotlb, a natural benefit that this patch brings is that vt-d enabled vhost can live even without ATS capability now. Though more tests are needed. Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Peter Xu	558e0024a4	intel_iommu: allow dynamic switch of IOMMU region This is preparation work to finally enabled dynamic switching ON/OFF for VT-d protection. The old VT-d codes is using static IOMMU address space, and that won't satisfy vfio-pci device listeners. Let me explain. vfio-pci devices depend on the memory region listener and IOMMU replay mechanism to make sure the device mapping is coherent with the guest even if there are domain switches. And there are two kinds of domain switches: (1) switch from domain A -> B (2) switch from domain A -> no domain (e.g., turn DMAR off) Case (1) is handled by the context entry invalidation handling by the VT-d replay logic. What the replay function should do here is to replay the existing page mappings in domain B. However for case (2), we don't want to replay any domain mappings - we just need the default GPA->HPA mappings (the address_space_memory mapping). And this patch helps on case (2) to build up the mapping automatically by leveraging the vfio-pci memory listeners. Another important thing that this patch does is to seperate IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not depend on the DMAR region (like before this patch). It should be a standalone region, and it should be able to be activated without DMAR (which is a common behavior of Linux kernel - by default it enables IR while disabled DMAR). Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Paolo Bonzini	8c9f42f3cf	tco: do not generate an NMI This behavior is not indicated in the datasheet and can confuse the OS. The TCO can trap NMIs from SERR# or IOCHK# and convert them to SMIs; but any other TCO event is either delivered as an SMI or completely disabled. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-04-05 17:23:52 +02:00
Paolo Bonzini	5354edd286	Revert "apic: save apic_delivered flag" This reverts commit `07bfa35477`. The global variable is only read as part of a apic_reset_irq_delivered(); qemu_irq_raise(s->irq); if (!apic_get_irq_delivered()) { sequence, so the value never matters at migration time. Reported-by: Dr. David Alan Gilbert <dglibert@redhat.com> Cc: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-27 14:41:01 +02:00
Eduardo Habkost	ec56a4a7b0	i386: Change stepping of Haswell to non-blacklisted value glibc blacklists TSX on Haswell CPUs with model==60 and stepping < 4. To make the Haswell CPU model more useful, make those guests actually use TSX by changing CPU stepping to 4. References: * glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359 https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170309181212.18864-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-03-10 15:01:09 -03:00
Dr. David Alan Gilbert	fc3a1fd74f	x86: Work around SMI migration breakages Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU due to a disagreement about SM (System management) interrupts. 2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI and this gets into the migration stream, but on 2.3.0 it never got delivered. ~2.4.0 SMI interrupt support was added but was broken - so that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI but never actually caused an interrupt. The SMI delivery was recently fixed by `68c6efe07a`, but the effect now is that an incoming 2.3.0 stream takes the interrupt it had flagged but it's bios can't actually handle it(I think partly due to the original interrupt not being taken during boot?). The consequence is a triple(?) fault and a reboot. Tested from: 2.3.1 -M 2.3.0 2.7.0 -M 2.3.0 2.8.0 -M 2.3.0 2.8.0 -M 2.8.0 This corresponds to RH bugzilla entry 1420679. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170223133441.16010-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-03 16:40:03 +01:00
Igor Mammedov	38690a1ca7	machine: move possible_cpus to MachineState so that it would be possible to reuse it with spapr/virt-aarch64 targets. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Aviv Ben-David	3b40f0e53c	intel_iommu: add "caching-mode" option This capability asks the guest to invalidate cache before each map operation. We can use this invalidation to trap map operations in the hypervisor. Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com> [peterx: using "caching-mode" instead of "cache-mode" to align with spec] [peterx: re-write the subject to make it short and clear] Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-02-17 21:52:31 +02:00
Marc-André Lureau	0ec7b3e7f2	char: rename CharDriverState Chardev Pick a uniform chardev type name. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-27 18:07:59 +01:00
Phil Dennis-Jordan	0b564e6f53	pc: Enable vmware-cpuid-freq CPU option for 2.9+ machine types Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu> Message-Id: <1484921496-11257-4-git-send-email-phil@philjordan.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-27 18:07:58 +01:00
Laszlo Ersek	b8bab8eb69	hw/isa/lpc_ich9: negotiate SMI broadcast on pc-q35-2.9+ machine types Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20170126014416.11211-4-lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-27 18:07:31 +01:00
Laszlo Ersek	5ce45c7a2b	hw/isa/lpc_ich9: add broadcast SMI feature The generic edk2 SMM infrastructure prefers EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. If Trigger() only brings the current processor into SMM, then edk2 handles it in the following ways: (1) If Trigger() is executed by the BSP (which is guaranteed before ExitBootServices(), but is not necessarily true at runtime), then: (a) If edk2 has been configured for "traditional" SMM synchronization, then the BSP sends directed SMIs to the APs with APIC delivery, bringing them into SMM individually. Then the BSP runs the SMI handler / dispatcher. (b) If edk2 has been configured for "relaxed" SMM synchronization, then the APs that are not already in SMM are not brought in, and the BSP runs the SMI handler / dispatcher. (2) If Trigger() is executed by an AP (which is possible after ExitBootServices(), and can be forced e.g. by "taskset -c 1 efibootmgr"), then the AP in question brings in the BSP with a directed SMI, and the BSP runs the SMI handler / dispatcher. The smaller problem with (1a) and (2) is that the BSP and AP synchronization is slow. For example, the "taskset -c 1 efibootmgr" command from (2) can take more than 3 seconds to complete, because efibootmgr accesses non-volatile UEFI variables intensively. The larger problem is that QEMU's current behavior diverges from the behavior usually seen on physical hardware, and that keeps exposing obscure corner cases, race conditions and other instabilities in edk2, which generally expects / prefers a software SMI to affect all CPUs at once. Therefore introduce the "broadcast SMI" feature that causes QEMU to inject the SMI on all VCPUs. While the original posting of this patch <http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html> only intended to speed up (2), based on our recent "stress testing" of SMM this patch actually provides functional improvements. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20170126014416.11211-3-lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-27 18:07:31 +01:00
Laszlo Ersek	50de920b37	hw/isa/lpc_ich9: add SMI feature negotiation via fw_cfg Introduce the following fw_cfg files: - "etc/smi/supported-features": a little endian uint64_t feature bitmap, presenting the features known by the host to the guest. Read-only for the guest. The content of this file will be determined via bit-granularity ICH9-LPC device properties, to be introduced later. For now, the bitmask is left zeroed. The bits will be set from machine type compat properties and on the QEMU command line, hence this file is not migrated. - "etc/smi/requested-features": a little endian uint64_t feature bitmap, representing the features the guest would like to request. Read-write for the guest. The guest can freely (re)write this file, it has no direct consequence. Initial value is zero. A nonzero value causes the SMI-related fw_cfg files and fields that are under guest influence to be migrated. - "etc/smi/features-ok": contains a uint8_t value, and it is read-only for the guest. When the guest selects the associated fw_cfg key, the guest features are validated against the host features. In case of error, the negotiation doesn't proceed, and the "features-ok" file remains zero. In case of success, the "features-ok" file becomes (uint8_t)1, and the negotiated features are locked down internally (to which no further changes are possible until reset). The initial value is zero. A nonzero value causes the SMI-related fw_cfg files and fields that are under guest influence to be migrated. The C-language fields backing the "supported-features" and "requested-features" files are uint8_t arrays. This is because they carry guest-side representation (our choice is little endian), while VMSTATE_UINT64() assumes / implies host-side endianness for any uint64_t fields. If we migrate a guest between hosts with different endiannesses (which is possible with TCG), then the host-side value is preserved, and the host-side representation is translated. This would be visible to the guest through fw_cfg, unless we used plain byte arrays. So we do. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20170126014416.11211-2-lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-27 18:07:31 +01:00
Pavel Dovgalyuk	07bfa35477	apic: save apic_delivered flag This patch implements saving/restoring of static apic_delivered variable. v8: saving static variable only for one of the APICs Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20170126123429.5412.94368.stgit@PASHA-ISP> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-27 18:07:30 +01:00
Igor Mammedov	80e5db303d	machine: Make possible_cpu_arch_ids() return const pointer make sure that external callers won't try to modify possible_cpus and owner of possible_cpus can access it directly when it modifies it. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1484759609-264075-5-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-01-23 21:25:37 -02:00
Marcelo Tosatti	abc62c89f3	pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8 As noticed by David Gilbert, commit `6053a86` 'kvmclock: reduce kvmclock differences on migration' added 'x-mach-use-reliable-get-clock' and a compatibility entry that turns it off; however it got merged after 2.8.0 was released but the entry has gone into PC_COMPAT_2_7 where it should have gone into PC_COMPAT_2_8. Fix it by moving the entry to PC_COMPAT_2_8. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170118175343.GA26873@amt.cnet> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-20 13:22:57 +01:00
Jason Wang	554f5e1604	intel_iommu: support device iotlb descriptor This patch enables device IOTLB support for intel iommu. The major work is to implement QI device IOTLB descriptor processing and notify the device through iommu notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2017-01-10 05:56:58 +02:00
Marcelo Tosatti	6053a86fe7	kvmclock: reduce kvmclock difference on migration Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which indicates that KVM_GET_CLOCK returns a value as seen by the guest at that moment. For new machine types, use this value rather than reading from guest memory. This reduces kvmclock difference on migration from 5s to 0.1s (when max_downtime == 5s). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20161121105052.598267440@redhat.com> [Add comment explaining what is going on. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-12-22 16:00:56 +01:00
Chao Peng	feddd2fd91	pc: make pit configurable Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1478330391-74060-4-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-12-22 16:00:25 +01:00
Chao Peng	272f042877	pc: make sata configurable Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1478330391-74060-3-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-12-22 16:00:25 +01:00
Chao Peng	be232eb076	pc: make smbus configurable Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1478330391-74060-2-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-12-22 16:00:25 +01:00
Dr. David Alan Gilbert	f9f885b78a	migration/pcspk: Turn migration of pcspk off for 2.7 and older To keep backwards migration compatibility allow us to turn pcspk migration off. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20161128133201.16104-3-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-11-28 16:45:12 +01:00
Igor Mammedov	e3cadac073	pc: fix FW_CFG_NB_CPUS to account for -device added CPUs Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479301481-197333-1-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-11-16 12:10:00 -02:00

1 2 3 4 5 ...

354 commits