linux

mirror of https://github.com/torvalds/linux synced 2024-10-24 12:17:07 +00:00

Author	SHA1	Message	Date
Paolo Bonzini	f7b9ddb8a5	3 fixes - memory leak on certain SIGP conditions - wrong size for idle bitmap (always too big) - clear local interrupts on initial CPU reset 1 performance improvement - improve performance with many guests on certain workloads -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTMXvEAAoJEBF7vIC1phx8ii8P/2XI/aXFlhkITD/79rghPbjo sE76o5Rlz4UzM8khgEW/4kajehww/1/hO68ojcy1xUE1eMHnjk4sF7c5ozbKX7Qw a411B/IrkDEz3aVFLx8KXu2qSdCs22WDgyYFUR4kOBjStP+TvN7KH1NYw84CO+pA 7Mf+DAxF26X83q6nNvfnEq4cjrQEGmB0aRLkiT4PmHjs4WL+iimJmXeVTewhCtKm rE3N9AjpaZpqUQANXvdTJc5Cap/RbVMJf05EbIg5LrsEN+9xQHHRpkkjSRw+7XLx iY1uxgA8a92dGWY1RCSZe3lLS0ibg6LWH05hlhYOOCe1DBU0KVQJ5Txixu/ekm5j gv2BPpnquF+R2uYptTmF8Xq7TeP15kc3JjahrE8tZ16RH5dpZ7fn6fK/f3590JYB 4rt0xt5diQwaFRDcgT+8zLGvIq8DZ4ZH6KNElXli8megdY1hiOIlkb1R7Hq/RIt1 eacX2mycZAlf2ZUp3lVTHYxPL43WH2Qf2s4Y4mHlEmH8LQGGiFOl8Ne0Wgoeha9O JVvoHxTEqvzhWTgUi6n8cTlUsYvq6ICXhwCOPM5HLgbxCgbPbEv5EMOZxGAQJ0FK fQXasKpjxzPYyJ6XS8xeNel4yFQ+j8G1rvN4Q8kMLY4fjAj5sfm0WRrFw/gCb2ds ISfe6UOnoV9scrXvAMyH =7smd -----END PGP SIGNATURE----- Merge tag 'kvm-s390-20140325' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next 3 fixes - memory leak on certain SIGP conditions - wrong size for idle bitmap (always too big) - clear local interrupts on initial CPU reset 1 performance improvement - improve performance with many guests on certain workloads	2014-03-25 15:44:06 +01:00
Jens Freimann	2ed10cc15e	KVM: s390: clear local interrupts at cpu initial reset Empty list of local interrupts when vcpu goes through initial reset to provide a clean state Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-03-25 13:27:12 +01:00
Thomas Huth	91880d07fc	KVM: s390: Fix possible memory leak in SIGP functions When kvm_get_vcpu() returned NULL for the destination CPU in __sigp_emergency() or __sigp_external_call(), the memory for the "inti" structure was not released anymore. This patch fixes this issue by moving the check for !dst_vcpu before the kzalloc() call. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-03-25 13:27:11 +01:00
Jens Freimann	609433fbed	KVM: s390: fix calculation of idle_mask array size We need BITS_TO_LONGS, not sizeof(long) to calculate the correct size. idle_mask is a bitmask, each bit representing the state of a cpu. The desired outcome is an array of unsigned long fields that can fit KVM_MAX_VCPUS bits. We should not use sizeof(long) which returnes the size in bytes, but BITS_TO_LONGS Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-03-25 13:27:11 +01:00
Christian Borntraeger	f6c137ff00	KVM: s390: randomize sca address We allocate a page for the 2k sca, so lets use the space to improve hit rate of some internal cpu caches. No need to change the freeing of the page, as this will shift away the page offset bits anyway. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>	2014-03-25 13:27:10 +01:00
Paolo Bonzini	ea2108c930	Merge branch 'kvms390-irqfd' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next	2014-03-24 11:49:13 +01:00
Paolo Bonzini	673f7b4257	KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP After the previous patches, an interrupt whose bit is set in the IRR register will never be in the LAPIC's IRR and has never been injected on the migration source. So inject it on the destination. This fixes migration of Windows guests without HPET (they use the RTC to trigger the scheduler tick, and lose it after migration). Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-21 15:10:43 +01:00
Cornelia Huck	f3f710bc64	KVM: Bump KVM_MAX_IRQ_ROUTES for s390 The maximum number for irq routes is currently 1024, which is a bit on the small size for s390: We support up to 4 x 64k virtual devices with up to 64 queues, and we need one route for each of the queues if we want to operate it via irqfd. Let's bump this to 4k on s390 for now, as this at least covers the saner setups. We need to find a more general solution, though, as we can't just grow the routing table indefinitly. Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-03-21 13:43:13 +01:00
Cornelia Huck	8422359877	KVM: s390: irq routing for adapter interrupts. Introduce a new interrupt class for s390 adapter interrupts and enable irqfds for s390. This is depending on a new s390 specific vm capability, KVM_CAP_S390_IRQCHIP, that needs to be enabled by userspace. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-03-21 13:43:00 +01:00
Cornelia Huck	841b91c584	KVM: s390: adapter interrupt sources Add a new interface to register/deregister sources of adapter interrupts identified by an unique id via the flic. Adapters may also be maskable and carry a list of pinned pages. These adapters will be used by irq routing later. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-03-21 13:42:49 +01:00
Cornelia Huck	d938dc5522	KVM: Add per-vm capability enablement. Allow KVM_ENABLE_CAP to act on a vm as well as on a vcpu. This makes more sense when the caller wants to enable a vm-related capability. s390 will be the first user; wire it up. Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-03-21 13:42:39 +01:00
Paolo Bonzini	44847dea79	KVM: ioapic: extract body of kvm_ioapic_set_irq We will reuse it to process a nonzero IRR that is passed to KVM_SET_IRQCHIP. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-21 10:20:16 +01:00
Paolo Bonzini	0bc830b05c	KVM: ioapic: clear IRR for edge-triggered interrupts at delivery This ensures that IRR bits are set in the KVM_GET_IRQCHIP result only if the interrupt is still sitting in the IOAPIC. After the next patches, it avoids spurious reinjection of the interrupt when KVM_SET_IRQCHIP is called. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-21 10:20:10 +01:00
Paolo Bonzini	0b10a1c87a	KVM: ioapic: merge ioapic_deliver into ioapic_service Commonize the handling of masking, which was absent for kvm_ioapic_set_irq. Setting remote_irr does not need a separate function either, and merging the two functions avoids confusion. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-21 10:19:48 +01:00
James Hogan	36c9549460	MIPS: KVM: Remove dead code in CP0 emulation The code to check whether rd > MIPS_CP0_DESAVE is dead code, since MIPS_CP0_DESAVE = 31 and rd is already masked with 0x1f. Remove it. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-19 17:01:50 +01:00
James Hogan	26f4f3b578	MIPS: KVM: Consult HWREna before emulating RDHWR The ability to read hardware registers from userland with the RDHWR instruction should depend upon the corresponding bit of the HWREna register being set, otherwise a reserved instruction exception should be generated. However KVM's current emulation ignores the guest's HWREna and always emulates RDHWR instructions even if the guest OS has disallowed them. Therefore rework the RDHWR emulation code to check for privilege or the corresponding bit in the guest HWREna bit. Also remove the #if 0 case for the UserLocal register. I presume it was there for debug purposes but it seems unnecessary now that the guest can control whether it causes a guest exception. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-19 17:01:43 +01:00
James Hogan	1550567936	MIPS: KVM: Pass reserved instruction exceptions to guest Previously a reserved instruction exception while in guest code would cause a KVM internal error if kvm_mips_handle_ri() didn't recognise the instruction (including a RDHWR from an unrecognised hardware register). However the guest OS should really have the opportunity to catch the exception so that it can take the appropriate actions such as sending a SIGILL to the guest user process or emulating the instruction itself. Therefore in these cases emulate a guest RI exception and only return EMULATE_FAIL if that fails, being careful to revert the PC first in case the exception occurred in a branch delay slot in which case the PC will already point to the branch target. Also turn the printk messages relating to these cases into kvm_debug messages so that they aren't usually visible. This allows crashme to run in the guest without killing the entire VM. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-19 17:01:34 +01:00
James Hogan	2202794548	MIPS: KVM: asm/kvm_host.h: Clean up whitespace The whitespace in asm/kvm_host.h is quite inconsistent in places. Clean up the whole file to use tabs more consistently. When you use the --ignore-space-change argument to git diff this patch only changes line wrapping in TLB_IS_GLOBAL and TLB_IS_VALID macros. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-19 17:01:15 +01:00
Cornelia Huck	684a0b719d	KVM: eventfd: Fix lock order inversion. When registering a new irqfd, we call its ->poll method to collect any event that might have previously been pending so that we can trigger it. This is done under the kvm->irqfds.lock, which means the eventfd's ctx lock is taken under it. However, if we get a POLLHUP in irqfd_wakeup, we will be called with the ctx lock held before getting the irqfds.lock to deactivate the irqfd, causing lockdep to complain. Calling the ->poll method does not really need the irqfds.lock, so let's just move it after we've given up the irqfds.lock in kvm_irqfd_assign(). Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-18 17:06:04 +01:00
Paolo Bonzini	93c4adc7af	KVM: x86: handle missing MPX in nested virtualization When doing nested virtualization, we may be able to read BNDCFGS but still not be allowed to write to GUEST_BNDCFGS in the VMCS. Guard writes to the field with vmx_mpx_supported(), and similarly hide the MSR from userspace if the processor does not support the field. We could work around this with the generic MSR save/load machinery, but there is only a limited number of MSR save/load slots and it is not really worthwhile to waste one for a scenario that should not happen except in the nested virtualization case. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-17 12:21:39 +01:00
Paolo Bonzini	36be0b9deb	KVM: x86: Add nested virtualization support for MPX This is simple to do, the "host" BNDCFGS is either 0 or the guest value. However, both controls have to be present. We cannot provide MPX if we only have one of the "load BNDCFGS" or "clear BNDCFGS" controls. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-17 12:21:39 +01:00
Paolo Bonzini	4ff417320c	KVM: x86: introduce kvm_supported_xcr0() XSAVE support for KVM is already using host_xcr0 & KVM_SUPPORTED_XCR0 as a "dynamic" version of KVM_SUPPORTED_XCR0. However, this is not enough because the MPX bits should not be presented to the guest unless kvm_x86_ops confirms the support. So, replace all instances of host_xcr0 & KVM_SUPPORTED_XCR0 with a new function kvm_supported_xcr0() that also has this check. Note that here: if (xstate_bv & ~KVM_SUPPORTED_XCR0) return -EINVAL; if (xstate_bv & ~host_cr0) return -EINVAL; the code is equivalent to if ((xstate_bv & ~KVM_SUPPORTED_XCR0) \|\| (xstate_bv & ~host_cr0) return -EINVAL; i.e. "xstate_bv & (~KVM_SUPPORTED_XCR0 \| ~host_cr0)" which is in turn equal to "xstate_bv & ~(KVM_SUPPORTED_XCR0 & host_cr0)". So we should also use the new function there. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-17 12:21:38 +01:00
Paolo Bonzini	94b3ffcd41	Two patches: - one regression fix for reducing the amount of ucontrol userspace exits - get rid of BUG_ONs in hot inner loops -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTJtUZAAoJEBF7vIC1phx8K90P/1AViGwY1RQjB3gE8vggOm0S 0PWG36vQw7zXy1P+veqyQqbusyn1TS5WcJn+5xg665nUv4s6PHIFYj8BcNDdcW1P GxzctKdaSY5dzunvY6gRhMcwT02h31xiAlajG39Bn8pEI/n/n+12jRIiuWndBqiC 3mpsKTLIscRPslxNkOLKbP2yiLW0rxo4rAsad7iRZSOjkLtGXbCB2hEew+osKGmK Grz1R54+FyYdUrLhHcmWLcfp4XX/aKP0pBaYxUBcNl6BR6QWRQMhcyBAK8UT19BS mXRXBf5TCIg2VMQVs0Mdi9eNjjUevgJu8FRc/kkq29an6kyOvcVfEIp4Hk8dH+yy suG70D1QVLRh/dhjhaTo+dDBSql1YHHcxuEwJ89oXuXaxNC7Lis4SyxSvDI7rntF aQv5Tf5dW4j+SEue483g8ImYeLoa7pEPE19rRhR7/+m8/D8acdQ/9993Pi+3KX3L QgV8aIZHRo9YQbaPE/ajLC85Vg1lXQFRGCWjk80v9YBrcshUdfLw9INinoCZ5m2V X+0kAPgVjqr4XYG5+ud5fy5mFYAR8EdhRDvv7fHF10a9UbGO2+1gN3wcewr4y/7j i0kk6jCLAnryeLkRX+teStbVrLMyyQTJKtnCTTG6WYNgj3KqTpXsCs14mofJFaH6 WVEMPvwvRpY3Xcslrq6e =Fq/F -----END PGP SIGNATURE----- Merge tag 'kvm-s390-20140317' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD Two patches: - one regression fix for reducing the amount of ucontrol userspace exits - get rid of BUG_ONs in hot inner loops	2014-03-17 12:21:35 +01:00
Igor Mammedov	6fec27d80f	KVM: x86 emulator: emulate MOVAPD Add emulation for 0x66 prefixed instruction of 0f 28 opcode that has been added earlier. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-17 12:14:30 +01:00
Igor Mammedov	27ce825823	KVM: x86 emulator: emulate MOVAPS HCK memory driver test fails when testing 32-bit Windows 8.1 with baloon driver. tracing KVM shows error: reason EXIT_ERR rip 0x81c18326 info 0 0 x/10i 0x81c18326-20 0x0000000081c18312: add %al,(%eax) 0x0000000081c18314: add %cl,-0x7127711d(%esi) 0x0000000081c1831a: rolb $0x0,0x80ec(%ecx) 0x0000000081c18321: and $0xfffffff0,%esp 0x0000000081c18324: mov %esp,%esi 0x0000000081c18326: movaps %xmm0,(%esi) 0x0000000081c18329: movaps %xmm1,0x10(%esi) 0x0000000081c1832d: movaps %xmm2,0x20(%esi) 0x0000000081c18331: movaps %xmm3,0x30(%esi) 0x0000000081c18335: movaps %xmm4,0x40(%esi) which points to MOVAPS instruction currently no emulated by KVM. Fix it by adding appropriate entries to opcode table in KVM's emulator. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-17 12:14:24 +01:00
Christian Borntraeger	2955c83f72	KVM: s390: Optimize ucontrol path Since commit `7c470539c9` (s390/kvm: avoid automatic sie reentry) we will run through the C code of KVM on host interrupts instead of just reentering the guest. This will result in additional ucontrol exits (at least HZ per second). Let handle a 0 intercept in the kernel and dont return to userspace, even if in ucontrol mode. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> CC: stable@vger.kernel.org	2014-03-17 11:06:51 +01:00
Dominik Dingel	fed495d25e	KVM: s390: Removing untriggerable BUG_ONs The BUG_ON in kvm-s390.c is unreachable, as we get the vcpu per common code, which itself does this from the private_data field of the file descriptor, and there is no KVM_UNCREATE_VCPU. The __{set,unset}_cpu_idle BUG_ONs are not triggerable because the vcpu creation code already checks against KVM_MAX_VCPUS. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-03-17 11:06:45 +01:00
Paolo Bonzini	8fbb1daf3e	Merge branch 'kvm-ppc-fix' into HEAD	2014-03-14 16:06:30 +01:00
Gabriel L. Somlo	100943c54e	kvm: x86: ignore ioapic polarity Both QEMU and KVM have already accumulated a significant number of optimizations based on the hard-coded assumption that ioapic polarity will always use the ActiveHigh convention, where the logical and physical states of level-triggered irq lines always match (i.e., active(asserted) == high == 1, inactive == low == 0). QEMU guests are expected to follow directions given via ACPI and configure the ioapic with polarity 0 (ActiveHigh). However, even when misbehaving guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow), QEMU will still use the ActiveHigh signaling convention when interfacing with KVM. This patch modifies KVM to completely ignore ioapic polarity as set by the guest OS, enabling misbehaving guests to work alongside those which comply with the ActiveHigh polarity specified by QEMU's ACPI tables. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu> [Move documentation to KVM_IRQ_LINE, add ia64. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-13 11:58:21 +01:00
Paul Mackerras	e724f080f5	KVM: PPC: Book3S HV: Fix register usage when loading/saving VRSAVE Commit `595e4f7e69` ("KVM: PPC: Book3S HV: Use load/store_fp_state functions in HV guest entry/exit") changed the register usage in kvmppc_save_fp() and kvmppc_load_fp() but omitted changing the instructions that load and save VRSAVE. The result is that the VRSAVE value was loaded from a constant address, and saved to a location past the end of the vcpu struct, causing host kernel memory corruption and various kinds of host kernel crashes. This fixes the problem by using register r31, which contains the vcpu pointer, instead of r3 and r4. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-13 10:47:01 +01:00
Paul Mackerras	a5b0ccb0b5	KVM: PPC: Book3S HV: Remove bogus duplicate code Commit `7b490411c3` ("KVM: PPC: Book3S HV: Add new state for transactional memory") incorrectly added some duplicate code to the guest exit path because I didn't manage to clean up after a rebase correctly. This removes the extraneous material. The presence of this extraneous code causes host crashes whenever a guest is run. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-13 10:46:52 +01:00
Paolo Bonzini	facb013969	KVM: svm: Allow the guest to run with dirty debug registers When not running in guest-debug mode (i.e. the guest controls the debug registers, having to take an exit for each DR access is a waste of time. If the guest gets into a state where each context switch causes DR to be saved and restored, this can take away as much as 40% of the execution time from the guest. If the guest is running with vcpu->arch.db == vcpu->arch.eff_db, we can let it write freely to the debug registers and reload them on the next exit. We still need to exit on the first access, so that the KVM_DEBUGREG_WONT_EXIT flag is set in switch_db_regs; after that, further accesses to the debug registers will not cause a vmexit. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:04 +01:00
Paolo Bonzini	5315c716b6	KVM: svm: set/clear all DR intercepts in one swoop Unlike other intercepts, debug register intercepts will be modified in hot paths if the guest OS is bad or otherwise gets tricked into doing so. Avoid calling recalc_intercepts 16 times for debug registers. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:04 +01:00
Paolo Bonzini	d16c293e4e	KVM: nVMX: Allow nested guests to run with dirty debug registers When preparing the VMCS02, the CPU-based execution controls is computed by vmx_exec_control. Turn off DR access exits there, too, if the KVM_DEBUGREG_WONT_EXIT bit is set in switch_db_regs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:03 +01:00
Paolo Bonzini	81908bf443	KVM: vmx: Allow the guest to run with dirty debug registers When not running in guest-debug mode (i.e. the guest controls the debug registers, having to take an exit for each DR access is a waste of time. If the guest gets into a state where each context switch causes DR to be saved and restored, this can take away as much as 40% of the execution time from the guest. If the guest is running with vcpu->arch.db == vcpu->arch.eff_db, we can let it write freely to the debug registers and reload them on the next exit. We still need to exit on the first access, so that the KVM_DEBUGREG_WONT_EXIT flag is set in switch_db_regs; after that, further accesses to the debug registers will not cause a vmexit. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:03 +01:00
Paolo Bonzini	c77fb5fe6f	KVM: x86: Allow the guest to run with dirty debug registers When not running in guest-debug mode, the guest controls the debug registers and having to take an exit for each DR access is a waste of time. If the guest gets into a state where each context switch causes DR to be saved and restored, this can take away as much as 40% of the execution time from the guest. After this patch, VMX- and SVM-specific code can set a flag in switch_db_regs, telling vcpu_enter_guest that on the next exit the debug registers might be dirty and need to be reloaded (syncing will be taken care of by a new callback in kvm_x86_ops). This flag can be set on the first access to a debug registers, so that multiple accesses to the debug registers only cause one vmexit. Note that since the guest will be able to read debug registers and enable breakpoints in DR7, we need to ensure that they are synchronized on entry to the guest---including DR6 that was not synced before. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:02 +01:00
Paolo Bonzini	360b948d88	KVM: x86: change vcpu->arch.switch_db_regs to a bit mask The next patch will add another bit that we can test with the same "if". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:02 +01:00
Paolo Bonzini	c845f9c646	KVM: vmx: we do rely on loading DR7 on entry Currently, this works even if the bit is not in "min", because the bit is always set in MSR_IA32_VMX_ENTRY_CTLS. Mention it for the sake of documentation, and to avoid surprises if we later switch to MSR_IA32_VMX_TRUE_ENTRY_CTLS. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 10:46:01 +01:00
Jan Kiszka	c9a7953f09	KVM: x86: Remove return code from enable_irq/nmi_window It's no longer possible to enter enable_irq_window in guest mode when L1 intercepts external interrupts and we are entering L2. This is now caught in vcpu_enter_guest. So we can remove the check from the VMX version of enable_irq_window, thus the need to return an error code from both enable_irq_window and enable_nmi_window. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 08:41:47 +01:00
Jan Kiszka	220c567297	KVM: nVMX: Do not inject NMI vmexits when L2 has a pending interrupt According to SDM 27.2.3, IDT vectoring information will not be valid on vmexits caused by external NMIs. So we have to avoid creating such scenarios by delaying EXIT_REASON_EXCEPTION_NMI injection as long as we have a pending interrupt because that one would be migrated to L1's IDT vectoring info on nested exit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 08:41:46 +01:00
Jan Kiszka	f4124500c2	KVM: nVMX: Fully emulate preemption timer We cannot rely on the hardware-provided preemption timer support because we are holding L2 in HLT outside non-root mode. Furthermore, emulating the preemption will resolve tick rate errata on older Intel CPUs. The emulation is based on hrtimer which is started on L2 entry, stopped on L2 exit and evaluated via the new check_nested_events hook. As we no longer rely on hardware features, we can enable both the preemption timer support and value saving unconditionally. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 08:41:45 +01:00
Jan Kiszka	b6b8a1451f	KVM: nVMX: Rework interception of IRQs and NMIs Move the check for leaving L2 on pending and intercepted IRQs or NMIs from the *_allowed handler into a dedicated callback. Invoke this callback at the relevant points before KVM checks if IRQs/NMIs can be injected. The callback has the task to switch from L2 to L1 if needed and inject the proper vmexit events. The rework fixes L2 wakeups from HLT and provides the foundation for preemption timer emulation. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-11 08:41:45 +01:00
Paolo Bonzini	b010926dc8	One fix for virtio-ccw, fixing a problem introduced with "virtio_ccw: fix vcdev pointer handling issues" and noticed just after it went into git. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTGD7oAAoJEN7Pa5PG8C+vlA8QAJ1tJ7o4wfqklh/NV0D24FQM eC2TPvUigSS4AbFAwfYqNHkuK+k80CuUHhX4hqErJ7vsWSI+pkrk7fzLs+H+zrHf eE+hILdKoVAPgAB03+tPVn2o+sVfTKTaFIbiRDcR2afJdLAtAJ8B8hrmZSLJWP/w iLhYdZhab9J62XQkieearhV0KZwQYMBkG0KM33TGFdrQw4Crsar32QgH/DFuD7bJ 4K2sppvt+pHCiIjk5baHkTt8Ngvslk2af73OgQCb2wTrk5SrMN03yFhm88SErtgl bTb2ybVnl1FpzFG3zZIYS8QhVTS8mj8W/3ie6mJ2G4JtUJFgHISLwKGPegLRYgdi e99ahKfKBleR60pAgplWU+7beTzRe92QhX5VGu20qNafuRQMaWIxqqctybT2v8ga NW6pXzYVorxTLipDnwgHrcEA2IhBtwTl2wCX6SXkoAZms1EkNkvzL4PMbkjZETVQ ++TMgGN5GA6Ak/yVSJfOO13qe465BHvff6PbYcmN/vTX5wy6AlnDu5fsRoUwl7OS vKR9JhQJXG3eJ/13Kex7To3qoceDNJjslmJnDyM8eKKfYCOawwEZPJ54jB9odset IHFMPQvkJZzled6ByqqR2yxcGW4DPdYac9mVxKO1Z3eGP7/QWkhwqza5qJfT2+MA zfdATG/Ym7uTFBIbR9XS =qLB9 -----END PGP SIGNATURE----- Merge tag 'kvm-s390-20140306' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next One fix for virtio-ccw, fixing a problem introduced with "virtio_ccw: fix vcdev pointer handling issues" and noticed just after it went into git.	2014-03-06 12:50:54 +01:00
Heinz Graalfs	79629b208f	virtio_ccw: fix hang in set offline processing During set offline processing virtio_grab_drvdata() incorrectly calls dev_set_drvdata() to remove the virtio_ccw_device from the parent ccw_device's driver data. This is wrong and ends up in a hang during virtio_ccw_reset(), as the interrupt handler still has need of the virtio_ccw_device. A new field 'going_away' is introduced in struct virtio_ccw_device to control the usage of the ccw_device's driver data pointer in virtio_grab_drvdata(). Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-03-06 10:22:40 +01:00
Paolo Bonzini	1c2af4968e	Merge tag 'kvm-for-3.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into kvm-next	2014-03-04 15:58:00 +01:00
Paolo Bonzini	a2fa301fdd	Merge tag 'kvm-s390-20140304' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next	2014-03-04 12:32:03 +01:00
Andrew Jones	332967a3ea	x86: kvm: introduce periodic global clock updates commit `0061d53daf` introduced a mechanism to execute a global clock update for a vm. We can apply this periodically in order to propagate host NTP corrections. Also, if all vcpus of a vm are pinned, then without an additional trigger, no guest NTP corrections can propagate either, as the current trigger is only vcpu cpu migration. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-04 11:50:54 +01:00
Andrew Jones	7e44e4495a	x86: kvm: rate-limit global clock updates When we update a vcpu's local clock it may pick up an NTP correction. We can't wait an indeterminate amount of time for other vcpus to pick up that correction, so commit `0061d53daf` introduced a global clock update. However, we can't request a global clock update on every vcpu load either (which is what happens if the tsc is marked as unstable). The solution is to rate-limit the global clock updates. Marcelo calculated that we should delay the global clock updates no more than 0.1s as follows: Assume an NTP correction c is applied to one vcpu, but not the other, then in n seconds the delta of the vcpu system_timestamps will be c * n. If we assume a correction of 500ppm (worst-case), then the two vcpus will diverge 50us in 0.1s, which is a considerable amount. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-03-04 11:50:47 +01:00
Cornelia Huck	96b14536d9	virtio-ccw: virtio-ccw adapter interrupt support. Implement the new CCW_CMD_SET_IND_ADAPTER command and try to enable adapter interrupts for every device on the first startup. If the host does not support adapter interrupts, fall back to normal I/O interrupts. virtio-ccw adapter interrupts use the same isc as normal I/O subchannels and share a summary indicator for all devices sharing the same indicator area. Indicator bits for the individual virtqueues may be contained in the same indicator area for different devices. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-03-04 10:41:04 +01:00
Martin Schwidefsky	84ec96a615	s390/airq: add support for irq ranges Add airq_iv_alloc and airq_iv_free to allocate and free consecutive ranges of irqs from the interrupt vector. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-03-04 10:41:04 +01:00

1 2 3 4 5 ...

426757 commits