MP initialization protocol differs between cpu families, and for P6 and
onward models it is up to CPU to decide if it will be BSP using this
protocol, so try to model this. However there is no point in implementing
MP initialization protocol in qemu. Thus first CPU is always marked as BSP.
This patch:
- moves decision to designate BSP from board into cpu, making cpu
self-sufficient in this regard. Later it will allow to cleanup hw/pc.c
and remove cpu_reset and wrappers from there.
- stores flag that CPU is BSP in IA32_APIC_BASE to model behavior
described in Inted SDM vol 3a part 1 chapter 8.4.1
- uses MSR_IA32_APICBASE_BSP flag in apic_base for checking if cpu is BSP
patch is based on Jan Kiszka's proposal:
http://thread.gmane.org/gmane.comp.emulators.qemu/100806
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently, it is split between hd_geometry_guess() and
pc_cmos_init_late(). Confusing. info qtree shows the result of the
former. Also confusing.
Fold the part done in pc_cmos_init_late() into hd_geometry_guess().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are two producers of these hints: drive_init() on behalf of
-drive, and hd_geometry_guess().
The only consumer of the hint is hd_geometry_guess().
The callers of hd_geometry_guess() call it only when drive_init()
didn't set the hints. Therefore, drive_init()'s hints are never used.
Thus, hd_geometry_guess() only ever sees hints it produced itself in a
prior call. Only the first call computes something, subsequent calls
just repeat the first call's results. However, hd_geometry_guess() is
never called more than once: the device models don't, and the block
device is destroyed on unplug. Thus, dropping the repeat feature
doesn't break anything now.
If a block device wasn't destroyed on unplug and could be reused with
a new device, then repeating old results would be wrong. Thus,
dropping the repeat feature prevents future breakage.
This renders the hints unused. Purge them from the block layer.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
PC BIOS setup needs IDE geometry information. Get it directly from
the device model rather than through the block layer. In preparation
of purging geometry from the block layer, which will happen later in
this series.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 5bbdbb46 moved it to block.c because "other geometry guessing
functions already reside in block.c". Device-specific functionality
should be kept in device code, not the block layer. Move it back.
Disk geometry guessing is still in block.c. To be moved out in a
later patch series.
Bonus: the floppy type used in pc_cmos_init() now obviously matches
the one in the FDrive. Before, we relied on
bdrv_get_floppy_geometry_hint() picking the same type both in
fd_revalidate() and in pc_cmos_init().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds two things. First it allows QEMU to distinguish between
regular powerdown and S4 powerdown. Later separate QMP notification will
be added for S4 powerdown. Second it allows S3/S4 states to be disabled
from QEMU command line. Some guests known to be broken with regards to
power management, but allow to use it anyway. Using new properties
management will be able to disable S3/S4 for such guests.
Supported system state are passed to a firmware using new fw_cfg file.
The file contains 6 byte array. Each byte represents one system
state. If byte at offset X has its MSB set it means that system state
X is supported and to enter it guest should use the value from lowest 3
bits.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Allows us to use cpu_reset() in place of cpu_state_reset().
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Needed for pc_cpu_reset().
Also change return type to X86CPU.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* qemu-kvm/uq/master:
virtio/vhost: Add support for KVM in-kernel MSI injection
msix: Add msix_nr_vectors_allocated
kvm: Enable use of kvm_irqchip_in_kernel in hwlib code
kvm: Introduce kvm_irqchip_add/remove_irqfd
kvm: Make kvm_irqchip_commit_routes an internal service
kvm: Publicize kvm_irqchip_release_virq
kvm: Introduce kvm_irqchip_add_msi_route
kvm: Rename kvm_irqchip_add_route to kvm_irqchip_add_irq_route
msix: Introduce vector notifiers
msix: Invoke msix_handle_mask_update on msix_mask_all
msix: Factor out msix_get_message
kvm: update vmxcap for EPT A/D, INVPCID, RDRAND, VMFUNC
kvm: Enable in-kernel irqchip support by default
kvm: Add support for direct MSI injections
kvm: Update kernel headers
kvm: x86: Wire up MSI support for in-kernel irqchip
pc: Enable MSI support at APIC level
kvm: Introduce basic MSI support for in-kernel irqchips
Introduce MSIMessage structure
kvm: Refactor KVMState::max_gsi to gsi_count
* sstabellini/for_1.1_rc3:
Call xc_domain_shutdown with the reboot flag when the guest requests a reboot.
xen: Fix PV-on-HVM
xen_disk: properly update stats in ioreq_release()
xen_disk: use bdrv_aio_flush instead of bdrv_flush
xen_disk: remove syncwrite option
xen: disable rtc_clock
xen: do not initialize the interval timer and PCSPK emulator
If you start guest with floppy drive but without media inserted, guest
still should see floppy drive pressent.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
PIT and PCSPK are emulated by the hypervisor so we don't need to emulate
them in Qemu: this patch prevents Qemu from waking up needlessly at
PIT_FREQ on Xen.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Push msi_supported enabling to the APIC implementations where we can
encapsulate the decision more cleanly, hiding the details from the
generic code.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
xc_hvm_inject_msi is only available on Xen >= 4.2: add a dummy
compatibility function for Xen < 4.2.
Also enable msi support only on Xen >= 4.2.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
The official spelling is QEMU.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
[blauwirbel@gmail.com: fixed comment style in hw/sun4m.c]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Scripted conversion:
for file in hw/apic.h hw/kvm/apic.c hw/kvmvapic.c hw/pc.c hw/vmport.c hw/xen_machine_pv.c; do
sed -i "s/CPUState/CPUX86State/g" $file
done
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Frees the identifier cpu_reset for QOM CPUs (manual rename).
Don't hide the parameter type behind explicit casts, use static
functions with strongly typed argument to indirect.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
This provides the required user space stubs to enable the in-kernel
i8254 emulation of KVM.
The in-kernel model supports lost tick compensation according to the
"delay" policy. This is enabled by default and can be switched off via a
device property.
Depending on the feature set of the host kernel (before 2.6.32), we may
have to disable the HPET or lack sound output from the PC speaker.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Floppies must be read at a specific transfer rate, depending of its own format.
Update floppy description table to include required transfer rate.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch switches pc s3 suspend over to the new infrastructure.
The cmos_s3 qemu_irq is killed, the new notifier is used instead.
The xen hack goes away with that too, the hypercall can simply be
done in a notifier function now.
This patch also makes the guest actually stay suspended instead
of leaving suspend instantly, so it is useful for more than just
testing whenever the suspend/resume cycle actually works.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Convert the PC speaker device to a qdev ISA model. Move the public
interface to a dedicated header file at this chance.
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When the HPET enters legacy mode, the IRQ output of the PIT is
suppressed and replaced by the HPET timer 0. But the current code to
emulate this was broken in many ways. It reset the PIT state after
re-enabling, it worked against a stale static PIT structure, and it did
not properly saved/restored the IRQ output mask in the PIT vmstate.
This patch solves the PIT IRQ control in a different way. On x86, it
both redirects the PIT IRQ to the HPET, just like the RTC. But it also
keeps the control line from the HPET to the PIT. This allows to disable
the PIT QEMU timer when it is not needed. The PIT's view on the control
line state is now saved in the same format that qemu-kvm is already
using.
Note that, in contrast to the suppressed RTC IRQ line, we do not need to
save/restore the PIT line state in the HPET. As we trigger a PIT IRQ
update via the control line, the line state is reconstructed on mode
switch.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
HPET legacy emulation will require control over the PIT IRQ output. To
enable this, add support for an alternative IRQ output object to the PIT
factory function. If the isa_irq number is < 0, this object will be
used.
This also removes the IRQ number property from the PIT class as we now
use a generic GPIO output pin that is connected by the factory function.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move the public interface of the PIT into its own header file and update
all users.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Replace device_init() with generalized type_init().
While at it, unify naming convention: type_init([$prefix_]register_types)
Also, type_init() is a function, so add preceding blank line where
necessary and don't put a semicolon after the closing brace.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: malc <av1474@comtv.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To both avoid that kvm_irqchip_in_kernel always has to be paired with
kvm_enabled and that the former ends up in a function call, implement it
like the latter. This means keeping the state in a global variable and
defining kvm_irqchip_in_kernel as a preprocessor macro.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This was done in a mostly automated fashion. I did it in three steps and then
rebased it into a single step which avoids repeatedly touching every file in
the tree.
The first step was a sed-based addition of the parent type to the subclass
registration functions.
The second step was another sed-based removal of subclass registration functions
while also adding virtual functions from the base class into a class_init
function as appropriate.
Finally, a python script was used to convert the DeviceInfo structures and
qdev_register_subclass functions to TypeInfo structures, class_init functions,
and type_register_static calls.
We are almost fully converted to QOM after this commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This converts two devices at once because PIC subclasses ISA and converting
subclasses independently is extremely hard.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qemu-kvm/uq/master:
kvm: Activate in-kernel irqchip support
kvm: x86: Add user space part for in-kernel IOAPIC
kvm: x86: Add user space part for in-kernel i8259
kvm: x86: Add user space part for in-kernel APIC
kvm: x86: Establish IRQ0 override control
kvm: Introduce core services for in-kernel irqchip support
memory: Introduce memory_region_init_reservation
ioapic: Factor out base class for KVM reuse
ioapic: Drop post-load irr initialization
i8259: Factor out base class for KVM reuse
i8259: Completely privatize PicState
apic: Open-code timer save/restore
apic: Factor out base class for KVM reuse
apic: Introduce apic_report_irq_delivered
apic: Inject external NMI events via LINT1
apic: Stop timer on reset
kvm: Move kvmclock into hw/kvm folder
msi: Generalize msix_supported to msi_supported
hyper-v: initialize Hyper-V CPUID leaves.
hyper-v: introduce Hyper-V support infrastructure.
Conflicts:
Makefile.target
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Improve VGA selection logic, push check for device availabilty to vl.c.
Create the devices at board level unconditionally.
Remove now unused pci_try_create*() functions.
Make PCI VGA devices optional.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.
MSI is not yet supported, so we disable this when the in-kernel model is
in use.
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Rename msix_supported to msi_supported and control MSI and MSI-X
activation this way. That was likely to original intention for this
flag, but MSI support came after MSI-X.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Currently creating a memory region automatically registers it for
live migration. This differs from other state (which is enumerated
in a VMStateDescription structure) and ties the live migration code
into the memory core.
Decouple the two by introducing a separate API, vmstate_register_ram(),
for registering a RAM block for migration. Currently the same
implementation is reused, but later it can be moved into a separate list,
and registrations can be moved to VMStateDescription blocks.
Signed-off-by: Avi Kivity <avi@redhat.com>
qemu-kvm passes numa/SRAT topology information for smp_cpus to SeaBIOS. However
SeaBIOS always expects to setup max_cpus number of SRAT cpu entries
(MaxCountCPUs variable in build_srat function of Seabios). When qemu-kvm runs
with smp_cpus != max_cpus (e.g. -smp 2,maxcpus=4), Seabios will mistakenly use
memory SRAT info for setting up CPU SRAT entries for the offline CPUs. Wrong
SRAT memory entries are also created. This breaks NUMA in a guest.
Fix by setting up SRAT info for max_cpus in qemu-kvm.
Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
NULL is a valid bus/device, so there is no change in behaviour.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fix a use-while-uninitialized of the fd_type[] array (introduced
in commit 34d4260e1, noticed by Coverity). This is more theoretical
than practical, since it's quite hard to get here with floppy==NULL
(the qdev_try_create() of the isa-fdc device has to fail).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sgabios hasn't gotten a lot of coverage since it was not shipped. For 1.0,
let's disable the automatic loading of the option ROM in -nographic
mode. We can put it back for 1.1.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
apic id returned to guest kernel in ebx for cpuid(function=1) depends on
CPUX86State->cpuid_apic_id which gets populated after the cpuid information
is cached in the host kernel. This results in broken CPU topology in guest.
Fix this by setting cpuid_apic_id before cpuid information is passed to
the host kernel. This is done by moving the setting of cpuid_apic_id
to cpu_x86_init() where it will work for both KVM as well as TCG modes.
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bharata B Rao <bharata.rao@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Commit 63ffb564 broke floppy devices specified on the command line like
-drive file=...,if=none,id=floppy -global isa-fdc.driveA=floppy because it
relies on drive_get() which works only with -fda/-drive if=floppy.
This patch resembles what we're already doing for IDE, i.e. remember the floppy
device that was created and use that to extract the BlockDriverStates where
needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
The master PIC is connected to the LINTIN0 of the APICs. As the APIC
currently does not track the state of that line, we have to ask the PIC
to reinject its IRQ after the CPU picked up an event from the APIC.
This introduces pic_get_output to read the master PIC IRQ line state
without changing it. The APIC uses this function to decide if a PIC IRQ
should be reinjected on apic_update_irq. This reflects better how the
real hardware works.
The patch fixes some failures of the kvm unit tests apic and eventinj by
allowing to enable the proper CPU IRQ deassertion when the guest masks
some pending IRQs at PIC level.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The ISA bus IRQ range is 0..15. What isa_irq_handler and IsaIrqState are
actually dealing with are the Global System Interrupts. Refactor the
code to clarify this.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
IsaIrqState::ioapic is always non-NULL. Probably, the concrete
qemu_irq was supposed to be tested, but that's already done by
qemu_set_irq.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>