Commit graph

13668 commits

Author SHA1 Message Date
Philippe Mathieu-Daudé 4860af2c4f target/s390x: Use s390_skeys_get|set() helper
Commit c9274b6bf0 ("target/s390x: start moving TCG-only code
to tcg/") moved mem_helper.c, but the trace-events file is
still in the parent directory, so is the generated trace.h.

Call the s390_skeys_get|set() helper, removing the need
for the trace event shared with the tcg/ sub-directory,
fixing the following build failure:

 In file included from ../target/s390x/tcg/mem_helper.c:33:
 ../target/s390x/tcg/trace.h:1:10: fatal error: 'trace/trace-target_s390x_tcg.h' file not found
 #include "trace/trace-target_s390x_tcg.h"

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20240613104415.9643-3-philmd@linaro.org>
2024-06-19 12:42:03 +02:00
Philippe Mathieu-Daudé 8291239113 target/i386: Remove X86CPU::kvm_no_smi_migration field
X86CPU::kvm_no_smi_migration was only used by the
pc-i440fx-2.3 machine, which got removed. Remove it
and simplify kvm_put_vcpu_events().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240617071118.60464-23-philmd@linaro.org>
2024-06-19 12:40:49 +02:00
Philippe Mathieu-Daudé 63f16d97c6 target/i386/kvm: Remove x86_cpu_change_kvm_default() and 'kvm-cpu.h'
x86_cpu_change_kvm_default() was only used out of kvm-cpu.c by
the pc-i440fx-2.1 machine, which got removed. Make it static,
and remove its declaration. "kvm-cpu.h" is now empty, remove it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240617071118.60464-10-philmd@linaro.org>
2024-06-19 12:40:49 +02:00
Paolo Bonzini 109238a8d9 target/i386: SEV: do not assume machine->cgs is SEV
There can be other confidential computing classes that are not derived
from sev-common.  Avoid aborting when encountering them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 0c4da54883 target/i386: convert CMPXCHG to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 7b1f25ac3a target/i386: convert XADD to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 11ffaf8c73 target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 6476902740 target/i386: convert SHLD/SHRD to new decoder
Use the same flag generation code as SHL and SHR, but use
the existing gen_shiftd_rm_T1 function to compute the result
as well as CC_SRC.

Decoding-wise, SHLD/SHRD by immediate count as a 4 operand
instruction because s->T0 and s->T1 actually occupy three op
slots.  The infrastructure used by opcodes in the 0F 3A table
works fine.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini e4e5981daf target/i386: adapt gen_shift_count for SHLD/SHRD
SHLD/SHRD can have 3 register operands - s->T0, s->T1 and either
1 or CL - and therefore decode->op[2] is taken by the low part
of the register being shifted.  Pass X86_OP_* to gen_shift_count
from its current callers and hardcode cpu_regs[R_ECX] as the
shift count.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 87b2037b65 target/i386: pull load/writeback out of gen_shiftd_rm_T1
Use gen_ld_modrm/gen_st_modrm, moving them and gen_shift_flags to the
caller.  This way, gen_shiftd_rm_T1 becomes something that the new
decoder can call.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini ae541c0eb4 target/i386: convert non-grouped, helper-based 2-byte opcodes
These have very simple generators and no need for complex group
decoding.  Apart from LAR/LSL which are simplified to use
gen_op_deposit_reg_v and movcond, the code is generally lifted
from translate.c into the generators.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 556c4c5cc4 target/i386: split X86_CHECK_prot into PE and VM86 checks
SYSENTER is allowed in VM86 mode, but not in real mode.  Split the check
so that PE and !VM86 are covered by separate bits.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini ea89aa895e target/i386: finish converting 0F AE to the new decoder
This is already partly implemented due to VLDMXCSR and VSTMXCSR; finish
the job.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini 10340080cd target/i386: fix bad sorting of entries in the 0F table
Aesthetic change only.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini e0448caebf target/i386: replace read_crN helper with read_cr8
All other control registers are stored plainly in CPUX86State.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:39 +02:00
Paolo Bonzini a1af7fba5a target/i386: convert MOV from/to CR and DR to new decoder
Complete implementation of C and D operand types, then the operations
are just MOVs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-17 09:47:35 +02:00
Paolo Bonzini 024538287e target/i386: fix processing of intercept 0 (read CR0)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini c0df9563a3 target/i386: replace NoSeg special with NoLoadEA
This is a bit more generic, as it can be applied to MPX as well.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini 4e2dc59cf9 target/i386: change X86_ENTRYwr to use T0, use it for moves
Just like X86_ENTRYr, X86_ENTRYwr is easily changed to use only T0.
In this case, the motivation is to use it for the MOV instruction
family.  The case when you need to preserve the input value is the
odd one, as it is used basically only for BLS* instructions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini c2b6b6a65a target/i386: change X86_ENTRYr to use T0
I am not sure why I made it use T1.  It is a bit more symmetric with
respect to X86_ENTRYwr (which uses T0 for the "w"ritten operand
and T1 for the "r"ead operand), but it is also less flexible because it
does not let you apply zextT0/sextT0.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini e628387cf9 target/i386: put BLS* input in T1, use generic flag writeback
This makes for easier cpu_cc_* setup, and not using set_cc_op()
should come in handy if QEMU ever implements APX.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini cc155f1971 target/i386: rewrite flags writeback for ADCX/ADOX
Avoid using set_cc_op() in preparation for implementing APX; treat
CC_OP_EFLAGS similar to the case where we have the "opposite" cc_op
(CC_OP_ADOX for ADCX and CC_OP_ADCX for ADOX), except the resulting
cc_op is not CC_OP_ADCOX. This is written easily as two "if"s, whose
conditions are both false for CC_OP_EFLAGS, both true for CC_OP_ADCOX,
and one each true for CC_OP_ADCX/ADOX.

The new logic also makes it easy to drop usage of tmp0.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Paolo Bonzini 4228eb8cc6 target/i386: remove CPUX86State argument from generator functions
CPUX86State argument would only be used to fetch bytes, but that has to be
done before the generator function is called.  So remove it, and all
temptation together with it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:22 +02:00
Pankaj Gupta cd7093a7a1 i386/sev: Return when sev_common is null
Fixes Coverity CID 1546885.

Fixes: 16dcf200dc ("i386/sev: Introduce "sev-common" type to encapsulate common SEV state")
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240607183611.1111100-4-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:14 +02:00
Pankaj Gupta 48779faef3 i386/sev: Move SEV_COMMON null check before dereferencing
Fixes Coverity CID 1546886.

Fixes: 9861405a8f ("i386/sev: Invoke launch_updata_data() for SEV class")
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240607183611.1111100-3-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:29:01 +02:00
Pankaj Gupta c94eb5db8e i386/sev: fix unreachable code coverity issue
Set 'finish->id_block_en' early, so that it is properly reset.

Fixes coverity CID 1546887.

Fixes: 7b34df4426 ("i386/sev: Introduce 'sev-snp-guest' object")
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240607183611.1111100-2-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:28:34 +02:00
Chuang Xu 903916f0a0 i386/cpu: fixup number of addressable IDs for processor cores in the physical package
When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].

When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.

Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.

Fixes: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240611032314.64076-1-xuchuangxclwt@bytedance.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-11 14:25:22 +02:00
Anton Johansson 1967a1ea98 target/hexagon: idef-parser simplify predicate init
Only predicate instruction arguments need to be initialized by
idef-parser. This commit removes registers from the init_list and
simplifies gen_inst_init_args() slightly.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-5-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-06-08 17:49:36 -07:00
Anton Johansson 95408ad8e2 target/hexagon: idef-parser fix leak of init_list
gen_inst_init_args() is called for instructions using a predicate as an
rvalue. Upon first call, the list of arguments which might need
initialization init_list is freed to indicate that they have been
processed. For instructions without an rvalue predicate,
gen_inst_init_args() isn't called and init_list will never be freed.

Free init_list from free_instruction() if it hasn't already been freed.
A comment in free_instruction is also updated.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-4-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-06-08 17:49:27 -07:00
Anton Johansson 348fec2afe target/hexagon: idef-parser remove undefined functions
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-3-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-06-08 17:49:23 -07:00
Anton Johansson 49c1f7a472 target/hexagon: idef-parser remove unused defines
Before switching to GArray/g_string_printf we used fixed size arrays for
output buffers and instructions arguments among other things.

Macros defining the sizes of these buffers were left behind, remove
them.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-2-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-06-08 17:49:16 -07:00
Matheus Tavares Bernardino e1b526f1d8 Hexagon: add PC alignment check and exception
The Hexagon Programmer's Reference Manual says that the exception 0x1e
should be raised upon an unaligned program counter. Let's implement that
and also add some tests.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <277b7aeda2c717a96d4dde936b3ac77707cb6517.1714755107.git.quic_mathbern@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-06-08 17:48:50 -07:00
Matheus Tavares Bernardino a1852002c7 Hexagon: fix HVX store new
At 09a7e7db0f (Hexagon (target/hexagon) Remove uses of
op_regs_generated.h.inc, 2024-03-06), we've changed the logic of
check_new_value() to use the new pre-calculated
packet->insn[...].dest_idx instead of calculating the index on the fly
using opcode_reginfo[...]. The dest_idx index is calculated roughly like
the following:

    for reg in iset[tag]["syntax"]:
        if reg.is_written():
            dest_idx = regno
            break

Thus, we take the first register that is writtable. Before that,
however, we also used to follow an alphabetical order on the register
type: 'd', 'e', 'x', and 'y'. No longer following that makes us select
the wrong register index and the HVX store new instruction does not
update the memory like expected.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Message-Id: <f548dc1c240819c724245e887f29f918441e9125.1716220379.git.quic_mathbern@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-06-08 17:48:50 -07:00
Richard Henderson 3e246da2c3 * scsi-disk: Don't silently truncate serial number
* backends/hostmem: Report error on unavailable qemu_madvise() features or unaligned memory sizes
 * target/i386: fixes and documentation for INHIBIT_IRQ/TF/RF and debugging
 * i386/hvf: Adds support for INVTSC cpuid bit
 * i386/hvf: Fixes for dirty memory tracking
 * i386/hvf: Use hv_vcpu_interrupt() and hv_vcpu_run_until()
 * hvf: Cleanups
 * stubs: fixes for --disable-system build
 * i386/kvm: support for FRED
 * i386/kvm: fix MCE handling on AMD hosts
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZkF2oUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPNlQf+N9y6Eh0nMEEQ69twtV8ytglTY+uX
 FsogvnsXHNMVubOWmmeItM6kFXTAkR9cmFaL8dqI1Gs03xEQdQXbF1KejJZOAZVl
 RQMOW8Fg2Afr+0lwqCXHvhsmZ4hr5yUkRndyucA/E9AO2uGrtgwsWGDBGaHJOZIA
 lAsEMOZgKjXHZnefXjhMrvpk/QNovjEV6f1RHX3oKZjKSI5/G4IqGSmwNYToot8p
 2fgs4Qti4+1gNyM2oBLq7cCMjMS61tSxOMH4uqVoIisjyckPlAFRvc+DXtKsUAAs
 9AgM++pNgpB0IXv67czRUNdRoK7OI8I0ULhI4qHXi6Yg2QYAHqpQ6WL4Lg==
 =RP7U
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* scsi-disk: Don't silently truncate serial number
* backends/hostmem: Report error on unavailable qemu_madvise() features or unaligned memory sizes
* target/i386: fixes and documentation for INHIBIT_IRQ/TF/RF and debugging
* i386/hvf: Adds support for INVTSC cpuid bit
* i386/hvf: Fixes for dirty memory tracking
* i386/hvf: Use hv_vcpu_interrupt() and hv_vcpu_run_until()
* hvf: Cleanups
* stubs: fixes for --disable-system build
* i386/kvm: support for FRED
* i386/kvm: fix MCE handling on AMD hosts

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZkF2oUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPNlQf+N9y6Eh0nMEEQ69twtV8ytglTY+uX
# FsogvnsXHNMVubOWmmeItM6kFXTAkR9cmFaL8dqI1Gs03xEQdQXbF1KejJZOAZVl
# RQMOW8Fg2Afr+0lwqCXHvhsmZ4hr5yUkRndyucA/E9AO2uGrtgwsWGDBGaHJOZIA
# lAsEMOZgKjXHZnefXjhMrvpk/QNovjEV6f1RHX3oKZjKSI5/G4IqGSmwNYToot8p
# 2fgs4Qti4+1gNyM2oBLq7cCMjMS61tSxOMH4uqVoIisjyckPlAFRvc+DXtKsUAAs
# 9AgM++pNgpB0IXv67czRUNdRoK7OI8I0ULhI4qHXi6Yg2QYAHqpQ6WL4Lg==
# =RP7U
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jun 2024 01:33:46 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (42 commits)
  python: mkvenv: remove ensure command
  Revert "python: use vendored tomli"
  i386: Add support for overflow recovery
  i386: Add support for SUCCOR feature
  i386: Fix MCE support for AMD hosts
  docs: i386: pc: Avoid mentioning limit of maximum vCPUs
  target/i386: Add get/set/migrate support for FRED MSRs
  target/i386: enumerate VMX nested-exception support
  vmxcap: add support for VMX FRED controls
  target/i386: mark CR4.FRED not reserved
  target/i386: add support for FRED in CPUID enumeration
  hvf: Makes assert_hvf_ok report failed expression
  i386/hvf: Updates API usage to use modern vCPU run function
  i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit
  i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change
  hvf: Consistent types for vCPU handles
  i386/hvf: Fixes some compilation warnings
  i386/hvf: Adds support for INVTSC cpuid bit
  stubs/meson: Fix qemuutil build when --disable-system
  scsi-disk: Don't silently truncate serial number
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-06-08 07:40:08 -07:00
John Allen 1ea1432199 i386: Add support for overflow recovery
Add cpuid bit definition for overflow recovery. This is needed in the case
where a deferred error has been sent to the guest, a guest process accesses the
poisoned memory, but the machine_check_poll function has not yet handled the
original deferred error. If overflow recovery is not set in this case, when we
handle the uncorrected error from the poisoned memory access, the overflow bit
will be set and will result in the guest being shut down.

By the time the MCE reaches the guest, the overflow has been handled
by the host and has not caused a shutdown, so include the bit unconditionally.

Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-4-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:39 +02:00
John Allen 2ba8b7ee63 i386: Add support for SUCCOR feature
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.

----
v2:
  - Add "succor" feature word.
  - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.

Reported-by: William Roche <william.roche@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-3-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:39 +02:00
John Allen 4b77512b27 i386: Fix MCE support for AMD hosts
For the most part, AMD hosts can use the same MCE injection code as Intel, but
there are instances where the qemu implementation is Intel specific. First, MCE
delivery works differently on AMD and does not support broadcast. Second,
kvm_mce_inject generates MCEs that include a number of Intel specific status
bits. Modify kvm_mce_inject to properly generate MCEs on AMD platforms.

Reported-by: William Roche <william.roche@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-2-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Xin Li 4ebd98eb3a target/i386: Add get/set/migrate support for FRED MSRs
FRED CPU states are managed in 9 new FRED MSRs, in addtion to a few
existing CPU registers and MSRs, e.g., CR4.FRED and MSR_IA32_PL0_SSP.

Save/restore/migrate FRED MSRs if FRED is exposed to the guest.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-7-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Xin Li ef202d64c3 target/i386: enumerate VMX nested-exception support
Allow VMX nested-exception support to be exposed in KVM guests, thus
nested KVM guests can enumerate it.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-6-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Xin Li f88ddc40c6 target/i386: mark CR4.FRED not reserved
The CR4.FRED bit, i.e., CR4[32], is no longer a reserved bit when FRED
is exposed to guests, otherwise it is still a reserved bit.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20231109072012.8078-3-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Xin Li c1acad9f72 target/i386: add support for FRED in CPUID enumeration
FRED, i.e., the Intel flexible return and event delivery architecture,
defines simple new transitions that change privilege level (ring
transitions).

The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0.  One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0.  Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.

In addition to these transitions, the FRED architecture defines a new
instruction (LKGS) for managing the state of the GS segment register.
The LKGS instruction can be used by 64-bit operating systems that do
not use the new FRED transitions.

WRMSRNS is an instruction that behaves exactly like WRMSR, with the
only difference being that it is not a serializing instruction by
default.  Under certain conditions, WRMSRNS may replace WRMSR to improve
performance.  FRED uses it to switch RSP0 in a faster manner.

Search for the latest FRED spec in most search engines with this search
pattern:

  site:intel.com FRED (flexible return and event delivery) specification

The CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[17] enumerates FRED, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[18] enumerates LKGS, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[19] enumerates WRMSRNS.

Add CPUID definitions for FRED/LKGS/WRMSRNS, and expose them to KVM guests.

Because FRED relies on LKGS and WRMSRNS, add that to feature dependency
map.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-2-xin3.li@intel.com>
[Fix order of dependencies, add dependencies from LM to FRED. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Phil Dennis-Jordan a59f5b2f83 i386/hvf: Updates API usage to use modern vCPU run function
macOS 10.15 introduced the more efficient hv_vcpu_run_until() function
to supersede hv_vcpu_run(). According to the documentation, there is no
longer any reason to use the latter on modern host OS versions, especially
after 11.0 added support for an indefinite deadline.

Observed behaviour of the newer function is that as documented, it exits
much less frequently - and most of the original function’s exits seem to
have been effectively pointless.

Another reason to use the new function is that it is a prerequisite for
using newer features such as in-kernel APIC support. (Not covered by
this patch.)

This change implements the upgrade by selecting one of three code paths
at compile time: two static code paths for the new and old functions
respectively, when building for targets where the new function is either
not available, or where the built executable won’t run on older
platforms lacking the new function anyway. The third code path selects
dynamically based on runtime detected availability of the weakly-linked
symbol.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-7-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Phil Dennis-Jordan bf9bf2306c i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit
When interrupting a vCPU thread, this patch actually tells the hypervisor to
stop running guest code on that vCPU.

Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to
hv_vcpus_exit on aarch64. Alternatively, if the vCPU thread
is not
running the VM, it will immediately cause an exit when it attempts
to do so.

Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very
frequently, including many spurious exits, which made it less of a problem that
nothing was actively done to stop the vCPU thread running guest code.
The newer, more efficient hv_vcpu_run_until exits much more rarely, so a true
"kick" is needed before switching to that.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-6-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Phil Dennis-Jordan 3e2c6727cb i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change
When using x86 macOS Hypervisor.framework as accelerator, detection of
dirty memory regions is implemented by marking logged memory region
slots as read-only in the EPT, then setting the dirty flag when a
guest write causes a fault. The area marked dirty should then be marked
writable in order for subsequent writes to succeed without a VM exit.

However, dirty bits are tracked on a per-page basis, whereas the fault
handler was marking the whole logged memory region as writable. This
change fixes the fault handler so only the protection of the single
faulting page is marked as dirty.

(Note: the dirty page tracking appeared to work despite this error
because HVF’s hv_vcpu_run() function generated unnecessary EPT fault
exits, which ended up causing the dirty marking handler to run even
when the memory region had been marked RW. When using
hv_vcpu_run_until(), a change planned for a subsequent commit, these
spurious exits no longer occur, so dirty memory tracking malfunctions.)

Additionally, the dirty page is set to permit code execution, the same
as all other guest memory; changing memory protection from RX to RW not
RWX appears to have been an oversight.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-5-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Phil Dennis-Jordan 0e4e622e32 i386/hvf: Fixes some compilation warnings
A bunch of function definitions used empty parentheses instead of (void) syntax, yielding the following warning when building with clang on macOS:

warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]

In addition to fixing these function headers, it also fixes what appears to be a typo causing a variable to be unused after initialisation.

warning: variable 'entry_ctls' set but not used [-Wunused-but-set-variable]

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-3-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Phil Dennis-Jordan 9c267239c7 i386/hvf: Adds support for INVTSC cpuid bit
This patch adds the INVTSC bit to the Hypervisor.framework accelerator's
CPUID bit passthrough allow-list. Previously, specifying +invtsc in the CPU
configuration would fail with the following warning despite the host CPU
advertising the feature:

qemu-system-x86_64: warning: host doesn't support requested feature:
CPUID.80000007H:EDX.invtsc [bit 8]

x86 macOS itself relies on a fixed rate TSC for its own Mach absolute time
timestamp mechanism, so there's no reason we can't enable this bit for guests.
When the feature is enabled, a migration blocker is installed.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-2-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Mark Cave-Ayland 3973615e7f target/i386: fix size of EBP writeback in gen_enter()
The calculation of FrameTemp is done using the size indicated by mo_pushpop()
before being written back to EBP, but the final writeback to EBP is done using
the size indicated by mo_stacksize().

In the case where mo_pushpop() is MO_32 and mo_stacksize() is MO_16 then the
final writeback to EBP is done using MO_16 which can leave junk in the top
16-bits of EBP after executing ENTER.

Change the writeback of EBP to use the same size indicated by mo_pushpop() to
ensure that the full value is written back.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-5-mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Mark Cave-Ayland f1b8613da3 target/i386: fix SP when taking a memory fault during POP
When OS/2 Warp configures its segment descriptors, many of them are configured with
the P flag clear to allow for a fault-on-demand implementation. In the case where
the stack value is POPped into the segment registers, the SP is incremented before
calling gen_helper_load_seg() to validate the segment descriptor:

IN:
0xffef2c0c:  66 07                    popl     %es

OP:
 ld_i32 loc9,env,$0xfffffffffffffff8
 sub_i32 loc9,loc9,$0x1
 brcond_i32 loc9,$0x0,lt,$L0
 st16_i32 loc9,env,$0xfffffffffffffff8
 st8_i32 $0x1,env,$0xfffffffffffffffc

 ---- 0000000000000c0c 0000000000000000
 ext16u_i64 loc0,rsp
 add_i64 loc0,loc0,ss_base
 ext32u_i64 loc0,loc0
 qemu_ld_a64_i64 loc0,loc0,noat+un+leul,5
 add_i64 loc3,rsp,$0x4
 deposit_i64 rsp,rsp,loc3,$0x0,$0x10
 extrl_i64_i32 loc5,loc0
 call load_seg,$0x0,$0,env,$0x0,loc5
 add_i64 rip,rip,$0x2
 ext16u_i64 rip,rip
 exit_tb $0x0
 set_label $L0
 exit_tb $0x7fff58000043

If helper_load_seg() generates a fault when validating the segment descriptor then as
the SP has already been incremented, the topmost word of the stack is overwritten by
the arguments pushed onto the stack by the CPU before taking the fault handler. As a
consequence things rapidly go wrong upon return from the fault handler due to the
corrupted stack.

Update the logic for the existing writeback condition so that a POP into the segment
registers also calls helper_load_seg() first before incrementing the SP, so that if a
fault occurs the SP remains unaltered.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-4-mark.cave-ayland@ilande.co.uk>
Fixes: cc1d28bdbe ("target/i386: move 00-5F opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Mark Cave-Ayland aea49fbb01 target/i386: use gen_writeback() within gen_POP()
Instead of directly implementing the writeback using gen_op_st_v(), use the
existing gen_writeback() function.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240606095319.229650-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Mark Cave-Ayland f41990f552 target/i386: use local X86DecodedOp in gen_POP()
This will make subsequent changes a little easier to read.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240606095319.229650-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00