Merge pull request #32359 from poettering/vmspawn-hyperv-enlight

some hyperv related enhancement in detect-virt + vmspawn
This commit is contained in:
Luca Boccassi 2024-04-20 14:40:14 +02:00 committed by GitHub
commit 6e6deacc61
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 128 additions and 14 deletions

10
NEWS
View file

@ -603,6 +603,12 @@ CHANGES WITH 256-rc1:
--ssh-key-type= to optionally set up transient SSH keys to pass to the
invoked VMs in order to be able to SSH into them once booted.
* systemd-vmspawn will no enable various "HyperV enlightenments" and
the "VM Generation ID" on the VMs.
* A new environment variable $SYSTEMD_VMSPAWN_QEMU_EXTRA may carry
additional qemu command line options to pass to qemu.
systemd-repart:
* systemd-repart gained new options --generate-fstab= and
@ -638,6 +644,10 @@ CHANGES WITH 256-rc1:
sd_journal_stream_fd() but creates a log stream targeted at a
specific log namespace.
* The sd-id128 API gained a new API call
sd_id128_get_invocation_app_specific() for acquiring an app-specific
ID that is derived from the service invocation ID.
systemd-cryptsetup/systemd-cryptenroll:
* systemd-cryptenroll can now enroll directly with a PKCS11 public key

3
TODO
View file

@ -329,10 +329,7 @@ Features:
PCRs.
* vmspawn:
- enable hyperv extension by default (https://www.qemu.org/docs/master/system/i386/hyperv.html)
- register with machined
- run in scope unit when invoked from command line, and machined registration is off
- support --directory= via virtiofs
- sd_notify support
- --ephemeral support
- --read-only support

View file

@ -191,6 +191,9 @@ All tools:
expected format is six groups of two hexadecimal digits separated by colons,
e.g. `SYSTEMD_VMSPAWN_NETWORK_MAC=12:34:56:78:90:AB`
* `$SYSTEMD_VMSPAWN_QEMU_EXTRA=…` may contain additional command line
arguments to append the qemu command line.
`systemd-logind`:
* `$SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1` — if set, report that

View file

@ -63,6 +63,12 @@
<paramdef>sd_id128_t *<parameter>ret</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_id128_get_invocation_app_specific</function></funcdef>
<paramdef>sd_id128_t <parameter>app_id</parameter></paramdef>
<paramdef>sd_id128_t *<parameter>ret</parameter></paramdef>
</funcprototype>
</funcsynopsis>
</refsynopsisdiv>
@ -126,12 +132,16 @@
for details. The ID is cached internally. In future a different mechanism to determine the invocation ID
may be added.</para>
<para><function>sd_id128_get_invocation_app_specific()</function> derives an application-specific ID from
the invocation ID.</para>
<para>Note that <function>sd_id128_get_machine_app_specific()</function>,
<function>sd_id128_get_boot()</function>, <function>sd_id128_get_boot_app_specific()</function>, and
<function>sd_id128_get_invocation()</function> always return UUID Variant 1 Version 4 compatible IDs.
<function>sd_id128_get_machine()</function> will also return a UUID Variant 1 Version 4 compatible ID on
new installations but might not on older. It is possible to convert the machine ID non-reversibly into a
UUID Variant 1 Version 4 compatible one. For more information, see
<function>sd_id128_get_boot()</function>, <function>sd_id128_get_boot_app_specific()</function>,
<function>sd_id128_get_invocation()</function> and
<function>sd_id128_get_invocation_app_specific</function> always return UUID Variant 1 Version 4
compatible IDs. <function>sd_id128_get_machine()</function> will also return a UUID Variant 1 Version 4
compatible ID on new installations but might not on older. It is possible to convert the machine ID
non-reversibly into a UUID Variant 1 Version 4 compatible one. For more information, see
<citerefentry><refentrytitle>machine-id</refentrytitle><manvolnum>5</manvolnum></citerefentry>. It is
hence guaranteed that these functions will never return the ID consisting of all zero or all one bits
(<constant>SD_ID128_NULL</constant>, <constant>SD_ID128_ALLF</constant>) — with the possible exception of
@ -262,6 +272,7 @@ As man:sd-id128(3) macro:
<para><function>sd_id128_get_machine_app_specific()</function> was added in version 233.</para>
<para><function>sd_id128_get_boot_app_specific()</function> was added in version 240.</para>
<para><function>sd_id128_get_app_specific()</function> was added in version 255.</para>
<para><function>sd_id128_get_invocation_app_specific()</function> was added in version 256.</para>
</refsect1>
<refsect1>

View file

@ -447,7 +447,7 @@ static Virtualization detect_vm_zvm(void) {
/* Returns a short identifier for the various VM implementations */
Virtualization detect_vm(void) {
static thread_local Virtualization cached_found = _VIRTUALIZATION_INVALID;
bool other = false;
bool other = false, hyperv = false;
int xen_dom0 = 0;
Virtualization v, dmi;
@ -504,7 +504,12 @@ Virtualization detect_vm(void) {
v = detect_vm_cpuid();
if (v < 0)
return v;
if (v == VIRTUALIZATION_VM_OTHER)
if (v == VIRTUALIZATION_MICROSOFT)
/* QEMU sets the CPUID string to hyperv's, in case it provides hyperv enlightenments. Let's
* hence not return Microsoft here but just use the other mechanisms first to make a better
* decision. */
hyperv = true;
else if (v == VIRTUALIZATION_VM_OTHER)
other = true;
else if (v != VIRTUALIZATION_NONE)
goto finish;
@ -545,8 +550,15 @@ Virtualization detect_vm(void) {
return v;
finish:
if (v == VIRTUALIZATION_NONE && other)
v = VIRTUALIZATION_VM_OTHER;
/* None of the checks above gave us a clear answer, hence let's now use fallback logic: if hyperv
* enlightenments are available but the VMM wasn't recognized as anything yet, it's probably
* Microsoft. */
if (v == VIRTUALIZATION_NONE) {
if (hyperv)
v = VIRTUALIZATION_MICROSOFT;
else if (other)
v = VIRTUALIZATION_VM_OTHER;
}
cached_found = v;
log_debug("Found VM virtualization %s", virtualization_to_string(v));

View file

@ -839,6 +839,7 @@ LIBSYSTEMD_256 {
global:
sd_bus_creds_get_pidfd_dup;
sd_bus_creds_new_from_pidfd;
sd_id128_get_invocation_app_specific;
sd_journal_stream_fd_with_namespace;
sd_event_source_get_inotify_path;
} LIBSYSTEMD_255;

View file

@ -390,3 +390,16 @@ _public_ int sd_id128_get_boot_app_specific(sd_id128_t app_id, sd_id128_t *ret)
return sd_id128_get_app_specific(id, app_id, ret);
}
_public_ int sd_id128_get_invocation_app_specific(sd_id128_t app_id, sd_id128_t *ret) {
sd_id128_t id;
int r;
assert_return(ret, -EINVAL);
r = sd_id128_get_invocation(&id);
if (r < 0)
return r;
return sd_id128_get_app_specific(id, app_id, ret);
}

View file

@ -53,6 +53,7 @@ int sd_id128_get_invocation(sd_id128_t *ret);
int sd_id128_get_app_specific(sd_id128_t base, sd_id128_t app_id, sd_id128_t *ret);
int sd_id128_get_machine_app_specific(sd_id128_t app_id, sd_id128_t *ret);
int sd_id128_get_boot_app_specific(sd_id128_t app_id, sd_id128_t *ret);
int sd_id128_get_invocation_app_specific(sd_id128_t app_id, sd_id128_t *ret);
#define SD_ID128_ARRAY(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15) \
{ .bytes = { 0x##v0, 0x##v1, 0x##v2, 0x##v3, 0x##v4, 0x##v5, 0x##v6, 0x##v7, \

View file

@ -199,7 +199,7 @@ TEST(id128) {
}
TEST(sd_id128_get_invocation) {
sd_id128_t id;
sd_id128_t id = SD_ID128_NULL;
int r;
/* Query the invocation ID */
@ -208,6 +208,36 @@ TEST(sd_id128_get_invocation) {
log_warning_errno(r, "Failed to get invocation ID, ignoring: %m");
else
log_info("Invocation ID: " SD_ID128_FORMAT_STR, SD_ID128_FORMAT_VAL(id));
sd_id128_t appid = SD_ID128_NULL;
r = sd_id128_get_invocation_app_specific(SD_ID128_MAKE(59,36,e9,92,fd,11,42,fe,87,c9,e9,b5,6c,9e,4f,04), &appid);
if (r < 0)
log_warning_errno(r, "Failed to get invocation ID, ignoring: %m");
else {
assert(!sd_id128_equal(id, appid));
log_info("Per-App Invocation ID: " SD_ID128_FORMAT_STR, SD_ID128_FORMAT_VAL(appid));
}
sd_id128_t appid2 = SD_ID128_NULL;
r = sd_id128_get_invocation_app_specific(SD_ID128_MAKE(59,36,e9,92,fd,11,42,fe,87,c9,e9,b5,6c,9e,4f,05), &appid2); /* slightly different appid */
if (r < 0)
log_warning_errno(r, "Failed to get invocation ID, ignoring: %m");
else {
assert(!sd_id128_equal(id, appid2));
assert(!sd_id128_equal(appid, appid2));
log_info("Per-App Invocation ID 2: " SD_ID128_FORMAT_STR, SD_ID128_FORMAT_VAL(appid2));
}
sd_id128_t appid3 = SD_ID128_NULL;
r = sd_id128_get_invocation_app_specific(SD_ID128_MAKE(59,36,e9,92,fd,11,42,fe,87,c9,e9,b5,6c,9e,4f,04), &appid3); /* same appid as before */
if (r < 0)
log_warning_errno(r, "Failed to get invocation ID, ignoring: %m");
else {
assert(!sd_id128_equal(id, appid3));
assert(sd_id128_equal(appid, appid3));
assert(!sd_id128_equal(appid2, appid3));
log_info("Per-App Invocation ID 3: " SD_ID128_FORMAT_STR, SD_ID128_FORMAT_VAL(appid3));
}
}
TEST(benchmark_sd_id128_get_machine_app_specific) {

View file

@ -1294,6 +1294,24 @@ static int run_virtual_machine(int kvm_device_fd, int vhost_device_fd) {
if (strv_extend_many(&cmdline, "-uuid", SD_ID128_TO_UUID_STRING(arg_uuid)) < 0)
return log_oom();
/* Derive a vmgenid automatically from the invocation ID, in a deterministic way. */
sd_id128_t vmgenid;
r = sd_id128_get_invocation_app_specific(SD_ID128_MAKE(bd,84,6d,e3,e4,7d,4b,6c,a6,85,4a,87,0f,3c,a3,a0), &vmgenid);
if (r < 0) {
log_debug_errno(r, "Failed to get invocation ID, making up randomized vmgenid: %m");
r = sd_id128_randomize(&vmgenid);
if (r < 0)
return log_error_errno(r, "Failed to make up randomized vmgenid: %m");
}
_cleanup_free_ char *vmgenid_device = NULL;
if (asprintf(&vmgenid_device, "vmgenid,guid=" SD_ID128_UUID_FORMAT_STR, SD_ID128_FORMAT_VAL(vmgenid)) < 0)
return log_oom();
if (strv_extend_many(&cmdline, "-device", vmgenid_device) < 0)
return log_oom();
/* if we are going to be starting any units with state then create our runtime dir */
if (arg_tpm != 0 || arg_directory || arg_runtime_mounts.n_mounts != 0) {
r = runtime_directory(&arg_runtime_directory, arg_privileged ? RUNTIME_SCOPE_SYSTEM : RUNTIME_SCOPE_USER, "systemd/vmspawn");
@ -1421,7 +1439,13 @@ static int run_virtual_machine(int kvm_device_fd, int vhost_device_fd) {
pass_fds[n_pass_fds++] = device_fd;
}
r = strv_extend_many(&cmdline, "-cpu", "max");
r = strv_extend_many(&cmdline, "-cpu",
#ifdef __x86_64__
"max,hv_relaxed,hv-vapic,hv-time"
#else
"max"
#endif
);
if (r < 0)
return log_oom();
@ -1875,6 +1899,18 @@ static int run_virtual_machine(int kvm_device_fd, int vhost_device_fd) {
return log_error_errno(r, "Failed to call getsockname on VSOCK: %m");
}
const char *e = secure_getenv("SYSTEMD_VMSPAWN_QEMU_EXTRA");
if (e) {
_cleanup_strv_free_ char **extra = NULL;
r = strv_split_full(&extra, e, /* separator= */ NULL, EXTRACT_CUNESCAPE|EXTRACT_UNQUOTE);
if (r < 0)
return log_error_errno(r, "Failed to split $SYSTEMD_VMSPAWN_QEMU_EXTRA environment variable: %m");
if (strv_extend_strv(&cmdline, extra, /* filter_duplicates= */ false) < 0)
return log_oom();
}
if (DEBUG_LOGGING) {
_cleanup_free_ char *joined = quote_command_line(cmdline, SHELL_ESCAPE_EMPTY);
if (!joined)