systemd/docs/ELF_PACKAGE_METADATA.md
Zbigniew Jędrzejewski-Szmek 9a2b54d9f7 docs/ELF_PACKAGE_METADATA: add detailed example
When the spec was initially written, we didn't add good documentation of how to
display the notes, also because there was no good way to display the data
except manually extracting the section to a file and running 'jq' on that. But
the tools have improved, so let's show the users how easy it is to use this
data.
2024-09-13 14:51:44 +02:00

9.6 KiB

title category layout SPDX-License-Identifier
Package Metadata for ELF Files Interfaces default LGPL-2.1-or-later

Package Metadata for Core Files

Intended audience: hackers working on userspace subsystems that create ELF binaries or parse ELF core files.

Motivation

ELF binaries get stamped with a unique, build-time generated hex string identifier called build-id, which gets embedded as an ELF note called .note.gnu.build-id. In most cases, this allows a stripped binary to be associated with its debugging information. It is used, for example, to dynamically fetch DWARF symbols from a debuginfo server, or to query the local package manager and find out the package metadata or, again, the DWARF symbols or program sources.

However, this usage of the build-id requires either local metadata, usually set up by the package manager, or access to a remote server over the network. Both of those might be unavailable or forbidden.

Thus it becomes desirable to add additional metadata to a binary at build time, so that systemd-coredump and other services analyzing core files are able to extract said metadata simply from the core file itself, without external dependencies.

Implementation

This document will attempt to define a common metadata format specification, so that multiple implementers might use it when building packages, or core file analyzers, and so on.

The metadata will be embedded in a single, new, 4-bytes-aligned, allocated, 0-padded, read-only ELF header section, in a name-value JSON object format. Implementers working on parsing core files should not assume a specific list of names, but parse anything that is included in the section, and should look for the note using the note type. Implementers working on build tools should strive to use the same names, for consistency. The most common will be listed here. When corresponding to the content of os-release, the values should match, again for consistency.

If available, the metadata should also include the debuginfod server URL that can provide the original executable, debuginfo and sources, to further facilitate debugging.

  • Section header
SECTION: `.note.package`
note type: `0xcafe1a7e`
Owner: `FDO` (FreeDesktop.org)
Value: a single JSON object encoded as a zero-terminated UTF-8 string
  • JSON payload
{
     "type":"rpm",          # this provides a namespace for the package+package-version fields
     "os":"fedora",
     "osVersion":"33",
     "name":"coreutils",
     "version":"4711.0815.fc13",
     "architecture":"arm32",
     "osCpe": "cpe:/o:fedoraproject:fedora:33",          # A CPE name for the operating system, `CPE_NAME` from os-release is a good default
     "debugInfoUrl": "https://debuginfod.fedoraproject.org/"
}

The format is a single JSON object, encoded as a zero-terminated UTF-8 string. Each name in the object shall be unique as per recommendations of RFC8259. Strings shall not contain any control character, nor use \uXXX escaping.

When it comes to JSON numbers, this specification assumes that JSON parsers processing this information are capable of reproducing the full signed 53bit integer range (i.e. -2⁵³+1…+2⁵³-1) as well as the full 64-bit IEEE floating point number range losslessly (with the exception of NaN/-inf/+inf, since JSON cannot encode that), as per recommendations of RFC8259. Fields in these JSON objects are thus permitted to encode numeric values from these ranges as JSON numbers, and should not use numeric values not covered by these types and ranges.

Reference implementations of packaging tools for .deb and .rpm are available, and provide macros/helpers to include the note in binaries built by the package build system. They make use of the new --package-metadata flag that is available in the bfd, gold, mold and lld linkers (versions 2.39, 1.3.0 and 15.0 respectively). This linker flag takes a JSON payload as parameter.

Well-known keys

The metadata format is intentionally left open, so that vendors can add their own information. A set of well-known keys is defined here, and hopefully shared among all vendors.

Key name Key description Example value
type The packaging type rpm
os The OS name, typically corresponding to ID in os-release fedora
osVersion The OS version, typically corresponding to VERSION_ID in os-release 33
name The source package name coreutils
version The source package version 4711.0815.fc13
architecture The binary package architecture arm32
osCpe A CPE name for the OS, typically corresponding to CPE_NAME in os-release cpe:/o:fedoraproject:fedora:33
debugInfoUrl The debuginfod server url, if available https://debuginfod.fedoraproject.org/

Displaying package notes

The raw ELF section can be extracted using objdump:

$ objdump -j .note.package -s /usr/bin/ls

/usr/bin/ls:     file format elf64-x86-64

Contents of section .note.package:
 03cc 04000000 7c000000 7e1afeca 46444f00  ....|...~...FDO.
 03dc 7b227479 7065223a 2272706d 222c226e  {"type":"rpm","n
 03ec 616d6522 3a22636f 72657574 696c7322  ame":"coreutils"
 03fc 2c227665 7273696f 6e223a22 392e342d  ,"version":"9.4-
 040c 372e6663 3430222c 22617263 68697465  7.fc40","archite
 041c 63747572 65223a22 7838365f 3634222c  cture":"x86_64",
 042c 226f7343 7065223a 22637065 3a2f6f3a  "osCpe":"cpe:/o:
 043c 6665646f 72617072 6f6a6563 743a6665  fedoraproject:fe
 044c 646f7261 3a343022 7d000000           dora:40"}...

It is more convenient to use a higher level tool:

$ readelf --notes /usr/bin/ls
...
Displaying notes found in: .note.gnu.build-id
  Owner                Data size 	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 40e5a1570a9d97fc48f5c61cfb7690fec0f872b2

Displaying notes found in: .note.ABI-tag
  Owner                Data size 	Description
  GNU                  0x00000010	NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 3.2.0

Displaying notes found in: .note.package
  Owner                Data size 	Description
  FDO                  0x0000007c	FDO_PACKAGING_METADATA
    Packaging Metadata: {"type":"rpm","name":"coreutils","version":"9.4-7.fc40","architecture":"x86_64","osCpe":"cpe:/o:fedoraproject:fedora:40"}
...

$ systemd-analyze inspect-elf /usr/bin/ls
           path: /usr/bin/ls
        elfType: executable
elfArchitecture: AMD x86-64

           type: rpm
           name: coreutils
        version: 9.4-7.fc40
   architecture: x86_64
          osCpe: cpe:/o:fedoraproject:fedora:40
        buildId: 40e5a1570a9d97fc48f5c61cfb7690fec0f872b2

If the binary crashes, systemd-coredump will display the combined information from the crashing binary and any shared libraries it links to:

$  coredumpctl info
           PID: 3987823 (ls)
        Signal: 11 (SEGV)
  Command Line: ls --color=tty -lR /
    Executable: /usr/bin/ls
...
       Storage: /var/lib/systemd/coredump/core.ls.1000.88dea1b9831c420dbb398f9d2ad9b41e.3987823.1726230641000000.zst (present)
  Size on Disk: 194.4K
       Package: coreutils/9.4-7.fc40
      build-id: 40e5a1570a9d97fc48f5c61cfb7690fec0f872b2
       Message: Process 3987823 (ls) of user 1000 dumped core.

                Module /usr/bin/ls from rpm coreutils-9.4-7.fc40.x86_64
                Module libz.so.1 from rpm zlib-ng-2.1.7-1.fc40.x86_64
                Module libcrypto.so.3 from rpm openssl-3.2.2-3.fc40.x86_64
                Module libmount.so.1 from rpm util-linux-2.40.1-1.fc40.x86_64
                Module libcrypt.so.2 from rpm libxcrypt-4.4.36-5.fc40.x86_64
                Module libblkid.so.1 from rpm util-linux-2.40.1-1.fc40.x86_64
                Module libnss_sss.so.2 from rpm sssd-2.9.5-1.fc40.x86_64
                Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc40.x86_64
                Module libcap.so.2 from rpm libcap-2.69-8.fc40.x86_64
                Module libselinux.so.1 from rpm libselinux-3.6-4.fc40.x86_64
                Stack trace of thread 3987823:
                #0  0x00007f19331c3f7e lgetxattr (libc.so.6 + 0x116f7e)
                #1  0x00007f19332be4c0 lgetfilecon_raw (libselinux.so.1 + 0x134c0)
                #2  0x00007f19332c3bd9 lgetfilecon (libselinux.so.1 + 0x18bd9)
                #3  0x000056038273ad55 gobble_file.constprop.0 (/usr/bin/ls + 0x17d55)
                #4  0x0000560382733c55 print_dir (/usr/bin/ls + 0x10c55)
                #5  0x0000560382727c35 main (/usr/bin/ls + 0x4c35)
                #6  0x00007f19330d7088 __libc_start_call_main (libc.so.6 + 0x2a088)
                #7  0x00007f19330d714b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2a14b)
                #8  0x0000560382728f15 _start (/usr/bin/ls + 0x5f15)
                ELF object binary architecture: AMD x86-64

(This is just a simulation. ls is not prone to crashing with a segmentation violation.)