Commit graph

2420 commits

Author SHA1 Message Date
Matthew Heon b53cb57680 Initial implementation of volume plugins
This implements support for mounting and unmounting volumes
backed by volume plugins. Support for actually retrieving
plugins requires a pull request to land in containers.conf and
then that to be vendored, and as such is not yet ready. Given
this, this code is only compile tested. However, the code for
everything past retrieving the plugin has been written - there is
support for creating, removing, mounting, and unmounting volumes,
which should allow full functionality once the c/common PR is
merged.

A major change is the signature of the MountPoint function for
volumes, which now, by necessity, returns an error. Named volumes
managed by a plugin do not have a mountpoint we control; instead,
it is managed entirely by the plugin. As such, we need to cache
the path in the DB, and calls to retrieve it now need to access
the DB (and may fail as such).

Notably absent is support for SELinux relabelling and chowning
these volumes. Given that we don't manage the mountpoint for
these volumes, I am extremely reluctant to try and modify it - we
could easily break the plugin trying to chown or relabel it.

Also, we had no less than *5* separate implementations of
inspecting a volume floating around in pkg/infra/abi and
pkg/api/handlers/libpod. And none of them used volume.Inspect(),
the only correct way of inspecting volumes. Remove them all and
consolidate to using the correct way. Compat API is likely still
doing things the wrong way, but that is an issue for another day.

Fixes #4304

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2021-01-14 15:35:33 -05:00
OpenShift Merge Robot a1b49749af
Merge pull request #8906 from vrothberg/fix-8501
container stop: release lock before calling the runtime
2021-01-14 13:37:16 -05:00
Valentin Rothberg d54478d8ea container stop: release lock before calling the runtime
Podman defers stopping the container to the runtime, which can take some
time.  Keeping the lock while waiting for the runtime to complete the
stop procedure, prevents other commands from acquiring the lock as shown
in #8501.

To improve the user experience, release the lock before invoking the
runtime, and re-acquire the lock when the runtime is finished.  Also
introduce an intermediate "stopping" to properly distinguish from
"stopped" containers etc.

Fixes: #8501
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-01-14 17:45:30 +01:00
OpenShift Merge Robot 99c5746150
Merge pull request #8958 from zhangguanzhang/duplicated-hosts
Fixes /etc/hosts duplicated every time after container restarted in a pod
2021-01-13 09:58:09 -05:00
zhangguanzhang 0cff5ad0a3 Fxes /etc/hosts duplicated every time after container restarted in a pod
Signed-off-by: zhangguanzhang <zhangguanzhang@qq.com>
2021-01-13 19:03:35 +08:00
Daniel J Walsh a6046dceef
Remove the ability to use [name:tag] in podman load command
Docker does not support this, and it is confusing what to do if
the image has more then one tag.  We are dropping support for this
in podman 3.0

Fixes: https://github.com/containers/podman/issues/7387

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-12 17:38:32 -05:00
OpenShift Merge Robot 265ec914d3
Merge pull request #8950 from mheon/exorcise_driver
Exorcise Driver code from libpod/define
2021-01-12 14:02:32 -05:00
OpenShift Merge Robot db52828621
Merge pull request #8946 from JAORMX/sec-errors
Expose security attribute errors with their own messages
2021-01-12 13:46:29 -05:00
OpenShift Merge Robot db5e7ec4c4
Merge pull request #8947 from Luap99/cleanup-code
Fix problems reported by staticcheck
2021-01-12 13:15:35 -05:00
Matthew Heon befd40b57d Exorcise Driver code from libpod/define
The libpod/define code should not import any large dependencies,
as it is intended to be structures and definitions only. It
included the libpod/driver package for information on the storage
driver, though, which brought in all of c/storage. Split the
driver package so that define has the struct, and thus does not
need to import Driver. And simplify the driver code while we're
at it.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-01-12 11:48:53 -05:00
Paul Holzinger 8452b768ec Fix problems reported by staticcheck
`staticcheck` is a golang code analysis tool. https://staticcheck.io/

This commit fixes a lot of problems found in our code. Common problems are:
- unnecessary use of fmt.Sprintf
- duplicated imports with different names
- unnecessary check that a key exists before a delete call

There are still a lot of reported problems in the test files but I have
not looked at those.

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-01-12 16:11:09 +01:00
Juan Antonio Osorio Robles 020abbfeab Expose security attribute errors with their own messages
This creates error objects for runtime errors that might come from the
runtime. Thus, indicating to users that the place to debug should be in
the security attributes of the container.

When creating a container with a SELinux label that doesn't exist, we
get a fairly cryptic error message:

```
$ podman run --security-opt label=type:my_container.process -it fedora bash
Error: OCI runtime error: write file `/proc/thread-self/attr/exec`: Invalid argument
```

This instead handles any errors coming from LSM's `/proc` API and
enhances the error message with a relevant indicator that it's related
to the container's security attributes.

A sample run looks as follows:

```
$ bin/podman run --security-opt label=type:my_container.process -it fedora bash
Error: `/proc/thread-self/attr/exec`: OCI runtime error: unable to assign security attribute
```

With `debug` log level enabled it would be:

```
Error: write file `/proc/thread-self/attr/exec`: Invalid argument: OCI runtime error: unable to assign security attribute
```

Note that these errors wrap ErrOCIRuntime, so it's still possible to to
compare these errors with `errors.Is/errors.As`.

One advantage of this approach is that we could start handling these
errors in a more efficient manner in the future.

e.g. If a SELinux label doesn't exist (yet), we could retry until it
becomes available.

Signed-off-by: Juan Antonio Osorio Robles <jaosorior@redhat.com>
2021-01-12 16:10:17 +02:00
OpenShift Merge Robot 5575c7be20
Merge pull request #8819 from chen-zhuohan/add-pre-checkpoint
Add pre-checkpoint and restore with previous
2021-01-12 07:57:05 -05:00
OpenShift Merge Robot 1955eee89f
Merge pull request #8933 from giuseppe/use-O_PATH-for-unix-sock
oci: use /proc/self/fd/FD to open unix socket
2021-01-12 07:26:37 -05:00
Giuseppe Scrivano fdbc278868
oci: use /proc/self/fd/FD to open unix socket
instead of opening directly the UNIX socket path, grab a reference to
it through a O_PATH file descriptor and use the fixed size string
"/proc/self/fd/%d" to open the UNIX socket.  In this way it won't hit
the 108 chars length limit.

Closes: https://github.com/containers/podman/issues/8798

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-12 10:38:32 +01:00
Giuseppe Scrivano ae9dab9445
oci: keep LC_ env variables to conmon
it is necessary for conmon to deal with the correct locale, otherwise
it uses C as a fallback.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1893567
Requires: https://github.com/containers/conmon/pull/215

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-11 13:48:04 +01:00
unknown 2aa381f2d0 add pre checkpoint
Signed-off-by: Zhuohan Chen <chen_zhuohan@163.com>
2021-01-10 21:38:28 +08:00
OpenShift Merge Robot 49db79e735
Merge pull request #8781 from rst0git/cr-volumes
Add support for checkpoint/restore of containers with volumes
2021-01-08 10:41:05 -05:00
OpenShift Merge Robot 6c132b78f1
Merge pull request #8771 from rhatdan/run
Switch references of /var/run -> /run
2021-01-07 15:06:17 -05:00
OpenShift Merge Robot 3cf41c4a73
Merge pull request #8821 from rhatdan/caps
Containers should not get inheritable caps by default
2021-01-07 09:44:37 -05:00
OpenShift Merge Robot 74af9254b9
Merge pull request #8816 from giuseppe/automatically-split-userns-mappings
rootless: automatically split userns ranges
2021-01-07 09:35:01 -05:00
Daniel J Walsh db71759b1a
Handle podman exec capabilities correctly
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-07 05:53:50 -05:00
Daniel J Walsh d9ebbbfe5b
Switch references of /var/run -> /run
Systemd is now complaining or mentioning /var/run as a legacy directory.
It has been many years where /var/run is a symlink to /run on all
most distributions, make the change to the default.

Partial fix for https://github.com/containers/podman/issues/8369

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-07 05:37:24 -05:00
Giuseppe Scrivano ecedda63a6
rootless: automatically split userns ranges
writing to the id map fails when an extent overlaps multiple mappings
in the parent user namespace:

$ cat /proc/self/uid_map
         0       1000          1
         1     100000      65536
$ unshare -U sleep 100 &
[1] 1029703
$ printf "0 0 100\n" | tee /proc/$!/uid_map
0 0 100
tee: /proc/1029703/uid_map: Operation not permitted

This limitation is particularly annoying when working with rootless
containers as each container runs in the rootless user namespace, so a
command like:

$ podman run --uidmap 0:0:2 --rm fedora echo hi
Error: writing file `/proc/664087/gid_map`: Operation not permitted: OCI permission denied

would fail since the specified mapping overlaps the first
mapping (where the user id is mapped to root) and the second extent
with the additional IDs available.

Detect such cases and automatically split the specified mapping with
the equivalent of:

$ podman run --uidmap 0:0:1 --uidmap 1:1:1 --rm fedora echo hi
hi

A fix has already been proposed for the kernel[1], but even if it
accepted it will take time until it is available in a released kernel,
so fix it also in pkg/rootless.

[1] https://lkml.kernel.org/lkml/20201203150252.1229077-1-gscrivan@redhat.com/

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-07 09:42:27 +01:00
Radostin Stoyanov 288ccc4c84 Include named volumes in container migration
When migrating a container with associated volumes, the content of
these volumes should be made available on the destination machine.

This patch enables container checkpoint/restore with named volumes
by including the content of volumes in checkpoint file. On restore,
volumes associated with container are created and their content is
restored.

The --ignore-volumes option is introduced to disable this feature.

Example:

 # podman container checkpoint --export checkpoint.tar.gz <container>

The content of all volumes associated with the container are included
in `checkpoint.tar.gz`

 # podman container checkpoint --export checkpoint.tar.gz --ignore-volumes <container>

The content of volumes is not included in `checkpoint.tar.gz`. This is
useful, for example, when the checkpoint/restore is performed on the
same machine.

 # podman container restore --import checkpoint.tar.gz

The associated volumes will be created and their content will be
restored. Podman will exit with an error if volumes with the same
name already exist on the system or the content of volumes is not
included in checkpoint.tar.gz

 # podman container restore --ignore-volumes --import checkpoint.tar.gz

Volumes associated with container must already exist. Podman will not
create them or restore their content.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-01-07 07:51:19 +00:00
Radostin Stoyanov 17f50fb4bf Use Options as exportCheckpoint() argument
Instead of individual values from ContainerCheckpointOptions,
provide the options object.

This is a preparation for the next patch where one more value
of the options object is required in exportCheckpoint().

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-01-07 07:48:41 +00:00
Paul Holzinger b7f699c199 Fix podman logs read partial log lines
If a partial log line has the length 1 it was ignored by podman logs.

Fixes #8879

Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2021-01-07 00:04:38 +01:00
OpenShift Merge Robot bb82c37b73
Merge pull request #8805 from giuseppe/single-user-mapped-root
libpod: handle single user mapped as root
2021-01-06 15:41:36 -05:00
OpenShift Merge Robot 8e4613ab0a
Merge pull request #8892 from mheon/fix_8886
Ensure that user-specified HOSTNAME is honored
2021-01-06 15:26:55 -05:00
Matthew Heon 8f844a66d5 Ensure that user-specified HOSTNAME is honored
When adding the HOSTNAME environment variable, only do so if it
is not already present in the spec. If it is already present, it
was likely added by the user, and we should honor their requested
value.

Fixes #8886

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-01-06 09:46:21 -05:00
OpenShift Merge Robot ffe2b1e95a
Merge pull request #8685 from mheon/ignore_containersconf_sysctls_shared_net
Ignore containers.conf sysctls when sharing namespaces
2021-01-05 17:08:31 -05:00
OpenShift Merge Robot b84b7c89bb
Merge pull request #8831 from bblenard/issue-8658-system-prune-reclaimed-space
Rework pruning to report reclaimed space
2021-01-05 11:35:18 -05:00
OpenShift Merge Robot 1b9366d650
Merge pull request #8873 from baude/issue8864
close journald when reading
2021-01-05 04:34:24 -05:00
OpenShift Merge Robot 618c35570d
Merge pull request #8878 from mheon/no_edit_config
Ensure we do not edit container config in Exec
2021-01-04 21:11:27 -05:00
OpenShift Merge Robot ced7c0ab7f
Merge pull request #8875 from rhatdan/image
Allow image errors to bubble up from lower level functions.
2021-01-04 17:30:22 -05:00
Matthew Heon 864592c746 Add default sysctls for pod infra containers
Ensure that infra containers for pods will grab default sysctls
from containers.conf, to match how other containers are created.
This mostly affects the other containers in the pod, which will
inherit those sysctls when they join the pod's namespaces.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-01-04 15:29:18 -05:00
Matthew Heon 960607a4cd Ensure we do not edit container config in Exec
The existing code grabs the base container's process, and then
modifies it for use with the exec session. This could cause
errors in `podman inspect` or similar on the container, as the
definition of its OCI spec has been changed by the exec session.
The change never propagates to the DB, so it's limited to a
single process, but we should still avoid it when possible - so
deep-copy it before use.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-01-04 14:36:41 -05:00
baude 002d0d6ee6 close journald when reading
when reading from journald, we need to close the journal handler for
events and logging.

Fixes: #8864

Signed-off-by: baude <bbaude@redhat.com>
2021-01-04 13:27:38 -06:00
Daniel J Walsh d0093026a2
Allow image errors to bubble up from lower level functions.
Currently we ignore ErrMultipleImages being returned from findImageInRepoTags.

Fixes: https://github.com/containers/podman/issues/8868

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-04 10:51:54 -05:00
Giuseppe Scrivano 898f57c4c1
systemd: make rundir always accessible
so that the PIDFile can be accessed also without being in the rootless
user namespace.

Closes: https://github.com/containers/podman/issues/8506

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-04 14:19:58 +01:00
OpenShift Merge Robot 23f25b8261
Merge pull request #8823 from giuseppe/exec-honor-privileged
exec: honor --privileged
2021-01-04 10:53:44 +01:00
Baron Lenardson b90f7f9095 Rework pruning to report reclaimed space
This change adds code to report the reclaimed space after a prune.
Reclaimed space from volumes, images, and containers is recorded
during the prune call in a PruneReport struct. These structs are
collected into a slice during a system prune and processed afterwards
to calculate the total reclaimed space.

Closes #8658

Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
2020-12-30 19:57:35 -06:00
OpenShift Merge Robot c6c9b45985
Merge pull request #8852 from afbjorklund/slirp_sandbox-no_pivot_root
The slirp4netns sandbox requires pivot_root
2020-12-30 16:03:28 +01:00
OpenShift Merge Robot a84383297c
Merge pull request #8853 from jubalh/gentoo
Add support for Gentoo file to package query
2020-12-30 15:57:55 +01:00
Michael Vetter 904dec2164 Add support for Gentoo file to package query
On Gentoo systems where `app-portage/gentoolkit` is installed the binary
`equery` is used to query for information on which package a file
belongs to.

Signed-off-by: Michael Vetter <jubalh@iodoru.org>
2020-12-29 20:33:27 +01:00
Anders F Björklund 25b7198441 The slirp4netns sandbox requires pivot_root
Disable the sandbox, when running on rootfs

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
2020-12-29 18:03:49 +01:00
Giuseppe Scrivano 2a39a6195a
exec: honor --privileged
write the capabilities to the configuration passed to the OCI
runtime.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-12-24 22:11:14 +01:00
Giuseppe Scrivano 2a97639263
libpod: change function to accept ExecOptions
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-12-24 22:01:38 +01:00
Baron Lenardson 76afb50f3a Consolidate filter logic to pkg subdirectory
Per the conversation on pull/8724 I am consolidating filter logic
and helper functions under the pkg/domain/filters dir.

Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
2020-12-24 20:27:41 +00:00
Giuseppe Scrivano 64571ea0a4
libpod: handle single user mapped as root
if a single user is mapped in the user namespace, handle it as root.

It is needed for running unprivileged containers with a single user
available without being forced to run with euid and egid set to 0.

Needs: https://github.com/containers/storage/pull/794

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-12-24 13:39:15 +01:00