Читать книгу Cloud Native Security - Chris Binnie - Страница 24

Docker Rootless Mode

Оглавление

Docker, beginning with v19.03 (docs.docker.com/engine/release-notes/#19030), offers a clever feature it calls rootless mode, in which Docker Engine doesn't require superuser privileges to spawn containers. Rootless mode appears to be an extension of a stable feature called user namespaces, which helped harden a container. The premise of that functionality was to effectively fool a container into thinking that it was using a host's user ID (UID) and group ID (GID) normally, when from a host's perspective the UID/GID used in the container was being run without any privileges and so was of much less consequence to the host's security.

With rootless mode there are some prerequisites to get started; these have to do with mapping unprivileged users with kernel namespaces. On Debian derivatives, the package we need is called uidmap, but we will start (as the root user) by removing Docker Engine and its associated packages with this command (be careful only to do this on systems that are used for development, for obvious reasons):

$ apt purge docker

Then, continuing as the superuser, we will install the package noted earlier with this command:

$ apt install uidmap

Next, we need to check the following two files to make sure that a less-privileged user (named chris in this instance) has 65,536 UIDs and GIDs available for re-mapping:

$ cat /etc/subuid chris:100000:65536 $ cat /etc/subgid chris:100000:65536

The output is what is expected, so we can continue. One caveat with this experimental functionality is that Docker Inc. encourages you to use an Ubuntu kernel. We will test this setup on a Linux Mint machine with Ubuntu 18.04 LTS under the hood.

If you want to try this on Debian Linux, Arch Linux, openSUSE, Fedora (v31+), or CentOS, then you will need to prepare your machine a little more beforehand. For example, although Debian is the underlying OS for Ubuntu, there are clearly notable differences between the two OSs; to try this feature on Debian, you would need to adjust the kernel settings a little beforehand. The required kernel tweak would be as follows, relating to user namespaces:

kernel.unprivileged_userns_clone=1 # add me to /etc/sysctl.conf to persist after a reboot

You would also be wise to use the overlay2 storage driver with this command:

$ modprobe overlay permit_mounts_in_userns=1 # add me to /etc/modprobe.d to survive a reboot

There are a few limitations that we will need to look at before continuing. The earlier user namespace feature had some trade-offs that meant the functionality was not suited for every application. For example, the --net=host feature was not compatible. However, that is not a problem, because the feature is a security hole; it is not recommended, because the host's network stack is opened up to a container for abuse. Similarly, we saw that the same applied when we tried to share the process table with the --pid switch in Chapter 1. It was also impossible to use --read-only containers to prevent data being saved to the container's internal filesystem, which disabled a welcome security control. And, the avoid-at-all-costs Privileged mode was not possible in this setup either.

For rootless mode, however, the limitations are subtly different. In Table 2.1 we can see some of the limitations.

Table 2.1: Rootless Mode Limitations and Restrictions

RESTRICTED FUNCTIONALITY DESCRIPTION/WORKAROUND
Control groups Known as cgroups, these were used to throttle containers to quotas for host services such as CPU, I/O, and RAM but are not available in rootless mode.
AppArmor On Ubuntu derivatives or those OSs that use AppArmor, it is not possible to use the mandatory access controls in AppArmor.
Checkpoint An experimental feature for snapshotting containers; checkpoints will not work in rootless mode: docs.docker.com/engine/reference/commandline/checkpoint.
Overlay v1 It appears that the original overlay storage driver is not compatible. Use overlay2 instead: docs.docker.com/storage/storagedriver/overlayfs-driver.
Privileged ports Sometimes known as root ports, privileged ports are any network ports below 1024 and for security reasons can only be exposed to the network by the root user. It is, however, possible to use the setcap command apparently to do this, but you should research the potentially unintended consequences: $ setcap cap_net_bind_service=ep $HOME/bin/rootlesskit.
Ping command On some Linux distributions it may not be possible to use the ping command without adding net.ipv4.ping_group_range = 0 2147483647 to the file /etc/sysctl.conf.
Networking You need to enter the correct namespace for the host to have visibility of the IP address of the container using nsenter (man7.org/linux/man-pages/man1/nsenter.1.html), and the same applies to the host's networking as per user namespaces. The --net=host option won't work without extra effort or conceding security trade-offs.

The contents of Table 2.1 are not intended to put you off using rootless mode but instead give an insight into the lengths that the developers at Docker have had to go to in order to make this functionality a reality. There are unquestionably trade-offs, but that is almost always the case when security controls are introduced. You might have only one lock on the front door of your house, for example, but to be fully insurable your door probably needs two locks, which means paying for a second lock, fitting it, and carrying a second key with you when you have left the house.

Cloud Native Security

Подняться наверх