-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Jailer: Incorrect handling of bind mounts within the rootfs #1089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Note: Disabling seccomp does not help "--seccomp-level", "0" |
Indeed seccomp has nothing to do with this. I think it's because the jailer bind mounts the jail over itself unrecursively. It's easy to reproduce the behavior without firecracker: # Create a jail and a mountpoint
mkdir jail
touch jail/device
file jail/device
jail/device: empty # Mount a device in the jail
sudo mount --bind /dev/random jail/device
file jail/device
jail/device: character special (1/8) # Mount the jail over itself
sudo mount --bind jail jail
file jail/device
jail/device: empty The device is no longer properly mounted - looks like your use case. sudo mount --rbind jail jail
file jail/device
jail/device: character special (1/8) |
Nope, I'm wrong, the jailer keeps the mount intact. @mcastelino can you help with the sequence of operations that create the mounts inside the jail so I can reproduce exactly what you're seeing? mkdir /srv/jailer/demo_jail/
touch /srv/jailer/demo_jail/mountpoint
file /srv/jailer/demo_jail/mountpoint
/srv/jailer/demo_jail/mountpoint: empty sudo mount --bind /dev/dm-0 /srv/jailer/demo_jail/mountpoint
file /srv/jailer/demo_jail/mountpoint
/srv/jailer/demo_jail/mountpoint: block special (253/0) sudo target/x86_64-unknown-linux-musl/debug/jailer --id demo --exec-file $PWD/target/x86_64-unknown-linux-musl/debug/firecracker --node 0 --uid 1234 --gid 1234 --chroot-base-dir /srv/jailer/demo_jail file /srv/jailer/demo_jail/mountpoint
/srv/jailer/demo_jail/mountpoint: block special (253/0) |
@aghecenco here is a simple sequence of operations. Run jailer two ways
curl -fsSL -o hello-vmlinux.bin https://s3.amazonaws.com/spec.ccfc.min/img/hello/kernel/hello-vmlinux.bin
curl -fsSL -o hello-rootfs.ext4 https://s3.amazonaws.com/spec.ccfc.min/img/hello/fsfiles/hello-rootfs.ext4
sudo rm -rf /var/lib/jailer
#Hold original content
sudo mkdir -p /var/lib/jailer/testjail
#Holds the jailed content
sudo mkdir -p /var/lib/jailer/firecracker/testhardlink/root
sudo mkdir -p /var/lib/jailer/firecracker/testbindmount/root
sudo -E cp ./hello-vmlinux.bin /var/lib/jailer/testjail
sudo -E cp ./hello-rootfs.ext4 /var/lib/jailer/testjail
#Using hardlinks
sudo -E ln /var/lib/jailer/testjail/hello-vmlinux.bin /var/lib/jailer/firecracker/testhardlink/root/hello-vmlinux.bin
sudo -E ln /var/lib/jailer/testjail/hello-rootfs.ext4 /var/lib/jailer/firecracker/testhardlink/root/hello-rootfs.ext4
#Using bindmount
sudo -E touch /var/lib/jailer/firecracker/testbindmount/root/hello-vmlinux.bin
sudo -E touch /var/lib/jailer/firecracker/testbindmount/root/hello-rootfs.ext4
sudo -E mount --bind /var/lib/jailer/testjail/hello-vmlinux.bin /var/lib/jailer/firecracker/testbindmount/root/hello-vmlinux.bin
sudo -E mount --bind /var/lib/jailer/testjail/hello-rootfs.ext4 /var/lib/jailer/firecracker/testbindmount/root/hello-rootfs.ext4
sudo -E $HOME/firecracker/build/debug/jailer \
--id testhardlink \
--node 0 \
--exec-file $HOME/firecracker/build/debug/firecracker \
--uid 0 \
--gid 0 \
--chroot-base-dir /var/lib/jailer \
--seccomp-level 0 \
--daemonize
sudo -E $HOME/firecracker/build/debug/jailer \
--id testbindmount \
--node 0 \
--exec-file $HOME/firecracker/build/debug/firecracker \
--uid 0 \
--gid 0 \
--chroot-base-dir /var/lib/jailer \
--seccomp-level 0 \
--daemonize Use the following script to launch the VM
curl --unix-socket /var/lib/jailer/firecracker/$1/root/api.socket -i \
-X PUT 'http://localhost/boot-source' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"kernel_image_path": "./hello-vmlinux.bin",
"boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
}'
curl --unix-socket /var/lib/jailer/firecracker/$1/root/api.socket -i \
-X PUT 'http://localhost/drives/rootfs' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"drive_id": "rootfs",
"path_on_host": "./hello-rootfs.ext4",
"is_root_device": true,
"is_read_only": false
}'
curl --unix-socket /var/lib/jailer/firecracker/$1/root/api.socket -i \
-X PUT 'http://localhost/actions' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"action_type": "InstanceStart"
}' Launch VM through jailer using hardlinks (it works)sudo ./launch testhardlink
Launch with bindmount which will fail
+ curl --unix-socket /var/lib/jailer/firecracker/testbindmount/root/api.socket -i -X PUT http://localhost/boot-source -H Accept: application/json -H Content-Type: application/json -d {
"kernel_image_path": "./hello-vmlinux.bin",
"boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
}
HTTP/1.1 204 No Content
Date: Mon, 13 May 2019 19:19:55 GMT
+ curl --unix-socket /var/lib/jailer/firecracker/testbindmount/root/api.socket -i -X PUT http://localhost/drives/rootfs -H Accept: application/json -H Content-Type: application/json -d {
"drive_id": "rootfs",
"path_on_host": "./hello-rootfs.ext4",
"is_root_device": true,
"is_read_only": false
}
HTTP/1.1 204 No Content
Date: Mon, 13 May 2019 19:19:55 GMT
+ curl --unix-socket /var/lib/jailer/firecracker/testbindmount/root/api.socket -i -X PUT http://localhost/actions -H Accept: application/json -H Content-Type: application/json -d {
"action_type": "InstanceStart"
}
HTTP/1.1 400 Bad Request
Content-Type: application/json
Transfer-Encoding: chunked
Date: Mon, 13 May 2019 19:19:55 GMT
{
"fault_message": "Cannot load kernel due to invalid memory configuration or invalid kernel image. Failed to read ELF header" |
See the last error, that is coming from firecracker being unable to read the backing file "fault_message": "Cannot load kernel due to invalid memory configuration or invalid kernel image. Failed to read ELF header" Further the jailed locations are setup correctly as seen below
|
Another thing to note is that if you change the bind mount to shared for the pivot root, this test will pass if the file already exists in the root. However if the files are dropped into the root after firecracker has been launched, we will still see the same issue. In kata we drop in the files after the firecracker launch (but before instance start). Also we patch drives after instance start. So in both cases the bind mount happens post jail creation.
|
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: firecracker-microvm#1089 Signed-off-by: Manohar Castelino <[email protected]>
@aghecenco @andreeaflorescu making the jailed mount as slave solves this issue. Please let me know if this still meets the security criterion. This protects the host from the container while allowing the host the freedom to update the jail from the outside. |
@mcastelino that's what I missed while testing. Thank you for the detailed investigation and the PR! 👍 |
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: firecracker-microvm#1089 Signed-off-by: Manohar Castelino <[email protected]>
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: firecracker-microvm#1089 Signed-off-by: Manohar Castelino <[email protected]>
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: firecracker-microvm#1089 Signed-off-by: Manohar Castelino <[email protected]>
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: firecracker-microvm#1089 Signed-off-by: Manohar Castelino <[email protected]>
Hi @mcastelino, We stared a bit longer at this, and it turns out there are two things which have to be fixed:
// Bind mount the jail root directory over itself, so we can go around a restriction
// imposed by pivot_root, which states that the new root and the old root should not
// be on the same filesystem. Safe because we provide valid parameters.
SyscallReturnCode(unsafe {
libc::mount(
chroot_dir.as_ptr(),
chroot_dir.as_ptr(),
null(),
libc::MS_BIND,
null(),
)
})
.into_empty_result()
.map_err(Error::MountBind)?; we don't also supply the
Can you please update your PR to also add the |
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: firecracker-microvm#1089 Signed-off-by: Manohar Castelino <[email protected]>
@alexandruag yes I missed the third case. I fixed as you suggested and also done the formatting. |
User can bind mount into the chroot location. This is needed as hard links cannot cross file system boundaries. Copy is not always feasible (e.g. block devices). Change the bind mount to be slave, such that host to jail bind mounts are properly propagated. However we do not want to jail to host events to propgate back. Fixes: #1089 Signed-off-by: Manohar Castelino <[email protected]>
…ating mounts for the guest kernel and rootfs and mounting them to the jailer root as mentioned in firecracker-microvm#1089 Signed-off-by: Anthony Corletti <[email protected]>
In Kata containers we bind mount device mapper devices into the chroot location.
This is needed as
Jailer does not seem to be able to handle this correctly.
The same bind mount is handled properly when using without jailer (i.e. just firecracker)
Below is the file hierarchies with and without jailer.
Here drive_0's are bind mounted to device mapper device nodes, which are then passed as drives to firecracker
The text was updated successfully, but these errors were encountered: