You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -30,13 +30,13 @@ We aimed to achieve container image's *block-level de-duplication on store*, *on
30
30
### Converting images.
31
31
You can generate new images which can be stored in de-dup manners and can be run with lazy-pull, based on an existing one.
32
32
We developed the image converter which generates following data.
33
-
- __Boot image__: Generated Docker image. This image include __boot__ program which has responsibility to set up the execution environment on boot, using casync and desync (both of them are also included in the image), then exec the original ENTRYPOINT app in the container. We use casync for provisioning the original image's rootfs with FUSE based on included metadata (aka [caibx or caidx](https://github.com/systemd/casync#file-suffixes)). By desync process, most of the original rootfs data will be *pulled lazily* from __remote chunk store__ on access, and cached locally. We use desync's [cache functionality](https://github.com/folbricht/desync#caching), so if some blobs are on the node, desync just use these blobs without pulling them remotely, which leads to *block-level de-duplication on transfer*. If you use container's volume as __local cache__, this can be shared with several containers on the node, then you can achieve *block-level inter-container de-duplication on the node*. This boot image follows [Docker image spec](https://github.com/moby/moby/blob/master/image/spec/v1.2.md), so you can pull and run it from container registry in very normal ways *without modification on the container runtime or registry*.
33
+
- __Boot image__: Generated Docker image. This image include __boot__ program which has responsibility to set up the execution environment on boot, using casync and desync (both of them are also included in the image), then exec the original ENTRYPOINT app in the container. We use casync for provisioning the original image's rootfs with FUSE based on included metadata (aka [caibx or caidx](https://github.com/systemd/casync#file-suffixes)). By desync process, most of the original rootfs data will be *pulled lazily* from __remote chunk store__ on access, and cached locally. We use desync's [cache functionality](https://github.com/folbricht/desync#caching), so if some blobs are on the node, desync just use these blobs without pulling them remotely, which leads to *block-level de-duplication on transfer*. If you use container's volume as __local cache__, this can be shared with several containers on the node, then you can achieve *block-level inter-container de-duplication on the node*. This boot image follows [Docker image spec](https://github.com/moby/moby/blob/master/image/spec/v1.2.md), so you can pull and run it from container registry in very normal ways *without modification on the container runtime or registry*. Recently, we are trying on several kinds of archive formats other than catar, for example, ISO9660 which has index header on top of the archive so we can pull arbitrary files lazily without parsing the entire archive (which tar or catar needs).
34
34
-__Rootfs blobs__ : The original image's block-level CDC-chunked rootfs blobs. We use casync for chunking. Put this blobs on somewhere like a cluster-global storage (we call it __remote chunk store__). If you store some sets of blobs generated by some containers in a same store, you can achieve *block-level de-duplication on the store*.
At runtime, the boot program sets up the execution environment, casync provisions the original rootfs using FUSE and desync pulls the rootfs blobs lazily from remote chunk store, as mentioned above.
39
+
At runtime, the boot program sets up the execution environment, casync (for catar) or `mount` command (for ISO9660) provisions the original rootfs using FUSE and desync pulls the rootfs blobs lazily from remote chunk store, as mentioned above.
40
40
The remote chunk store can be anything desync supports.
41
41
As we will mention in TODO list later, by extending desync to be able to talk container registry API, we believe that we can combine the boot image and the blobs into one OCI compatible image using similar way of FILEgrain project doing and that we can pull it in a manner of container runtimes doing, which means we don't need to have dedicated remote chunk stores anymore.
42
42
In our example, we use a SSH server with casync installed.
@@ -68,11 +68,11 @@ So currently, this is not perfect.
68
68
Some of the TODOs are listed below.
69
69
70
70
-[ ] We need to evaluate bootfs in quantitative ways (__critical !!!__).
71
-
-[ ] We use move mount and casync's FUSE mount functionality to provision rootfs. So we need to use insecure runtime options `--cap-add SYS_ADMIN`, `--security-opt apparmor:unconfined` and `--device /dev/fuse`. We can find a same kind of issue related to FUSE on the [Docker repo](https://github.com/docker/for-linux/issues/321).
71
+
-[ ] We use move mount and desync's FUSE mount functionality to provision rootfs. So we need to use insecure runtime options `--cap-add SYS_ADMIN`, `--security-opt apparmor:unconfined` and `--device /dev/fuse`. We can find a same kind of issue related to FUSE on the [Docker repo](https://github.com/docker/for-linux/issues/321).
72
72
-[ ] We cannot pull blobs from a container registry, which means we cannot combine boot image and the blobs into one container image and put it on a container registry. This is because we rely on desync for pulling blobs, which doesn't talk registry API. First we need to extend desync to support registry API, and then combine boot image and blobs using the way like [FILEgrain project](https://github.com/AkihiroSuda/filegrain) proposing. By doing it, we don't need dedicated remote chunk stores anymore.
73
73
-[ ] We cannot use container's volume functionality if we don't make mountpoint placeholder (dummy files or directories) on the original rootfs in advance, because the provisioned rootfs is read-only and we cannot make the placeholder at runtime.
74
74
-[ ] SSH client implementation is very ad-hoc. Let's say, desync rely on the system's ssh client and we are using [Dropbear](http://matt.ucc.asn.au/dropbear/dropbear.html) which may be fine. But, we inheriting original rootfs's user information configured in `/etc` (which is including `/etc/passwd` file, etc) without creating the bootfs-specific one. We also need to consider about around authentication, but currently we don't use any certifications and also ignore the dropbare's known_hosts checking.
75
-
-[ ] Boot image is heavy. But, this would be shared among all containers on node, thanks to container runtime's native layer-level de-duplication functionality. Maybe We need to create lighter binary which includes functionalities of boot program's setting-up functionality, casync's rootfs-provisioning functionality and desync's lazy-pull functionality.
75
+
-[ ] Boot image is heavy. But, this would be shared among all containers on node, thanks to container runtime's native layer-level de-duplication functionality. Maybe We need to create lighter binary which includes functionalities of boot program's setting-up functionality, rootfs-provisioning functionality and desync's lazy-pull functionality.
76
76
-[ ] We make blobs only from the original view of *rootfs*, not from *layers*. This means we through away the layer information of the original image.
77
77
-[ ] We can see casync and desync processes from any app, thus can break the world by reaping them. We need to `unshare(2)` PID namespace between setup-related processes and the others.
78
78
-[ ] Currently, we haven't gotten any trouble by move mounting from new rootfs to `/`, which could affect to casync and desync processes which rely on directories or files in original rootfs (like pulled blobs). But, move mounting possibly could affect them in dangerous ways. Maybe we need to `unshare(2)` mount namespaces between setup-related processes and the other.
0 commit comments