Skip to content

Commit fc6149a

Browse files
authored
Merge pull request systemd#4962 from poettering/root-directory-2
Add new MountAPIVFS= boolean unit file setting + RootImage=
2 parents 52a4aaf + ef3116b commit fc6149a

File tree

17 files changed

+405
-139
lines changed

17 files changed

+405
-139
lines changed

TODO

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -122,8 +122,6 @@ Features:
122122

123123
* switch to ProtectSystem=strict for all our long-running services where that's possible
124124

125-
* If RootDirectory= is used, mount /proc, /sys, /dev into it, if not mounted yet
126-
127125
* Permit masking specific netlink APIs with RestrictAddressFamily=
128126

129127
* nspawn: start UID allocation loop from hash of container name
@@ -153,8 +151,6 @@ Features:
153151
* Add DataDirectory=, CacheDirectory= and LogDirectory= to match
154152
RuntimeDirectory=, and create it as necessary when starting a service, owned by the right user.
155153

156-
* Add RootImage= for mounting a disk image or file as root directory
157-
158154
* make sure the ratelimit object can deal with USEC_INFINITY as way to turn off things
159155

160156
* journalctl: make sure -f ends when the container indicated by -M terminates

man/systemd-nspawn.xml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -256,10 +256,14 @@
256256

257257
<listitem><para>Takes a data integrity (dm-verity) root hash specified in hexadecimal. This option enables data
258258
integrity checks using dm-verity, if the used image contains the appropriate integrity data (see above). The
259-
specified hash must match the root hash of integrity data, and is usually at least 256bits (and hence 64
260-
hexadecimal characters) long (in case of SHA256 for example). If this option is not specified, but a file with
261-
the <filename>.roothash</filename> suffix is found next to the image file, bearing otherwise the same name the
262-
root hash is read from it and automatically used.</para></listitem>
259+
specified hash must match the root hash of integrity data, and is usually at least 256 bits (and hence 64
260+
formatted hexadecimal characters) long (in case of SHA256 for example). If this option is not specified, but
261+
the image file carries the <literal>user.verity.roothash</literal> extended file attribute (see <citerefentry
262+
project='man-pages'><refentrytitle>xattr</refentrytitle><manvolnum>7</manvolnum></citerefentry>), then the root
263+
hash is read from it, also as formatted hexadecimal characters. If the extended file attribute is not found (or
264+
is not supported by the underlying file system), but a file with the <filename>.roothash</filename> suffix is
265+
found next to the image file, bearing otherwise the same name, the root hash is read from it and automatically
266+
used, also as formatted hexadecimal characters.</para></listitem>
263267
</varlistentry>
264268

265269
<varlistentry>

man/systemd.exec.xml

Lines changed: 47 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -86,12 +86,10 @@
8686
<para>A few execution parameters result in additional, automatic
8787
dependencies to be added.</para>
8888

89-
<para>Units with <varname>WorkingDirectory=</varname> or
90-
<varname>RootDirectory=</varname> set automatically gain
91-
dependencies of type <varname>Requires=</varname> and
92-
<varname>After=</varname> on all mount units required to access
93-
the specified paths. This is equivalent to having them listed
94-
explicitly in <varname>RequiresMountsFor=</varname>.</para>
89+
<para>Units with <varname>WorkingDirectory=</varname>, <varname>RootDirectory=</varname> or
90+
<varname>RootImage=</varname> set automatically gain dependencies of type <varname>Requires=</varname> and
91+
<varname>After=</varname> on all mount units required to access the specified paths. This is equivalent to having
92+
them listed explicitly in <varname>RequiresMountsFor=</varname>.</para>
9593

9694
<para>Similar, units with <varname>PrivateTmp=</varname> enabled automatically get mount unit dependencies for all
9795
mounts required to access <filename>/tmp</filename> and <filename>/var/tmp</filename>. They will also gain an
@@ -117,9 +115,10 @@
117115
<varname>User=</varname> is used. If not set, defaults to the root directory when systemd is running as a
118116
system instance and the respective user's home directory if run as user. If the setting is prefixed with the
119117
<literal>-</literal> character, a missing working directory is not considered fatal. If
120-
<varname>RootDirectory=</varname> is not set, then <varname>WorkingDirectory=</varname> is relative to the root
121-
of the system running the service manager. Note that setting this parameter might result in additional
122-
dependencies to be added to the unit (see above).</para></listitem>
118+
<varname>RootDirectory=</varname>/<varname>RootImage=</varname> is not set, then
119+
<varname>WorkingDirectory=</varname> is relative to the root of the system running the service manager. Note
120+
that setting this parameter might result in additional dependencies to be added to the unit (see
121+
above).</para></listitem>
123122
</varlistentry>
124123

125124
<varlistentry>
@@ -132,8 +131,33 @@
132131
the <function>chroot()</function> jail. Note that setting this parameter might result in additional
133132
dependencies to be added to the unit (see above).</para>
134133

135-
<para>The <varname>PrivateUsers=</varname> setting is particularly useful in conjunction with
136-
<varname>RootDirectory=</varname>. For details, see below.</para></listitem>
134+
<para>The <varname>MountAPIVFS=</varname> and <varname>PrivateUsers=</varname> settings are particularly useful
135+
in conjunction with <varname>RootDirectory=</varname>. For details, see below.</para></listitem>
136+
</varlistentry>
137+
138+
<varlistentry>
139+
<term><varname>RootImage=</varname></term>
140+
<listitem><para>Takes a path to a block device node or regular file as argument. This call is similar to
141+
<varname>RootDirectory=</varname> however mounts a file system hierarchy from a block device node or loopack
142+
file instead of a directory. The device node or file system image file needs to contain a file system without a
143+
partition table, or a file system within an MBR/MS-DOS or GPT partition table with only a single
144+
Linux-compatible partition, or a set of file systems within a GPT partition table that follows the <ulink
145+
url="http://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/">Discoverable Partitions
146+
Specification</ulink>.</para></listitem>
147+
</varlistentry>
148+
149+
<varlistentry>
150+
<term><varname>MountAPIVFS=</varname></term>
151+
152+
<listitem><para>Takes a boolean argument. If on, a private mount namespace for the unit's processes is created
153+
and the API file systems <filename>/proc</filename>, <filename>/sys</filename>, and <filename>/dev</filename>
154+
are mounted inside of it, unless they are already mounted. Note that this option has no effect unless used in
155+
conjunction with <varname>RootDirectory=</varname>/<varname>RootImage=</varname> as these three mounts are
156+
generally mounted in the host anyway, and unless the root directory is changed, the private mount namespace
157+
will be a 1:1 copy of the host's, and include these three mounts. Note that the <filename>/dev</filename> file
158+
system of the host is bind mounted if this option is used without <varname>PrivateDevices=</varname>. To run
159+
the service with a private, minimal version of <filename>/dev/</filename>, combine this option with
160+
<varname>PrivateDevices=</varname>.</para></listitem>
137161
</varlistentry>
138162

139163
<varlistentry>
@@ -938,7 +962,7 @@
938962
access a process might have to the file system hierarchy. Each setting takes a space-separated list of paths
939963
relative to the host's root directory (i.e. the system running the service manager). Note that if paths
940964
contain symlinks, they are resolved relative to the root directory set with
941-
<varname>RootDirectory=</varname>.</para>
965+
<varname>RootDirectory=</varname>/<varname>RootImage=</varname>.</para>
942966

943967
<para>Paths listed in <varname>ReadWritePaths=</varname> are accessible from within the namespace with the same
944968
access modes as from outside of it. Paths listed in <varname>ReadOnlyPaths=</varname> are accessible for
@@ -957,9 +981,10 @@
957981
<para>Paths in <varname>ReadWritePaths=</varname>, <varname>ReadOnlyPaths=</varname> and
958982
<varname>InaccessiblePaths=</varname> may be prefixed with <literal>-</literal>, in which case they will be
959983
ignored when they do not exist. If prefixed with <literal>+</literal> the paths are taken relative to the root
960-
directory of the unit, as configured with <varname>RootDirectory=</varname>, instead of relative to the root
961-
directory of the host (see above). When combining <literal>-</literal> and <literal>+</literal> on the same
962-
path make sure to specify <literal>-</literal> first, and <literal>+</literal> second.</para>
984+
directory of the unit, as configured with <varname>RootDirectory=</varname>/<varname>RootImage=</varname>,
985+
instead of relative to the root directory of the host (see above). When combining <literal>-</literal> and
986+
<literal>+</literal> on the same path make sure to specify <literal>-</literal> first, and <literal>+</literal>
987+
second.</para>
963988

964989
<para>Note that using this setting will disconnect propagation of mounts from the service to the host
965990
(propagation in the opposite direction continues to work). This means that this setting may not be used for
@@ -990,9 +1015,9 @@
9901015
that in this case both read-only and regular bind mounts are reset, regardless which of the two settings is
9911016
used.</para>
9921017

993-
<para>This option is particularly useful when <varname>RootDirectory=</varname> is used. In this case the
994-
source path refers to a path on the host file system, while the destination path refers to a path below the
995-
root directory of the unit.</para></listitem>
1018+
<para>This option is particularly useful when <varname>RootDirectory=</varname>/<varname>RootImage=</varname>
1019+
is used. In this case the source path refers to a path on the host file system, while the destination path
1020+
refers to a path below the root directory of the unit.</para></listitem>
9961021
</varlistentry>
9971022

9981023
<varlistentry>
@@ -1080,10 +1105,10 @@
10801105
such as <varname>CapabilityBoundingSet=</varname> will affect only the latter, and there's no way to acquire
10811106
additional capabilities in the host's user namespace. Defaults to off.</para>
10821107

1083-
<para>This setting is particularly useful in conjunction with <varname>RootDirectory=</varname>, as the need to
1084-
synchronize the user and group databases in the root directory and on the host is reduced, as the only users
1085-
and groups who need to be matched are <literal>root</literal>, <literal>nobody</literal> and the unit's own
1086-
user and group.</para></listitem>
1108+
<para>This setting is particularly useful in conjunction with
1109+
<varname>RootDirectory=</varname>/<varname>RootImage=</varname>, as the need to synchronize the user and group
1110+
databases in the root directory and on the host is reduced, as the only users and groups who need to be matched
1111+
are <literal>root</literal>, <literal>nobody</literal> and the unit's own user and group.</para></listitem>
10871112
</varlistentry>
10881113

10891114
<varlistentry>

src/core/dbus-execute.c

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -758,6 +758,7 @@ const sd_bus_vtable bus_exec_vtable[] = {
758758
SD_BUS_PROPERTY("LimitRTTIMESoft", "t", bus_property_get_rlimit, offsetof(ExecContext, rlimit[RLIMIT_RTTIME]), SD_BUS_VTABLE_PROPERTY_CONST),
759759
SD_BUS_PROPERTY("WorkingDirectory", "s", property_get_working_directory, 0, SD_BUS_VTABLE_PROPERTY_CONST),
760760
SD_BUS_PROPERTY("RootDirectory", "s", NULL, offsetof(ExecContext, root_directory), SD_BUS_VTABLE_PROPERTY_CONST),
761+
SD_BUS_PROPERTY("RootImage", "s", NULL, offsetof(ExecContext, root_image), SD_BUS_VTABLE_PROPERTY_CONST),
761762
SD_BUS_PROPERTY("OOMScoreAdjust", "i", property_get_oom_score_adjust, 0, SD_BUS_VTABLE_PROPERTY_CONST),
762763
SD_BUS_PROPERTY("Nice", "i", property_get_nice, 0, SD_BUS_VTABLE_PROPERTY_CONST),
763764
SD_BUS_PROPERTY("IOScheduling", "i", property_get_ioprio, 0, SD_BUS_VTABLE_PROPERTY_CONST),
@@ -828,6 +829,7 @@ const sd_bus_vtable bus_exec_vtable[] = {
828829
SD_BUS_PROPERTY("RestrictNamespaces", "t", bus_property_get_ulong, offsetof(ExecContext, restrict_namespaces), SD_BUS_VTABLE_PROPERTY_CONST),
829830
SD_BUS_PROPERTY("BindPaths", "a(ssbt)", property_get_bind_paths, 0, SD_BUS_VTABLE_PROPERTY_CONST),
830831
SD_BUS_PROPERTY("BindReadOnlyPaths", "a(ssbt)", property_get_bind_paths, 0, SD_BUS_VTABLE_PROPERTY_CONST),
832+
SD_BUS_PROPERTY("MountAPIVFS", "b", bus_property_get_bool, offsetof(ExecContext, mount_apivfs), SD_BUS_VTABLE_PROPERTY_CONST),
831833
SD_BUS_VTABLE_END
832834
};
833835

@@ -1047,7 +1049,7 @@ int bus_exec_context_set_transient_property(
10471049

10481050
return 1;
10491051

1050-
} else if (STR_IN_SET(name, "TTYPath", "RootDirectory")) {
1052+
} else if (STR_IN_SET(name, "TTYPath", "RootDirectory", "RootImage")) {
10511053
const char *s;
10521054

10531055
r = sd_bus_message_read(message, "s", &s);
@@ -1060,6 +1062,8 @@ int bus_exec_context_set_transient_property(
10601062
if (mode != UNIT_CHECK) {
10611063
if (streq(name, "TTYPath"))
10621064
r = free_and_strdup(&c->tty_path, s);
1065+
else if (streq(name, "RootImage"))
1066+
r = free_and_strdup(&c->root_image, s);
10631067
else {
10641068
assert(streq(name, "RootDirectory"));
10651069
r = free_and_strdup(&c->root_directory, s);
@@ -1207,7 +1211,7 @@ int bus_exec_context_set_transient_property(
12071211
"PrivateTmp", "PrivateDevices", "PrivateNetwork", "PrivateUsers",
12081212
"NoNewPrivileges", "SyslogLevelPrefix", "MemoryDenyWriteExecute",
12091213
"RestrictRealtime", "DynamicUser", "RemoveIPC", "ProtectKernelTunables",
1210-
"ProtectKernelModules", "ProtectControlGroups")) {
1214+
"ProtectKernelModules", "ProtectControlGroups", "MountAPIVFS")) {
12111215
int b;
12121216

12131217
r = sd_bus_message_read(message, "b", &b);
@@ -1247,6 +1251,8 @@ int bus_exec_context_set_transient_property(
12471251
c->protect_kernel_modules = b;
12481252
else if (streq(name, "ProtectControlGroups"))
12491253
c->protect_control_groups = b;
1254+
else if (streq(name, "MountAPIVFS"))
1255+
c->mount_apivfs = b;
12501256

12511257
unit_write_drop_in_private_format(u, mode, name, "%s=%s", name, yes_no(b));
12521258
}
@@ -1495,12 +1501,15 @@ int bus_exec_context_set_transient_property(
14951501
return r;
14961502

14971503
STRV_FOREACH(p, l) {
1498-
int offset;
1499-
if (!utf8_is_valid(*p))
1504+
const char *i = *p;
1505+
size_t offset;
1506+
1507+
if (!utf8_is_valid(i))
15001508
return sd_bus_error_setf(error, SD_BUS_ERROR_INVALID_ARGS, "Invalid %s", name);
15011509

1502-
offset = **p == '-';
1503-
if (!path_is_absolute(*p + offset))
1510+
offset = i[0] == '-';
1511+
offset += i[offset] == '+';
1512+
if (!path_is_absolute(i + offset))
15041513
return sd_bus_error_setf(error, SD_BUS_ERROR_INVALID_ARGS, "Invalid %s", name);
15051514
}
15061515

@@ -1519,7 +1528,6 @@ int bus_exec_context_set_transient_property(
15191528
unit_write_drop_in_private_format(u, mode, name, "%s=", name);
15201529
} else {
15211530
r = strv_extend_strv(dirs, l, true);
1522-
15231531
if (r < 0)
15241532
return -ENOMEM;
15251533

0 commit comments

Comments
 (0)