sd-daemon: introduce sd_watchdog_enabled() for parsing $WATCHDOG_USEC

Also, introduce a new environment variable named $WATCHDOG_PID which
cotnains the PID of the process that is supposed to send the keep-alive
events. This is similar how $LISTEN_FDS and $LISTEN_PID work together,
and protects against confusing processes further down the process tree
due to inherited environment.
This commit is contained in:
Lennart Poettering 2013-12-22 22:14:05 +01:00
parent 565a9388f2
commit 09812eb764
14 changed files with 326 additions and 26 deletions

View file

@ -41,6 +41,7 @@ MANPAGES += \
man/sd_journal_stream_fd.3 \
man/sd_listen_fds.3 \
man/sd_notify.3 \
man/sd_watchdog_enabled.3 \
man/shutdown.8 \
man/sysctl.d.5 \
man/systemctl.1 \
@ -1133,6 +1134,7 @@ EXTRA_DIST += \
man/sd_seat_get_active.xml \
man/sd_session_is_active.xml \
man/sd_uid_get_state.xml \
man/sd_watchdog_enabled.xml \
man/shutdown.xml \
man/sysctl.d.xml \
man/systemctl.xml \

View file

@ -167,6 +167,7 @@
<citerefentry><refentrytitle>sd_notify</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_booted</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_is_fifo</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_watchdog_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>daemon</refentrytitle><manvolnum>7</manvolnum></citerefentry>,
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
<citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry>,

View file

@ -164,11 +164,15 @@
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>
for details. It is recommended to send
this message if the
<varname>WATCHDOG_USEC=</varname>
environment variable has been set for
the service process, in every half the
time interval that is specified in the
variable.</para></listitem>
<varname>$WATCHDOG_PID</varname>
environment variable has been set to
the PID of the service process, in
every half the time interval that is
specified in the
<varname>$WATCHDOG_USEC</varname>
environment variable. See
<citerefentry><refentrytitle>sd_watchdog_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>
for details.</para></listitem>
</varlistentry>
</variablelist>
@ -311,7 +315,8 @@
<citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd-daemon</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>daemon</refentrytitle><manvolnum>7</manvolnum></citerefentry>,
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_watchdog_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>
</para>
</refsect1>

198
man/sd_watchdog_enabled.xml Normal file
View file

@ -0,0 +1,198 @@
<?xml version='1.0'?> <!--*-nxml-*-->
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<!--
This file is part of systemd.
Copyright 2013 Lennart Poettering
systemd is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or
(at your option) any later version.
systemd is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with systemd; If not, see <http://www.gnu.org/licenses/>.
-->
<refentry id="sd_watchdog_enabled">
<refentryinfo>
<title>sd_watchdog_enabled</title>
<productname>systemd</productname>
<authorgroup>
<author>
<contrib>Developer</contrib>
<firstname>Lennart</firstname>
<surname>Poettering</surname>
<email>lennart@poettering.net</email>
</author>
</authorgroup>
</refentryinfo>
<refmeta>
<refentrytitle>sd_watchdog_enabled</refentrytitle>
<manvolnum>3</manvolnum>
</refmeta>
<refnamediv>
<refname>sd_watchdog_enabled</refname>
<refpurpose>Check whether the service manager expects watchdog keep-alive notifications from a service</refpurpose>
</refnamediv>
<refsynopsisdiv>
<funcsynopsis>
<funcsynopsisinfo>#include &lt;systemd/sd-daemon.h&gt;</funcsynopsisinfo>
<funcprototype>
<funcdef>int <function>sd_watchdog_enabled</function></funcdef>
<paramdef>int <parameter>unset_environment</parameter></paramdef>
<paramdef>const uint64_t *<parameter>usec</parameter></paramdef>
</funcprototype>
</funcsynopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para><function>sd_watchdog_enabled()</function> may
be called by a service to detect whether the service
manager expects regular keep-alive watchdog
notification events from it, and the timeout after
which the manager will act on the service if it did
not get such a notification.</para>
<para>If the <parameter>unset_environment</parameter>
parameter is non-zero,
<function>sd_watchdog_enabled()</function> will unset
the <varname>$WATCHDOG_USEC</varname> and
<varname>$WATCHDOG_PID</varname> environment variables
before returning (regardless whether the function call
itself succeeded or not). Further calls to
<function>sd_watchdog_enabled()</function> will then
return with zero, but the variable is no longer
inherited by child processes.</para>
<para>If the <parameter>usec</parameter> parameter is
non-NULL <function>sd_watchdog_enabled()</function>
will return the timeout in µs for the watchdog
logic. The service manager will usually terminate a
service when it did not get a notification message
within the specified time after startup and after each
previous message. It is recommended that a daemon
sends a keep-alive notification message to the service
manager every half of the time returned
here. Notification messages may be sent with
<citerefentry><refentrytitle>sd_notify</refentrytitle><manvolnum>3</manvolnum></citerefentry>
with a message string of
<literal>WATCHDOG=1</literal>.</para>
<para>To enable service supervision with the watchdog
logic use <varname>WatchdogSec=</varname> in service
files. See
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>
for details.</para>
</refsect1>
<refsect1>
<title>Return Value</title>
<para>On failure, this call returns a negative
errno-style error code. If the service manager expects
watchdog keep-alive notification messages to be sent,
&gt; 0 is returned, otherwise 0 is returned. Only if
the return value is &gt; 0 the
<parameter>usec</parameter> parameter is valid after
the call.</para>
</refsect1>
<refsect1>
<title>Notes</title>
<para>This function is provided by the reference
implementation of APIs for new-style daemons and
distributed with the systemd package. The algorithm
it implements is simple, and can easily be
reimplemented in daemons if it is important to support
this interface without using the reference
implementation.</para>
<para>Internally, this functions parses the
<varname>$WATCHDOG_PID</varname> and
<varname>$WATCHDOG_USEC</varname> environment
variable. The call will ignore these variables if
<varname>$WATCHDOG_PID</varname> does containe the PID
of the current process, under the assumption that in
that case the variables were set for a different
process further up the process tree.</para>
<para>For details about the algorithm check the
liberally licensed reference implementation sources:
<ulink url="http://cgit.freedesktop.org/systemd/systemd/plain/src/libsystemd-daemon/sd-daemon.c"/>
and <ulink
url="http://cgit.freedesktop.org/systemd/systemd/plain/src/systemd/sd-daemon.h"/></para>
<para><function>sd_watchdog_enabled()</function> is
implemented in the reference implementation's
<filename>sd-daemon.c</filename> and
<filename>sd-daemon.h</filename> files. These
interfaces are available as shared library, which can
be compiled and linked to with the
<constant>libsystemd-daemon</constant> <citerefentry><refentrytitle>pkg-config</refentrytitle><manvolnum>1</manvolnum></citerefentry>
file. Alternatively, applications consuming these APIs
may copy the implementation into their source
tree. For more details about the reference
implementation see
<citerefentry><refentrytitle>sd-daemon</refentrytitle><manvolnum>3</manvolnum></citerefentry>.</para>
<para>If the reference implementation is used as
drop-in files and -DDISABLE_SYSTEMD is set during
compilation, these functions will always return 0 and
otherwise become a NOP.</para>
</refsect1>
<refsect1>
<title>Environment</title>
<variablelist class='environment-variables'>
<varlistentry>
<term><varname>$WATCHDOG_PID</varname></term>
<listitem><para>Set by the system
manager for supervised process for
which watchdog support is enabled, and
contains the PID of that process. See
above for details.</para></listitem>
</varlistentry>
<varlistentry>
<term><varname>$WATCHDOG_USEC</varname></term>
<listitem><para>Set by the system
manager for supervised process for
which watchdog support is enabled, and
contains the watchdog timeout in µs
See above for
details.</para></listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See Also</title>
<para>
<citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd-daemon</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>daemon</refentrytitle><manvolnum>7</manvolnum></citerefentry>,
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_notify</refentrytitle><manvolnum>3</manvolnum></citerefentry>
</para>
</refsect1>
</refentry>

View file

@ -1020,6 +1020,7 @@ static void do_idle_pipe_dance(int idle_pipe[4]) {
static int build_environment(
ExecContext *c,
unsigned n_fds,
usec_t watchdog_usec,
const char *home,
const char *username,
const char *shell,
@ -1032,7 +1033,7 @@ static int build_environment(
assert(c);
assert(ret);
our_env = new(char*, 8);
our_env = new0(char*, 10);
if (!our_env)
return -ENOMEM;
@ -1046,6 +1047,16 @@ static int build_environment(
our_env[n_env++] = x;
}
if (watchdog_usec > 0) {
if (asprintf(&x, "WATCHDOG_PID=%lu", (unsigned long) getpid()) < 0)
return -ENOMEM;
our_env[n_env++] = x;
if (asprintf(&x, "WATCHDOG_USEC=%llu", (unsigned long long) watchdog_usec) < 0)
return -ENOMEM;
our_env[n_env++] = x;
}
if (home) {
x = strappend("HOME=", home);
if (!x)
@ -1084,7 +1095,7 @@ static int build_environment(
}
our_env[n_env++] = NULL;
assert(n_env <= 8);
assert(n_env <= 10);
*ret = our_env;
our_env = NULL;
@ -1104,6 +1115,7 @@ int exec_spawn(ExecCommand *command,
CGroupControllerMask cgroup_supported,
const char *cgroup_path,
const char *unit_id,
usec_t watchdog_usec,
int idle_pipe[4],
ExecRuntime *runtime,
pid_t *ret) {
@ -1560,7 +1572,7 @@ int exec_spawn(ExecCommand *command,
}
}
err = build_environment(context, n_fds, home, username, shell, &our_env);
err = build_environment(context, n_fds, watchdog_usec, home, username, shell, &our_env);
if (r < 0) {
r = EXIT_MEMORY;
goto fail_child;

View file

@ -181,6 +181,7 @@ int exec_spawn(ExecCommand *command,
CGroupControllerMask cgroup_mask,
const char *cgroup_path,
const char *unit_id,
usec_t watchdog_usec,
int pipe_fd[2],
ExecRuntime *runtime,
pid_t *ret);

View file

@ -787,6 +787,7 @@ static int mount_spawn(Mount *m, ExecCommand *c, pid_t *_pid) {
UNIT(m)->manager->cgroup_supported,
UNIT(m)->cgroup_path,
UNIT(m)->id,
0,
NULL,
m->exec_runtime,
&pid);

View file

@ -1750,7 +1750,7 @@ static int service_spawn(
if (r < 0)
goto fail;
our_env = new0(char*, 5);
our_env = new0(char*, 4);
if (!our_env) {
r = -ENOMEM;
goto fail;
@ -1768,12 +1768,6 @@ static int service_spawn(
goto fail;
}
if (s->watchdog_usec > 0)
if (asprintf(our_env + n_env++, "WATCHDOG_USEC=%llu", (unsigned long long) s->watchdog_usec) < 0) {
r = -ENOMEM;
goto fail;
}
if (UNIT(s)->manager->running_as != SYSTEMD_SYSTEM)
if (asprintf(our_env + n_env++, "MANAGERPID=%lu", (unsigned long) getpid()) < 0) {
r = -ENOMEM;
@ -1804,6 +1798,7 @@ static int service_spawn(
UNIT(s)->manager->cgroup_supported,
path,
UNIT(s)->id,
s->watchdog_usec,
s->type == SERVICE_IDLE ? UNIT(s)->manager->idle_pipe : NULL,
s->exec_runtime,
&pid);

View file

@ -1254,6 +1254,7 @@ static int socket_spawn(Socket *s, ExecCommand *c, pid_t *_pid) {
UNIT(s)->manager->cgroup_supported,
UNIT(s)->cgroup_path,
UNIT(s)->id,
0,
NULL,
s->exec_runtime,
&pid);

View file

@ -645,6 +645,7 @@ static int swap_spawn(Swap *s, ExecCommand *c, pid_t *_pid) {
UNIT(s)->manager->cgroup_supported,
UNIT(s)->cgroup_path,
UNIT(s)->id,
0,
NULL,
s->exec_runtime,
&pid);

View file

@ -2164,17 +2164,10 @@ _public_ int sd_event_set_watchdog(sd_event *e, int b) {
if (b) {
struct epoll_event ev = {};
const char *env;
env = getenv("WATCHDOG_USEC");
if (!env)
return false;
r = safe_atou64(env, &e->watchdog_period);
if (r < 0)
r = sd_watchdog_enabled(false, &e->watchdog_period);
if (r <= 0)
return r;
if (e->watchdog_period <= 0)
return -EIO;
/* Issue first ping immediately */
sd_notify(false, "WATCHDOG=1");

View file

@ -25,3 +25,8 @@ global:
local:
*;
};
LIBSYSTEMD_DAEMON_209 {
global:
sd_watchdog_enabled;
} LIBSYSTEMD_DAEMON_31;

View file

@ -518,3 +518,69 @@ _sd_export_ int sd_booted(void) {
return !!S_ISDIR(st.st_mode);
#endif
}
_sd_export_ int sd_watchdog_enabled(int unset_environment, uint64_t *usec) {
#if defined(DISABLE_SYSTEMD) || !defined(__linux__)
return 0;
#else
unsigned long long ll;
unsigned long l;
const char *e;
char *p = NULL;
int r;
e = getenv("WATCHDOG_PID");
if (!e) {
r = 0;
goto finish;
}
errno = 0;
l = strtoul(e, &p, 10);
if (errno > 0) {
r = -errno;
goto finish;
}
if (!p || p == e || *p || l <= 0) {
r = -EINVAL;
goto finish;
}
/* Is this for us? */
if (getpid() != (pid_t) l) {
r = 0;
goto finish;
}
e = getenv("WATCHDOG_USEC");
if (!e) {
r = -EINVAL;
goto finish;
}
errno = 0;
ll = strtoull(e, &p, 10);
if (errno > 0) {
r = -errno;
goto finish;
}
if (!p || p == e || *p || l <= 0) {
r = -EINVAL;
goto finish;
}
if (usec)
*usec = ll;
r = 1;
finish:
if (unset_environment) {
unsetenv("WATCHDOG_PID");
unsetenv("WATCHDOG_USEC");
}
return r;
#endif
}

View file

@ -186,6 +186,8 @@ int sd_is_socket_unix(int fd, int type, int listening, const char *path, size_t
the file descriptor is a POSIX Message Queue of the specified name,
0 otherwise. If path is NULL a message queue name check is not
done. Returns a negative errno style error code on failure.
See sd_is_mq(3) for more information.
*/
int sd_is_mq(int fd, const char *path);
@ -220,7 +222,8 @@ int sd_is_mq(int fd, const char *path);
WATCHDOG=1 Tells systemd to update the watchdog timestamp.
Services using this feature should do this in
regular intervals. A watchdog framework can use the
timestamps to detect failed services.
timestamps to detect failed services. Also see
sd_watchdog_enabled() below.
Daemons can choose to send additional variables. However, it is
recommended to prefix variable names not listed above with X_.
@ -275,6 +278,22 @@ int sd_notifyf(int unset_environment, const char *format, ...) _sd_printf_attr_(
*/
int sd_booted(void);
/*
Returns > 0 if the service manager expects watchdog keep-alive
events to be sent regularly via sd_notify(0, "WATCHDOG=1"). Returns
0 if it does not expect this. If the usec argument is non-NULL
returns the watchdog timeout in µs after which the service manager
will act on a process that has not sent a watchdog keep alive
message. This function is useful to implement services that
recognize automatically if they are being run under supervision of
systemd with WatchdogSec= set. It is recommended for clients to
generate keep-alive pings via sd_notify(0, "WATCHDOG=1") every half
of the returned time.
See sd_watchdog_enabled(3) for more information.
*/
int sd_watchdog_enabled(int unset_environment, uint64_t *usec);
#ifdef __cplusplus
}
#endif