parse_timestamp: accept RFC3339-style timezone and %FT%R[:%S[.%N]]

We basically parsed the RFC3339 format already, except with a space:
      NOTE: ISO 8601 defines date and time separated by "T".
      Applications using this syntax may choose, for the sake of
      readability, to specify a full-date and full-time separated by
      (say) a space character.
so now we handle both
  2012-11-23 11:12:13.456
  2012-11-23T11:12:13.456
as equivalent.

Parse directly-suffixed Z and +05:30 timezones as well:
  2012-11-23T11:12:13.456Z
  2012-11-23T11:12:13.456+02:00
as they're both defined by RFC3339.

We do /not/ allow z or t; the RFC says
      NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this
      syntax may alternatively be lower case "t" or "z" respectively.

      This date/time format may be used in some environments or contexts
      that distinguish between the upper- and lower-case letters 'A'-'Z'
      and 'a'-'z' (e.g. XML).  Specifications that use this format in
      such environments MAY further limit the date/time syntax so that
      the letters 'T' and 'Z' used in the date/time syntax must always
      be upper case.  Applications that generate this format SHOULD use
      upper case letters.
We /are/ in a case-sensitive environment, neither are in wide-spread
use, and "z" poses an issue of whether "todayz" should be the same
as "todayZ" ("today UTC") or an error (it should be an error).

Fractional seconds are limited to six digits (they're nominally
   time-secfrac    = "." 1*DIGIT
), since we only support 1µs-resolution timestamps, and limit to six
digits in our other sub-second formats.

Parsing
  2012-11-23T11:12
is an extension two ways (no seconds, no timezone),
mirroring our "canonical" format.

Fixes #5194
This commit is contained in:
наб 2023-09-03 22:34:41 +02:00 committed by Zbigniew Jędrzejewski-Szmek
parent 82b7bf8c1c
commit ef658a63f8
3 changed files with 115 additions and 17 deletions

View file

@ -110,7 +110,10 @@
<para>When parsing, systemd will accept a similar syntax, but expects no timezone specification, unless
it is given as the literal string <literal>UTC</literal> (for the UTC timezone), or is specified to be
the locally configured timezone, or the timezone name in the IANA timezone database format. The complete
the locally configured timezone, the timezone name in the IANA timezone database format,
or in the <ulink url="https://tools.ietf.org/html/rfc3339">RFC 3339</ulink> profile of ISO 8601, as
<literal>Z</literal> or <literal>+<replaceable>05</replaceable>:<replaceable>30</replaceable></literal>
appended directly after the timestamp. The complete
list of timezones supported on your system can be obtained using the <literal>timedatectl
list-timezones</literal> (see
<citerefentry><refentrytitle>timedatectl</refentrytitle><manvolnum>1</manvolnum></citerefentry>). Using
@ -154,6 +157,8 @@
<programlisting> Fri 2012-11-23 11:12:13 → Fri 2012-11-23 11:12:13
2012-11-23 11:12:13 → Fri 2012-11-23 11:12:13
2012-11-23 11:12:13 UTC → Fri 2012-11-23 19:12:13
2012-11-23T11:12:13Z → Fri 2012-11-23 19:12:13
2012-11-23T11:12+02:00 → Fri 2012-11-23 17:12:00
2012-11-23 → Fri 2012-11-23 00:00:00
12-11-23 → Fri 2012-11-23 00:00:00
11:12:13 → Fri 2012-11-23 11:12:13

View file

@ -632,7 +632,7 @@ char* format_timespan(char *buf, size_t l, usec_t t, usec_t accuracy) {
static int parse_timestamp_impl(
const char *t,
size_t tz_offset,
size_t max_len,
bool utc,
int isdst,
long gmtoff,
@ -669,8 +669,12 @@ static int parse_timestamp_impl(
/* Allowed syntaxes:
*
* 2012-09-22 16:34:22
* 2012-09-22 16:34:22.1[2[3[4[5[6]]]]]
* 2012-09-22 16:34:22 (µsec will be set to 0)
* 2012-09-22 16:34 (seconds will be set to 0)
* 2012-09-22T16:34:22.1[2[3[4[5[6]]]]]
* 2012-09-22T16:34:22 (µsec will be set to 0)
* 2012-09-22T16:34 (seconds will be set to 0)
* 2012-09-22 (time will be set to 00:00:00)
* 16:34:22 (date will be set to today)
* 16:34 (date will be set to today, seconds to 0)
@ -684,17 +688,26 @@ static int parse_timestamp_impl(
*
* Note, on DST change, 00:00:00 may not exist and in that case the time part may be shifted.
* E.g. "Sun 2023-03-13 America/Havana" is parsed as "Sun 2023-03-13 01:00:00 CDT".
*
* A simplified strptime-spelled RFC3339 ABNF looks like
* "%Y-%m-%d" "T" "%H" ":" "%M" ":" "%S" [".%N"] ("Z" / (("+" / "-") "%H:%M"))
* We additionally allow no seconds and inherited timezone
* for symmetry with our other syntaxes and improved interactive usability:
* "%Y-%m-%d" "T" "%H" ":" "%M" ":" ["%S" [".%N"]] ["Z" / (("+" / "-") "%H:%M")]
* RFC3339 defines time-secfrac to as "." 1*DIGIT, but we limit to 6 digits,
* since we're limited to 1µs resolution.
* We also accept "Sat 2012-09-22T16:34:22", RFC3339 warns against it.
*/
assert(t);
if (tz_offset != SIZE_MAX) {
if (max_len != SIZE_MAX) {
/* If the input string contains timezone, then cut it here. */
if (tz_offset <= 1) /* timezone must be after a space. */
if (max_len == 0) /* Can't be the only field */
return -EINVAL;
t_alloc = strndup(t, tz_offset - 1);
t_alloc = strndup(t, max_len);
if (!t_alloc)
return -ENOMEM;
@ -806,6 +819,7 @@ static int parse_timestamp_impl(
goto from_tm;
}
/* Our "canonical" RFC3339 syntax variant */
tm = copy;
k = strptime(t, "%Y-%m-%d %H:%M:%S", &tm);
if (k) {
@ -815,6 +829,16 @@ static int parse_timestamp_impl(
goto from_tm;
}
/* RFC3339 syntax */
tm = copy;
k = strptime(t, "%Y-%m-%dT%H:%M:%S", &tm);
if (k) {
if (*k == '.')
goto parse_usec;
else if (*k == 0)
goto from_tm;
}
/* Support OUTPUT_SHORT and OUTPUT_SHORT_PRECISE formats */
tm = copy;
k = strptime(t, "%b %d %H:%M:%S", &tm);
@ -832,6 +856,7 @@ static int parse_timestamp_impl(
goto from_tm;
}
/* Our "canonical" RFC3339 syntax variant without seconds */
tm = copy;
k = strptime(t, "%Y-%m-%d %H:%M", &tm);
if (k && *k == 0) {
@ -839,6 +864,14 @@ static int parse_timestamp_impl(
goto from_tm;
}
/* RFC3339 syntax without seconds */
tm = copy;
k = strptime(t, "%Y-%m-%dT%H:%M", &tm);
if (k && *k == 0) {
tm.tm_sec = 0;
goto from_tm;
}
tm = copy;
k = strptime(t, "%y-%m-%d", &tm);
if (k && *k == 0) {
@ -941,13 +974,13 @@ static int parse_timestamp_maybe_with_tz(const char *t, size_t tz_offset, bool v
continue;
/* The specified timezone matches tzname[] of the local timezone. */
return parse_timestamp_impl(t, tz_offset, /* utc = */ false, /* isdst = */ j, /* gmtoff = */ 0, ret);
return parse_timestamp_impl(t, tz_offset - 1, /* utc = */ false, /* isdst = */ j, /* gmtoff = */ 0, ret);
}
/* If we know that the last word is a valid timezone (e.g. Asia/Tokyo), then simply drop the timezone
* and parse the remaining string as a local time. If we know that the last word is not a timezone,
* then assume that it is a part of the time and try to parse the whole string as a local time. */
return parse_timestamp_impl(t, valid_tz ? tz_offset : SIZE_MAX,
return parse_timestamp_impl(t, valid_tz ? tz_offset - 1 : SIZE_MAX,
/* utc = */ false, /* isdst = */ -1, /* gmtoff = */ 0, ret);
}
@ -959,40 +992,50 @@ typedef struct ParseTimestampResult {
int parse_timestamp(const char *t, usec_t *ret) {
ParseTimestampResult *shared, tmp;
const char *k, *tz, *current_tz;
size_t tz_offset;
size_t max_len, t_len;
struct tm tm;
int r;
assert(t);
t_len = strlen(t);
if (t_len > 2 && t[t_len - 1] == 'Z' && t[t_len - 2] != ' ') /* RFC3339-style welded UTC: "1985-04-12T23:20:50.52Z" */
return parse_timestamp_impl(t, t_len - 1, /* utc = */ true, /* isdst = */ -1, /* gmtoff = */ 0, ret);
if (t_len > 7 && IN_SET(t[t_len - 6], '+', '-') && t[t_len - 7] != ' ') { /* RFC3339-style welded offset: "1990-12-31T15:59:60-08:00" */
k = strptime(&t[t_len - 6], "%z", &tm);
if (k && *k == '\0')
return parse_timestamp_impl(t, t_len - 6, /* utc = */ true, /* isdst = */ -1, /* gmtoff = */ tm.tm_gmtoff, ret);
}
tz = strrchr(t, ' ');
if (!tz)
return parse_timestamp_impl(t, /* tz_offset = */ SIZE_MAX, /* utc = */ false, /* isdst = */ -1, /* gmtoff = */ 0, ret);
return parse_timestamp_impl(t, /* max_len = */ SIZE_MAX, /* utc = */ false, /* isdst = */ -1, /* gmtoff = */ 0, ret);
max_len = tz - t;
tz++;
tz_offset = tz - t;
/* Shortcut, parse the string as UTC. */
if (streq(tz, "UTC"))
return parse_timestamp_impl(t, tz_offset, /* utc = */ true, /* isdst = */ -1, /* gmtoff = */ 0, ret);
return parse_timestamp_impl(t, max_len, /* utc = */ true, /* isdst = */ -1, /* gmtoff = */ 0, ret);
/* If the timezone is compatible with RFC-822/ISO 8601 (e.g. +06, or -03:00) then parse the string as
* UTC and shift the result. Note, this must be earlier than the timezone check with tzname[], as
* tzname[] may be in the same format. */
k = strptime(tz, "%z", &tm);
if (k && *k == '\0')
return parse_timestamp_impl(t, tz_offset, /* utc = */ true, /* isdst = */ -1, /* gmtoff = */ tm.tm_gmtoff, ret);
return parse_timestamp_impl(t, max_len, /* utc = */ true, /* isdst = */ -1, /* gmtoff = */ tm.tm_gmtoff, ret);
/* If the last word is not a timezone file (e.g. Asia/Tokyo), then let's check if it matches
* tzname[] of the local timezone, e.g. JST or CEST. */
if (!timezone_is_valid(tz, LOG_DEBUG))
return parse_timestamp_maybe_with_tz(t, tz_offset, /* valid_tz = */ false, ret);
return parse_timestamp_maybe_with_tz(t, tz - t, /* valid_tz = */ false, ret);
/* Shortcut. If the current $TZ is equivalent to the specified timezone, it is not necessary to fork
* the process. */
current_tz = getenv("TZ");
if (current_tz && *current_tz == ':' && streq(current_tz + 1, tz))
return parse_timestamp_maybe_with_tz(t, tz_offset, /* valid_tz = */ true, ret);
return parse_timestamp_maybe_with_tz(t, tz - t, /* valid_tz = */ true, ret);
/* Otherwise, to avoid polluting the current environment variables, let's fork the process and set
* the specified timezone in the child process. */
@ -1017,7 +1060,7 @@ int parse_timestamp(const char *t, usec_t *ret) {
_exit(EXIT_FAILURE);
}
shared->return_value = parse_timestamp_maybe_with_tz(t, tz_offset, /* valid_tz = */ true, &shared->usec);
shared->return_value = parse_timestamp_maybe_with_tz(t, tz - t, /* valid_tz = */ true, &shared->usec);
_exit(EXIT_SUCCESS);
}

View file

@ -673,7 +673,7 @@ static bool timezone_equal(usec_t today, usec_t target) {
}
static void test_parse_timestamp_impl(const char *tz) {
usec_t today, now_usec;
usec_t today, today2, now_usec;
/* Invalid: Ensure that systemctl reboot --when=show and --when=cancel
* will not result in ambiguities */
@ -701,6 +701,56 @@ static void test_parse_timestamp_impl(const char *tz) {
test_parse_timestamp_one("70-01-01 00:00:01.001 UTC", 0, USEC_PER_SEC + 1000);
test_parse_timestamp_one("70-01-01 00:00:01.0010 UTC", 0, USEC_PER_SEC + 1000);
/* Examples from RFC3339 */
test_parse_timestamp_one("1985-04-12T23:20:50.52Z", 0, 482196050 * USEC_PER_SEC + 520000);
test_parse_timestamp_one("1996-12-19T16:39:57-08:00", 0, 851042397 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("1996-12-20T00:39:57Z", 0, 851042397 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("1990-12-31T23:59:60Z", 0, 662688000 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("1990-12-31T15:59:60-08:00", 0, 662688000 * USEC_PER_SEC + 000000);
assert_se(parse_timestamp("1937-01-01T12:00:27.87+00:20", NULL) == -EINVAL); /* we don't support pre-epoch timestamps */
/* We accept timestamps without seconds as well */
test_parse_timestamp_one("1996-12-20T00:39Z", 0, (851042397 - 57) * USEC_PER_SEC + 000000);
test_parse_timestamp_one("1990-12-31T15:59-08:00", 0, (662688000-60) * USEC_PER_SEC + 000000);
/* We drop day-of-week before parsing the timestamp */
test_parse_timestamp_one("Thu 1970-01-01T00:01 UTC", 0, USEC_PER_MINUTE);
test_parse_timestamp_one("Thu 1970-01-01T00:00:01 UTC", 0, USEC_PER_SEC);
test_parse_timestamp_one("Thu 1970-01-01T00:01Z", 0, USEC_PER_MINUTE);
test_parse_timestamp_one("Thu 1970-01-01T00:00:01Z", 0, USEC_PER_SEC);
/* RFC3339-style timezones can be welded to all formats */
assert_se(parse_timestamp("today UTC", &today) == 0);
assert_se(parse_timestamp("todayZ", &today2) == 0);
assert_se(today == today2);
assert_se(parse_timestamp("today +0200", &today) == 0);
assert_se(parse_timestamp("today+02:00", &today2) == 0);
assert_se(today == today2);
/* https://ijmacd.github.io/rfc3339-iso8601/ */
test_parse_timestamp_one("2023-09-06 12:49:27-00:00", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06 12:49:27.284-00:00", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06 12:49:27.284029Z", 0, 1694004567 * USEC_PER_SEC + 284029);
test_parse_timestamp_one("2023-09-06 12:49:27.284Z", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06 12:49:27.28Z", 0, 1694004567 * USEC_PER_SEC + 280000);
test_parse_timestamp_one("2023-09-06 12:49:27.2Z", 0, 1694004567 * USEC_PER_SEC + 200000);
test_parse_timestamp_one("2023-09-06 12:49:27Z", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06 14:49:27+02:00", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06 14:49:27.2+02:00", 0, 1694004567 * USEC_PER_SEC + 200000);
test_parse_timestamp_one("2023-09-06 14:49:27.28+02:00", 0, 1694004567 * USEC_PER_SEC + 280000);
test_parse_timestamp_one("2023-09-06 14:49:27.284+02:00", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06 14:49:27.284029+02:00", 0, 1694004567 * USEC_PER_SEC + 284029);
test_parse_timestamp_one("2023-09-06T12:49:27+00:00", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06T12:49:27-00:00", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06T12:49:27.284+00:00", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06T12:49:27.284-00:00", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06T12:49:27.284029Z", 0, 1694004567 * USEC_PER_SEC + 284029);
test_parse_timestamp_one("2023-09-06T12:49:27.284Z", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06T12:49:27.28Z", 0, 1694004567 * USEC_PER_SEC + 280000);
test_parse_timestamp_one("2023-09-06T12:49:27.2Z", 0, 1694004567 * USEC_PER_SEC + 200000);
test_parse_timestamp_one("2023-09-06T12:49:27Z", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06T14:49:27+02:00", 0, 1694004567 * USEC_PER_SEC + 000000);
test_parse_timestamp_one("2023-09-06T14:49:27.284+02:00", 0, 1694004567 * USEC_PER_SEC + 284000);
test_parse_timestamp_one("2023-09-06T14:49:27.284029+02:00", 0, 1694004567 * USEC_PER_SEC + 284029);
test_parse_timestamp_one("2023-09-06T21:34:27+08:45", 0, 1694004567 * USEC_PER_SEC + 000000);
if (timezone_is_valid("Asia/Tokyo", LOG_DEBUG)) {
/* Asia/Tokyo (+0900) */
test_parse_timestamp_one("Thu 1970-01-01 09:01 Asia/Tokyo", 0, USEC_PER_MINUTE);