Skip to content

Conversation

citrus-it
Copy link
Contributor

@citrus-it citrus-it commented Feb 27, 2025

For consistency of time within the rack we must guard against there ever
being two authoritative sources of time. We currently have two (admittedly
edge) cases where this can occur.

  1. When, some time after everything is synchronised, one or both of the
    boundary NTP servers loses upstream connectivity, but continues to
    advertise at the same stratum as its clock begins to drift.
  2. If both boundary NTP servers lose connectivity and fall back to their
    local clocks, advertising them with stratum 10, they will both be
    authoritative but with potentially different times.

This change addresses both of these by updating the NTP server configuration
in a number of ways.

  1. Adding each boundary server as a source to the other;
  2. Configuring the boundary "local" sources with the "orphan" flag that
    causes selection of just one as authoritative if both are in that mode;
  3. Configuring RSS sources with a new "failfast" flag that causes them to
    be discounted quickly (marked as "unselectable") once they are considered
    unreachable, instead of waiting for their "distance" to degrade over time;
  4. Adjusting the root dispersion decay rate from chrony's default of 1µs/s
    (versus RFC recommended default of 15µs/s) up to 60µs/s to achieve faster
    source reselection.

This is partially in response to the problems encountered in #7534

@citrus-it
Copy link
Contributor Author

oxidecomputer/helios-omnios-build#49 is the other part of this that updates chrony and adds the new failfast source option.

@citrus-it
Copy link
Contributor Author

I've tested this by deploying to the london environment and everything works as expected.

On a boundary NTP zone, there are the correct 5 nameserver entries in /etc/resolv.conf:

root@oxz_ntp_9575937a:~# ls
root@oxz_ntp_9575937a:~# cat /etc/resolv.conf
nameserver fd00:1122:3344:1::1
nameserver fd00:1122:3344:2::1
nameserver fd00:1122:3344:3::1
nameserver 1.1.1.1
nameserver 9.9.9.9

and both the external NTP server and the peer boundary are being used:

chronyc> sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* 172.20.0.5                    3   0   377     0   +124ns[ +102ns] +/-   16ms
^+ fd00:1122:3344:102::f         4   6   377    48  +8628ns[-7787ns] +/-   16ms
^? oxz_ntp_9575937a-8c8f-4c>     0   9   377     -     +0ns[   +0ns] +/-    0ns

chronyc> selectdata -v
  . State: N - noselect, s - unsynchronised, M - missing samples,
 /         d/D - large distance, ~ - jittery, w/W - waits for others,
|          S - stale, O - orphan, T - not trusted, P - not preferred,
|          U - waits for update,, x - falseticker, + - combined, * - best.
|   Effective options   ---------.  (N - noselect, P - prefer
|   Configured options  ----.     \  T - trust, R - require)
|   Auth. enabled (Y/N) -.   \     \     Offset interval --.
|                        |    |     |                       |
S Name/IP Address        Auth COpts EOpts Last Score     Interval  Leap
=======================================================================
* 172.20.0.5                N ----- -----    0   1.0   -16ms   +16ms  N
+ fd00:1122:3344:102::f     N ----- -----   24   1.0   -16ms   +16ms  N
M oxz_ntp_9575937a-8c8f-4c> N ----- -----    0   1.0    +0ns    +0ns  N

As expected, when seeing itself, chrony has marked that source as unusable (the ? against it in the sources output).

Compare this with an internal DNS zone:

root@oxz_ntp_c254a759:~# cat /etc/resolv.conf
nameserver fd00:1122:3344:1::1
nameserver fd00:1122:3344:2::1
nameserver fd00:1122:3344:3::1

root@oxz_ntp_c254a759:~# chronyc
chrony version 4.6.1
Copyright (C) 1997-2003, 2007, 2009-2024 Richard P. Curnow and others
chrony comes with ABSOLUTELY NO WARRANTY.  This is free software, and
you are welcome to redistribute it under certain conditions.  See the
GNU General Public License version 2 for details.

chronyc> sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* fd00:1122:3344:102::f         4   7   377    46  -4260ns[-5105ns] +/-   16ms
^+ fd00:1122:3344:101::10        4   6   377    59  -1170ns[-1994ns] +/-   16ms

chronyc> selectdata -v
  . State: N - noselect, s - unsynchronised, M - missing samples,
 /         d/D - large distance, ~ - jittery, w/W - waits for others,
|          S - stale, O - orphan, T - not trusted, P - not preferred,
|          U - waits for update,, x - falseticker, + - combined, * - best.
|   Effective options   ---------.  (N - noselect, P - prefer
|   Configured options  ----.     \  T - trust, R - require)
|   Auth. enabled (Y/N) -.   \     \     Offset interval --.
|                        |    |     |                       |
S Name/IP Address        Auth COpts EOpts Last Score     Interval  Leap
=======================================================================
* fd00:1122:3344:102::f     N ----- -----    0   1.0   -16ms   +16ms  N
+ fd00:1122:3344:101::10    N ----- -----   12   1.0   -16ms   +16ms  N

Disabling access to the boundary's upstream results in a rapid switch to using the other boundary as the source:

root@oxz_ntp_9575937a:~# route add -host 172.20.0.5 127.0.0.1 -blackhole
add host 172.20.0.5: gateway 127.0.0.1

chronyc> tracking
Reference ID    : 9F7807A6 (fd00:1122:3344:102::f)
Stratum         : 5
Ref time (UTC)  : Fri Mar 28 21:21:29 2025
System time     : 0.000007926 seconds slow of NTP time
Last offset     : -0.000008382 seconds
RMS offset      : 0.000004284 seconds
Frequency       : 47.575 ppm slow
Residual freq   : -0.002 ppm
Skew            : 0.561 ppm
Root delay      : 0.029122135 seconds
Root dispersion : 0.002157714 seconds
Update interval : 54.9 seconds
Leap status     : Normal
chronyc>
chronyc> sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^? 172.20.0.5                    3   4     0   111  +9244ns[-1571ns] +/-   16ms
^* fd00:1122:3344:102::f         4   6   377    56  +6699ns[-1683ns] +/-   16ms
^? oxz_ntp_9575937a-8c8f-4c>     0   9   377     -     +0ns[   +0ns] +/-    0ns

chronyc> selectdata -v
  . State: N - noselect, s - unsynchronised, M - missing samples,
 /         d/D - large distance, ~ - jittery, w/W - waits for others,
|          S - stale, O - orphan, T - not trusted, P - not preferred,
|          U - waits for update,, x - falseticker, + - combined, * - best.
|   Effective options   ---------.  (N - noselect, P - prefer
|   Configured options  ----.     \  T - trust, R - require)
|   Auth. enabled (Y/N) -.   \     \     Offset interval --.
|                        |    |     |                       |
S Name/IP Address        Auth COpts EOpts Last Score     Interval  Leap
=======================================================================
N 172.20.0.5                N ----- -----   99   1.0   -23ms   +22ms  N
* fd00:1122:3344:102::f     N ----- -----   44   1.0   -16ms   +16ms  N
M oxz_ntp_9575937a-8c8f-4c> N ----- -----    0   1.0    +0ns    +0ns  N

I also did tests where I removed connectivity from both boundary zones and confirmed that one of them became authoritative, with the other taking time from it.

@citrus-it citrus-it marked this pull request as ready for review March 28, 2025 21:30
@citrus-it
Copy link
Contributor Author

This is ready for review, and integration once buildomat images are updated to include the new chrony version.

…lems.

For consistency of time within the rack we must guard against there ever
being two authoritative sources of time. We currently have two (admittedly
edge) cases where this can occur.

1) When, some time after everything is synchronised, one or both of the
   boundary NTP servers loses upstream connectivity, but continues to
   advertise at the same stratum as its clock begins to drift.
2) If both boundary NTP servers lose connectivity and fall back to their
   local clocks, advertising them with stratum 10, they will both be
   authoritative but with potentially different times.

This change addresses both of these by updating the NTP server configuration
in a number of ways.

1) Adding each boundary server as a source to the other;
2) Configuring the boundary "local" sources with the "orphan" flag that
   causes selection of just one as authoritative if both are in that mode;
3) Configuring RSS sources with a new "failfast" flag that causes them to
   be discounted quickly (marked as "unselectable") once they are considered
   unreachable, instead of waiting for their "distance" to degrade over time;
4) Adjusting the root dispersion decay rate from chrony's default of 1µs/s
   (versus RFC recommended default of 15µs/s) up to 60µs/s to achieve faster
   source reselection.
@citrus-it citrus-it merged commit acdcbd9 into main Apr 9, 2025
16 checks passed
@citrus-it citrus-it deleted the andy/ntp branch April 9, 2025 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants