Linux时间服务

4 minute read

systemd-timesyncd

首先查看systemd-timesyncd服务,出现异常,通过search发现systemd-timesyncd和ntpd有互斥关系,在systemd-timesyncd服务启动的时候,如果发现有ntpd服务,就不启动了。

root@ip:~# systemctl status systemd-timesyncd
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/systemd-timesyncd.service.d
           └─disable-with-time-daemon.conf
   Active: inactive (dead)
Condition: start condition failed at Sat 2020-03-21 14:11:07 UTC; 29s ago
           └─ ConditionFileIsExecutable=!/usr/sbin/ntpd was not met
     Docs: man:systemd-timesyncd.service(8)

ntp

接着排查ntpd服务

# systemctl发现ntp服务是异常状态
root@ip:~# systemctl status ntp
● ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; generated; vendor preset: enabled)
   Active: active (running) since Sat 2020-03-14 10:20:04 UTC; 1 weeks 0 days ago
     Docs: man:systemd-sysv-generator(8)
    Tasks: 2 (limit: 5529)
   Memory: 2.4M
      CPU: 28.491s
   CGroup: /system.slice/ntp.service
           └─2785 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 105:109

Mar 21 12:20:51 ip ntpd[2785]: Deleting interface #55 cali00c42313d69, fe80::ecee:eeff:feee:eeee%39#123, interface stats: received=0, sent=0, dropped=0, active_time=3
1589 secs
Mar 21 12:22:55 ip ntpd[2785]: Listen normally on 60 cali932a6d7b8c8 [fe80::ecee:eeff:feee:eeee%42]:123
Mar 21 12:27:16 ip ntpd[2785]: bind(37) AF_INET6 fe80::ecee:eeff:feee:eeee%43#123 flags 0x11 failed: Cannot assign requested address
Mar 21 12:27:16 ip ntpd[2785]: unable to create socket on calic18accdd6e8 (61) for fe80::ecee:eeff:feee:eeee%43#123
Mar 21 12:27:16 ip ntpd[2785]: failed to init interface for address fe80::ecee:eeff:feee:eeee%43
Mar 21 12:27:18 ip ntpd[2785]: Listen normally on 62 calic18accdd6e8 [fe80::ecee:eeff:feee:eeee%43]:123
Mar 21 12:28:04 ip ntpd[2785]: bind(40) AF_INET6 fe80::ecee:eeff:feee:eeee%44#123 flags 0x11 failed: Cannot assign requested address
Mar 21 12:28:04 ip ntpd[2785]: unable to create socket on cali775db44cfbe (63) for fe80::ecee:eeff:feee:eeee%44#123
Mar 21 12:28:04 ip ntpd[2785]: failed to init interface for address fe80::ecee:eeff:feee:eeee%44
Mar 21 12:28:06 ip ntpd[2785]: Listen normally on 64 cali775db44cfbe [fe80::ecee:eeff:feee:eeee%44]:123

# 通过网上的线索,发现ntp服务的ipv6问题,修改/etc/default/ntp为
# NTPD_OPTS='-4 -g'
# the option -4 says to the ntpd to only listen to ipv4 .

root@ip:~# systemctl status ntp
● ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; generated; vendor preset: enabled)
   Active: active (running) since Sat 2020-03-21 14:21:12 UTC; 9s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 26121 ExecStop=/etc/init.d/ntp stop (code=exited, status=0/SUCCESS)
  Process: 26161 ExecStart=/etc/init.d/ntp start (code=exited, status=0/SUCCESS)
    Tasks: 2 (limit: 5529)
   Memory: 2.4M
      CPU: 22ms
   CGroup: /system.slice/ntp.service
           └─26182 /usr/sbin/ntpd -p /var/run/ntpd.pid -4 -g -u 105:109

Mar 21 14:21:15 ip ntpd[26182]: Soliciting pool server 129.250.35.251
Mar 21 14:21:15 ip ntpd[26182]: Soliciting pool server 13.230.38.136
Mar 21 14:21:16 ip ntpd[26182]: Soliciting pool server 162.159.200.1
Mar 21 14:21:16 ip ntpd[26182]: Soliciting pool server 153.127.161.248
Mar 21 14:21:16 ip ntpd[26182]: Soliciting pool server 202.181.103.212
Mar 21 14:21:16 ip ntpd[26182]: Soliciting pool server 23.152.160.126
Mar 21 14:21:17 ip ntpd[26182]: Soliciting pool server 45.76.244.202
Mar 21 14:21:17 ip ntpd[26182]: Soliciting pool server 129.250.35.250
Mar 21 14:21:18 ip ntpd[26182]: Soliciting pool server 162.251.160.61
Mar 21 14:21:18 ip ntpd[26182]: Soliciting pool server 2401:c080:1400:4d1b:5400:2ff:fe81:9d6b

  RTC(Real-Time Clock)或CMOS时间,一般在主板上靠电池供电,服务器断电后也会继续运行。仅保存日期时间数值,无法保存时区和夏令时设置。

chrony

NTP服务搞定之后,服务器时间还是无法保持一致;怀疑是因为连接的不同的ntp服务器导致的,因此继续调查,发现在aws上已经不建议使用ntp,而建议使用Amazon Time Sync Service with chrony,详细配置文档blog

⚠️,一定要注意centos/Amazon Linux AMI配置文件是在/etc/chrony.conf,而debian/ubuntu是在/etc/chrony/chrony.conf,安装的过程中一定要使用chronyc trackingchronyc sources -v检查utc时间源是不是169.254.169.123

## cat init_machine.sh

# stop ntp
sudo systemctl stop ntp
sudo apt-get remove ntp -y

# start chrony
sudo apt-get install chrony -y
sudo sed -i "/pool 2.debian.pool.ntp.org iburst/i\server 169.254.169.123 prefer iburst minpoll 4 maxpoll 4" /etc/chrony/chrony.conf
sudo systemctl restart chronyd
sudo systemctl enable chrony

# stop systemd-timesyncd
sudo systemctl stop systemd-timesyncd
sudo systemctl disable systemd-timesyncd

sudo apt-get install dbus -y
sudo timedatectl
sudo chronyc tracking

## 使用kubectl执行所有的命令
kubectl get nodes|grep northeast-1|cut -d " " -f 1|while read host;do scp init_machine.sh admin@$host:~;ssh admin@$host sh init_machine.sh</dev/null;done

## 通过date检查时间的准确性
kubectl get nodes|grep northeast-1|cut -d " " -f 1|while read host;do echo "$host ";ssh admin@$host date +%s.%N</dev/null;date  +%s.%N;done

## 通过chronyc tracking检查时间的准确性
kubectl get nodes|grep northeast-1|cut -d " " -f 1|while read host;do echo "$host ";ssh admin@$host sudo chronyc tracking </dev/null;done

使用chrony的过程中遇到一个问题,k8s debian服务器在重启之后,ntp会自动装上,然后chrony会被unmask,调查了很多地方,没有找到线索。

apt-get remove ntp

apt-get install chrony
# 检查chrony配置第一行是不是
/etc/chrony/chrony.conf
server 169.254.169.123 prefer iburst minpoll 4 maxpoll 4

systemctl unmask chrony.service
systemctl enable chrony.service
systemctl start chrony.service
systemctl status chrony.service

自建时间服务器

如果完成上面这些动作之后,时间的一致性还是满足不了需求的话,可能需要从自己的时间服务器上进行同步,local同步应该要好很多,具体操作方法如下:

Given this situation using the link local was not a viable method to sync each EC2’s time as that references the underlying host which is different for some of the EC2s. We explored looking into using an external NTP source to pull data from however, I did find a method in my environment that appeared to work.

The method I had was to use one instance in my VPC as the “master” server to pull time and serve up the time for the other nodes with the following changes in the /etc/chrony/chrony.conf file with:

server 169.254.169.123 prefer iburst minpoll 4 maxpoll 4 allow 172.31.0.0/16

This would pull from link local as you are aware but you would also need to add in a line for your VPC, in my case it was 172.31.0.0/16 for my VPC, your’s may be different. I also had to allow a new UDP rule for port 123 inbound for 172.31.0.0/16 in my VPC security group. Depending on the rules in your environment outbound 123 for VPC traffic may also need to be enabled.

On the “client” servers I sourced the main machine by changing the /etc/chrony/chrony.conf with the following:

server 172.31.44.125 prefer iburst minpoll 4 maxpoll 4

This would pull from the “master” server so that the time would be in sync every 16.2 seconds interval, you may change this by editing the minpoll/maxpoll for your use case[1].

As mentioned on the chat, this is normally a system administration task[2], but I have given a best effort to come up with a workaround to find an option to have time synced for your environment.

Tags: ,

Categories: , ,

Updated: