Wazo-consul flooding log after restore (Wazo 22.01)

Hi,

We had an issue with our production VM so we installed a new one and restored a backup (same hostname, same ip address).

Everything is working but today when i was checking the log i saw that in the syslog :


Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync remote state: No cluster leader
Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:04:06 DV2Fv-Wazo consul[14249]:     2022/01/26 13:04:06 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: consul.service: Main process exited, code=exited, status=1/FAILURE
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: consul.service: Failed with result 'exit-code'.
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: Failed to start Consul agent.
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: consul.service: Service RestartSec=100ms expired, scheduling restart.
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: consul.service: Scheduled restart job, restart counter is at 9.
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: Stopped Consul agent.
Jan 26 13:04:07 DV2Fv-Wazo systemd[1]: Starting Consul agent...
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]: bootstrap = true: do not enable unless necessary
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]: ==> Starting Consul agent...
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]: ==> Consul agent running!
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:            Version: '1.0.7-dev'
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:            Node ID: '00ee1e78-3a23-91c3-af18-8e80f5dfe08e'
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:          Node name: 'DV2Fv-Wazo'
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:         Datacenter: 'wazo-platform' (Segment: '<all>')
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:             Server: true (Bootstrap: true)
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:        Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: 8501, DNS: -1)
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:       Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]:            Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
Jan 26 13:04:07 DV2Fv-Wazo consul[16524]: ==> Log data will now stream in as it occurs:
Jan 26 13:04:14 DV2Fv-Wazo consul[16524]: 2022/01/26 13:04:14 [WARN] raft: Unable to get address for server id 318d4d09-4019-17d1-2c03-d538cc667048, using fallback address 127.0.0.1:8300: Could not find address for server id 318d4d09-4019-17d1-2c03-d538cc667048
Jan 26 13:04:14 DV2Fv-Wazo consul[16524]: 2022/01/26 13:04:14 [DEBUG] raft-net: 127.0.0.1:8300 accepted connection from: 127.0.0.1:47447
Jan 26 13:04:14 DV2Fv-Wazo consul[16524]: 2022/01/26 13:04:14 [WARN] raft: Unable to get address for server id 318d4d09-4019-17d1-2c03-d538cc667048, using fallback address 127.0.0.1:8300: Could not find address for server id 318d4d09-4019-17d1-2c03-d538cc667048
Jan 26 13:07:39 DV2Fv-Wazo consul[17910]:     2022/01/26 13:07:39 [ERR] consul: failed to wait for barrier: node is not the leader
Jan 26 13:07:42 DV2Fv-Wazo consul[17910]:     2022/01/26 13:07:42 [ERR] agent: failed to sync remote state: No cluster leader
Jan 26 13:07:42 DV2Fv-Wazo consul[17910]:     2022/01/26 13:07:42 [ERR] agent: failed to sync changes: No cluster leader
Jan 26 13:07:42 DV2Fv-Wazo consul[17910]:     2022/01/26 13:07:42 [ERR] raft: Failed to take snapshot: nothing new to snapshot

Is there something wrong with our configuration ?

Hello,

Looks like the database of consul is corrupted, it’s possible to reset it.
There is no data inside used by Wazo-platform, so don’t be afraid ;-).

Stop consul

rm -rf /var/lib/consul/raft/
rm -rf /var/lib/consul/serf/
rm -rf /var/lib/consul/services/
rm -rf /var/lib/consul/tmp/
rm -rf /var/lib/consul/checks/

Start consul

Sylvain

Thanks, It looks better but i’ve got this looping now :


Jan 27 13:26:52 Wazo wazo-setupd[425]: 2022-01-27 13:26:52,126 [425] (INFO) (service_discovery): ttl pass failed
Jan 27 13:26:54 Wazo consul[17300]:     2022/01/27 13:26:54 [ERR] agent: failed to sync changes: ACL not found
Jan 27 13:26:54 Wazo consul[17300]:     2022/01/27 13:26:54 [ERR] http: Request PUT /v1/agent/check/pass/service:e7aa47f6-872b-4fc6-a3bd-6bc2c7673128, error: CheckID "service:e7aa47f6-872b-4fc6-a3bd-6bc2c7673128" does not have associated TTL from=127.0.0.1:57814
Jan 27 13:26:54 Wazo wazo-setupd[425]: 2022-01-27 13:26:54,128 [425] (INFO) (service_discovery): 500 CheckID "service:e7aa47f6-872b-4fc6-a3bd-6bc2c7673128" does not have associated TT

and this just after restarting services

Jan 27 13:29:12 Wazo wazo-dird[20140]: 2022-01-27 13:29:12,131 [20140] (ERROR) (xivo.rest_api_helpers): Unauthorized: {'invalid_token': '', 'required_access': 'dird.status.read'}
Jan 27 13:29:12 Wazo wazo-dird[20140]: 2022-01-27 13:29:12,131 [20140] (INFO) (flask.app): response: (127.0.0.1) GET http://localhost:9489/0.1/status 401
Jan 27 13:29:12 Wazo wazo-dird[20140]: 2022-01-27 13:29:12,131 [20140] (DEBUG) (flask.app): response body: {"message": "Unauthorized", "error_id": "unauthorized", "details": {"invalid_token": "", "required_access": "dird.status.read"}, "timestamp": 1643286552.1312199}
Jan 27 13:29:12 Wazo wazo-dird[20140]: 2022-01-27 13:29:12,132 [20140] (INFO) (service_discovery): Registering wazo-dird on Consul as 32f481e4-60e0-40e4-af2b-2338e34cd645 with 10.0.229.49:9489

and also this

2022-01-27 13:33:17,348 [24280] (ERROR) (xivo.pubsub): 401 Client Error: ('', 'confd.trunks.read') for url: http://localhost:9486/1.1/trunks?recurse=True
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/xivo/pubsub.py", line 39, in publish_one
    callback(message)
  File "/usr/lib/python3/dist-packages/wazo_calld/plugins/endpoints/bus.py", line 66, in on_peer_status
    endpoint.registered = False
  File "/usr/lib/python3.7/contextlib.py", line 119, in __exit__
    next(self.gen)
  File "/usr/lib/python3/dist-packages/wazo_calld/plugins/endpoints/services.py", line 107, in update
    self._notify_fn(endpoint)
  File "/usr/lib/python3/dist-packages/wazo_calld/plugins/endpoints/notifier.py", line 28, in endpoint_updated
    trunk = self._confd_cache.get_trunk(endpoint.techno, endpoint.name)
  File "/usr/lib/python3/dist-packages/wazo_calld/plugins/endpoints/services.py", line 169, in get_trunk
    return self._get_endpoint_by_index(techno, name, self._trunks, index='name')
  File "/usr/lib/python3/dist-packages/wazo_calld/plugins/endpoints/services.py", line 208, in _get_endpoint_by_index
    self._initialize()
  File "/usr/lib/python3/dist-packages/wazo_calld/plugins/endpoints/services.py", line 219, in _initialize
    trunks = self._confd.trunks.list(recurse=True)['items']
  File "/usr/lib/python3/dist-packages/wazo_confd_client/crud.py", line 70, in list
    response = self.session.get(url, headers=headers, params=kwargs)
  File "/usr/lib/python3/dist-packages/wazo_confd_client/session.py", line 55, in get
    self.check_response(response, check_response)
  File "/usr/lib/python3/dist-packages/wazo_confd_client/session.py", line 33, in check_response
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: ('', 'confd.trunks.read') for url: http://localhost:9486/1.1/trunks?recurse=True

trunks are displayed ok going in the gui though