- A Platform9 Managed host shows as offline from Clarity UI.
- Connectivity errors are observed as shown below.
pf9_app.py INFO - Setting the desired service state/var/log/pf9/muster.log
pf9_app.py INFO - Setting service state pf9-comms.3.9.0-663.baf294f. Command: sudo systemctl start pf9-comms
session.py INFO - Converge succeeded
amqp.py WARNING - Connection closed due to Not specified, retrying in 10 seconds
session.py ERROR - Connection closed unexpectedly.
slave.py ERROR - Connection error. Retrying in 10 seconds.
Traceback (most recent call last):
File "/opt/pf9/hostagent/lib/python2.7/site-packages/bbslave/slave.py", line 83, in reconnect_loop
File "/opt/pf9/hostagent/lib/python2.7/site-packages/bbslave/session.py", line 716, in start
AMQPConnectionError upstream: failed to connect to RabbitMQ: Exception (501) Reason: "read tcp 127.0.0.1:41514->127.0.0.1:5672: i/o timeout"
- The enic devcmd timed out messages are reported in /var/log/messages on the host.
kernel: [21479642.626977] enic 0000:06:00.0 enp6s0: devcmd 4 timed out
kernel: [21479642.727633] enic 0000:07:00.0 enp7s0: devcmd 4 timed out
kernel: [21479650.951488] enic 0000:07:00.0 enp7s0: devcmd2 4: wq is full. fetch index: 17, posted index:16
kernel: [21479650.952179] enic 0000:0c:00.0 enp12s0: devcmd2 4: wq is full. fetch index: 8, posted index: 7
kernel: [21489854.868177] enic 0000:0c:00.0 enp12s0: devcmd2 4: wq is full. fetch index: 8, posted index: 7
kernel: [21489854.869338] enic 0000:07:00.0 enp7s0: devcmd2 4: wq is full. fetch index: 17, posted index:16
- Platform9 Managed OpenStack - All Versions
- Platform9 Managed Kubernetes - All Versions
- CentOS Linux 7.4
- enic driver - v22.214.171.124
The enic module sends a command to the NIC firmware and does not receive a timely response. NIC firmware hangs or in a bad state.
Work with the hardware vendor to investigate this issue further and upgrade firmware to the latest version recommended by the hardware vendor.