• 0 Posts
  • 1 Comment
Joined 1 year ago
cake
Cake day: July 3rd, 2023

help-circle
  • I seem to recall a VMware complaint similar to this too, and there was a ring buffer tuning to do to fix it… But that error message doesn’t seem quite right to match it.

    TX queue timeouts can be caused by several things, but I wonder if you’re not seeing an end result of a spammy Ethernet flow control implementation where the device can’t transmit because the link is continuously paused.

    If so, there may be rx_xoff counters viewable from within proxmox, or “ethtool -s enp1s0f0” would tell you where the device is seeing pause frames from the switch on a regular Linux host.

    The link down tends to be a reaction by the driver to recover from a hung queue, so if it’s not flow control, there could be a driver/firmware upgrade possible, or a series of tunables if there’s a bug somewhere in packet handling land resulting in the NIC itself hanging.