Description
Today if a node-to-node connection drops we log this message:
The "if unexpected" bit is tricksy, it's actually pretty hard to tell from the logs whether a disconnect was expected (e.g. the node shut down) or not (e.g. network disruption). Yet we should be able to work out ourselves whether a disconnect was unexpected, and log a message that unambiguously indicates that we saw an unexpected disconnect.
In particular, if the org.elasticsearch.cluster.NodeConnectionsService
finds it is disconnected from a peer and then successfully reconnects to that same peer again (its DiscoveryNode#ephemeralId
did not change) then that's definitely not due to the node shutting down. We should be emitting a WARN
log in this case. Moreover, it'd be incredibly useful to capture the exception (if any) that TcpTransport
reported as causing the disconnect so we can repeat it in such a log message.