ISC readers report a significant increase of "odd" error messages in their named/bind logs.server named: dispatch 0x8face08: shutting down due to TCP receive error: [IP REMOVED]#53: connection reset.
named: dispatch 0x81eb2b0: shutting down due to TCP receive error: <unknown address, family 48830>: connection resetUpdate 18:30 UTC:
It looks like we got the solution, or at least parts of it:
- Some DNS servers of "secureserver.net" are apparently broken and sometimes return incomplete records. Two DNS servers in particular, 184.108.40.206 and 220.127.116.11, are implicated in the majority of the "TCP receive error" packet traces that we have received.
- What happens is that "named" sends a UDP DNS query to one of the broken servers and receives a truncated UDP response. By nature of the DNS protocol, "named" re-tries the same query in TCP, which is answered by the broken servers with a rude "tcp reset" packet, which in turn again triggers "named" to write the above log line. This behaviour can be reproduced with "dig" as shown below:
daniel@debian:$ dig whatever.net @18.104.22.168
;; Truncated, retrying in TCP mode.
;; communications error to 22.214.171.124#53: connection reset
- Lookups against ISIPP's IADB spam / sender database seem to have ended up on the broken servers listed above from time to time, causing the "link" between receiving email and seeing the named log entries as reported by some readers
- The IP address in the named log does not seem to have anything to do with the IP that causes the problem. I have no idea where this logged IP comes from, but seeing that some versions are printing "address unknown" instead of an IP, I suspect that this error print statement is broken in several (older?) Bind releases
A big thank you to all the readers who have volunteered their packet traces and time to help with this analysis!