Bug #6210
ping can misreport ICMP latency
100%
Description
ping(1M) by default attempts to perform a DNS reverse lookup for a given IP address. When this succeeds, it's of some value. However, when it fails, things go off the rails. First off, we end up delaying the reporting of packets. The much more criminal thing is that we end up including the failed DNS resolution time in the time the ping took!
This often causes something to look like:
[root@36e435c7-266c-cf78-d972-a39e33a2c7c2 ~]# ping -s 10.248.88.69 PING 10.248.88.69: 56 data bytes 64 bytes from 10.248.88.69: icmp_seq=0. time=63.842 ms 64 bytes from 10.248.88.69: icmp_seq=1. time=22008.527 ms 64 bytes from 10.248.88.69: icmp_seq=2. time=21008.593 ms 64 bytes from 10.248.88.69: icmp_seq=3. time=20008.640 ms 64 bytes from 10.248.88.69: icmp_seq=4. time=19008.682 ms 64 bytes from 10.248.88.69: icmp_seq=5. time=18008.748 ms 64 bytes from 10.248.88.69: icmp_seq=6. time=17008.776 ms 64 bytes from 10.248.88.69: icmp_seq=7. time=16008.845 ms 64 bytes from 10.248.88.69: icmp_seq=8. time=15008.875 ms 64 bytes from 10.248.88.69: icmp_seq=9. time=14008.931 ms 64 bytes from 10.248.88.69: icmp_seq=10. time=13008.967 ms 64 bytes from 10.248.88.69: icmp_seq=11. time=12009.047 ms 64 bytes from 10.248.88.69: icmp_seq=12. time=11009.086 ms 64 bytes from 10.248.88.69: icmp_seq=13. time=10009.111 ms 64 bytes from 10.248.88.69: icmp_seq=14. time=9009.158 ms 64 bytes from 10.248.88.69: icmp_seq=15. time=8009.203 ms 64 bytes from 10.248.88.69: icmp_seq=16. time=7009.286 ms 64 bytes from 10.248.88.69: icmp_seq=17. time=6009.334 ms 64 bytes from 10.248.88.69: icmp_seq=18. time=5009.374 ms 64 bytes from 10.248.88.69: icmp_seq=19. time=4009.424 ms 64 bytes from 10.248.88.69: icmp_seq=20. time=3009.451 ms 64 bytes from 10.248.88.69: icmp_seq=21. time=2009.517 ms 64 bytes from 10.248.88.69: icmp_seq=22. time=1009.545 ms 64 bytes from 10.248.88.69: icmp_seq=23. time=115.486 ms 64 bytes from 10.248.88.69: icmp_seq=24. time=20.667 ms 64 bytes from 10.248.88.69: icmp_seq=25. time=44.564 ms 64 bytes from 10.248.88.69: icmp_seq=26. time=91.477 ms
Note the long series of delays in terms of seconds until about 20 seconds have elapsed and we've timed out all of the DNS resolutions.
I'm not 100% sure on what would be the best behavior here. It feels to me like with an ICMP ping, the most important thing is to show the user the behavior that the network is functioning from an ICMP perspective, even if it's not from a DNS perspective and to notify them of such.
Another option would be to delay the showing of any packets until DNS returns, but to show the original ICMP latency. That would give the feeling that things didn't work, but then a lot of icmp packets came back with low latency, and I feel like that would still cause the user to doubt the validity of the results, I know I'd be immediately skeptical.
Updated by Electric Monk over 5 years ago
- Status changed from New to Closed
git commit b08923d6c9c63a4f4b647b84d9454d8124fcedd7
commit b08923d6c9c63a4f4b647b84d9454d8124fcedd7 Author: Robert Mustacchi <rm@joyent.com> Date: 2015-09-14T20:47:46.000Z 6210 ping can misreport ICMP latency 6211 want warnings in the face of long running name lookups for ping 6212 Want sub-second ping interval support 6213 clean up warnings in ping Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com> Reviewed by: Joshua M. Clulow <jmc@joyent.com> Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com> Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Approved by: Dan McDonald <danmcd@omniti.com>