Bug #5026
intra-node/inter-zone networking doesn't always deliver SIGPOLL
Start date:
2014-07-18
Due date:
% Done:
100%
Estimated time:
Difficulty:
Hard
Tags:
Gerrit CR:
Description
NOTE: This was seen with OmniOS r151006, which is about a year old. I will update this if I can reproduce it on more modern Illumos code.
Consider a global-zone process using SIGPOLL handling to detect IO on a network socket. ntpd is a good example.
Consider a sending process on a same-machine non-global zone. It is possible, at the very least where both global and non-global zones have vnics over the same aggregation, that the SIGPOLL is never delivered to the process in the global (receiving) zone.
Some verbose annotated DTrace output:
ip`ire_recv_local_v4+0x132 ip`ill_input_short_v4+0x4d6 ip`ip_input_common_v4+0x372 ip`ip_input+0x2b dls`i_dls_link_rx+0x1cd mac`mac_rx_deliver+0x37 mac`mac_rx_soft_ring_process+0x19a mac`mac_rx_srs_proto_fanout+0x29a mac`mac_rx_srs_drain+0x363 mac`mac_rx_srs_process+0x3ce /* RIGHT HERE we leave a non-global zone and enter a global zone! */ mac`mac_tx_send+0x431 mac`mac_tx_soft_ring_process+0x79 mac`mac_tx_aggr_mode+0x7c mac`mac_tx+0xda dld`str_mdata_fastpath_put+0x53 ip`ip_xmit+0x94f ip`ire_send_wire_v4+0x3e9 ip`conn_ip_output+0x190 ip`udp_output_connected+0x139 ip`udp_send+0x7b2 4 | ip_input_local_v4:entry 4 -> ip_csum_hdr 4 <- ip_csum_hdr Returns 0x0 4 -> ip_fanout_v4 4 -> ip_input_cksum_v4 4 -> ip_input_sw_cksum_v4 4 -> ip_input_cksum_pseudo_v4 4 <- ip_input_cksum_pseudo_v4 Returns 0x145d8 4 -> ip_cksum 4 -> ip_ocsum 4 <- ip_ocsum Returns 0xffff 4 <- ip_cksum Returns 0xffff 4 <- ip_input_sw_cksum_v4 Returns 0x1 4 <- ip_input_cksum_v4 Returns 0x1 4 -> ipcl_classify_v4 4 <- ipcl_classify_v4 Returns 0xffffff32cde34540 4 -> udp_input 4 -> ip_find_hdr_v4 4 <- ip_find_hdr_v4 Returns 0x0 4 -> conn_recvancillary_size 4 <- conn_recvancillary_size Returns 0x28 4 -> allocb 4 -> kmem_cache_alloc 4 <- kmem_cache_alloc Returns 0xffffff7530dd1440 4 <- allocb Returns 0xffffffb7db752280 4 -> conn_recvancillary_add 4 -> gethrestime 4 -> pc_gethrestime 4 -> gethrtime 4 -> tsc_gethrtime 4 <- tsc_gethrtime Returns 0x6f4ee913f76957 4 <- gethrtime Returns 0x6f4ee913f76957 4 <- pc_gethrestime Returns 0x75609f00 4 <- gethrestime Returns 0x75609f00 4 <- conn_recvancillary_add Returns 0x28 4 -> udp_ulp_recv 4 -> so_queue_msg 4 -> so_queue_msg_impl /* Message is enqueed, but ... */ 4 -> so_enqueue_msg 4 <- so_enqueue_msg Returns 0xffffffdef28b2060 4 -> so_notify_data 4 -> socket_sendsig 4 -> prfind 4 -> getzoneid 4 <- getzoneid Returns 0x14 /* This function... */ 4 -> prfind_zone 4 -> pid_lookup 4 <- pid_lookup Returns 0xffffff32a5c98420 4 <- prfind_zone Returns 0x0 /* FAILED!!! Because the receiver's pid's zone isn't the sender's. */ 4 <- prfind Returns 0x0 4 <- socket_sendsig Returns 0x0 4 <- so_notify_data Returns 0x0 4 <- so_queue_msg_impl Returns 0xdfb8 4 <- so_queue_msg Returns 0xdfb8 4 <- udp_ulp_recv Returns 0xdfb8 4 <- udp_input Returns 0xdfb8 4 -> cv_broadcast 4 <- cv_broadcast Returns 0x0 4 <- ip_fanout_v4 Returns 0x0 4 <- ip_input_local_v4 Returns 0x0
Files