Project

General

Profile

Bug #12084

Updated by Mike Gerdts almost 2 years ago

While testing a fix for #4454, I got: 

 <pre> 
 > ::status 
 debugging crash dump vmcore.12 (64-bit) from omni-2 
 operating system: 5.11 omni-physio-0-g59a9ce0e70 (i86pc) 
 build version: gfx-drm - heads/master-0-gbdc58b1-dirty 

 image uuid: 6c62e915-bc81-4cd5-cc07-ecb85608e869 
 panic message: assertion failed: 0, file: ../../common/io/idm/idm_conn_sm.c, line: 424 
 dump content: kernel pages only 
 > $C 
 ffffff001ef78ac0 vpanic() 
 ffffff001ef78b10 0xfffffffffbe2a425() 
 ffffff001ef78b50 idm_conn_event_handler+0x139(ffffff076697e4d8) 
 ffffff001ef78c00 taskq_thread+0x315(ffffff0766989628) 
 ffffff001ef78c10 thread_start+0xb() 
 </pre> 

 The test involves having a zpool with a mirror vdev and a spare sitting atop 3 iscsi disks.    A dd loop writes to a file in that pool while another loop offlines an LU, sleeps 15 seconds, onlines it, then sleeps 45 seconds before repeating the loop.    Scripts are in #12076. 

 Near line 424 we have: 

 <pre> 
  405           if (event_ctx->iec_pdu_event_type != CT_NONE) { 
  406                   switch (action) { 
  407                   case CA_TX_PROTOCOL_ERROR: 
  408                           idm_pdu_tx_protocol_error(ic, pdu); 
  409                           break; 
  410                   case CA_RX_PROTOCOL_ERROR: 
  411                           idm_pdu_rx_protocol_error(ic, pdu); 
  412                           break; 
  413                   case CA_FORWARD: 
  414                           if (!event_ctx->iec_pdu_forwarded) { 
  415                                   if (event_ctx->iec_pdu_event_type == 
  416                                       CT_RX_PDU) { 
  417                                           idm_pdu_rx_forward(ic, pdu); 
  418                                   } else { 
  419                                           idm_pdu_tx_forward(ic, pdu); 
  420                                   } 
  421                           } 
  422                           break; 
  423                   default: 
  424                           ASSERT(0); 
  425                           break; 
  426                   } 
  427           } 
 </pre> <pre> 

 The `if` block and `case` statements are: 

 <pre> 
 idm_conn_event_handler+0x100:     movl     0x18(%rbx),%eax 
 idm_conn_event_handler+0x103:     testl    %eax,%eax 
 idm_conn_event_handler+0x105:     je       +0x5c      <idm_conn_event_handler+0x163> 
 idm_conn_event_handler+0x107:     cmpl     $0x1,%r14d 
 idm_conn_event_handler+0x10b:     je       +0x2af     <idm_conn_event_handler+0x3c0> 
 idm_conn_event_handler+0x111:     jb       +0x291     <idm_conn_event_handler+0x3a8> 
 idm_conn_event_handler+0x117:     cmpl     $0x2,%r14d 
 idm_conn_event_handler+0x11b:     je       +0x15f     <idm_conn_event_handler+0x280> 
 idm_conn_event_handler+0x121:     movl     $0x1a8,%edx 
 idm_conn_event_handler+0x126:     movq     $0xfffffffff82479b0,%rsi 
 idm_conn_event_handler+0x12d:     movq     $0xfffffffff824692b,%rdi 
 idm_conn_event_handler+0x134:     call     +0x3be9d97         <assfail> 
 idm_conn_event_handler+0x139:     movl     0x18(%rbx),%eax 
 </pre> 

 Thus it seems that the value of action is in r14 

 <pre> 
 > <r14=D 
                 3 
 </pre> 

 That is CA_DROP. 

 <pre> 
 203 typedef enum { 
 204           CA_TX_PROTOCOL_ERROR,     /* Send "protocol error" to state machine */ 
 205           CA_RX_PROTOCOL_ERROR,     /* Send "protocol error" to state machine */ 
 206           CA_FORWARD,               /* State machine event and forward to client */ 
 207           CA_DROP                   /* Drop PDU */ 
 208 } idm_pdu_event_action_t; 
 </pre> 

 Earlier CA_DROP is handled. 

 <pre> 
  338                   case CA_DROP: 
  339                           /* 
  340                            * It never even happened 
  341                            */ 
  342                           IDM_SM_LOG(CE_NOTE, "*** drop PDU %p", (void *) pdu); 
  343                           idm_pdu_complete(pdu, IDM_STATUS_FAIL); 
  344                           break; 
 </pre> 

 Which is consistent with the fact that the pdu has been freed by idm_pdu_complete(). 

 <pre> 
 > <event_ctx::print idm_conn_event_ctx_t iec_info | ::whatis 
 ffffff0729887030 is freed from idm_rx_pdu_cache: 
             ADDR            BUFADDR          TIMESTAMP             THREAD 
                             CACHE            LASTLOG           CONTENTS 
 ffffff072efccea8 ffffff0729887030        22e24f71bc7 ffffff001ef78c20 
                  ffffff0726beb008 ffffff06e402e800 ffffff06fdecf768 
                  kmem_cache_free_debug+0xfa 
                  kmem_cache_free+0x86 
                  idm_sorx_cache_pdu_cb+0x2a 
                  idm_sorx_addl_pdu_cb+0x68 
                  idm_pdu_complete+0x17 
                  idm_conn_event_handler+0xc2 
                  taskq_thread+0x315 
                  thread_start+0xb 
 </pre> 

Back