Project

General

Profile

Actions

Bug #3426

closed

assertion failed: irq < 16 on VMware hardware version 9 (apix related)

Added by Yuri Pankov almost 11 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Normal
Category:
kernel
Start date:
2012-12-22
Due date:
% Done:

90%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:

Description

This was already discussed on developer@, documenting it here as well.

apic_record_rdt_entry: intr_index=32767 irq = 0x9 dip = 0x0 vector = 0x20
apic_record_rdt_entry: intr_index=-3 irq = 0x9 dip = 0x0 vector = 0x20
apic_record_rdt_entry: intr_index=32767 irq = 0x14 dip = 0x0 vector = 0x21

panic[cpu0]/thread=fffffffffbc3d220: assertion failed: irq < 16, file: ../../i86pc/io/mp_platform_common.c, line: 1669

Warning - stack not written to the dump buffer
fffffffffbc7f500 genunix:assfail+73 ()
fffffffffbc7f560 apix:apic_record_rdt_entry+2a6 ()
fffffffffbc7f5a0 apix:apix_intx_set_vector+47 ()
fffffffffbc7f5f0 apix:apix_alloc_intx+134 ()
fffffffffbc7f660 apix:ioapix_setup_intr+c3 ()
fffffffffbc7f6a0 apix:ioapix_init_intr+11c ()
fffffffffbc7f6e0 apix:apix_picinit+165 ()
fffffffffbc7f700 unix:mach_picinit+30 ()
fffffffffbc7f740 unix:startup_end+fa ()
fffffffffbc7f750 unix:startup+4c ()
fffffffffbc7f790 genunix:main+87 ()
fffffffffbc7f7a0 unix:_locore_start+90 ()

panic: entering debugger (no dump device, continue to reboot)
Loaded modules: [ scsi_vhci mac uppc apix specfs pcplusmp cpu.generic ]
kmdb: target stopped at:
kmdb_enter+0xb: movq   %rax,%rdi
[0]> fffffffffbc3d220::findstack -v
stack pointer for thread fffffffffbc3d220: fffffffffbc7f2a0
  fffffffffbc7f4c0 0x47()
  fffffffffbc7f500 assfail+0x73(fffffffff78454e0, fffffffff7845830, 685)
  fffffffffbc7f560 apix`apic_record_rdt_entry+0x2a6(ffffff014ee2c380, 14)
  fffffffffbc7f5a0 apix`apix_intx_set_vector+0x47(14, 0, 21)
  fffffffffbc7f5f0 apix`apix_alloc_intx+0x134(0, 0, 14)
  fffffffffbc7f660 apix`ioapix_setup_intr+0xc3(14, fffffffffc06d849)
  fffffffffbc7f6a0 apix`ioapix_init_intr+0x11c(1)
  fffffffffbc7f6e0 apix`apix_picinit+0x165()
  fffffffffbc7f700 mach_picinit+0x30()
  fffffffffbc7f740 startup_end+0xfa()
  fffffffffbc7f750 startup+0x4c()
  fffffffffbc7f790 main+0x87()
  fffffffffbc7f7a0 _locore_start+0x90()
[0]> ffffff014ee2c380::print apic_irq_t
{
    airq_mps_intr_index = 0x7fff
    airq_intin_no = 0
    airq_ioapicindex = 0
    airq_dip = 0
    airq_major = 0
    airq_rdt_entry = 0
    airq_cpu = 0
    airq_temp_cpu = 0
    airq_vector = 0x21
    airq_share = 0
    airq_share_id = 0
    airq_ipl = 0
    airq_iflag = {
        intr_po = 0
        intr_el = 0
        bustype = 0
    }
    airq_origirq = 0x14
    airq_busy = 0
    airq_next = 0
    airq_intrmap_private = 0
}
Actions #1

Updated by Ilya Usvyatsky over 10 years ago

100% reproducible on VMware 9, not there for VMware 8.0 and earlier.

Actions #2

Updated by Hans Rosenfeld over 10 years ago

  • Status changed from New to In Progress
  • Assignee set to Hans Rosenfeld
  • % Done changed from 0 to 90

This problem is triggered by the HPET interrupt being at irq 0x14. The HPET and SCI interrupts are somewhat special, they are registered with add_avintr() before the apic or apix modules are loaded. Because of this, the apic and apix modules complete the remaining interrupt setup tasks when the apic or apix is initialized.

So what happens at apix initialization is that ioapix_init_intr() is calling ioapix_setup_intr() for each of these two interrupts. Then, ioapix_setup_intr() finds that there is no entry in apic_irq_table for the interrupt, so it calls apix_alloc_intx(), which will, among other things, allocate and initialize the apic_irq_table entry. This initialization unconditionally sets airq_mps_intr_index to DEFAULT_INDEX for free (and new) entries. Eventually it will reach apic_record_rdt_entry() via apix_intx_set_vector(). If the assertion was not triggered we will return to ioapix_setup_intr(), which will eventually finish setting up the newly allocated apic_irq_table entry and call apic_record_rdt_entry() again. Because of the new settings in the apic_irq_table entry, this second call to apic_record_rdt_entry() will take a different code path and will write a different airq_rdt_entry to the irq table entry than in the first call.

To resolve this, we will allocate the apic_irq_table entry manually and move the call to apix_alloc_intx() down to after the initializations of the new entry. Now apix_alloc_intx() will find an existing irq table entry with airq_mps_intr_index=ACPI_INDEX. It will select a CPU to bind this interrupt to, will allocate a vector and will call apic_intx_set_vector(), which will call apic_record_rdt_entry(). The 2nd call to apic_record_rdt_entry() in ioapix_init_intr() can be removed, as the first call will already take the correct code path.

A webrev can be found here: http://cr.illumos.org/~webrev/hans/illumos-3426-webrev/

Actions #3

Updated by Gordon Ross over 10 years ago

  • Status changed from In Progress to Resolved
commit 584d084a45d320c86a541cf9072cccd91b4da17b
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Date:   Wed Feb 27 21:13:42 2013 +0100

    3426 assertion failed: irq < 16 on VMware hardware version 9 (apix related)
    Reviewed by: Albert Lee <trisk@nexenta.com>
    Reviewed by: Dan McDonald <danmcd@nexenta.com>
    Reviewed by: Boris Protopopov <boris.protopopov@nexenta.com>
    Reviewed by: Ilya Usvyatsky <ilya.usvyatsky@nexenta.com>
    Reviewed by: Marcel Telka <marcel.telka@nexenta.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Rich Lowe <richlowe@richlowe.net>
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Approved by: Gordon Ross <gwr@nexenta.com>

:100644 100644 3342306... 04e49a6... M  usr/src/uts/i86pc/io/apix/apix_utils.c
Actions

Also available in: Atom PDF