Project

General

Profile

Bug #1606

nwam problem after powerfail (with UPS)...

Added by Richard PALO about 8 years ago. Updated about 8 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
OS/Net (Kernel and Userland)
Target version:
-
Start date:
2011-10-06
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
Nwam

Description

strange problem that I've seen before, but persists in oi_151a.
after a network powerfail (the computers live on their UPS), when the network comes back
nwam is totally confused. There is network access, but the icons in the right top bar flash every 2 seconds or so between "Wired (nge0) connected", and "Wired (nge0) disconnected"

screen shots supplied.

Strange ifconfig -a output as well...

A reboot seems required as a 'pfexec svcadm restart nwam' doesn't seem to stop it.


Files

Capture-18.png (18.8 KB) Capture-18.png Richard PALO, 2011-10-06 03:11 PM
Capture-19.png (11.5 KB) Capture-19.png Richard PALO, 2011-10-06 03:11 PM
ifconfig.txt (23 KB) ifconfig.txt Richard PALO, 2011-10-06 03:11 PM
messages.txt (3.25 MB) messages.txt Richard PALO, 2011-10-09 05:28 AM

History

#1

Updated by Chris Jordan about 8 years ago

  • Category set to OS/Net (Kernel and Userland)
  • Status changed from New to Feedback
  • Tags changed from needs-triage to Nwam

What kind of NIC is this, and what is logged in /var/adm/messages when this occurs?

#2

Updated by Richard PALO about 8 years ago

Chris Jordan wrote:

What kind of NIC is this, and what is logged in /var/adm/messages when this occurs?

Hi, I've attached an extract of the messages, including a snippet of the lastboot prior to the problem.

The system is an Acer X3200 Nvidia based machine.
pci1025,157 (pciex10de,760), instance #0 (driver name: nge)

richard@x3200:~$ prtconf -dD
System Configuration:  Project OpenIndiana  i86pc
Memory size: 7935 Megabytes
System Peripherals (Software Nodes):

i86pc (driver name: rootnex)
    scsi_vhci, instance #0 (driver name: scsi_vhci)
    pci, instance #0 (driver name: npe)
        pci10de,cb84 (pciex10de,754)
        isa (pciex10de,75c), instance #1 (driver name: isa)
            motherboard
            pit_beep, instance #1 (driver name: pit_beep)
        pci1025,157 (pciex10de,752)
        pci10de,cb84 (pciex10de,751)
        pci1025,157 (pciex10de,753)
        pci10de,568 (pciex10de,568)
        pci10de,cb84 (pciex10de,77b), instance #2 (driver name: ohci)
            hub, instance #2 (driver name: hubd)
                device, instance #6 (driver name: usb_mid)
                    keyboard, instance #6 (driver name: hid)
                    mouse, instance #7 (driver name: hid)
        pci10de,cb84 (pciex10de,77c), instance #1 (driver name: ehci)
        pci10de,cb84 (pciex10de,77d), instance #3 (driver name: ohci)
        pci10de,cb84 (pciex10de,77e), instance #2 (driver name: ehci)
            storage, instance #1 (driver name: scsa2usb)
                disk, instance #5 (driver name: sd)
        pci-ide (pciex10de,759), instance #2 (driver name: pci-ide)
            ide (driver name: ata)
            ide (driver name: ata)
        pci1025,157 (pciex10de,774), instance #1 (driver name: audiohd)
        pci10de,75a (pciex10de,75a), instance #2 (driver name: pci_pci)
        pci1025,157 (pciex10de,ad4), instance #0 (driver name: ahci)
            disk, instance #3 (driver name: sd)
            cdrom, instance #4 (driver name: sd)
            disk, instance #6 (driver name: sd)
        pci1025,157 (pciex10de,760), instance #0 (driver name: nge)
        pci10de,569 (pciex10de,569), instance #3 (driver name: pci_pci)
            display (pci10de,84b), instance #0 (driver name: nvidia)
        pci10de,778 (pciex10de,778), instance #1 (driver name: pcieb)
            display (pciex10de,6e0), instance #1 (driver name: nvidia)
        pci10de,75b (pciex10de,75b) (driver name: pcieb)
        pci10de,77a (pciex10de,77a), instance #2 (driver name: pcieb)
            pci1025,157 (pciex1106,3403), instance #1 (driver name: hci1394)
        pci1022,1200 (pciex1022,1200)
        pci1022,1201 (pciex1022,1201)
        pci1022,1202 (pciex1022,1202)
        pci1022,1203 (pciex1022,1203)
        pci1022,1204 (pciex1022,1204)
    fw, instance #0 (driver name: acpinex)
        cpu, instance #0 (driver name: cpudrv)
        cpu, instance #1 (driver name: cpudrv)
        cpu, instance #2 (driver name: cpudrv)
        cpu, instance #3 (driver name: cpudrv)
        sb, instance #1 (driver name: acpinex)
    used-resources
    iscsi, instance #0 (driver name: iscsi)
    options, instance #0 (driver name: options)
    pseudo, instance #0 (driver name: pseudo)
    agpgart, instance #0 (driver name: agpgart)
    xsvc, instance #0 (driver name: xsvc)
#3

Updated by Chris Jordan about 8 years ago

  • Status changed from Feedback to New
  • Assignee set to OI illumos
  • Difficulty changed from Medium to Hard

Thanks. Looking at that messages file it looks like nwam sees the link come back up and tries to unplumb the nge0 interface, which for some reason fails. Whether nwam is somehow doing that wrong, or whether it's a problem with nge0 isn't clear to me, but after that nwam just keeps retrying and failing till you reboot. I'm assigning this to "OI illumos" since it appears to be a problem with nwam or with the nge driver.

#4

Updated by Richard PALO about 8 years ago

Great, btw, I thought nge was supposed to go MSI:

richard@x3200:~# echo ::interrupts -d | mdb -k
IRQ  Vect IPL Bus    Trg Type   CPU Share APIC/INT# Driver Name(s) 
1    0x41 5   ISA    Edg Fixed  3   1     0x0/0x1   i8042#1
9    0x80 9   PCI    Lvl Fixed  1   1     0x0/0x9   acpi_wrapper_isr
12   0x42 5   ISA    Edg Fixed  0   1     0x0/0xc   i8042#1
16   0x86 9   PCI    Lvl Fixed  3   2     0x0/0x10  nvidia#1, hci1394#1
20   0x82 9   PCI    Lvl Fixed  3   2     0x0/0x14  ohci#3, nvidia#0
21   0x83 9   PCI    Lvl Fixed  0   2     0x0/0x15  ehci#1, nge#0
22   0x84 9   PCI    Lvl Fixed  1   1     0x0/0x16  ehci#2
23   0x85 9   PCI    Lvl Fixed  2   1     0x0/0x17  ohci#2
24   0x40 5   PCI    Edg MSI    2   1     -         ahci#0
25   0x81 7   PCI    Edg MSI    1   1     -         pcieb#1
26   0x30 4   PCI    Edg MSI    2   1     -         pcieb#2
160  0xa0 0          Edg IPI    all 0     -         poke_cpu
208  0xd0 14         Edg IPI    all 1     -         kcpc_hw_overflow_intr
209  0xd1 14         Edg IPI    all 1     -         cbe_fire
210  0xd3 14         Edg IPI    all 1     -         cbe_fire
240  0xe0 15         Edg IPI    all 1     -         xc_serv
241  0xe1 15         Edg IPI    all 1     -         apic_error_intr

Chris Jordan wrote:

Thanks. Looking at that messages file it looks like nwam sees the link come back up and tries to unplumb the nge0 interface, which for some reason fails. Whether nwam is somehow doing that wrong, or whether it's a problem with nge0 isn't clear to me, but after that nwam just keeps retrying and failing till you reboot. I'm assigning this to "OI illumos" since it appears to be a problem with nwam or with the nge driver.

Also available in: Atom PDF