Bug #1606
nwam problem after powerfail (with UPS)...
0%
Description
strange problem that I've seen before, but persists in oi_151a.
after a network powerfail (the computers live on their UPS), when the network comes back
nwam is totally confused. There is network access, but the icons in the right top bar flash every 2 seconds or so between "Wired (nge0) connected", and "Wired (nge0) disconnected"
screen shots supplied.
Strange ifconfig -a output as well...
A reboot seems required as a 'pfexec svcadm restart nwam' doesn't seem to stop it.
Files
Updated by Chris Jordan over 9 years ago
- Category set to OS/Net (Kernel and Userland)
- Status changed from New to Feedback
- Tags changed from needs-triage to Nwam
What kind of NIC is this, and what is logged in /var/adm/messages when this occurs?
Updated by Richard PALO over 9 years ago
- File messages.txt messages.txt added
Chris Jordan wrote:
What kind of NIC is this, and what is logged in /var/adm/messages when this occurs?
Hi, I've attached an extract of the messages, including a snippet of the lastboot prior to the problem.
The system is an Acer X3200 Nvidia based machine.
pci1025,157 (pciex10de,760), instance #0 (driver name: nge)
richard@x3200:~$ prtconf -dD System Configuration: Project OpenIndiana i86pc Memory size: 7935 Megabytes System Peripherals (Software Nodes): i86pc (driver name: rootnex) scsi_vhci, instance #0 (driver name: scsi_vhci) pci, instance #0 (driver name: npe) pci10de,cb84 (pciex10de,754) isa (pciex10de,75c), instance #1 (driver name: isa) motherboard pit_beep, instance #1 (driver name: pit_beep) pci1025,157 (pciex10de,752) pci10de,cb84 (pciex10de,751) pci1025,157 (pciex10de,753) pci10de,568 (pciex10de,568) pci10de,cb84 (pciex10de,77b), instance #2 (driver name: ohci) hub, instance #2 (driver name: hubd) device, instance #6 (driver name: usb_mid) keyboard, instance #6 (driver name: hid) mouse, instance #7 (driver name: hid) pci10de,cb84 (pciex10de,77c), instance #1 (driver name: ehci) pci10de,cb84 (pciex10de,77d), instance #3 (driver name: ohci) pci10de,cb84 (pciex10de,77e), instance #2 (driver name: ehci) storage, instance #1 (driver name: scsa2usb) disk, instance #5 (driver name: sd) pci-ide (pciex10de,759), instance #2 (driver name: pci-ide) ide (driver name: ata) ide (driver name: ata) pci1025,157 (pciex10de,774), instance #1 (driver name: audiohd) pci10de,75a (pciex10de,75a), instance #2 (driver name: pci_pci) pci1025,157 (pciex10de,ad4), instance #0 (driver name: ahci) disk, instance #3 (driver name: sd) cdrom, instance #4 (driver name: sd) disk, instance #6 (driver name: sd) pci1025,157 (pciex10de,760), instance #0 (driver name: nge) pci10de,569 (pciex10de,569), instance #3 (driver name: pci_pci) display (pci10de,84b), instance #0 (driver name: nvidia) pci10de,778 (pciex10de,778), instance #1 (driver name: pcieb) display (pciex10de,6e0), instance #1 (driver name: nvidia) pci10de,75b (pciex10de,75b) (driver name: pcieb) pci10de,77a (pciex10de,77a), instance #2 (driver name: pcieb) pci1025,157 (pciex1106,3403), instance #1 (driver name: hci1394) pci1022,1200 (pciex1022,1200) pci1022,1201 (pciex1022,1201) pci1022,1202 (pciex1022,1202) pci1022,1203 (pciex1022,1203) pci1022,1204 (pciex1022,1204) fw, instance #0 (driver name: acpinex) cpu, instance #0 (driver name: cpudrv) cpu, instance #1 (driver name: cpudrv) cpu, instance #2 (driver name: cpudrv) cpu, instance #3 (driver name: cpudrv) sb, instance #1 (driver name: acpinex) used-resources iscsi, instance #0 (driver name: iscsi) options, instance #0 (driver name: options) pseudo, instance #0 (driver name: pseudo) agpgart, instance #0 (driver name: agpgart) xsvc, instance #0 (driver name: xsvc)
Updated by Chris Jordan over 9 years ago
- Status changed from Feedback to New
- Assignee set to OI illumos
- Difficulty changed from Medium to Hard
Thanks. Looking at that messages file it looks like nwam sees the link come back up and tries to unplumb the nge0 interface, which for some reason fails. Whether nwam is somehow doing that wrong, or whether it's a problem with nge0 isn't clear to me, but after that nwam just keeps retrying and failing till you reboot. I'm assigning this to "OI illumos" since it appears to be a problem with nwam or with the nge driver.
Updated by Richard PALO over 9 years ago
Great, btw, I thought nge was supposed to go MSI:
richard@x3200:~# echo ::interrupts -d | mdb -k IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s) 1 0x41 5 ISA Edg Fixed 3 1 0x0/0x1 i8042#1 9 0x80 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr 12 0x42 5 ISA Edg Fixed 0 1 0x0/0xc i8042#1 16 0x86 9 PCI Lvl Fixed 3 2 0x0/0x10 nvidia#1, hci1394#1 20 0x82 9 PCI Lvl Fixed 3 2 0x0/0x14 ohci#3, nvidia#0 21 0x83 9 PCI Lvl Fixed 0 2 0x0/0x15 ehci#1, nge#0 22 0x84 9 PCI Lvl Fixed 1 1 0x0/0x16 ehci#2 23 0x85 9 PCI Lvl Fixed 2 1 0x0/0x17 ohci#2 24 0x40 5 PCI Edg MSI 2 1 - ahci#0 25 0x81 7 PCI Edg MSI 1 1 - pcieb#1 26 0x30 4 PCI Edg MSI 2 1 - pcieb#2 160 0xa0 0 Edg IPI all 0 - poke_cpu 208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr 209 0xd1 14 Edg IPI all 1 - cbe_fire 210 0xd3 14 Edg IPI all 1 - cbe_fire 240 0xe0 15 Edg IPI all 1 - xc_serv 241 0xe1 15 Edg IPI all 1 - apic_error_intr
Chris Jordan wrote:
Thanks. Looking at that messages file it looks like nwam sees the link come back up and tries to unplumb the nge0 interface, which for some reason fails. Whether nwam is somehow doing that wrong, or whether it's a problem with nge0 isn't clear to me, but after that nwam just keeps retrying and failing till you reboot. I'm assigning this to "OI illumos" since it appears to be a problem with nwam or with the nge driver.