Bug #3126
openoi_151a5 ixgbe driver
0%
Description
I upgraded my OI151 to OI151a5. After I rebooted into my new be, everything worked except the Intel's 10GE card. The switch showed the port as "UP" where it was connected to. The ifconfig -a showed the interfaces with it's usual IP addresses.
dladm show-phys showed everything as ok...
Everything seemed ok, except the fact, that it didn't worked. Don't even a ping to or from that interface.
Files
Updated by Milan Jurik about 11 years ago
- Status changed from New to Feedback
- Priority changed from Normal to High
- Tags deleted (
needs-triage)
Could you identify which prestable release shows this regression? There were several rounds of ixgbe updates among 5 prestable releases.
Also it would be nice if you give more info about your card? E.g. from scanpci output.
How do you configure your IP address? Static configuration or dhcp?
Any suspicious values in kstat (-m ixgbe) ?
Updated by György Pásztor about 11 years ago
The "before state":
root@thumper:~# uname -a SunOS thumper 5.11 oi_151a i86pc i386 i86pc Solaris
The "not working state": oi_151a5
I deleted the non-working be, so just my memories:
I assume the image update updated the whole old system (which was released 14th Sep. 2011) to oi_151a5 (which was released 02nd Jul. 2012).
scanpci -v :
http://pastebin.com/dBiZGYFV
kstat -m ixgbe:
http://pastebin.com/iCZMnjXr
I think, another kstat would be nice with the non working kernel/system.
I need a little patience, because it's a working storage server. I try to require a maintenance window ASAP to get those data too.
Updated by György Pásztor about 11 years ago
I almost forgot the IP data. It's static, since it's a storage server with production data. (I trust in a devel version of OI more then Oracle's anything ;-) )
root@thumper:~# grep . /etc/hostname.* /etc/hostname.ixgbe0:10.10.1.1 /etc/hostname.ixgbe1:10.11.1.1 /etc/hostname.nge0:10.1.7.16 /etc/hostname.nge1:10.0.1.10 /etc/hostname.nge2:10.2.0.2 root@thumper:~# grep ^10\. /etc/netmasks 10.11.0.0 255.255.0.0 10.10.0.0 255.255.0.0 10.1.0.0 255.255.0.0 10.0.0.0 255.255.0.0 10.2.0.0 255.255.0.0
Updated by Dan McDonald about 11 years ago
György Pásztor wrote:
scanpci -v :
http://pastebin.com/dBiZGYFV
This shows an Intel X520. I looked at the diffs for X540 support -- I didn't see any changes in these codepaths. Can you snoop on the interface with the most recent version?
Updated by György Pásztor about 11 years ago
Do you need any specific thing, or I should just do: snoop -rv -s 9000 -I ixgbe0 or snoop on datalink layer, or should I snoop into a file? (Or all of them? :D)
Snoop will arrive tomorrow with the new kstat snippets, if I can get a maintenance window... ;-)
Updated by Dan McDonald about 11 years ago
György Pásztor wrote:
Do you need any specific thing, or I should just do: snoop -rv -s 9000 -I ixgbe0 or snoop on datalink layer, or should I snoop into a file? (Or all of them? :D)
Snoop will arrive tomorrow with the new kstat snippets, if I can get a maintenance window... ;-)
My first test is always to run "snoop -d ixgbe0 -o test-file". That'll test the data link first. Also, differential kstats (kstat ; ping a bit ; kstat, diff the two kstat outputs) can help a lot. Include "ip" in those kstats as well.
Updated by György Pásztor about 11 years ago
Revision:
root@thumper:~# uname -a SunOS thumper 5.11 oi_151a5 i86pc i386 i86pc Solaris
Shell#1: snoop d ixgbe -o test-file.0> wait for not alive
Shell#2: ping 10.10.3.1 --
Shell#1: kstat m ixgbe > wait for not alive
http://pastebin.com/yHc3FWzV
Shell#1 snoop -d ixgbe -o test-file.1
Shell#2: ping 10.10.3.1 --
Shell#1: kstat m ixgbe> wait for not alive
http://pastebin.com/UvMLH4tS
Shell#1: snoop -I ixgbe0 -o test-file.2
Shell#2: ping 10.10.3.1 --
test-file-s:
after gzip test-file.*
pasztor@io:~/Sun$ cat test-file.0.gz | uuencode -m /dev/stdout begin-base64 644 /dev/stdout H4sICLflPFAAA3Rlc3QtZmlsZS4wACvOy88vYAADJiBmAWItKPYACQbYPK5h 4CkQ/Q8GDNKKvhcqONgYGDkY2FgYGCF8Li5GRog8FxczIwMMoJtTy8By9QFV zOEp16CCOXUMPBWqVDCnHuivX1Qxh6f8GhXMaWDgKTtIBXMaGVj+bKaKOTxz d1LBnCYGnnlHqWBOM9Bf56liDs+Rb1Qwp4WB51gvFcxpZWDVnUYVc3he76eC OW0MPG9cqGBOOwNrzEaqmMPzmhrlWAcDrzA1yo1OBtZCZ6qYwyvkRAVzuoDm TKeCOd1Af1lQxRxeSzMqmNPDwGt3ggrm9DKwLmCgijm8NrxUMKePgdeFGumw n4F13jyqmMNr8ZhUcwCk/+kkgAgAAA== ==== pasztor@io:~/Sun$ cat test-file.1.gz | uuencode -m /dev/stdout begin-base64 644 /dev/stdout H4sICLflPFAAA3Rlc3QtZmlsZS4xAK3SPUvCURgF8HOPIvIQUvSevVlW0GJN IbS1aBQ2ONR3KP9WCjYEDa0lBK5BRB8gyk1saGxxaHayj9Bs/7SmQAzOgTPc e+HHA889zXleHp3Qb9Dv6k9T35d7m60R8KLZ7gTR2G6jFA7BhREKwnXPZs51 380CDr/569jwlcAZBY8/Bc4YWHiVODZ0JnDG4TL1fh3Xy6H3IphnAsw3JI5t 3AucSbByInCmwFxb4tjgvsCJgkcDAmcaPHyTOBbp+x/2cGbA4qXAmQVvkhLH Ds4Fzhz4XhI48+DtmsSx9TuBEwPLaYGzAF6XJY4ldgTOoj/PlsCJg8+KfcVh 2ZrAWQKftgXOMlgNSBzLPgicFfDx47/OF11YMMnICAAA ==== pasztor@io:~/Sun$ cat test-file.2.gz | uuencode -m /dev/stdout begin-base64 644 /dev/stdout H4sICLflPFAAA3Rlc3QtZmlsZS4yAK3US0iUURQH8PONTTOe8TWjpT00q+nh q3LssRGyxSzKzUgPcleLFoI0QerM+BhHHbVliyAKjIJKIqhMS6001KahCKFQ x0ppRplNElG0qJWe7zBwxcVdxP3D4XIPl8tvce69fNHtvgQcQwDASGttoq7o TVd5vA7whF0zgEbbU/o5SGSZ4qSe9qEClrXzYURNQ0zSzFBVbaoAqIuXu/6V IJiT0ZKSmpaeYbVlZm3YmJ2zafOWrbl52/K379hp37V7z96CwqLikn37D5Q6 yg4eOnwkcf1aRz2g0yl1TLIjJBx/dIdWT47TxcocDYAnf0kdH9kxLhy9ZeQw NJDjTpUyhwfwzJLU8YkdY8Ix4SBHkoccQXUOL2BNSOqYYsdr4ZjMJcc6Lzm8 F5Q5fICu+1LHNDtGheNHKTmMPnKcdSlzNJLjmdQxw44R4TDqjvWN5DiqztEE eNUtdUTY8Uo4ImZymJrIUetR5mgG7Psrdcyy46VwFMTIYW4mR84tZY4WwJ5x qeMzO14IRzhKjuQWcvh6lDn8gG9tUscXdgwLx8MhcqCfHNfeKHO0Ar7Lkzq+ smNIOK4PksPSSo5edY4A4PcRqWOOHYPCUXmTHCkBcvyOKnO0Ac74pY55djwX Dv9dcqS2kWN0WpmjHSzWY1LHN3YMCEf2Y3KktZPD/l6ZowMwfE/qiLKjXziq B8iR3kGO2IQyRxAs8EDqiLHjqXA4dEdGkBwmdY5OwPl+qWOBHX3CcfwGOayd 5Pg5p8zRBThbI3UssuOJcJy7TQ5bFzkiU8oc3YChYakjzo5Hq96tPqeZ3eRY +s//YwUINW0KsAoAAA== ====
After all, a dmesg:
http://pastebin.com/k2cgSJ2V
Updated by György Pásztor about 11 years ago
- File test-file.0.gz test-file.0.gz added
- File test-file.1.gz test-file.1.gz added
- File test-file.2.gz test-file.2.gz added
- File kstat0.txt kstat0.txt added
- File kstat1.txt kstat1.txt added
Hmm... I've just noticed, I can upload files. :D
Updated by Dan McDonald almost 11 years ago
Sorry for taking so long on the followup.
Your most interesting datapoint comes from the kstats:
108c108
< link_xoff_xmitd 92131
---
link_xoff_xmitd 156311
The link itself is off.
Can you show the output of: "dladm show-link" and "dladm show-ether" as well? Bonus points if you can show these from the early, working, version.
Updated by György Pásztor almost 11 years ago
- File oi151a_working_kstat.txt oi151a_working_kstat.txt added
- File oi151a_working_dladm_ipadm.txt oi151a_working_dladm_ipadm.txt added
- File oi151a7_nworking_kstat.txt oi151a7_nworking_kstat.txt added
- File oi151a7_nworking_dladm_ipadm.txt oi151a7_nworking_dladm_ipadm.txt added
Sorry too, for the long wait...
But I can't get a maintenance window anytime...
So, let's see the results. I think, the filenames are good enugh, to not say more ;-)
The new try was on 151a7, not on a5. I don't know if it's count's or not...
Updated by Dan McDonald almost 11 years ago
György Pásztor wrote:
The new try was on 151a7, not on a5. I don't know if it's count's or not...
That counts just fine.
Hmmm. For some reason the newer ixgbe doesn't set your card up properly for some reason. I don't suppose "grep ixgbe /var/adm/messages" shows anything unusual, does it?
Updated by György Pásztor almost 11 years ago
- File oi151a7_nworking_admmsg.txt oi151a7_nworking_admmsg.txt added
- File oi151a_working_admmsg.txt oi151a_working_admmsg.txt added
openindiana is the currently working BE, with oi151a, and openindiana-1 is the freshly created oi151a7 BE.
root@thumper:~# beadm mount openindiana-1 root@thumper:~# cd /tmp/tmp.YQa4RB/ root@thumper:/tmp/tmp.YQa4RB# tail -5000 var/adm/messages |grep Oct\ 27 >/tmp/oi151a7_nworking_admmsg.txt root@thumper:/tmp/tmp.YQa4RB# cd root@thumper:~# beadm umount openindiana-1 root@thumper:~# tail -5000 /var/adm/messages |grep Oct\ 27 >/tmp/oi151a_working_admmsg.txt
The tail was necessary, because I forget to install logrotate, and now pkg only willing to do it into a new BE, where it updates other things to... (amongst other things, that's why we want to upgrade)
Updated by Kevin Crowe about 10 years ago
Anyone know if this might be related to flowcontrol problems in ixgbe on X520 HW? If so, see bug 4063 I just submitted.