Bug #14174
closedpcieadm needs to handle v1 pcie cap better
100%
Description
When running pcieadm against an instance of e1000g, we actually end up aborting, blowing an internal assertion and at least not reading random memory:
$ /usr/lib/pci/pcieadm show-cfgspace -f ./e1000g.out Device ./e1000g.out -- Type 0 Header Vendor ID: 0x8086 -- Intel Corporation Device ID: 0x10d3 -- 82574L Gigabit Network Connection Command: 0x47 |--> I/O Space: enabled (0x1) |--> Memory Space: enabled (0x2) ... Slot Status: 0x0 |--> Attention Button Pressed: no (0x0) |--> Power Fault Detected: no (0x0) |--> MRL Sensor Changed: no (0x0) |--> Presence Detect Changed: no (0x0) |--> Command Complete: no (0x0) |--> MRL Sensor State: closed (0x0) |--> Presence Detect State: not present (0x0) |--> Electromechanical Interlock: disengaged (0x0) |--> Data Link Layer State Changed: no (0x0) Root control: 0x0 |--> CRS Software Visibility: disabled (0x0) Root Capabilities: 0x0 |--> System Error on Correctable Error: disabled (0x0) |--> System Error on Non-Fatal Error: disabled (0x0) |--> System Error on Fatal Error: disabled (0x0) |--> PME Interrupt: disabled (0x0) |--> CRS Software Visibility: disabled (0x0) assertion failed for thread 0xfffffc7fef372a40, thread-id 1: print->pcp_off + print->pcp_len + walkp->pcw_capoff <= walkp->pcw_valid (0x104 <= 0x100), file pcieadm_cfgspace.c, line 440 Abort
Here we ended up trying to read right at the border between basic and extended configuration space. Devices aren't supposed to have capabilities cross them. Of note, this device is old enough that it actually implements the PCIe Gen 1 version of the PCIe capability. While the v2 capability requires devices to hardcode as zero registers they don't implement, it turns out that this was not the case in the v1 capability and devices were allowed to just return short versions of the capability.
The solution here is to make sure that when we parse the PCIe capability we look at the device type for v1 devices and the slot implemented bit as well. With this in hand, we successfully parse the rest of this and then find the serial capability afterwards. For example, we would see for this device now:
$ ./pcieadm show-cfgspace -f ./e1000g.out Device ./e1000g.out -- Type 0 Header Vendor ID: 0x8086 -- Intel Corporation Device ID: 0x10d3 -- 82574L Gigabit Network Connection Command: 0x47 |--> I/O Space: enabled (0x1) |--> Memory Space: enabled (0x2) |--> Bus Master: enabled (0x4) |--> Special Cycle: disabled (0x0) |--> Memory Write and Invalidate: disabled (0x0) |--> VGA Palette Snoop: disabled (0x0) |--> Parity Error Response: enabled (0x40) |--> IDSEL Stepping/Wait Cycle Control: unsupported (0x0) |--> SERR# Enable: disabled (0x0) |--> Fast Back-to-Back Transactions: disabled (0x0) |--> Interrupt X: enabled (0x0) Status: 0x10 |--> Immediate Readiness: unsupported (0x0) |--> Interrupt Status: not pending (0x0) |--> Capabilities List: supported (0x10) |--> 66 MHz Capable: unsupported (0x0) |--> Fast Back-to-Back Capable: unsupported (0x0) |--> Master Data Parity Error: no error (0x0) |--> DEVSEL# Timing: fast (0x0) |--> Signaled Target Abort: no (0x0) |--> Received Target Abort: no (0x0) |--> Received Master Abort: no (0x0) |--> Signaled System Error: no (0x0) |--> Detected Parity Error: no (0x0) Revision ID: 0x0 Class Code: 0x20000 |--> Class Code: 0x2 |--> Sub-Class Code: 0x0 |--> Programming Interface: 0x0 Cache Line Size: 0x40 bytes Latency Timer: 0x0 cycles Header Type: 0x0 |--> Header Layout: Device (0x0) |--> Multi-Function Device: no (0x0) BIST: 0x0 |--> Completion Code: 0x0 |--> Start BIST: 0x0 |--> BIST Capable: unsupported (0x0) Base Address Register 0 |--> Space: Memory Space (0x0) |--> Address: 0xf44c0000 |--> Address: 32-bit (0x0) |--> Prefetchable: no (0x0) Base Address Register 1 |--> Space: Memory Space (0x0) |--> Address: 0xf4400000 |--> Address: 32-bit (0x0) |--> Prefetchable: no (0x0) Base Address Register 2 |--> Space: I/O Space (0x1) |--> Address: 0x5000 Base Address Register 3 |--> Space: Memory Space (0x0) |--> Address: 0xf44e0000 |--> Address: 32-bit (0x0) |--> Prefetchable: no (0x0) Base Address Register 4 |--> Space: Memory Space (0x0) |--> Address: 0x0 |--> Address: 32-bit (0x0) |--> Prefetchable: no (0x0) Base Address Register 5 |--> Space: Memory Space (0x0) |--> Address: 0x0 |--> Address: 32-bit (0x0) |--> Prefetchable: no (0x0) Cardbus CIS Pointer: 0x0 Subsystem Vendor ID: 0x8086 -- Intel Corporation Subsystem Device ID: 0xa01f -- Gigabit CT Desktop Adapter Expansion ROM: 0xf4480000 |--> Enable: disabled (0x0) |--> Base Address: 0x3d120000000 Capabilities Pointer: 0xc8 Interrupt Line: 0xa Interrupt Pin: 0x1 -- INTA Min_Gnt: 0x0 Min_Lat: 0x0 PCI Power Management Capability (0x1) Power Management Capabilities: 0xc822 |--> Version: 0x2 |--> PME Clock: not required (0x0) |--> Immediate Readiness on Return to D0: no (0x0) |--> Device Specific Initialization: yes (0x20) |--> Auxiliary Current: 0 (0x0) |--> D1: unsupported (0x0) |--> D2: unsupported (0x0) |--> PME Support: 0xc800 |--> D0 (0x800) |--> D3hot (0x4000) |--> D3cold (0x8000) Message Signaled Interrupts Capability (0x5) Message Control: 0x81 |--> MSI Enable: enabled (0x1) |--> Multiple Message Capable: 1 vector (0x0) |--> Multiple Message Enabled: 1 vector (0x0) |--> 64-bit Address Capable: supported (0x80) |--> Per-Vector Masking Capable: unsupported (0x0) |--> Extended Message Data Capable: unsupported (0x0) |--> extended Message Data Enable: unsupported (0x0) Message Address: 0xfee3e000 Upper Message Address: 0x0 Message Data: 0x20 PCI Express Capability (0x10) Capability Register: 0x1 |--> Version: 0x1 |--> Device/Port Type: PCIe Endpoint (0x0) |--> Slot Implemented: No (0x0) |--> Interrupt Message Number: 0x0 Device Capabilities: 0x8cc1 |--> Max Payload Size Supported: 256 bytes (0x1) |--> Phantom Functions Supported: No (0x0) |--> Extended Tag Field: 5-bit (0x0) |--> L0s Acceptable Latency: 512 ns (0xc0) |--> L1 Acceptable Latency: 64 us (0xc00) |--> Role Based Error Reporting: supported (0x8000) |--> ERR_COR Subclass: unsupported (0x0) |--> Captured Slot Power Limit: 0x0 |--> Captured Slot Power Limit Scale: 1.0x (0x0) |--> Function Level Reset: unsupported (0x0) Device Status: 0x10 |--> Correctable Error Detected: no (0x0) |--> Non-Fatal Error Detected: no (0x0) |--> Fatal Error Detected: no (0x0) |--> Unsupported Request Detected: no (0x0) |--> AUX Power Detected: yes (0x10) |--> Transactions Pending: no (0x0) |--> Emergency Power Reduction Detected: no (0x0) Link Capabilities: 0x31c11 |--> Maximum Link Speed: 2.5 GT/s (0x1) |--> Maximum Link Width: 0x1 |--> ASPM Support: L0s/L1 (0xc00) |--> L0s Exit Latency: 64-128ns (0x1000) |--> L1 Exit Latency: >64us (0x30000) |--> Clock Power Management: unsupported (0x0) |--> Surprise Down Error Reporting: unsupported (0x0) |--> Data Link Layer Active Reporting: unsupported (0x0) |--> Link Bandwidth Notification Capability: unsupported (0x0) |--> ASPM Optionality Compliance: not compliant (0x0) |--> Port Number: 0x0 Link Control: 0x40 |--> ASPM Control: None (0x0) |--> Read Completion Boundary: 64 byte (0x0) |--> Link Disable: not force disabled (0x0) |--> Retrain Link: 0x0 |--> Common Clock Configuration: common (0x40) |--> Extended Sync: 0x40e434 |--> Clock Power Management: 0x40e434 |--> Hardware Autonomous Width: 0x40e422 |--> Link Bandwidth Management Interrupt: 0x40e434 |--> Link Autonomous Bandwidth Interrupt: 0x40e434 |--> DRS Signaling Control: 0x40f58b Link Status: 0x1011 |--> Link Speed: 2.5 GT/s (0x1) |--> Link Width: 0x1 |--> Link Training: no (0x0) |--> Slot Clock Configuration: common (0x1000) |--> Data Link Layer Link Active: no (0x0) |--> Link Bandwidth Management Status: no change (0x0) |--> Link Autonomous Bandwidth Status: no change (0x0) MSI-X Capability (0x11) Control Register: 0x4 |--> Table Size: 0x5 |--> Function Mask: unmasked (0x0) |--> MSI-X Enable: disabled (0x0) Table Offset: 0x3 |--> Table BIR: BAR 3 (0x3) |--> Table Offset: 0x0 PBA Offset: 0x2003 |--> PBA BIR: BAR 3 (0x3) |--> PBA Offset: 0x2000 Advanced Error Reporting Capability (0x1) Capability Header: 0x14010001 |--> Capability ID: 0x1 |--> Capability Version: 0x1 |--> Next Capability Offset: 0x140 Uncorrectable Error Status: 0x0 |--> Data Link Protocol Error: 0x0 |--> Surprise Down Error: 0x0 |--> Poisoned TLP Received: 0x0 |--> Flow Control Protocol Error: 0x0 |--> Completion Timeout: 0x0 |--> Completion Abort: 0x0 |--> Unexpected Completion: 0x0 |--> Receiver Overflow: 0x0 |--> Malformed TLP: 0x0 |--> ECRC Error: 0x0 |--> Unsupported Request Error: 0x0 |--> ACS Violation: 0x0 |--> Uncorrectable Internal Error: 0x0 |--> MC Blocked TLP: 0x0 |--> AtomicOp Egress Blocked: 0x0 |--> TLP Prefix Blocked Error: 0x0 |--> Poisoned TLP Egress Blocked: 0x0 Uncorrectable Error Mask: 0x100000 |--> Data Link Protocol Error: 0x0 |--> Surprise Down Error: 0x0 |--> Poisoned TLP Received: 0x0 |--> Flow Control Protocol Error: 0x0 |--> Completion Timeout: 0x0 |--> Completion Abort: 0x0 |--> Unexpected Completion: 0x0 |--> Receiver Overflow: 0x0 |--> Malformed TLP: 0x0 |--> ECRC Error: 0x0 |--> Unsupported Request Error: 0x1 |--> ACS Violation: 0x0 |--> Uncorrectable Internal Error: 0x0 |--> MC Blocked TLP: 0x0 |--> AtomicOp Egress Blocked: 0x0 |--> TLP Prefix Blocked Error: 0x0 |--> Poisoned TLP Egress Blocked: 0x0 Uncorrectable Error Severity: 0x62011 |--> Data Link Protocol Error: 0x1 |--> Surprise Down Error: 0x0 |--> Poisoned TLP Received: 0x0 |--> Flow Control Protocol Error: 0x1 |--> Completion Timeout: 0x0 |--> Completion Abort: 0x0 |--> Unexpected Completion: 0x0 |--> Receiver Overflow: 0x1 |--> Malformed TLP: 0x1 |--> ECRC Error: 0x0 |--> Unsupported Request Error: 0x0 |--> ACS Violation: 0x0 |--> Uncorrectable Internal Error: 0x0 |--> MC Blocked TLP: 0x0 |--> AtomicOp Egress Blocked: 0x0 |--> TLP Prefix Blocked Error: 0x0 |--> Poisoned TLP Egress Blocked: 0x0 Correctable Error Status: 0x0 |--> Receiver Error: 0x0 |--> Bad TLP: 0x0 |--> Bad DLLP: 0x0 |--> REPLAY_NUM Rollover: 0x0 |--> Replay timer Timeout: 0x0 |--> Advisory Non-Fatal Error: 0x0 |--> Correctable Internal Error: 0x0 |--> Header Log Overflow: 0x0 Correctable Error Mask: 0x0 |--> Receiver Error: 0x0 |--> Bad TLP: 0x0 |--> Bad DLLP: 0x0 |--> REPLAY_NUM Rollover: 0x0 |--> Replay timer Timeout: 0x0 |--> Advisory Non-Fatal Error: 0x0 |--> Correctable Internal Error: 0x0 |--> Header Log Overflow: 0x0 Advanced Error Capabilities and Control: 0x0 |--> First Error Pointer: 0x0 |--> ECRC Generation Capable: unsupported (0x0) |--> ECRC Generation Enable: disabled (0x0) |--> ECRC Check Capable: unsupported (0x0) |--> ECRC Check Enable: disabled (0x0) Header Log 0: 0x0 Header Log 1: 0x0 Header Log 2: 0x0 Header Log 3: 0x0 Root Error Command: 0x0 |--> Correctable Error Reporting: disabled (0x0) |--> Non-Fatal Error Reporting: disabled (0x0) |--> Fatal Error Reporting: disabled (0x0) Root Error Status: 0x0 |--> ERR_COR Received: 0x0 |--> Multiple ERR_COR Received: 0x0 |--> ERR_FATAL/NONFATAL Received: 0x0 |--> Multiple ERR_FATAL/NONFATAL Received: 0x0 |--> First Uncorrectable Fatal: 0x0 |--> Non-Fatal Error Messages Received: 0x0 |--> Fatal Error Messages Received: 0x0 |--> ERR_COR Subclass: ECS Legacy (0x0) |--> Advanced Error Interrupt Message: 0x0 Error Source Identification: 0x0 |--> ERR_COR Source: 0x0 |--> ERR_FATAL/NONFATAL Source: 0x0 Serial Number Capability (0x3) Capability Header: 0x10003 |--> Capability ID: 0x3 |--> Capability Version: 0x1 |--> Next Capability Offset: 0x0 Serial Number: 6c-b3-11-ff-ff-0f-d0-12
Updated by Robert Mustacchi 7 months ago
In addition, tested by the test suite run:
rm@iliad:/ws/rm$ pfexec /opt/util-tests/bin/utiltest Test: /opt/util-tests/tests/allowed-ips (run as root) [00:01] [PASS] Test: /opt/util-tests/tests/chown_test (run as root) [00:01] [PASS] Test: /opt/util-tests/tests/date_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/find/findtest (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/grep_test (run as root) [00:03] [PASS] Test: /opt/util-tests/tests/head/head_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libjedec_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libsff/libsff (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/make_test (run as root) [00:01] [PASS] Test: /opt/util-tests/tests/mdb/mdbtest (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/mergeq/mqt (run as root) [00:01] [PASS] Test: /opt/util-tests/tests/mergeq/wqt (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/pcidbtest (run as root) [00:02] [PASS] Test: /opt/util-tests/tests/pcieadm-priv (run as root) [00:02] [PASS] Test: /opt/util-tests/tests/pcieadmtest (run as root) [00:05] [PASS] Test: /opt/util-tests/tests/printf_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/set-linkprop (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/sleep/sleeptest (run as root) [00:44] [PASS] Test: /opt/util-tests/tests/smbios (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/svr4pkg_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/xargs_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/awk/runtests.sh (run as nobody) [02:33] [PASS] Test: /opt/util-tests/tests/ctf/precheck (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/ctf/ctftest (run as root) [00:10] [PASS] Test: /opt/util-tests/tests/demangle/afl-fast (run as root) [00:01] [PASS] Test: /opt/util-tests/tests/demangle/gcc-libstdc++ (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/demangle/llvm-stdcxxabi (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libcustr/custr_remove (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libcustr/custr_trunc (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_00_blank (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_01_boolean (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_02_numbers (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_03_empty_arrays (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_04_number_arrays (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_05_strings (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_06_nested (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_07_nested_arrays (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/sed/sed_addr (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/sed/multi_test (run as root) [00:01] [PASS] Results Summary PASS 39 Running Time: 00:03:54 Percent passed: 100.0% Log directory: /var/tmp/test_results/20211027T084400
Updated by Electric Monk 7 months ago
- Status changed from New to Closed
- % Done changed from 50 to 100
git commit bc729d490568bb6599aac50d559e64c366738e85
commit bc729d490568bb6599aac50d559e64c366738e85 Author: Robert Mustacchi <rm@fingolfin.org> Date: 2021-10-28T21:30:51.000Z 14174 pcieadm needs to handle v1 pcie cap better 14175 pcieadm show-devs can do better on missing pcidb entries 14176 pcieadm aer cap compares wrong field Reviewed by: Patrick Mooney <pmooney@pfmooney.com> Approved by: Dan McDonald <danmcd@joyent.com>