Project

General

Profile

Actions

Bug #14174

closed

pcieadm needs to handle v1 pcie cap better

Added by Robert Mustacchi about 1 month ago. Updated 29 days ago.

Status:
Closed
Priority:
Normal
Category:
cmd - userland programs
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

When running pcieadm against an instance of e1000g, we actually end up aborting, blowing an internal assertion and at least not reading random memory:

$ /usr/lib/pci/pcieadm show-cfgspace -f ./e1000g.out 
Device ./e1000g.out -- Type 0 Header
  Vendor ID: 0x8086 -- Intel Corporation
  Device ID: 0x10d3 -- 82574L Gigabit Network Connection
  Command: 0x47
    |--> I/O Space: enabled (0x1)
    |--> Memory Space: enabled (0x2)
...
  Slot Status: 0x0
    |--> Attention Button Pressed: no (0x0)
    |--> Power Fault Detected: no (0x0)
    |--> MRL Sensor Changed: no (0x0)
    |--> Presence Detect Changed: no (0x0)
    |--> Command Complete: no (0x0)
    |--> MRL Sensor State: closed (0x0)
    |--> Presence Detect State: not present (0x0)
    |--> Electromechanical Interlock: disengaged (0x0)
    |--> Data Link Layer State Changed: no (0x0)
  Root control: 0x0
    |--> CRS Software Visibility: disabled (0x0)
  Root Capabilities: 0x0
    |--> System Error on Correctable Error: disabled (0x0)
    |--> System Error on Non-Fatal Error: disabled (0x0)
    |--> System Error on Fatal Error: disabled (0x0)
    |--> PME Interrupt: disabled (0x0)
    |--> CRS Software Visibility: disabled (0x0)
assertion failed for thread 0xfffffc7fef372a40, thread-id 1: print->pcp_off + print->pcp_len + walkp->pcw_capoff <= walkp->pcw_valid (0x104 <= 0x100), file pcieadm_cfgspace.c, line 440
Abort

Here we ended up trying to read right at the border between basic and extended configuration space. Devices aren't supposed to have capabilities cross them. Of note, this device is old enough that it actually implements the PCIe Gen 1 version of the PCIe capability. While the v2 capability requires devices to hardcode as zero registers they don't implement, it turns out that this was not the case in the v1 capability and devices were allowed to just return short versions of the capability.

The solution here is to make sure that when we parse the PCIe capability we look at the device type for v1 devices and the slot implemented bit as well. With this in hand, we successfully parse the rest of this and then find the serial capability afterwards. For example, we would see for this device now:

$ ./pcieadm show-cfgspace -f ./e1000g.out 
Device ./e1000g.out -- Type 0 Header
  Vendor ID: 0x8086 -- Intel Corporation
  Device ID: 0x10d3 -- 82574L Gigabit Network Connection
  Command: 0x47
    |--> I/O Space: enabled (0x1)
    |--> Memory Space: enabled (0x2)
    |--> Bus Master: enabled (0x4)
    |--> Special Cycle: disabled (0x0)
    |--> Memory Write and Invalidate: disabled (0x0)
    |--> VGA Palette Snoop: disabled (0x0)
    |--> Parity Error Response: enabled (0x40)
    |--> IDSEL Stepping/Wait Cycle Control: unsupported (0x0)
    |--> SERR# Enable: disabled (0x0)
    |--> Fast Back-to-Back Transactions: disabled (0x0)
    |--> Interrupt X: enabled (0x0)
  Status: 0x10
    |--> Immediate Readiness: unsupported (0x0)
    |--> Interrupt Status: not pending (0x0)
    |--> Capabilities List: supported (0x10)
    |--> 66 MHz Capable: unsupported (0x0)
    |--> Fast Back-to-Back Capable: unsupported (0x0)
    |--> Master Data Parity Error: no error (0x0)
    |--> DEVSEL# Timing: fast (0x0)
    |--> Signaled Target Abort: no (0x0)
    |--> Received Target Abort: no (0x0)
    |--> Received Master Abort: no (0x0)
    |--> Signaled System Error: no (0x0)
    |--> Detected Parity Error: no (0x0)
  Revision ID: 0x0
  Class Code: 0x20000
    |--> Class Code: 0x2
    |--> Sub-Class Code: 0x0
    |--> Programming Interface: 0x0
  Cache Line Size: 0x40 bytes
  Latency Timer: 0x0 cycles
  Header Type: 0x0
    |--> Header Layout: Device (0x0)
    |--> Multi-Function Device: no (0x0)
  BIST: 0x0
    |--> Completion Code: 0x0
    |--> Start BIST: 0x0
    |--> BIST Capable: unsupported (0x0)
  Base Address Register 0
    |--> Space: Memory Space (0x0)
    |--> Address: 0xf44c0000
    |--> Address: 32-bit (0x0)
    |--> Prefetchable: no (0x0)
  Base Address Register 1
    |--> Space: Memory Space (0x0)
    |--> Address: 0xf4400000
    |--> Address: 32-bit (0x0)
    |--> Prefetchable: no (0x0)
  Base Address Register 2
    |--> Space: I/O Space (0x1)
    |--> Address: 0x5000
  Base Address Register 3
    |--> Space: Memory Space (0x0)
    |--> Address: 0xf44e0000
    |--> Address: 32-bit (0x0)
    |--> Prefetchable: no (0x0)
  Base Address Register 4
    |--> Space: Memory Space (0x0)
    |--> Address: 0x0
    |--> Address: 32-bit (0x0)
    |--> Prefetchable: no (0x0)
  Base Address Register 5
    |--> Space: Memory Space (0x0)
    |--> Address: 0x0
    |--> Address: 32-bit (0x0)
    |--> Prefetchable: no (0x0)
  Cardbus CIS Pointer: 0x0
  Subsystem Vendor ID: 0x8086 -- Intel Corporation
  Subsystem Device ID: 0xa01f -- Gigabit CT Desktop Adapter
  Expansion ROM: 0xf4480000
    |--> Enable: disabled (0x0)
    |--> Base Address: 0x3d120000000
  Capabilities Pointer: 0xc8
  Interrupt Line: 0xa
  Interrupt Pin: 0x1 -- INTA
  Min_Gnt: 0x0
  Min_Lat: 0x0
PCI Power Management Capability (0x1)
  Power Management Capabilities: 0xc822
    |--> Version: 0x2
    |--> PME Clock: not required (0x0)
    |--> Immediate Readiness on Return to D0: no (0x0)
    |--> Device Specific Initialization: yes (0x20)
    |--> Auxiliary Current: 0 (0x0)
    |--> D1: unsupported (0x0)
    |--> D2: unsupported (0x0)
    |--> PME Support: 0xc800
      |--> D0 (0x800)
      |--> D3hot (0x4000)
      |--> D3cold (0x8000)
Message Signaled Interrupts Capability (0x5)
  Message Control: 0x81
    |--> MSI Enable: enabled (0x1)
    |--> Multiple Message Capable: 1 vector (0x0)
    |--> Multiple Message Enabled: 1 vector (0x0)
    |--> 64-bit Address Capable: supported (0x80)
    |--> Per-Vector Masking Capable: unsupported (0x0)
    |--> Extended Message Data Capable: unsupported (0x0)
    |--> extended Message Data Enable: unsupported (0x0)
  Message Address: 0xfee3e000
  Upper Message Address: 0x0
  Message Data: 0x20
PCI Express Capability (0x10)
  Capability Register: 0x1
    |--> Version: 0x1
    |--> Device/Port Type: PCIe Endpoint (0x0)
    |--> Slot Implemented: No (0x0)
    |--> Interrupt Message Number: 0x0
  Device Capabilities: 0x8cc1
    |--> Max Payload Size Supported: 256 bytes (0x1)
    |--> Phantom Functions Supported: No (0x0)
    |--> Extended Tag Field: 5-bit (0x0)
    |--> L0s Acceptable Latency: 512 ns (0xc0)
    |--> L1 Acceptable Latency: 64 us (0xc00)
    |--> Role Based Error Reporting: supported (0x8000)
    |--> ERR_COR Subclass: unsupported (0x0)
    |--> Captured Slot Power Limit: 0x0
    |--> Captured Slot Power Limit Scale: 1.0x (0x0)
    |--> Function Level Reset: unsupported (0x0)
  Device Status: 0x10
    |--> Correctable Error Detected: no (0x0)
    |--> Non-Fatal Error Detected: no (0x0)
    |--> Fatal Error Detected: no (0x0)
    |--> Unsupported Request Detected: no (0x0)
    |--> AUX Power Detected: yes (0x10)
    |--> Transactions Pending: no (0x0)
    |--> Emergency Power Reduction Detected: no (0x0)
  Link Capabilities: 0x31c11
    |--> Maximum Link Speed: 2.5 GT/s (0x1)
    |--> Maximum Link Width: 0x1
    |--> ASPM Support: L0s/L1 (0xc00)
    |--> L0s Exit Latency: 64-128ns (0x1000)
    |--> L1 Exit Latency: >64us (0x30000)
    |--> Clock Power Management: unsupported (0x0)
    |--> Surprise Down Error Reporting: unsupported (0x0)
    |--> Data Link Layer Active Reporting: unsupported (0x0)
    |--> Link Bandwidth Notification Capability: unsupported (0x0)
    |--> ASPM Optionality Compliance: not compliant (0x0)
    |--> Port Number: 0x0
  Link Control: 0x40
    |--> ASPM Control: None (0x0)
    |--> Read Completion Boundary: 64 byte (0x0)
    |--> Link Disable: not force disabled (0x0)
    |--> Retrain Link: 0x0
    |--> Common Clock Configuration: common (0x40)
    |--> Extended Sync: 0x40e434
    |--> Clock Power Management: 0x40e434
    |--> Hardware Autonomous Width: 0x40e422
    |--> Link Bandwidth Management Interrupt: 0x40e434
    |--> Link Autonomous Bandwidth Interrupt: 0x40e434
    |--> DRS Signaling Control: 0x40f58b
  Link Status: 0x1011
    |--> Link Speed: 2.5 GT/s (0x1)
    |--> Link Width: 0x1
    |--> Link Training: no (0x0)
    |--> Slot Clock Configuration: common (0x1000)
    |--> Data Link Layer Link Active: no (0x0)
    |--> Link Bandwidth Management Status: no change (0x0)
    |--> Link Autonomous Bandwidth Status: no change (0x0)
MSI-X Capability (0x11)
  Control Register: 0x4
    |--> Table Size: 0x5
    |--> Function Mask: unmasked (0x0)
    |--> MSI-X Enable: disabled (0x0)
  Table Offset: 0x3
    |--> Table BIR: BAR 3 (0x3)
    |--> Table Offset: 0x0
  PBA Offset: 0x2003
    |--> PBA BIR: BAR 3 (0x3)
    |--> PBA Offset: 0x2000
Advanced Error Reporting Capability (0x1)
  Capability Header: 0x14010001
    |--> Capability ID: 0x1
    |--> Capability Version: 0x1
    |--> Next Capability Offset: 0x140
  Uncorrectable Error Status: 0x0
    |--> Data Link Protocol Error: 0x0
    |--> Surprise Down Error: 0x0
    |--> Poisoned TLP Received: 0x0
    |--> Flow Control Protocol Error: 0x0
    |--> Completion Timeout: 0x0
    |--> Completion Abort: 0x0
    |--> Unexpected Completion: 0x0
    |--> Receiver Overflow: 0x0
    |--> Malformed TLP: 0x0
    |--> ECRC Error: 0x0
    |--> Unsupported Request Error: 0x0
    |--> ACS Violation: 0x0
    |--> Uncorrectable Internal Error: 0x0
    |--> MC Blocked TLP: 0x0
    |--> AtomicOp Egress Blocked: 0x0
    |--> TLP Prefix Blocked Error: 0x0
    |--> Poisoned TLP Egress Blocked: 0x0
  Uncorrectable Error Mask: 0x100000
    |--> Data Link Protocol Error: 0x0
    |--> Surprise Down Error: 0x0
    |--> Poisoned TLP Received: 0x0
    |--> Flow Control Protocol Error: 0x0
    |--> Completion Timeout: 0x0
    |--> Completion Abort: 0x0
    |--> Unexpected Completion: 0x0
    |--> Receiver Overflow: 0x0
    |--> Malformed TLP: 0x0
    |--> ECRC Error: 0x0
    |--> Unsupported Request Error: 0x1
    |--> ACS Violation: 0x0
    |--> Uncorrectable Internal Error: 0x0
    |--> MC Blocked TLP: 0x0
    |--> AtomicOp Egress Blocked: 0x0
    |--> TLP Prefix Blocked Error: 0x0
    |--> Poisoned TLP Egress Blocked: 0x0
  Uncorrectable Error Severity: 0x62011
    |--> Data Link Protocol Error: 0x1
    |--> Surprise Down Error: 0x0
    |--> Poisoned TLP Received: 0x0
    |--> Flow Control Protocol Error: 0x1
    |--> Completion Timeout: 0x0
    |--> Completion Abort: 0x0
    |--> Unexpected Completion: 0x0
    |--> Receiver Overflow: 0x1
    |--> Malformed TLP: 0x1
    |--> ECRC Error: 0x0
    |--> Unsupported Request Error: 0x0
    |--> ACS Violation: 0x0
    |--> Uncorrectable Internal Error: 0x0
    |--> MC Blocked TLP: 0x0
    |--> AtomicOp Egress Blocked: 0x0
    |--> TLP Prefix Blocked Error: 0x0
    |--> Poisoned TLP Egress Blocked: 0x0
  Correctable Error Status: 0x0
    |--> Receiver Error: 0x0
    |--> Bad TLP: 0x0
    |--> Bad DLLP: 0x0
    |--> REPLAY_NUM Rollover: 0x0
    |--> Replay timer Timeout: 0x0
    |--> Advisory Non-Fatal Error: 0x0
    |--> Correctable Internal Error: 0x0
    |--> Header Log Overflow: 0x0
  Correctable Error Mask: 0x0
    |--> Receiver Error: 0x0
    |--> Bad TLP: 0x0
    |--> Bad DLLP: 0x0
    |--> REPLAY_NUM Rollover: 0x0
    |--> Replay timer Timeout: 0x0
    |--> Advisory Non-Fatal Error: 0x0
    |--> Correctable Internal Error: 0x0
    |--> Header Log Overflow: 0x0
  Advanced Error Capabilities and Control: 0x0
    |--> First Error Pointer: 0x0
    |--> ECRC Generation Capable: unsupported (0x0)
    |--> ECRC Generation Enable: disabled (0x0)
    |--> ECRC Check Capable: unsupported (0x0)
    |--> ECRC Check Enable: disabled (0x0)
  Header Log 0: 0x0
  Header Log 1: 0x0
  Header Log 2: 0x0
  Header Log 3: 0x0
  Root Error Command: 0x0
    |--> Correctable Error Reporting: disabled (0x0)
    |--> Non-Fatal Error Reporting: disabled (0x0)
    |--> Fatal Error Reporting: disabled (0x0)
  Root Error Status: 0x0
    |--> ERR_COR Received: 0x0
    |--> Multiple ERR_COR Received: 0x0
    |--> ERR_FATAL/NONFATAL Received: 0x0
    |--> Multiple ERR_FATAL/NONFATAL Received: 0x0
    |--> First Uncorrectable Fatal: 0x0
    |--> Non-Fatal Error Messages Received: 0x0
    |--> Fatal Error Messages Received: 0x0
    |--> ERR_COR Subclass: ECS Legacy (0x0)
    |--> Advanced Error Interrupt Message: 0x0
  Error Source Identification: 0x0
    |--> ERR_COR Source: 0x0
    |--> ERR_FATAL/NONFATAL Source: 0x0
Serial Number Capability (0x3)
  Capability Header: 0x10003
    |--> Capability ID: 0x3
    |--> Capability Version: 0x1
    |--> Next Capability Offset: 0x0
  Serial Number: 6c-b3-11-ff-ff-0f-d0-12

Actions #1

Updated by Electric Monk about 1 month ago

  • Gerrit CR set to 1765
Actions #2

Updated by Robert Mustacchi about 1 month ago

In addition, tested by the test suite run:

rm@iliad:/ws/rm$ pfexec /opt/util-tests/bin/utiltest 
Test: /opt/util-tests/tests/allowed-ips (run as root)             [00:01] [PASS]
Test: /opt/util-tests/tests/chown_test (run as root)              [00:01] [PASS]
Test: /opt/util-tests/tests/date_test (run as root)               [00:00] [PASS]
Test: /opt/util-tests/tests/find/findtest (run as root)           [00:00] [PASS]
Test: /opt/util-tests/tests/grep_test (run as root)               [00:03] [PASS]
Test: /opt/util-tests/tests/head/head_test (run as root)          [00:00] [PASS]
Test: /opt/util-tests/tests/libjedec_test (run as root)           [00:00] [PASS]
Test: /opt/util-tests/tests/libsff/libsff (run as root)           [00:00] [PASS]
Test: /opt/util-tests/tests/make_test (run as root)               [00:01] [PASS]
Test: /opt/util-tests/tests/mdb/mdbtest (run as root)             [00:00] [PASS]
Test: /opt/util-tests/tests/mergeq/mqt (run as root)              [00:01] [PASS]
Test: /opt/util-tests/tests/mergeq/wqt (run as root)              [00:00] [PASS]
Test: /opt/util-tests/tests/pcidbtest (run as root)               [00:02] [PASS]
Test: /opt/util-tests/tests/pcieadm-priv (run as root)            [00:02] [PASS]
Test: /opt/util-tests/tests/pcieadmtest (run as root)             [00:05] [PASS]
Test: /opt/util-tests/tests/printf_test (run as root)             [00:00] [PASS]
Test: /opt/util-tests/tests/set-linkprop (run as root)            [00:00] [PASS]
Test: /opt/util-tests/tests/sleep/sleeptest (run as root)         [00:44] [PASS]
Test: /opt/util-tests/tests/smbios (run as root)                  [00:00] [PASS]
Test: /opt/util-tests/tests/svr4pkg_test (run as root)            [00:00] [PASS]
Test: /opt/util-tests/tests/xargs_test (run as root)              [00:00] [PASS]
Test: /opt/util-tests/tests/awk/runtests.sh (run as nobody)       [02:33] [PASS]
Test: /opt/util-tests/tests/ctf/precheck (run as root)            [00:00] [PASS]
Test: /opt/util-tests/tests/ctf/ctftest (run as root)             [00:10] [PASS]
Test: /opt/util-tests/tests/demangle/afl-fast (run as root)       [00:01] [PASS]
Test: /opt/util-tests/tests/demangle/gcc-libstdc++ (run as root)  [00:00] [PASS]
Test: /opt/util-tests/tests/demangle/llvm-stdcxxabi (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libcustr/custr_remove (run as root)   [00:00] [PASS]
Test: /opt/util-tests/tests/libcustr/custr_trunc (run as root)    [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_00_blank (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_01_boolean (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_02_numbers (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_03_empty_arrays (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_04_number_arrays (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_05_strings (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_06_nested (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/libnvpair_json/json_07_nested_arrays (run as root) [00:00] [PASS]
Test: /opt/util-tests/tests/sed/sed_addr (run as root)            [00:00] [PASS]
Test: /opt/util-tests/tests/sed/multi_test (run as root)          [00:01] [PASS]

Results Summary
PASS      39

Running Time:   00:03:54
Percent passed: 100.0%
Log directory:  /var/tmp/test_results/20211027T084400
Actions #3

Updated by Electric Monk 29 days ago

  • Status changed from New to Closed
  • % Done changed from 50 to 100

git commit bc729d490568bb6599aac50d559e64c366738e85

commit  bc729d490568bb6599aac50d559e64c366738e85
Author: Robert Mustacchi <rm@fingolfin.org>
Date:   2021-10-28T21:30:51.000Z

    14174 pcieadm needs to handle v1 pcie cap better
    14175 pcieadm show-devs can do better on missing pcidb entries
    14176 pcieadm aer cap compares wrong field
    Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF