There's an open spec available online for the Mellanox ConnectX-4/5/6 family parts which is good enough to write a driver, so we should have one!

These are available in 10g/25g/40g/100g parts, and have big limits on a lot of their resources, so they're very practical server parts and should work well for having lots of VNICs.

Updated by Jorge Schrauwen almost 4 years ago

Have we reached out to Mellanox? I believe they provided support to freebsd in writing the drivers.

Updated by Robert Mustacchi almost 4 years ago

Well, Alex has written the driver already, so I'm not sure there's much need to ask them to do something here.

Updated by Garrett D'Amore almost 4 years ago

This is very timely -- we (RackTop) had just retained another driver engineer (Paul Winder) to write just such a driver (he had already begun). I believe Paul has reached out already to Alex, and will be working to collaborate on test, review, and also address any bugs found. I think they are already working together (I know that Alex's work has some dependencies on things that are present in the Joyent tree but not in upstream, for example.)

Updated by Alex Wilson over 3 years ago

Testing notes:

  • Built and tested on SmartOS together with #12205
  • Testing was done against Connect-X4Lx 10G (1-port) and 25G (2-port) parts on PCIe cards
  • Did add_drv, verified that driver attaches and NIC appears in dladm show-phys. Checked MAC address.
  • Basic state and transceiver testing
    • Removed and re-inserted transceiver (10G DAC), verified that interface state and speed update correctly
    • Inserted 1G copper transceiver, verified interface state and speed
    • Checked transceiver information in dltraninfo
    • Tested LEDs with dlled
    • Tested that rem_drv correctly shuts the card down and detaches cleanly
  • Basic tx and rx
    • Connected other end of DAC to a separate SmartOS box (using ixgbe on the other end)
    • Used dlsend and dlrecv to verify that basic L2 packet transmission and reception works
    • Used dlsend (on the remote end) to a different MAC combined with snoop to verify that promisc works
    • Plumbed interface with ipadm create-addr -T static
    • Verified that ping works (both directions) between the ixgbe IP address on remote machine and the one on mlxcx
    • Used snoop to watch ARP traffic to verify that broadcast traffic is being received
    • Used iperf3 in both directions between the remote ixgbe and mlxcx to verify that performance is not completely boned
    • Left iperf3 running for extended periods (>72 hrs) and monitored kernel memory usage as a cursory check for bad leaks
    • Verified that rem_drv still works and cleanly detaches the driver after interface is operating
  • VNICs
    • Created VNICs using dladm create-vnic on both ends of the ixgbe-mlxcx link -- one VNIC directly on the interface, one VNIC on a tagged VLAN, another VNIC on a different tagged VLAN
    • Verified ping in both directions on each VNIC
    • Used mdb -k to examine the flow tables being generated within the driver to check that they made sense for the VNICs and VLANs in use and that rings were being allocated as expected
    • Tested VNIC performance using iperf3 with explicit listen addresses
    • Created a zone with a VNIC and tested iperf3 again
  • Aggrs
    • On the 2-port 25G card, inserted two 25G DACs going to a Force10 switch with the two ports in a port-channel (single chassis)
    • Created an aggr on the mlxcx end using dladm create-aggr
    • Plumbed an interface on top of the aggr using ipadm create-addr -T static
    • Verified ping and SSH across the link
    • Then created 20 VNICs on top of the aggr using dladm create-vnic in a shell script -- some untagged and some on tagged VLANs.
    • Gave some of the VNICs static addresses and some DHCP, verified ping through each of them with a shell script
    • Used mdb -k to inspect the state of rings and flow tables in the driver to check that VNIC mac addresses and other filters were being constructed correctly on both sides
    • Started iperf3 on the aggr0 interface
    • Physically unplugged one side of the aggr at a time to verify that failover works while watching iperf3 output
    • Re-did this again with "shutdown" commands on the switch
    • Left iperf3 on top of the aggr running for 48 hrs to verify the box was still up and not egregiously leaking kernel memory
  • Fault testing
    • Changed code in mlxcx_sq_add_buffer to deliberately place invalid pointers and lkeys into work queue entries
    • Tried to tx packets, verified that the queues involved go into error state and FM ereports are generated (also a unexpected_telemetry fault since there's no ESC for the cqe.err events yet, subject of a future patch)
    • Changed code in mlxcx_rq_add_buffer to do the same
    • Modified mlxcx_eq_arm to write incorrect event counter value, checked that driver reports EQ stall and generates FM ereports for it
    • Modified mlxcx_cq_check to always report that CQ status is "overflow", waited for it to run and checked that ereports and messages are generated as expected
  • Also built on OmniOS against vanilla illumos-gate. Attached a connectx-4 Lx passed through to an OmniosVM but did not extensively test it.

Note that Paul Winder's changes on top of this patch (which he's going to RTI separately) are needed for this driver to function correctly on Connect-X5 (connect-x5lx might work as-is). I don't believe either of us have tested a Connect-X6 part yet as they're quite expensive.

Updated by Electric Monk over 3 years ago

git commit ebb7c6fd4f966f94af3e235242b8a39b7a53664a

commit  ebb7c6fd4f966f94af3e235242b8a39b7a53664a
Author: Alex Wilson <>
Date:   2020-03-04T02:46:53.000Z

    12204 want driver for Mellanox ConnectX-4/5/6 NICs
    Reviewed by: Robert Mustacchi <>
    Reviewed by: Paul Winder <>
    Approved by: Garrett D'Amore <>


