Project

General

Profile

Feature #11202

Allow the number of NVMe submission and completion queues to be different

Added by Paul Winder 4 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

Currently there is a one-to-one relationship between a submission queue and completion queue. This is represented by a queue pair. The number of completion queues (and hence queue pairs) is restricted by by the number of interrupt vectors available, this make sense as each completion queue can only be processed by one interrupt handler.

This direct relationship between these queues is a restriction of the driver and not the NVMe specification.

This enhancement is to remove the 1-1 tie between completion and submission queues, and allow as many submission queues to be created as the drive will permit. Completion queues can be specified independently, but will be gated by drive hardware, number of interrupt vectors and submission queues (there is no point in having more completion than submission queues).

The concept of a queue pair is maintained. There is one queue pair for each submission queue, but if there are more queue pairs than completion queues, the completion queues are shared. A NVMe command directed at a given submission queue, will have its completion notified in the completion queue referenced in the queue pair.

+------------+
|  QP 1      |
| +-------+  |--------------+
| | SQ 1  |  |              |
| +-------+  |              v
+------------+            +-------+
                          | CQ 1  |
                          +-------+
+------------+              ^
|  QP 2      |              |
| +-------+  |--------------+
| | SQ 2  |  |
| +-------+  | 
+------------+

+------------+
|  QP 3      |
| +-------+  |--------------+
| | SQ 3  |  |              |
| +-------+  |              v
+------------+            +-------+
                          | CQ 2  |
                          +-------+
+------------+              ^
|  QP 4      |              |
| +-------+  |--------------+
| | SQ 4  |  |
| +-------+  | 
+------------+

As well is increasing concurrency through more submission queues, sharing (and possibly reducing) the number of completion queues can make the processing through the interrupt handlers more efficient by processing more completions per interrupt.


Related issues

Related to illumos gate - Bug #11228: nvme may queue more submissions than allowedClosed

Actions
Related to illumos gate - Bug #11229: nvme_get_logpage() can allocate a too small buffer to receive logpage dataClosed

Actions
Related to illumos gate - Bug #11230: Panic in nvme_fill_prp() because of miscalculation of the number of PRPs per pageClosed

Actions
Related to illumos gate - Bug #11231: nvme in polled mode ignores the command call backClosed

Actions

History

#1

Updated by Paul Winder 4 months ago

  • % Done changed from 0 to 80
#2

Updated by Paul Winder 4 months ago

  • Status changed from New to In Progress
#3

Updated by Paul Winder 4 months ago

  • Subject changed from Allow the NVMe submission and completion queue sizes to be different to Allow the number of NVMe submission and completion queues to be different
#4

Updated by Paul Winder 4 months ago

#5

Updated by Gergő Mihály Doma 4 months ago

  • Related to Bug #11228: nvme may queue more submissions than allowed added
#6

Updated by Gergő Mihály Doma 4 months ago

  • Related to Bug #11229: nvme_get_logpage() can allocate a too small buffer to receive logpage data added
#7

Updated by Gergő Mihály Doma 4 months ago

  • Related to Bug #11230: Panic in nvme_fill_prp() because of miscalculation of the number of PRPs per page added
#8

Updated by Gergő Mihály Doma 4 months ago

  • Related to Bug #11231: nvme in polled mode ignores the command call back added
#9

Updated by Electric Monk 4 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 80 to 100

git commit 0999c1123c1ab769df080ccc5f1626d50663e7a8

commit  0999c1123c1ab769df080ccc5f1626d50663e7a8
Author: Paul Winder <Paul.Winder@wdc.com>
Date:   2019-06-20T14:02:46.000Z

    11202 Allow the number of NVMe submission and completion queues to be different
    11228 nvme may queue more submissions than allowed
    11229 nvme_get_logpage() can allocate a too small buffer to receive logpage data
    11230 Panic in nvme_fill_prp() because of miscalculation of the number of PRPs per page
    11231 nvme in polled mode ignores the command call back
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
    Reviewed by: Gergő Mihály Doma <domag02@gmail.com>
    Reviewed by: Youzhong Yang <youzhong@gmail.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

#10

Updated by Paul Winder 4 months ago

Testing included:
  • Ran a complete ZFS test suite with NVMe drives as the targets
  • Ran vdbench with 50% r/w workload
  • In use in WDC's O/S in their storage servers

Also available in: Atom PDF