comm page should fallback to syscall after excessive migration
Copied from SmartOS OS-6135:
On platforms that have unsynchronized TSCs and lack support for the RDTSCP instruction, determining the TSC offset is a multi-step process. The current CPU ID is queried (via the lsl/GDT method), the TSC is read, then finally the CPU ID is re-checked in case a migration occurred. This is meant to ensure that the offset applied to the TSC reading corresponds to the CPU from which it was taken. If the CPU IDs don't match, the logic is repeated until it is successful.
While this ID-checking loop is rather tight, it can become a problem in certain circumstances. If the system is heavily loaded, migrations may occur rapidly. Any process "stuck" in this loop will represent an additional source of CPU load, potentially exacerbating the problem. It would be valuable to have bail-out logic to limit the loop iterations, perform the time reading via syscall if it cannot be completed by the userspace code in a timely manner.
Extensive testing notes are included in the comments of the ticket.
Updated by Electric Monk over 3 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
commit e121b61f5e8ffbeb2f6b373c967c80351333ee21 Author: Patrick Mooney <email@example.com> Date: 2020-04-01T15:22:43.000Z 12345 comm page should fallback to syscall after excessive migration Reviewed by: Jerry Jelinek <firstname.lastname@example.org> Reviewed by: Ryan Zezeski <email@example.com> Reviewed by: Robert Mustacchi <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>