Project

General

Profile

Actions

Bug #4281

open

nhm and nhmex are dangerous

Added by Garrett D'Amore over 7 years ago. Updated over 7 years ago.

Status:
New
Priority:
High
Category:
driver - device drivers
Start date:
2013-10-31
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Both the intel_nhm and intel_nhmex drivers crash on some platforms. In particular, I have an Ivy Bridge based platform that uses a different region of PCI configuration than typically found. It lacks ACPI.

On this system, the intel_nhm and intel_nhmex crash during nhm_init().

The reason is that the code in nhm_init blithely attempts to start doing pci configuration space accesses (these are to memory mapped regions) that don't work.

nhm_init(void) {
int slot;

/* return ENOTSUP if there is no PCI config space support. */
if (pci_getl_func == NULL)
return (ENOTSUP);
for (slot = 0; slot < MAX_CPU_NODES; slot++) {
nhm_chipset = CPU_ID_RD(slot);
if (nhm_chipset NHM_EP_CPU || nhm_chipset NHM_WS_CPU ||
nhm_chipset NHM_JF_CPU || nhm_chipset NHM_WM_CPU)
break;
}
if (slot == MAX_CPU_NODES) {
return (ENOTSUP);
}
mem_reg_init();
return (0);
}

Now, it turns out that probably we need to put some limit on what spaces are addressable via the memory mapped accesses as our addresses are not accessible. But the code above is just flat out broken. No attempt to validate whether the CPU is a nehalem or not is made before going and hitting configuration space.

intel_nhmex is closed source, and probably should just be killed. it does mean that we'd lose memory based self-healing on those platforms, but without source its unclear how we could even meaningfully diagnose any problems there.

intel_nhm should use something -- perhaps CPUID instructions -- to verify whether it is on a candidate CPU before accessing configuration space blindly.

Actions

Also available in: Atom PDF