Project

General

Profile

Actions

Bug #5928

closed

htable_walk strays into the VA hole

Added by Robert Mustacchi about 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
2015-05-15
Due date:
% Done:

100%

Estimated time:
Difficulty:
Hard
Tags:
Gerrit CR:

Description

It would appear that hat_swapout() (or potentially just htable_walk()) does not cope terribly well when there is memory in use at the underside of the VA hole. This is very common on LX-branded processes because we nail the top (base) of the stack close to the underside of the hole, i.e. at 0x7ffffff00000, 1MB below 0x800000000000.

To reproduce, first you must run DEBUG bits. Then, encourage the swapping out of entire processes by cranking up the desired free memory:

# mdb -kw
> minfree/W 1000000
minfree:        0x7ee           =       0x1000000
> desfree/W 1000000
desfree:        0xfdd           =       0x1000000

Then, run a 64-bit LX-branded program that explodes memory usage in lots of processes, e.g. the LTP test case /opt/ltp/testcases/bin/msgctl10.

The panic looks like:

panic[cpu0]/thread=fffffffffbc3f420: assertion failed: !IN_VA_HOLE(va), file: ../../i86pc/vm/htable.c, line: 1717

fffffffffbc81840 genunix:process_type+166670 ()
fffffffffbc818e0 unix:htable_walk+297 ()
fffffffffbc81950 unix:hat_swapout+8e ()
fffffffffbc81990 genunix:as_swapout+5b ()
fffffffffbc81a50 genunix:swapout+406 ()
fffffffffbc81af0 genunix:sched+2de ()
fffffffffbc81b30 genunix:main+4cc ()
fffffffffbc81b40 unix:_locore_start+90 ()

Further reading of the code and analysis at runtime leading up to the failure leads me to believe that htable_walk() is really at fault here.

The htable_walk function is documented as locating the address (and htable_t) of the first populated translation in the provided VA range. It explicitly mentions that it uses level information to skip unpopulated sections of the VA space – the amd64 VA hole is most definitely unpopulated.

The function uses htable_lookup to attempt to find an exact match for the next VA in the walk. If it finds a potential match, it then uses htable_scan, which explicitly deals with the VA hole. If it does not find a match, the NEXT_ENTRY_VA macro is used to skip to the next level up, again explicitly dealing with skipping the VA hole.

I believe the ASSERT) is actually misplaced, and should actually be a VERIFY in the path where we return a PTE. i.e.

     while (va < eaddr && va >= *vaddr) {
-        ASSERT(!IN_VA_HOLE(va));
-
         /*
          *  Find lowest table with any entry for given address.
          */
         for (l = 0; l <= TOP_LEVEL(hat); ++l) {
             ht = htable_lookup(hat, va, l);
             if (ht != NULL) {
                 pte = htable_scan(ht, &va, eaddr);
                 if (PTE_ISPAGE(pte, l)) {
+                    VERIFY(!IN_VA_HOLE(va));
                     *vaddr = va;
                     *htp = ht;
                     return (pte);
                 }
                 htable_release(ht);
                 break;
             }
Actions #1

Updated by Electric Monk about 7 years ago

  • Status changed from New to Closed

git commit 8c1d5be330d8ad770aaaa74b0da6cac9139842af

commit  8c1d5be330d8ad770aaaa74b0da6cac9139842af
Author: Joshua M. Clulow <jmc@joyent.com>
Date:   2015-05-19T04:34:09.000Z

    5928 htable_walk strays into the VA hole
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Reviewed by: Richard Lowe <richlowe@richlowe.net>
    Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
    Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

Actions

Also available in: Atom PDF