Project

General

Profile

Bug #4110

panic while reaping htable_cache

Added by Boris Protopopov almost 6 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Start date:
2013-09-04
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

The following panic was observed (quite rare, under heavy load, ):
$c
vpanic()
mutex_panic+0x73(fffffffffb94b618, ffffff2328dd64c0)
mutex_vector_enter+0x446(ffffff2328dd64c0)
hat_enter+0x15(ffffff2328dd64c0)
htable_free+0x4f(ffffff373b379900)
htable_reap+0x76(0)
kmem_cache_reap+0x38(ffffff22f0cd0008)
taskq_thread+0x285(ffffff230f902d18)
thread_start+8()

::panicinfo
cpu 5
thread ffffff00f4017c60
message mutex_enter: bad mutex, lp=ffffff2328dd64c0 owner=deadbeefdeadbee8 thread=ffffff00f4017c60

This is with kmem_flags=0xf, so it appears that this is "use after free" situation, and we are able to see where the buffer is freed from with ::bufctl -v:
mem_cache_free_debug+0x12f
kmem_cache_free+0xb4
hat_free_end+0xd2
as_free+0x1d9
relvm+0xdd
proc_exit+0x46f
exit+0x15
rexit+0x1c
dtrace_systrace_syscall+0x11a

This appears to be a process address space teardown code path. After reviewing the code, the apparent design intent is as follows: in order to prevent the race between the teardown and reap, the hat_t's are atomically checked for HAT_VICTIM and marked HAT_FREEING in hat_free_start(). Then the reap code skips the HAT_FREEING hat_t's when victimizing.

Yet it also appears that we still see the race. Looking at the htable_steal(), we find that the inner for(;;) loop seems to be re-setting the HAT_VICTIM flag after a collection of htable_t's was taken off the hat_t list but before they are actually freed.

This seems to allow for a situation where the teardown thread waiting for the hat_t's HAT_VICTIM flag to be cleared in hat_free_start(), will come out and set the HAT_FREEING flag at some later point in time, and the hat_free_end() will teardown/free that hat_t. Yet the reap code path will not check for the HAT_FREEING (or will fail to notice the flag setting - depending on the scheduling - the checks are not done under the lock), so the htable_t's taken off the (freed) hat_t list in htable_steal() will still be reaped.

Specifically, in our case, if the teardown is completed first (the buffer is freed and reused), and then the htable_free()s are called in the reap code path, the htable_free()s will be looking at garbage when checking for the HAT_FREEING flag in the first if(...!(hat->hat_flag & HAT_FREEING)...) statement of the HAT_FREEING. As a consequence, hat_enter() will be called, and it will cause the above panic if kmem_flags=0xf are set or the mutex is trashed in some other way.

To summarize, the cause of the panic appears to be stale ht_hat pointer in the htable_t at the time of calling htable_free(). The ht_hat points to freed space (arbitrary garbage). The reason for this is a race condition between teardown and reap control flows, which is not being properly addressed by the combination of HAT_VICTIM and HAT_FREEING flags.

The proposed solution is to disassociate htable_t's to be reaped from the hat_t object at the time the former a gathered for reaping. This happens in htable_steal() function. The latter is also called from htable_alloc(), and in this case, we do not reap/disassociate from the hat_t. Therefore, an additional parameter that conveys the context is needed for htable_steal() to do the right thing in both cases: in case of reaping context, we zero out hat_t pointer, and we perform platform specific ptable_free() work (as this was previously done in the original code).

Later, in htable_free() we check for ht_hat pointer in htable and perfrom the needed actions, while guaranteeing race-free execution in reap context - there is no dependency on ht_hat, so we do not need to worry about it going away.


Related issues

Related to illumos gate - Bug #5923: Panic on ht->ht_lock_cnt == 0 assertion in htable_releaseNew2015-05-13

Actions
Related to illumos gate - Bug #5924: System deadlocked in htable_steal_activeNew2015-05-13

Actions

History

#1

Updated by Boris Protopopov over 5 years ago

One other detail that seemed amiss in the code was that it was not obvious whether the htable_dont_cache flag was set when htable_free() was called from htable_reap(). However, it would seem that if the flag was not set, then htable_free() could put the htable back on the hat's cached_list. This would have defeated the purpose of reaping - the htable would not be freed but would still be cached.

The proposed fix is to set/increment the dont_cache flag in htable_reap() before calling htable_steal() and to decrement the flag only after htable_free() loop is done, in order to assure proper freeing.

#2

Updated by Electric Monk almost 5 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit b59c4a48daf5a1863ecac763711b497b2f8321e4

commit  b59c4a48daf5a1863ecac763711b497b2f8321e4
Author: Boris Protopopov <boris.protopopov@nexenta.com>
Date:   2014-11-05T22:53:03.000Z

    4110 panic while reaping htable_cache
    Reviewed by: Gordon Ross <Gordon.Ross@nexenta.com>
    Reviewed by: Ilya Usvyatsky <Ilya.Usvyatsky@nexenta.com>
    Reviewed by: Thomas Keiser <tkeiser@gmail.com>
    Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
    Approved by: Garrett D'Amore <garrett@damore.org>

#3

Updated by Alexander Kolbasov over 4 years ago

  • Related to Bug #5923: Panic on ht->ht_lock_cnt == 0 assertion in htable_release added
#4

Updated by Alexander Kolbasov over 4 years ago

  • Related to Bug #5924: System deadlocked in htable_steal_active added

Also available in: Atom PDF