Project

General

Profile

Actions

Bug #12820

closed

failure in dladm_ether_info() can lead to a double free

Added by Robert Mustacchi over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Category:
cmd - userland programs
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

A user had a core dump with the following:

> ::status
debugging core file of dladm (32-bit) from omnibot
initial argv: dladm show-ether
threading model: raw lwps
status: process terminated by SIGABRT (Abort), pid=647 uid=0 code=-1
> $C
0803fa58 libc.so.1`_lwp_kill+7(1, 6, 0, 1, fef42000, fef20dfd)
0803fa78 libc.so.1`raise+0x2b(6)
0803fa98 libumem.so.1`umem_do_abort+0x53()
0803faa8 libumem.so.1`__umem_assert_failed(fef20dfd, fef21077, 86eef78)
0803fae8 libumem.so.1`process_free+0x74(86eef78, 1, 0)
0803fb08 libumem.so.1`umem_malloc_free+0x1a(86eef78)
0803fb38 libdladm.so.1`dladm_ether_info_done+0x2a(803fb5c)
0803fc58 show_etherprop+0x6b(86ebfb8, 3, 803fd08)
0803fcd8 libdladm.so.1`dladm_walk_datalink_id+0x9d(8061b50, 86ebfb8, 803fd08, 1, 4, 0)
0803fd38 do_show_ether+0x14c(1, 803fde0, 8067890)
0803fd78 main+0x10a(fed3d7e7, fedb75e8)
0803fdb8 _start_crt+0x96(2, 803fddc, fefd0d2f, 0, 0, 0)
0803fdd0 _start+0x1a(2, 803ff14, 803ff1a, 0, 803ff25, 803ff39)
> ::umem_status -v
Status:         ready and active
Concurrency:    4
Logs:           (inactive)
Message buffer:
free(86eef78): invalid or corrupted buffer
stack trace:
libumem.so.1'umem_err_recoverable+0x37
libumem.so.1'process_free+0x74
libumem.so.1'umem_malloc_free+0x1a
libdladm.so.1'dladm_ether_info_done+0x2a
dladm'show_etherprop+0x6b
libdladm.so.1'dladm_walk_datalink_id+0x9d
dladm'do_show_ether+0x14c
dladm'main+0x10a
dladm'_start_crt+0x96
dladm'_start+0x1a

If we look at the buffer and context, we're trying to free the dladm_ether_info_t structure that's constructed from dladm_ether_info(). Now, the comment from umem is that this is either a corrupt buffer or already freed. If we look at the implementation of dladm_ether_info() these values are allocated in one place and then freed when dladm_ether_info_done() is called. Now here's the interesting thing that can happen:

1. If dladm_ether_info() fails, it'll call dladm_ether_info_done() to free the state, before returning a failure.
2. When dladm_ether_info() fails, show_etherprop() also calls dladm_ether_info_done() on the structure/

This means that we end up trying to free the same data twice which leads us here. This means that a transient failure to get one of the stats would have had to happen. However, we hear that this always happens the first time the user calls this and then everything is fine after that. We should do two things here:

1. Make sure that dladm_ether_info_done() NULLs out the various values it frees so if a double free where to happen again, it'd be a no-op.
2. Make sure that show_etherprop() doesn't call the clean up function in the failure case.

Actions #1

Updated by Robert Mustacchi over 1 year ago

To test this I simulated a failure by injecting a failure with mdb by causing the act of getting a mac stat to fail. For example, this is what happened on an old binary:

root@beowulf:~# export LD_PRELOAD=libumem.so
root@beowulf:~# export UMEM_DEBUG=default
root@beowulf:~# mdb /usr/sbin/dladm
> ::bp libdladm.so.1`dladm_ether_info     
> ::run show-ether
mdb: stop at libdladm.so.1`dladm_ether_info
mdb: target stopped at:
libdladm.so.1`dladm_ether_info: pushl  %ebp
mdb: You've got symbols!
Loading modules: [ ld.so.1 libumem.so.1 libc.so.1 ]
> libdladm.so.1`dladm_ether_info+0xf4::bp
> :c
mdb: stop at libdladm.so.1`dladm_ether_info+0xf4
mdb: target stopped at:
libdladm.so.1`dladm_ether_info+0xf4:    movl   %eax,%esi
mdb: You've got symbols!
Loading modules: [ libnvpair.so.1 ]> <eip::dis
libdladm.so.1`dladm_ether_info+0xd7:    addl   $0xc,%esp
libdladm.so.1`dladm_ether_info+0xda:    movl   %eax,(%esi)
libdladm.so.1`dladm_ether_info+0xdc:    movl   0x3c(%edi),%eax
libdladm.so.1`dladm_ether_info+0xdf:    movl   -0x34(%ebp),%edx
libdladm.so.1`dladm_ether_info+0xe2:    movl   %edx,0x4(%eax)
libdladm.so.1`dladm_ether_info+0xe5:    leal   0x24(%edi),%eax
libdladm.so.1`dladm_ether_info+0xe8:    pushl  %eax
libdladm.so.1`dladm_ether_info+0xe9:    pushl  0xc(%ebp)
libdladm.so.1`dladm_ether_info+0xec:    pushl  0x8(%ebp)
libdladm.so.1`dladm_ether_info+0xef:    call   -0x11fa6 <libdladm.so.1`dladm_get_state>
libdladm.so.1`dladm_ether_info+0xf4:    movl   %eax,%esi
libdladm.so.1`dladm_ether_info+0xf6:    addl   $0x10,%esp
libdladm.so.1`dladm_ether_info+0xf9:    testl  %eax,%eax
libdladm.so.1`dladm_ether_info+0xfb:    jne    -0xbc    <libdladm.so.1`dladm_ether_info+0x45>
libdladm.so.1`dladm_ether_info+0x101:   leal   0x2b00(%ebx),%esi
libdladm.so.1`dladm_ether_info+0x107:   leal   0x28(%edi),%eax
libdladm.so.1`dladm_ether_info+0x10a:   movl   $0x1,-0x40(%ebp)
libdladm.so.1`dladm_ether_info+0x111:   leal   -0x1c(%ebp),%ecx
libdladm.so.1`dladm_ether_info+0x114:   movl   %ecx,-0x48(%ebp)
libdladm.so.1`dladm_ether_info+0x117:   leal   -0x20(%ebp),%ecx
libdladm.so.1`dladm_ether_info+0x11a:   movl   %ecx,-0x4c(%ebp)
> 2>eax
> :c
mdb: stop on SIGABRT
mdb: target stopped at:
libc_hwcap2.so.1`_lwp_kill+7:   jae    +0xc     <libc_hwcap2.so.1`_lwp_kill+0x15>
> $C
08040c88 libc_hwcap2.so.1`_lwp_kill+7(1, 6, 0, 1, fef32000, fef10d1d)
08040ca8 libc_hwcap2.so.1`raise+0x2b(6)
08040cc8 libumem.so.1`umem_do_abort+0x53()
08040cd8 libumem.so.1`__umem_assert_failed(fef10d1d, fef10f97, 8eadf48)
08040d18 libumem.so.1`process_free+0x74(8eadf48, 1, 0)
08040d38 libumem.so.1`umem_malloc_free+0x1a(8eadf48)
08040d68 libdladm.so.1`dladm_ether_info_done+0x2a(8040d8c)
08040e88 show_etherprop+0x6b(8ea2fb0, 1, 8040f38)
08040f08 libdladm.so.1`dladm_walk_datalink_id+0x9d(805f428, 8ea2fb0, 8040f38, 1, 4, 0)
08040f68 do_show_ether+0x14c(1, 8041018, 8065360)
08040fa8 main+0xd2(feda0ef7, fee1c628)
08040fe8 _start_crt+0x96(2, 8041014, f7d5477e, 0, 0, 0)
08041008 _start+0x1a(2, 80410ec, 80410f2, 0, 80410fd, 8041111)

However, if we did the same thing on the newer binary, this no longer crashes and I can verify that we don't call into dladm_ether_info_done() twice any more. An important consideration with this test is that if libumem is not loaded, then the system will swallow the double free, unfortunately.

Actions #2

Updated by Electric Monk over 1 year ago

  • Status changed from New to Closed
  • % Done changed from 80 to 100

git commit 7fa359c09493c7b56e7c3e98fccc7e5cc0f09b48

commit  7fa359c09493c7b56e7c3e98fccc7e5cc0f09b48
Author: Robert Mustacchi <rm@fingolfin.org>
Date:   2020-06-11T16:56:18.000Z

    12820 failure in dladm_ether_info() can lead to a double free
    Reviewed by: Andy Fiddaman <omnios@citrus-it.co.uk>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Toomas Soome <toomas@me.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF