Bug #13977
closedksh93: stack robustness fixes
100%
Description
OpenSuse, the modern github ksh93 project and Solaris have all imported a set of patches to improve stack robustness in ksh93.
This patch addresses an issue with the ksh93 internal stack restoration
and resizing logic which was causing random coredumps, mainly in nested shell scripts.
https://github.com/ksh93/ksh/commit/4604df9ada906e0a6537157a63b6ce7c0509f34d
Updated by Andy Fiddaman almost 2 years ago
The original code intention is to exit the sh_debug function with the stack in the same state as when it was entered, so the pointer to the top element on the stack is gathered at the start and checked at the end and used to restore the top stack frame (or just its seek position). However, the man page for libast/stak says that the pointer returned from stakptr is only valid until the next stak operation and there are other operations in here. In particular, the out_string(), sfputc() and sfwrite() calls grow the top stack element and can cause a reallocation. By freezing the old top element as 'sav', these calls now all write to a new top of stack element.
At the end of the function, stakset() is used to restore the stack:
The stakset() function finds the frame containing the given address, frees all frames that were created after the one containing the given address, and sets the current object to the given address. The top of the current object is set to offset bytes from current object.
The corruption occurred when the stashed pointer was no longer in the stack after the other calls in this function, which also led to the author's change in stakset() to make passing an unknown stack address fatal for an early crash rather than a crash later in some unrelated code.
In fact, with this change, the
if(sav != stkptr(stkp,0))
test should no longer be necessary (it will always be true since the top stack element is new given the freeze at the top of the function), but there is no need to diverge from upstream for this.
Updated by Andy Fiddaman almost 2 years ago
I ran the full ksh testsuite from the system/test/ksh93
package and observed the same results from both before and after the patch in the associated gerrit review.
I also wrote some test scripts that were heavily nested/recursive and exercised the DEBUG
trap, to try and reproduce the original error reported upstream. I could not, but these test scripts also produced the same results both before and after the patch.
Updated by Electric Monk almost 2 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
git commit 3636ae5408a4c1fb68954277f99be72f62e3489e
commit 3636ae5408a4c1fb68954277f99be72f62e3489e Author: Andy Fiddaman <omnios@citrus-it.co.uk> Date: 2021-08-06T09:08:05.000Z 13977 ksh93: stack robustness fixes 13976 contrib/ast: remove unreferenced man pages Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: Robert Mustacchi <rm@fingolfin.org> Approved by: Gordon Ross <gordon.w.ross@gmail.com>