Project

General

Profile

Actions

Bug #3912

closed

crti needs to make sure _init and _fini are 16-byte stack aligned

Added by Robert Mustacchi about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
High
Category:
lib - userland libraries
Start date:
2013-07-26
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

A while back, gcc changed its expectations for the i386 ABI. Particularly, it changed it such that it is expected for stack alignment to be based on 16-bytes instead of the traditional 4-byte SYSV psABI. This was announced rather ceremoniously here: https://groups.google.com/forum/#!topic/ia32-abi/T5s-UGmUO_E. git commit ebe15f48e9897d68d978938414a5c16cb0ceb049 [6881217 32bit stack frames should be aligned on 16-byte boundaries (for sse2 code)] fixes this in crt1.s for main. However _init and _fini do not follow this pattern. Specifically, we don't have this guarantee made anywhere before _start itself starts to set up the fp environment and calls main(). At this point _fini should be aligned, but it doesn't leave itself in the proper state.

The solution here is to modify _init and _fini stubs in crti.s and make sure that they guarantee the proper alignment regardless of what they come in with. It's important to note here what the actual expectations are. gcc expects that the stack be 16-byte aligned before the call instruction is executed. That means if you have a call foo(), at foo's entry, the stack will be 0xc aligned. This may not be intuitive and was unfortunately not documented well with gcc's ABI change.

To better demonstrate the problem, consider the following bit of C code:

typedef int v4si __attribute__ ((vector_size (16)));

v4si s1, s2;

v4si y(v4si *s3)
{
        v4si a;
        a + s1;
        return a;
}

v4si x(void)
{
        v4si s3 = s1 + s2;
        return y(&s3);
}

static __attribute__((constructor)) void
initmain()
{
        y(&s1);
}

int
main(void)
{
        v4si a, b;
        x();
        y(&a);
}

static __attribute__((destructor)) void
finimain()
{
        y(&s1);
}

This code is compiled via:

gcc -fno-inline -msse2 -fomit-frame-pointer new.c

Now, if we look at the function y's disassembly we'll see the following:

disassembly for a.out

y()
    y:     83 ec 1c           subl   $0x1c,%esp
    y+0x3: 66 0f 6f 04 24     movdqa (%esp),%xmm0
    y+0x8: 83 c4 1c           addl   $0x1c,%esp
    y+0xb: c3                 ret    

Note here that gcc is subtracting 0x1c from the stack. The part we care about is specifically the bit where it is doing the 0xc.


Related issues

Related to OpenIndiana Distribution - Bug #3874: Packagemanager core dumps with hipsterClosedOI Userland2013-07-082013-12-10

Actions
Actions #1

Updated by Robert Mustacchi almost 8 years ago

  • Status changed from New to Resolved
  • % Done changed from 80 to 100

Resolved in a86e931db9089f6b514c9b98b206d0eb2c1a2a34.

Actions

Also available in: Atom PDF