Bug #3912
closedcrti needs to make sure _init and _fini are 16-byte stack aligned
100%
Description
A while back, gcc changed its expectations for the i386 ABI. Particularly, it changed it such that it is expected for stack alignment to be based on 16-bytes instead of the traditional 4-byte SYSV psABI. This was announced rather ceremoniously here: https://groups.google.com/forum/#!topic/ia32-abi/T5s-UGmUO_E. git commit ebe15f48e9897d68d978938414a5c16cb0ceb049 [6881217 32bit stack frames should be aligned on 16-byte boundaries (for sse2 code)] fixes this in crt1.s for main. However _init and _fini do not follow this pattern. Specifically, we don't have this guarantee made anywhere before _start itself starts to set up the fp environment and calls main(). At this point _fini should be aligned, but it doesn't leave itself in the proper state.
The solution here is to modify _init and _fini stubs in crti.s and make sure that they guarantee the proper alignment regardless of what they come in with. It's important to note here what the actual expectations are. gcc expects that the stack be 16-byte aligned before the call instruction is executed. That means if you have a call foo(), at foo's entry, the stack will be 0xc aligned. This may not be intuitive and was unfortunately not documented well with gcc's ABI change.
To better demonstrate the problem, consider the following bit of C code:
typedef int v4si __attribute__ ((vector_size (16))); v4si s1, s2; v4si y(v4si *s3) { v4si a; a + s1; return a; } v4si x(void) { v4si s3 = s1 + s2; return y(&s3); } static __attribute__((constructor)) void initmain() { y(&s1); } int main(void) { v4si a, b; x(); y(&a); } static __attribute__((destructor)) void finimain() { y(&s1); }
This code is compiled via:
gcc -fno-inline -msse2 -fomit-frame-pointer new.c
Now, if we look at the function y's disassembly we'll see the following:
disassembly for a.out y() y: 83 ec 1c subl $0x1c,%esp y+0x3: 66 0f 6f 04 24 movdqa (%esp),%xmm0 y+0x8: 83 c4 1c addl $0x1c,%esp y+0xb: c3 ret
Note here that gcc is subtracting 0x1c from the stack. The part we care about is specifically the bit where it is doing the 0xc.
Related issues
Updated by Robert Mustacchi about 10 years ago
- Status changed from New to Resolved
- % Done changed from 80 to 100
Resolved in a86e931db9089f6b514c9b98b206d0eb2c1a2a34.