Bug #308
ld may misalign sections only preceded by empty sections
0%
Description
Copy of Opensolaris bug 6988300
When objects containing COMMON (bss) data are linked into
an executable, and a separate BSS segment is established
via a mapfile, the symbols that reference that segment end
up with addresses outside the section boundaries.
% sh -x build.sh + cc -M /usr/lib/ld/map.pagealign -M /usr/lib/ld/map.bssalign -M /usr/lib/ld/map.noexbss -B direct -o bad.xterm button.o cachedGCs.o charproc.o charsets.o cursor.o data.o doublechr.o fontutils.o input.o linedata.o main.o menu.o misc.o print.o ptydata.o scrollback.o screen.o scrollbar.o tabs.o util.o xstrings.o xtermcap.o VTPrsTbl.o TekPrsTbl.o Tekproc.o charclass.o precompose.o wcwidth.o -R/usr/lib -lXft -lfontconfig -R/usr/lib -lXaw7 -lXmu -R/usr/lib -lXt -lX11 -lICE -ltermcap % elfdump bad.xterm > /dev/null bad.xterm: .dynsym: index[224]: bad symbol entry: smeBSBObjectClass: section[25] size: 0x26c8: symbol (address 0x84026ec, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[251]: bad symbol entry: sessionShellWidgetClass: section[25] size: 0x26c8: symbol (address 0x8402700, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[560]: bad symbol entry: scrollbarWidgetClass: section[25] size: 0x26c8: symbol (address 0x84026f4, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[579]: bad symbol entry: simpleMenuWidgetClass: section[25] size: 0x26c8: symbol (address 0x84026e8, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[605]: bad symbol entry: max_plus1: section[25] size: 0x26c8: symbol (address 0x84026f8, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[676]: bad symbol entry: my_wcwidth: section[25] size: 0x26c8: symbol (address 0x84026f0, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[821]: bad symbol entry: resource: section[25] size: 0x26c8: symbol (address 0x84026a8, size 0x40) lies outside of containing section bad.xterm: .dynsym: index[855]: bad symbol entry: first_widechar: section[25] size: 0x26c8: symbol (address 0x84026fc, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[874]: bad symbol entry: tekshellwidget: section[25] size: 0x26c8: symbol (address 0x8402704, size 0x4) lies outside of containing section bad.xterm: .symtab: index[865]: bad symbol entry: smeBSBObjectClass: section[25] size: 0x26c8: symbol (address 0x84026ec, size 0x4) lies outside of containing section bad.xterm: .symtab: index[892]: bad symbol entry: sessionShellWidgetClass: section[25] size: 0x26c8: symbol (address 0x8402700, size 0x4) lies outside of containing section bad.xterm: .symtab: index[1201]: bad symbol entry: scrollbarWidgetClass: section[25] size: 0x26c8: symbol (address 0x84026f4, size 0x4) lies outside of containing section bad.xterm: .symtab: index[1220]: bad symbol entry: simpleMenuWidgetClass: section[25] size: 0x26c8: symbol (address 0x84026e8, size 0x4) lies outside of containing section bad.xterm: .symtab: index[1246]: bad symbol entry: max_plus1: section[25] size: 0x26c8: symbol (address 0x84026f8, size 0x4) lies outside of containing section bad.xterm: .symtab: index[1317]: bad symbol entry: my_wcwidth: section[25] size: 0x26c8: symbol (address 0x84026f0, size 0x4) lies outside of containing section bad.xterm: .symtab: index[1462]: bad symbol entry: resource: section[25] size: 0x26c8: symbol (address 0x84026a8, size 0x40) lies outside of containing section bad.xterm: .symtab: index[1496]: bad symbol entry: first_widechar: section[25] size: 0x26c8: symbol (address 0x84026fc, size 0x4) lies outside of containing section bad.xterm: .symtab: index[1515]: bad symbol entry: tekshellwidget: section[25] size: 0x26c8: symbol (address 0x8402704, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[821]: bad symbol entry: resource: section[25] size: 0x26c8: symbol (address 0x84026a8, size 0x40) lies outside of containing section bad.xterm: .dynsym: index[579]: bad symbol entry: simpleMenuWidgetClass: section[25] size: 0x26c8: symbol (address 0x84026e8, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[224]: bad symbol entry: smeBSBObjectClass: section[25] size: 0x26c8: symbol (address 0x84026ec, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[676]: bad symbol entry: my_wcwidth: section[25] size: 0x26c8: symbol (address 0x84026f0, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[560]: bad symbol entry: scrollbarWidgetClass: section[25] size: 0x26c8: symbol (address 0x84026f4, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[605]: bad symbol entry: max_plus1: section[25] size: 0x26c8: symbol (address 0x84026f8, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[855]: bad symbol entry: first_widechar: section[25] size: 0x26c8: symbol (address 0x84026fc, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[251]: bad symbol entry: sessionShellWidgetClass: section[25] size: 0x26c8: symbol (address 0x8402700, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[874]: bad symbol entry: tekshellwidget: section[25] size: 0x26c8: symbol (address 0x8402704, size 0x4) lies outside of containing section
Many of these symbols have copy relocations, which 'ldd -r' reveals are out of
bounds, and ignored by the runtime linker:
% ldd -r bad.xterm libXft.so.2 => /usr/lib/libXft.so.2 libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 libXaw7.so.7 => /usr/lib/libXaw7.so.7 libXmu.so.4 => /usr/lib/libXmu.so.4 libXt.so.4 => /usr/lib/libXt.so.4 libX11.so.4 => /usr/lib/libX11.so.4 libICE.so.6 => /usr/lib/libICE.so.6 libcurses.so.1 => /usr/lib/libcurses.so.1 libc.so.1 => /usr/lib/libc.so.1 libfreetype.so.6 => /usr/lib/libfreetype.so.6 libXrender.so.1 => /usr/lib/libXrender.so.1 libexpat.so.1 => /usr/lib/libexpat.so.1 libXext.so.0 => /usr/lib/libXext.so.0 libSM.so.6 => /usr/lib/libSM.so.6 libXpm.so.4 => /usr/lib/libXpm.so.4 libXmuu.so.1 => /usr/lib/libXmuu.so.1 libm.so.2 => /usr/lib/libm.so.2 libnsl.so.1 => /usr/lib/libnsl.so.1 libsocket.so.1 => /usr/lib/libsocket.so.1 libXau.so.6 => /usr/lib/libXau.so.6 libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 libz.so.1 => /usr/lib/libz.so.1 libmp.so.2 => /lib/libmp.so.2 libmd.so.1 => /lib/libmd.so.1 relocation R_386_COPY offset invalid: simpleMenuWidgetClass: offset=0x84026e8 lies outside memory image; relocation discarded relocation R_386_COPY offset invalid: smeBSBObjectClass: offset=0x84026ec lies outside memory image; relocation discarded relocation R_386_COPY offset invalid: scrollbarWidgetClass: offset=0x84026f4 lies outside memory image; relocation discarded relocation R_386_COPY offset invalid: sessionShellWidgetClass: offset=0x8402700 lies outside memory image; relocation discarded libXevie.so.1 => /usr/lib/libXevie.so.1 libXss.so.1 => /usr/lib/libXss.so.1
Since the copy relocations are not carried out, global data in the
program is uninitialized, leading to the runtime failure:
% ./bad.xterm ./bad.xterm Xt error: XtAppCreateShell requires non-NULL widget class
Related issues
Updated by Jason King over 10 years ago
A workaround is to not use the /usr/lib/ld/map.bssalign and /usr/lib/ld/map.noexbss mapfiles.
Updated by Jason King over 10 years ago
Doing a little digging, one of the issues is that the .bss segment's sh_size value is too small. Using elfedit shdr:sh_size .bss 0x2504 (for xterm) resolves the elfdump issues, however the R_386_COPY remain.
Or rather, the issue seems to be confined to the .bss segment itself.
Updated by Jason King over 10 years ago
The program header for the bss segment is also too small, adjusting that resolves the R_386_COPY issues. So it appears it's somehow ignoring the last 3-4 objects when calculating the segment sizes.
Updated by Rich Lowe almost 10 years ago
- Category set to cmd - userland programs
- Status changed from New to In Progress
- Assignee changed from Jason King to Rich Lowe
Taking this, on the basis that I have a likely fix etc. Jason, I'm assuming you're ok with this?
Updated by Rich Lowe almost 10 years ago
- Subject changed from separate bss segment causes symbol/section mismatch to ld may misalign sections only preceded by empty sections
If a mapfile which places .bss in a separate segment (such as /usr/lib/ld/map.bssalign or /usr/lib/ld/map.noexbss) is used, and sections for thread-local storage are created1 the linker may misalign .bss, possibly causing symbols to be placed outside its bounds.
This has mostly been seen with builds of Xorg from the xnv source tree, which use map.bssalign.
Symptoms¶
% elfdump bad.xterm > /dev/null bad.xterm: .dynsym: index[226]: bad symbol entry: smeBSBObjectClass: section[25] size: 0x26c8: symbol (address 0x84026ec, size 0x4) lies outside of containing section bad.xterm: .dynsym: index[252]: bad symbol entry: sessionShellWidgetClass: section[25] size: 0x26c8: symbol (address 0x8402700, size 0x4) lies outside of containing section ...
And xterm fails to run
% ./bad.xterm ./bad.xterm Xt error: XtAppCreateShell requires non-NULL widget class
Anecdotally, there are binaries which suffer from this problem yet don't end up with symbols hanging outside their section. One of my test builds showed vncconfig in this state.
% elfdump root_i386/usr/bin/i86/vncconfig > /dev/null % elfdump root_i386/usr/bin/i86/vncconfig | ggrep -A5 'sh_name: \.bss$' Section Header[31]: sh_name: .bss sh_addr: 0x8400024 sh_flags: [ SHF_WRITE SHF_ALLOC ] sh_size: 0x474 sh_type: [ SHT_NOBITS ] sh_offset: 0x1c340 sh_entsize: 0 sh_link: 0 sh_info: 0 sh_addralign: 0x40
A simple test over an xnv tree finds 101 invalid binaries (with bss symbols straying outside the section), and about 15 more with .bss sections not honouring their .sh_addralign.
Why it's misaligned¶
map.bssalign and map.noexbss create a new loadable segment in which to place the bss, and in the case of the former heavily align it. In this new section will be placed .tbss, .tdata2, and .bss.
The alignment of the first section's sh_offset is controlled solely by the alignment of that initial section -- in our case this will be .tdata -- an empty section requiring no alignment:
Section Header[23]: sh_name: .tdata sh_addr: 0x8400000 sh_flags: [ SHF_WRITE SHF_ALLOC SHF_TLS ] sh_size: 0 sh_type: [ SHT_PROGBITS ] sh_offset: 0x5fc7c sh_entsize: 0 sh_link: 0 sh_info: 0 sh_addralign: 0x1
followed by .tbss, also empty and similarly lacking in required alignment.
Section Header[24]: sh_name: .tbss sh_addr: 0x8400000 sh_flags: [ SHF_WRITE SHF_ALLOC SHF_TLS ] sh_size: 0 sh_type: [ SHT_NOBITS ] sh_offset: 0x5fc7c sh_entsize: 0 sh_link: 0 sh_info: 0 sh_addralign: 0x1
The .bss though does have alignment requirements (0x40), and so the offset is moved to accommodate them, exposing our problem: The linker assumes that address and offset have congruent alignment, such that moving a section through one (offset) to achieve alignment also guarantees it in the other (address) -- see the loop at source:usr/src/cmd/sgs/libld/common/update.c lines 4056,4081
In this instance, aligning the offset of .bss moves it 4 bytes forward to 0x5fc80, moving the address a similar 4 bytes forward to 0x8400004 and misaligning it
Section Header[25]: sh_name: .bss sh_addr: 0x8400004 sh_flags: [ SHF_WRITE SHF_ALLOC ] sh_size: 0x26c8 sh_type: [ SHT_NOBITS ] sh_offset: 0x5fc80 sh_entsize: 0 sh_link: 0 sh_info: 0 sh_addralign: 0x40
Why that causes us to overflow the section¶
The size of the bss is calculated based on the size and alignment of symbols in the input section assuming that each is correctly aligned beginning at the start of the section, which in our case they will not (see: source:usr/src/cmd/sgs/libld/common/syms.c lines 1517,1536)
Symbols are placed in the output section at their actually correct alignment, with no assumption as to initial section alignment, this causes their location within the section to be pushed forward to the next aligned address (see source:usr/src/cmd/sgs/libelf/common/update.c lines 234,252) eventually causing us to spill over the section end.
Footnotes¶
1 this happens much more frequently since the integration of PSARC 2010/299 GNU/Linux/BSD compatibility functions and 6960818 add get_nprocs(), getline(), strdupa(), strndup() to libc in commit:48f2dbca79a2, which added magical incantations for strdupa() and strndupa() to string.h which make use of TLS.
2 .tdata ends up in the bss segment to retain its spacial relationship with .tbss, see: 6910387 .tdata and .tbss separation invalidates TLS program header information
Updated by Jason King almost 10 years ago
By all means, feel free to take the lead on this... no objections here.
Updated by Rich Lowe almost 10 years ago
The easiest way to deal with this is by altering the alignment requirement of empty sections at the beginning of a segment to match the requirement of the first non-empty section in that segment (if one exists), this keeps .tdata, .tbss, and .bss at the same virtual address and file offset, and allows the assumptions elsewhere in the link editor to remain valid.
A fixed xterm binary suggests that this is also the tack taken by Oracle's linker.
Updated by Rich Lowe almost 10 years ago
- Status changed from In Progress to Resolved
Integrated in r13296 commit:2d47a00dfb9b