Project

General

Profile

Bug #13247

CTF conversion fails with large files

Added by Andy Fiddaman about 1 month ago. Updated 15 days ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
tools - gate/build tools
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

In continuing the rollout of CTF to more of OmniOS userland, I hit a memory scaling problem with a 30MiB shared library with 3,816 compilation units. ctfconvert cannot even initialise the die list without running out memory (ctfconvert is 32-bit).

Talking to Jonathan Perkin, he hit the same issue with the same shared library (libicudata.so) a while back, and wrote a patch for ctfconvert. This is open as https://smartos.org/bugview/OS-6485 and https://github.com/joyent/illumos-joyent/pull/215 but CTF in gate has moved on a bit since.

Here's the write-up from his issue:


During the work to CTF convert pkgsrc, libicudata.so exposed memory scaling issues with ctfconvert. It is a 29MB shared library containing 3,449 DIEs.

The current conversion process allocates memory as follows:

  • A new ctf_die_t is created for each DIE during the initialisation process.
  • Each ctf_die_t holds a DWARF handle open on the input file.
  • Each ctf_die_t includes a new ctf_file_t allocation, each of which mmap()'s its own private copy of the CTF data, symtab, and strtab from the object.
  • During the merge process, for each DIE, an extra ctf_file_t is allocated, again with its own mmap'ed private copies, as the destination for the merge output.
  • These allocations are only freed at the end of the entire conversion process.
  • With all these allocations, a 32-bit ctfconvert process runs out of available memory while processing libicudata.so.

With the proposed patch the conversion process is changed to be as follows:

  • Instead of allocating a full ctf_die_t for every DIE up-front, we instead defer full initialisation to the main conversion process in ctf_dwarf_convert_one().
  • DIEs are processed in batches, defaulting to a batch size of 256 (configurable via -b on the command line if necessary).
  • Each batch is converted and merged until a single merged ctf_file_t is returned as the result.
  • After processing a batch, the input DIEs for that batch are freed.
  • The merged ctf_file_t is added as an input to the next batch.
  • The process continues until we have processed all batches and end with a final merged ctf_file_t, or a failure.

The default batchsize of 256 is based on a few constraints:

  • Processing DIEs in multiple batches means that ctf_id_t's will be different compared to those generated by a previous ctfconvert. If we're able to choose a batchsize which is larger than the
    number of DIEs in most objects then we will avoid changing ctf_id_t's. Whilst mostly cosmetic, it's still nice to avoid differences if possible.
  • Performance goes up as the batchsize increases, at least when using the default of 4 threads.
  • The batchsize needs to be well below the number of DIEs that can be processed without hitting memory limits.

Related issues

Related to illumos gate - Bug #13251: CTF conversion fails if any CU is missing DWARF dataClosedAndy Fiddaman

Actions
Related to illumos gate - Bug #13252: ctf_update()/ctf_dwarf_convert_function() leak memoryClosedAndy Fiddaman

Actions
Related to illumos gate - Bug #13278: CTF assertion failed cmp->cm_tmap[id].cmt_map == suidClosedAndy Fiddaman

Actions
#1

Updated by Electric Monk about 1 month ago

  • Gerrit CR set to 1012
#2

Updated by Andy Fiddaman about 1 month ago

I've tested the change in the attached Gerrit review.

Using wsdiff (recently updated to compare CTF) on two gate builds showed no CTF differences apart from in libctf itself.

For doing the single object conversions in gate, the new version is measurably faster. Most likely because it no longer allocates two libdwarf handles and does not use the task queue for processing.

Testing with two sample objects:

sunddi.o - 741K

Old:
% hyperfine -w1 -r50 'ctfconvert -o /dev/null sunddi.o'
  Time (mean):     225.8 ms +-  16.4 ms    [User: 215.2 ms, System: 8.2 ms]
  Range (min | max):   200.5 ms | 253.4 ms    50 runs

New:
% hyperfine -w1 -r50 'ctfconvert -o /dev/null sunddi.o'
  Time (mean):     213.4 ms +-  21.2 ms    [User: 203.1 ms, System: 8.3 ms]
  Range (min | max):   184.3 ms | 255.5 ms    50 runs

unix.o - 29.1M

Old:
% hyperfine -w1 -r10 'ctfconvert -o /dev/null unix.o'
  Time (mean):     26.849 s +-  0.613 s    [User: 66.796 s, System: 2.360 s]
  Range (min | max):   25.582 s | 27.639 s    10 runs

New:
% hyperfine -w1 -r10 'ctfconvert -o /dev/null unix.o'
  Time (mean):     17.836 s +-  0.467 s    [User: 52.066 s, System: 2.359 s]
  Range (min | max):   17.049 s | 18.467 s    10 runs

I also tested converting libcrypto.so.1.1, 13MiB, which contains 659 DWARF CUs. The current gate version cannot convert this file due to one of the CUs (from aesni-mb-x86_64.s) not having any debug data - it aborts the whole process.
I did this test to get a better feel for the default level of threads and batch size.

ctfconvert has always defaulted to 4 threads and, with this change, defaults to batching the CUs up into blocks of 256.

Threads   Batch size
--------------------
... 1 ... 16 ...   29.74s user 1.10s system  99% cpu 30.924 total
... 1 ... 32 ...   19.45s user 1.17s system  99% cpu 20.693 total
... 1 ... 64 ...   15.64s user 1.18s system  99% cpu 16.893 total
... 1 ... 128 ...  13.79s user 1.12s system  99% cpu 14.977 total
... 1 ... 256 ...  12.87s user 1.19s system  99% cpu 14.140 total
... 1 ... 512 ...  11.58s user 1.17s system  99% cpu 12.844 total
... 4 ... 16 ...   41.40s user 1.94s system 140% cpu 30.875 total
... 4 ... 32 ...   28.28s user 1.85s system 169% cpu 17.817 total
... 4 ... 64 ...   23.80s user 1.83s system 191% cpu 13.408 total
... 4 ... 128 ...  21.26s user 1.82s system 213% cpu 10.810 total
... 4 ... 256 ...  19.23s user 1.75s system 235% cpu  8.911 total
... 4 ... 512 ...  18.59s user 1.87s system 243% cpu  8.413 total
... 8 ... 16 ...   47.24s user 2.76s system 163% cpu 30.494 total
... 8 ... 32 ...   34.63s user 2.60s system 205% cpu 18.153 total
... 8 ... 64 ...   28.96s user 2.54s system 245% cpu 12.819 total
... 8 ... 128 ...  25.95s user 2.52s system 282% cpu 10.079 total
... 8 ... 256 ...  23.45s user 2.46s system 321% cpu  8.063 total
... 8 ... 512 ...  23.43s user 2.56s system 333% cpu  7.801 total
... 16 ... 16 ...  53.59s user 3.65s system 172% cpu 33.138 total
... 16 ... 32 ...  40.53s user 3.56s system 232% cpu 18.947 total
... 16 ... 64 ...  35.39s user 3.36s system 285% cpu 13.591 total
... 16 ... 128 ... 32.44s user 3.31s system 335% cpu 10.650 total
... 16 ... 256 ... 30.04s user 3.37s system 382% cpu  8.744 total
... 16 ... 512 ... 29.72s user 3.38s system 406% cpu  8.135 total
... 32 ... 16 ...  52.85s user 4.06s system 176% cpu 32.242 total
... 32 ... 32 ...  43.53s user 4.21s system 239% cpu 19.897 total
... 32 ... 64 ...  38.01s user 3.91s system 295% cpu 14.200 total
... 32 ... 128 ... 34.79s user 3.65s system 353% cpu 10.878 total
... 32 ... 256 ... 33.24s user 3.63s system 416% cpu  8.862 total
... 32 ... 512 ... 32.23s user 3.74s system 428% cpu  8.402 total
... 64 ... 16 ...  54.10s user 4.46s system 176% cpu 33.251 total
... 64 ... 32 ...  42.43s user 4.40s system 236% cpu 19.833 total
... 64 ... 64 ...  38.14s user 4.28s system 290% cpu 14.591 total
... 64 ... 128 ... 33.67s user 4.03s system 339% cpu 11.104 total
... 64 ... 256 ... 33.59s user 3.93s system 401% cpu  9.352 total
... 64 ... 512 ... 32.82s user 3.97s system 424% cpu  8.660 total

4 threads and batches of 256 seems to still be a sweet spot (the same conclusion was reached by jperkin in https://smartos.org/bugview/OS-6485 for libicudata.so, which I still can't convert)

#3

Updated by Andy Fiddaman about 1 month ago

  • Related to Bug #13251: CTF conversion fails if any CU is missing DWARF data added
#4

Updated by Andy Fiddaman about 1 month ago

  • Related to Bug #13252: ctf_update()/ctf_dwarf_convert_function() leak memory added
#5

Updated by Andy Fiddaman about 1 month ago

  • Gerrit CR changed from 1012 to 1014
#6

Updated by Andy Fiddaman 24 days ago

  • Related to Bug #13278: CTF assertion failed cmp->cm_tmap[id].cmt_map == suid added
#7

Updated by Andy Fiddaman 18 days ago

With thanks to Jonathan Perkin, a full pkgsrc build has been done with the updated CTF tools with good results.

A total of 22608 objects were successfully converted, slightly up on the existing pkgsrc CTF tools (which are patched versions of older gate bits).

The summarised output from the full pkgsrc build with the updated tools is:

      1 assertion failed for thread 0xfe7e0a40, thread-id 3: kind != CTF_ERR, file /data/omnios-build/omniosorg/bloody/illumos/usr/src/lib/libctf/common/ctf_dwarf.c, line 1377
      1 ctfconvert: CTF conversion failed: Failed to mmap a needed data section
      1 ctfconvert: die main.cpp: failed to convert strong functions and variables: Invalid type identifier
      1 ctfconvert: die posix.c: failed to add inputs for merge: Failed to mmap a needed data section
      1 ctfconvert: die qrc_kdedeprecated.cxx: failed to convert strong functions and variables: Invalid type identifier
      1 ctfconvert: failed to get tag type: DW_DLE_DIE_NULL (52)
      1 ctfconvert: failed to get unsigned attribute for type: DW_DLE_BAD_REF_FORM. The form code is 0x10 which does not have an offset  for dwarf_formref() to return.
      2 ctfconvert: failed to get DW_FORM_ref4 (19) value for DW_AT_upper_bound: DW_DLE_ATTR_FORM_BAD: In function formudata (internal function) on seeing form  0x13  (DW_FORM_ref4)
      4 ctfconvert: failed to add member hidden: Invalid type identifier
      5 ctfconvert: encountered unknown DWARF encoding: 16
      5 ctfconvert: failed to add member tlsh: Invalid type identifier
      5 ctfconvert: failed to add member variable: Invalid type identifier
      6 ctfconvert: failed to add member obj: Invalid type identifier
      8 CTF conversion failed: Invalid type identifier
      8 ctfconvert: failed to add member UNNAMED: Duplicate member name definition
     10 ctfconvert: failed to get unsigned attribute for type: DW_DLE_ATTR_FORM_BAD: In function formudata (internal function) on seeing form  0xd  (DW_FORM_sdata)
     12 ctfconvert: failed to add member <various>: Limit on number of dynamic type members reached
     22 ctfconvert: CTF conversion failed: No such file or directory
     27 CTF conversion failed: Invalid argument
     29 /usr/bin/bash: ctfconvert: command not found
    109 truncating enumeration <various> at member <various>
    205 ctfconvert: CTF conversion failed: Invalid type identifier
    682 ctfconvert: CTF conversion failed: Invalid argument
    690 ctfconvert: file <various>.c is missing debug info
   3283 ctfconvert: file does not contain DWARF data

It's difficult to directly compare with the existing tools since they produce different output and less informative messages (that's mostly due to 6885 CTF Everywhere Part 1 - the old tools were forked before this came in) , but Jonathan tells me this is comparable, and that the additional information in the messages is helpful in terms of working out why some objects do not convert.

#8

Updated by Robert Mustacchi 18 days ago

It certainly wasn't the intent of the CTF everywhere changes to regress the output on error messages, so if there's something concrete we should change, can we get a bug open on that?

#9

Updated by Andy Fiddaman 17 days ago

Sorry, I didn't phrase that very well.

It's the existing pkgsrc tools (which were forked before the CTF everywhere changes) that produce less useful error messages.
That is what makes it difficult to directly compare an old pkgsrc build with a newer one that uses the gate tools + the four reviews I have open.

#10

Updated by Electric Monk 15 days ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit effb27ee30c48fe502152c38487ced379d9f8693

commit  effb27ee30c48fe502152c38487ced379d9f8693
Author: Andy Fiddaman <omnios@citrus-it.co.uk>
Date:   2020-11-12T21:15:16.000Z

    13247 CTF conversion fails with large files
    13251 CTF conversion fails if any CU is missing DWARF data
    Portions Contributed by: Jonathan Perkin <jperkin@joyent.com>
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF