Bug #13247
CTF conversion fails with large files
100%
Description
In continuing the rollout of CTF to more of OmniOS userland, I hit a memory scaling problem with a 30MiB shared library with 3,816 compilation units. ctfconvert
cannot even initialise the die list without running out memory (ctfconvert is 32-bit).
Talking to Jonathan Perkin, he hit the same issue with the same shared library (libicudata.so) a while back, and wrote a patch for ctfconvert. This is open as https://smartos.org/bugview/OS-6485 and https://github.com/joyent/illumos-joyent/pull/215 but CTF in gate has moved on a bit since.
Here's the write-up from his issue:
During the work to CTF convert pkgsrc, libicudata.so exposed memory scaling issues with ctfconvert. It is a 29MB shared library containing 3,449 DIEs.
The current conversion process allocates memory as follows:
- A new ctf_die_t is created for each DIE during the initialisation process.
- Each ctf_die_t holds a DWARF handle open on the input file.
- Each ctf_die_t includes a new ctf_file_t allocation, each of which mmap()'s its own private copy of the CTF data, symtab, and strtab from the object.
- During the merge process, for each DIE, an extra ctf_file_t is allocated, again with its own mmap'ed private copies, as the destination for the merge output.
- These allocations are only freed at the end of the entire conversion process.
- With all these allocations, a 32-bit ctfconvert process runs out of available memory while processing libicudata.so.
With the proposed patch the conversion process is changed to be as follows:
- Instead of allocating a full ctf_die_t for every DIE up-front, we instead defer full initialisation to the main conversion process in ctf_dwarf_convert_one().
- DIEs are processed in batches, defaulting to a batch size of 256 (configurable via -b on the command line if necessary).
- Each batch is converted and merged until a single merged ctf_file_t is returned as the result.
- After processing a batch, the input DIEs for that batch are freed.
- The merged ctf_file_t is added as an input to the next batch.
- The process continues until we have processed all batches and end with a final merged ctf_file_t, or a failure.
The default batchsize of 256 is based on a few constraints:
- Processing DIEs in multiple batches means that ctf_id_t's will be different compared to those generated by a previous ctfconvert. If we're able to choose a batchsize which is larger than the
number of DIEs in most objects then we will avoid changing ctf_id_t's. Whilst mostly cosmetic, it's still nice to avoid differences if possible. - Performance goes up as the batchsize increases, at least when using the default of 4 threads.
- The batchsize needs to be well below the number of DIEs that can be processed without hitting memory limits.
Related issues