Bug #8379

illumos-gate 'install' make target is too eager building things

Added by Jason King 4 months ago. Updated 4 months ago.

Status:ClosedStart date:2017-06-12
Priority:NormalDue date:
Assignee:Jason King% Done:

100%

Category:tools - gate/build tools
Target version:-
Difficulty:Medium Tags:needs-triage

Description

I had nightly(1) fail with errors like this (reformatted for clarity and sanity):

/root/illumos/illumos-gate/usr/src/tools/proto/root_i386-nd/opt/onbld/bin/i386/cw \
    -_gcc -O -m32 -K pic -xspace -Xa -xildoff \
    -errtags=yes \
    -errwarn=%all \
    -erroff=E_EMPTY_TRANSLATION_UNIT \
    -erroff=E_STATEMENT_NOT_REACHED \
    -_gcc=-Wno-missing-braces \
    -_gcc=-Wno-sign-compare \
    -_gcc=-Wno-unknown-pragmas \
    -_gcc=-Wno-unused-parameter \
    -_gcc=-Wno-missing-field-initializers \
    -_gcc=-Wno-array-bounds \
    -_gcc=-Wno-type-limits \
    -_gcc=-Wno-parentheses \
    -_gcc=-Wno-unused-value \
    -_gcc=-Wno-type-limits \
    -_gcc=-Wno-switch \
    -xc99=%all \
    -W0,-xglobalstatic \
    -_gcc=-fno-inline-small-functions \
    -_gcc=-fno-inline-functions-called-once \
    -_gcc=-fno-ipa-cp \
    -v \
    -DDEBUG \
    -xF=%all \
    -g \
    -xc99=%all \
    -W0,-noglobal \
    -xdebugformat=stabs \
    -I. \
    -I../common \
    -I../../include \
    -I../../include/i386 \
    -D_TS_ERRNO \
    -I/root/illumos/illumos-gate/proto/root_i386/usr/include \
    -I/root/illumos/illumos-gate/../proto.strap/usr/include \
    -I/root/illumos/illumos-gate/usr/src/common/elfcap \
    -I../../../../lib/libc/inc \
    -I/root/illumos/illumos-gate/usr/src/common/elfcap \
    -I/root/illumos/illumos-gate/usr/src/common/sgsrtcid \
    -DPIC \
    -D_REENTRANT \
    -c -o pics/demangle.o ../common/demangle.c
+ /root/illumos/illumos-gate/../proto.strap/usr/bin/gcc \
    -fident -finline -fno-inline-functions -fno-builtin -fno-asm \
    -fdiagnostics-show-option -nodefaultlibs \
    -D__sun \
    -O -m32 \
    -fpic \
    -Wall -Wextra -Werror -Wno-missing-braces -Wno-sign-compare \
    -Wno-unknown-pragmas -Wno-unused-parameter -Wno-missing-field-initializers \
    -Wno-array-bounds -Wno-type-limits -Wno-parentheses -Wno-unused-value \
    -Wno-type-limits -Wno-switch \
    -std=gnu99 \
    -fno-inline-small-functions -fno-inline-functions-called-once -fno-ipa-cp \
    -DDEBUG \
    -gdwarf-2 \
    -std=gnu99 \
    -I. \
    -I../common \
    -I../../include \
    -I../../include/i386 \
    -D_TS_ERRNO \
    -I/root/illumos/illumos-gate/proto/root_i386/usr/include \
    -I/root/illumos/illumos-gate/../proto.strap/usr/include \
    -I/root/illumos/illumos-gate/usr/src/common/elfcap \
    -I../../../../lib/libc/inc \
    -I/root/illumos/illumos-gate/usr/src/common/elfcap \
    -I/root/illumos/illumos-gate/usr/src/common/sgsrtcid \
    -DPIC \
    -D_REENTRANT \
    -c -o pics/demangle.o ../common/demangle.c
In file included from /usr/include/demangle.h:31,
                 from ../common/demangle.c:27:
/root/illumos/illumos-gate/proto/root_i386/usr/include/stddef.h:36:24: error: \
    sys/stddef.h: No such file or directory
*** Error code 1
make: Warning: Command failed for target `pics/demangle.o'
Current working directory /root/illumos/illumos-gate/usr/src/cmd/sgs/libconv/i386

And similar errors for usr/src/cmd/sendmail/libmilter, etc...

Building things by hand worked fine, and examination of $ROOT/usr/include afterwards showed that indeed $ROOT/usr/include/sys/stddef.h was present, suggesting a race.

Spelunking through the build system, the following portions of usr/src/Makefile are of interest:

COMMON_SUBDIRS= data uts lib cmd ucblib ucbcmd psm man test
sparc_SUBDIRS= stand
i386_SUBDIRS= grub boot

#
# sparc needs to build stand before psm
#
$(SPARC_BLD)psm: stand

SUBDIRS= $(COMMON_SUBDIRS) $($(MACH)_SUBDIRS)
...
#
# Headers that can be built in parallel
#
PARALLEL_HEADERS = sysheaders userheaders libheaders cmdheaders

#
# Directories that can be built in parallel
#
PARALLEL_DIRS = data uts lib man
...
all: mapfiles closedbins sgs .WAIT $(SUBDIRS) pkg
...
install: install1 install2 _msg stage-licenses
        @cd msg; pwd; $(MAKE) _msg
        @rm -rf "$(ROOT)/catalog" 
...
install1: mapfiles closedbins sgs

install2: install1 $(SUBDIRS)
...
#
# Declare what parts can be built in parallel
# DUMMY at the end is used in case macro expansion produces an empty string to
# prevent everything going in parallel
#
.PARALLEL: $(PARALLEL_HEADERS) DUMMY
.PARALLEL: $(PARALLEL_DIRS) DUMMY
...
# librpcsvc has a dependency on headers installed by
# userheaders, hence the .WAIT before libheaders.
sgs: rootdirs .WAIT sysheaders userheaders .WAIT \
        libheaders cmdheaders
...
setup: closedbins bldtools sgs mapfiles

From this, key observations:

  • The 'sgs' target is what creates the proto area and installs the header files from the various locations in the gate into the proto area
  • As the install1 target is a phony target and has no build rules, the dependencies of install2 are effectively 'mapfiles closedbins sgs $(SUBDIRS)'
  • The 'all' target performs a .WAIT between the sgs target and $(SUBDIRS)
  • While 'cmd' is not in the list of parallel targets, 'lib' is

It should also be noted that nightly(1) builds the gate via '[d]make i install' and the 'setup' target
These by themselves are highly suggestive -
why would the 'all' target wait for all the targets up to sgs to complete before descending into $(SUBDIRS) while the 'install' target does not? That means stuff under usr/src/lib could in theory be built while the proto area (including header files) is still being built. This in itself is almost certainly an error, but doesn't explain the error messages I saw (after all those came from usr/src/cmd -- not in the list of parallel targets). However, examination of usr/src/lib/Makefile revealed something rather surprising--it builds targets under usr/src/cmd! In fact, amongst the targets it builds, it includes '../cmd/sgs/libconv' and '../cmd/sendmail/libmilter' sources of two of the errors I encountered. This also explains why I could not reproduce it outside of nightly. Most/all the documentation (including bldenv) all tell everyone to '[d]make setup; make' or such -- which of course won't trigger the race.

While we could try to get extremely surgical, I think doing what 'make all' does and do a .WAIT prior to $(SUBDIRS) will both fix the issue without eliminating any significant amount of parallelism -- basically it will wait for the proto area and the header files to finish installation prior to building anything else (which will then proceed as today).

While we could attempt to be more surg

History

#1 Updated by Electric Monk 4 months ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit f15a6fde3c0a52aca95943ffd637d7b8220e2261

commit  f15a6fde3c0a52aca95943ffd637d7b8220e2261
Author: Jason King <jason.brian.king@gmail.com>
Date:   2017-06-19T15:07:42.000Z

    8379 illumos-gate 'install' make target is too eager building things
    8360 ipdadm missing 'all' target
    8359 libzpool Makefiles are slightly broken
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Yuri Pankov <yuripv@gmx.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom