Bug #13837: Race condition synchronously enabling "svccfg add"-ed instance
Race condition adding services crashes startd
While trying to debug https://www.illumos.org/issues/13837 (which is worth reading, for background), I managed to trigger a (reproducible!) assertion error in startd.
$ svccfg import <manifest> $ svccfg -s <service FMRI> add <instance> $ svcadm enable -t <instance FMRI>
In a tight loop caused an assertion to fail in startd:
"Assertion failed: 0, file restarter.c, line 532", which refers to "usr/src/cmd/svc/startd/restarter.c".
I've attached a shell script, manifest file, and core dump of startd. As a warning: Running these reproducers will crash startd, so make sure you're on a machine where you can access the console to recover.
Updated by Jason King 16 days ago
Looking at the code a bit, it appears that assert is triggered because
libscf_get_startd_properties() is returning an unexpected error value (amusingly, there is a
bad_error() macro that seems like it'd be appropriate here, but isn't being used).
libscf_get_startd_properties(), it appears besides
ENOENT, it can also return an error of
ECHILD (usr/src/cmd/svc/startd/libscf.c:2278). That seems like a good candidate for the cause (though I'm not sure offhand how that error should be handled).