Bug #13838
openBug #13837: Race condition synchronously enabling "svccfg add"-ed instance
Race condition adding services crashes startd
0%
Description
While trying to debug https://www.illumos.org/issues/13837 (which is worth reading, for background), I managed to trigger a (reproducible!) assertion error in startd.
TL;DR:
$ svccfg import <manifest> $ svccfg -s <service FMRI> add <instance> $ svcadm enable -t <instance FMRI>
In a tight loop caused an assertion to fail in startd:
"Assertion failed: 0, file restarter.c, line 532", which refers to "usr/src/cmd/svc/startd/restarter.c".
I've attached a shell script, manifest file, and core dump of startd. As a warning: Running these reproducers will crash startd, so make sure you're on a machine where you can access the console to recover.
Files
Updated by Jason King 12 months ago
Looking at the code a bit, it appears that assert is triggered because libscf_get_startd_properties()
is returning an unexpected error value (amusingly, there is a bad_error()
macro that seems like it'd be appropriate here, but isn't being used).
Looking at libscf_get_startd_properties()
, it appears besides 0
, ECONNABORTED
, ECANCELED
, and ENOENT
, it can also return an error of ECHILD
(usr/src/cmd/svc/startd/libscf.c:2278). That seems like a good candidate for the cause (though I'm not sure offhand how that error should be handled).