Bug #13837
openRace condition synchronously enabling "svccfg add"-ed instance
0%
Description
Hi! I'm new to the illumos community, so if there is any additional info I can supply, I'm very happy to do so - lemme know.
TL;DR: I have effectively run the following three commands:
$ svccfg import <manifest> $ svccfg -s <service FMRI> add <instance> $ svcadm enable -st <instance FMRI>
In doing so, I occasionally - but not always - see this error while executing svcadm:
svcadm: <instance FMRI> is misconfigured ("restarter" property group lacks "state" property).
Also, sometimes I see the following error instead (to be explicit - using exactly the same commands in a repro script):
svcadm: <instance FMRI> is misconfigured (lacks "restarter" property group)
Fortunately, I'm able to reproduce this pretty easily, and I think the fault can be triggered with the "-s" flag to svcadm specifically. I have attached two files: "repro.sh" and "manifest.xml". By running repro.sh (which refers to "manifest.xml"), this error can be reproduced within seconds (they're simplifications of my "real service" which triggered this bug).
I'm currently running OmniOS r151038.
Files
Updated by Sean Klein 12 months ago
Sean Klein wrote:
Hi! I'm new to the illumos community, so if there is any additional info I can supply, I'm very happy to do so - lemme know.
TL;DR: I have effectively run the following three commands:
[...]
In doing so, I occasionally - but not always - see this error while executing svcadm:
svcadm: <instance FMRI> is misconfigured ("restarter" property group lacks "state" property).
Also, sometimes I see the following error instead (to be explicit - using exactly the same commands in a repro script):
svcadm: <instance FMRI> is misconfigured (lacks "restarter" property group)
Fortunately, I'm able to reproduce this pretty easily, and I think the fault can be triggered with the "-s" flag to svcadm specifically. I have attached two files: "repro.sh" and "manifest.xml". By running repro.sh (which refers to "manifest.xml"), this error can be reproduced within seconds (they're simplifications of my "real service" which triggered this bug).
I'm currently running OmniOS r151038.
To add more color here:
- svcadm -st <FMRI>
should create a temporary service, and wait for it to enter the online or degraded state.
- When svccfg add
(to create the instance) has completed, the instance does not have a "restarter" property group. Typically svc.startd is responsible for adding this property group if it has not been supplied...
- ... but I believe the addition of this "restarter" property group is racy with the synchronous check in svcadm enable
. Sometimes, svcadm returns and the service is online, other times, the property group hasn't yet been added.
As a workaround in my "real" code, I'm avoiding all usage of the "-s" flag to svcadm enable
, and instead spinning in a loop which runs svcprop
. This is a hack, I'd prefer the "-s" be usable.
Updated by Sean Klein 12 months ago
WARNING: Also, as a follow-up: When I ran these repro steps without the "-s" flag to svcadm, I was able to crash startd - it hit "Assertion failed: 0, file restarter.c, line 532", which refers to "usr/src/cmd/svc/startd/restarter.c".
A minified example to repro this case exists here: https://www.illumos.org/issues/13838