Project

General

Profile

Actions

Bug #13837

open

Race condition synchronously enabling "svccfg add"-ed instance

Added by Sean Klein 16 days ago. Updated 16 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
smf
Start date:
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Hi! I'm new to the illumos community, so if there is any additional info I can supply, I'm very happy to do so - lemme know.

TL;DR: I have effectively run the following three commands:

$ svccfg import <manifest>
$ svccfg -s <service FMRI> add <instance>
$ svcadm enable -st <instance FMRI>

In doing so, I occasionally - but not always - see this error while executing svcadm:

svcadm: <instance FMRI> is misconfigured ("restarter" property group lacks "state" property).

Also, sometimes I see the following error instead (to be explicit - using exactly the same commands in a repro script):

svcadm: <instance FMRI> is misconfigured (lacks "restarter" property group)

Fortunately, I'm able to reproduce this pretty easily, and I think the fault can be triggered with the "-s" flag to svcadm specifically. I have attached two files: "repro.sh" and "manifest.xml". By running repro.sh (which refers to "manifest.xml"), this error can be reproduced within seconds (they're simplifications of my "real service" which triggered this bug).

I'm currently running OmniOS r151038.


Files

repro.sh (263 Bytes) repro.sh Reproducer script Sean Klein, 2021-05-27 12:00 PM
manifest.xml (836 Bytes) manifest.xml Super simple manifest which repros the issue. Sean Klein, 2021-05-27 12:00 PM

Subtasks 1 (1 open0 closed)

Bug #13838: Race condition adding services crashes startdNew

Actions
Actions #1

Updated by Sean Klein 16 days ago

Sean Klein wrote:

Hi! I'm new to the illumos community, so if there is any additional info I can supply, I'm very happy to do so - lemme know.

TL;DR: I have effectively run the following three commands:

[...]

In doing so, I occasionally - but not always - see this error while executing svcadm:

svcadm: <instance FMRI> is misconfigured ("restarter" property group lacks "state" property).

Also, sometimes I see the following error instead (to be explicit - using exactly the same commands in a repro script):

svcadm: <instance FMRI> is misconfigured (lacks "restarter" property group)

Fortunately, I'm able to reproduce this pretty easily, and I think the fault can be triggered with the "-s" flag to svcadm specifically. I have attached two files: "repro.sh" and "manifest.xml". By running repro.sh (which refers to "manifest.xml"), this error can be reproduced within seconds (they're simplifications of my "real service" which triggered this bug).

I'm currently running OmniOS r151038.

To add more color here:

- svcadm -st <FMRI> should create a temporary service, and wait for it to enter the online or degraded state.
- When svccfg add (to create the instance) has completed, the instance does not have a "restarter" property group. Typically svc.startd is responsible for adding this property group if it has not been supplied...
- ... but I believe the addition of this "restarter" property group is racy with the synchronous check in svcadm enable. Sometimes, svcadm returns and the service is online, other times, the property group hasn't yet been added.

As a workaround in my "real" code, I'm avoiding all usage of the "-s" flag to svcadm enable, and instead spinning in a loop which runs svcprop. This is a hack, I'd prefer the "-s" be usable.

Actions #2

Updated by Sean Klein 16 days ago

WARNING: Also, as a follow-up: When I ran these repro steps without the "-s" flag to svcadm, I was able to crash startd - it hit "Assertion failed: 0, file restarter.c, line 532", which refers to "usr/src/cmd/svc/startd/restarter.c".

A minified example to repro this case exists here: https://www.illumos.org/issues/13838

Actions

Also available in: Atom PDF