Bug #3068
openvmstat swap/free wrong in local zones
50%
Description
vmstat sometimes displays incorrect values when used inside a local zone with 1 second interval and count set:
root@oitest01:~# vmstat 1 3 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr f0 s0 s1 s3 in sy cs us sy id 1 0 0 6085124 3000516 324 4440 0 0 0 0 26 1 1 4 23 1481 3826 1218 4 2 94 0 0 0 0 0 3048 60544 0 0 0 0 0 0 0 2 4 2053 48958 1875 84 16 0 6 0 0 4061440 979512 829 31250 0 0 0 0 0 0 0 102 7 1840 26619 2033 90 10 0 root@oitest01:~# vmstat 1 3 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr f0 s0 s1 s3 in sy cs us sy id 1 0 0 6084940 3000328 325 4446 0 0 0 0 26 1 1 4 23 1481 3831 1218 4 2 94 13 0 0 4024012 951772 1785 52162 0 0 0 0 0 0 0 2 4 1041 60424 712 85 15 0 0 0 0 0 0 1322 43275 0 0 0 0 0 0 20 20 0 1324 38769 916 89 11 0
This happens if the time between the first and second/third snapshot is too long. A timestamp is taken before acquire snapshot, if it takes too long sleep_until will calculate the sleep to a negative value which will result in a sleep period half the specified.
/illumos-gate/usr/src/cmd/stat/common/common.c:
70 now = gethrtime(); 71 pause = *wakeup + interval - now; 72 73 if (pause <= 0 || pause < (interval / 4))
This can cause two snapshots to be taken before kstat have been updated which happens one per second.
Since the memory counter are accumulative the result in will this case be zero, showing no free swap or memory.
This will not be noticed with intervals of 2s or longer since half of that will always be at least 1s and kstat will have been updated. If no count is give this logic will also be out ignored:
73 if (pause <= 0 || pause < (interval / 4)) 74 if (forever || *caught_cont) { 75 /* Reset our cadence (see comment below) */ 76 *wakeup = now + interval; 77 pause = interval;
This only happens inside local zones because get_pretty_name calls libdevinfo'di_dim_init which which will end up in libdevinfo'devlink_create that tries to start the devsadmd that can not run inside a zone, it will retry once and sleep between the retries.
/illumos-gate/usr/src/lib/libdevinfo/devinfo_devlink.c:
3267 #define MAX_DAEMON_ATTEMPTS 2 ... 3377 #define DAEMON_STARTUP_TIME 1 /* 1 second. This may need to be adjusted */ … 3311 } while ((++i < MAX_DAEMON_ATTEMPTS) && 3312 start_daemon(root, install) == 0); … 3500 static int 3501 start_daemon(const char *root, int install) … 3520 (void) sleep(DAEMON_STARTUP_TIME);
Easy to fix but a bit hard to find, I have prepared a patch that modifies devinfo_devlink.c to return without trying to start the daemon inside local zones.
Updated by Henrik Johansson over 11 years ago
- Status changed from New to In Progress