Bug #4246
kstat read improperly returned ENOMEM
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
kernel
Start date:
2013-10-20
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
Description
Summarized from the Joyent bug report:
This code is taking the result of kstat_read() looking at the "data" field, and pulling out the "zonename". But "data" is undefined. That's odd. Well, there's this code in node-kstat's "read" function: 291 Handle<Value> 292 KStatReader::read(kstat_t *ksp) 293 { 294 Handle<Object> rval = Object::New(); 295 Handle<Object> data; 296 297 rval->Set(String::New("class"), String::New(ksp->ks_class)); 298 rval->Set(String::New("module"), String::New(ksp->ks_module)); 299 rval->Set(String::New("name"), String::New(ksp->ks_name)); 300 rval->Set(String::New("instance"), Integer::New(ksp->ks_instance)); 301 302 if (kstat_read(ksr_ctl, ksp, NULL) == -1) { 303 /* 304 * It is deeply annoying, but some kstats can return errors 305 * under otherwise routine conditions. (ACPI is one 306 * offender; there are surely others.) To prevent these 307 * fouled kstats from completely ruining our day, we assign 308 * an "error" member to the return value that consists of 309 * the strerror(). 310 */ 311 rval->Set(String::New("error"), String::New(strerror(errno))); 312 return (rval); 313 } We can check core.node.44430 on RM08218 to see if we hit this case. Indeed, this shows us the set of possible objects that could represent kstat read results: ::findjsobjects -p instance a4e93511 8598301d 85575481 85983291 8511e795 81975c9d and this one's our winner: > 85575481::jsprint { name: "z6385_net0", error: "Not enough space", module: "link", class: "net", instance: 0, } According to usr/src/lib/libc/port/gen/errlist, "Not enough space" is errno 12, or ENOMEM. kstat_read() is documented to return ENOMEM when: ENOMEM Insufficient storage space is available. It's unclear why this would have happened. There certainly haven't been any anonymous memory allocation failures in the global zone: # kstat -m memory_cap -i 0 module: memory_cap instance: 0 name: global class: zone_memory_cap anon_alloc_fail 0 anonpgin 0 crtime 0 execpgin 32440 fspgin 287917 n_pf_throttle 0 n_pf_throttle_usec 0 nover 0 pagedout 0 pgpgin 320357 physcap 18446744073709551615 rss 0 snaptime 2939894.787492560 swap 1955274752 swapcap 18446744073709551615 zonename global I'm not sure what we can do about this in Marlin. We're hitting this when we gather the baseline kstat values before dispatching a task to a zone for the first time in this zone's lifetime. If this condition is extremely transient, we could retry the kstat_read() a few times (possibly even with a timeout). It's unclear to me how likely this is to work. The alternative is to fail the task, which is not retryable in this context, and so sucks for the user.
--
This is much simpler than we thought, and the clue is in the output from the very first D script. This output is from the third one, but it's the same as the first one: 11 144 read_kstat_data:entry entry: pid 55897 user ksp = kstat32_t { hrtime_t ks_crtime = 0x292ae73c9fd858 caddr32_t ks_next = 0x9630290 kid32_t ks_kid = 0x74dd3b char [31] ks_module = [ "link" ] uint8_t ks_resv = 0 int32_t ks_instance = 0 char [31] ks_name = [ "z9363_net0" ] uint8_t ks_type = 0x1 char [31] ks_class = [ "net" ] uint8_t ks_flags = 0x21 caddr32_t ks_data = 0x9f1f488 uint32_t ks_ndata = 0x16 size32_t ks_data_size = 0x420 hrtime_t ks_snaptime = 0x292ae73c9fd858 int32_t _ks_update = 0 caddr32_t _ks_private = 0 int32_t _ks_snapshot = 0 caddr32_t _ks_lock = 0 } 11 145 read_kstat_data:return return: pid 55897 user ksp = kstat32_t { hrtime_t ks_crtime = 0x292ae73c9fd858 caddr32_t ks_next = 0x9630290 kid32_t ks_kid = 0x74dd3b char [31] ks_module = [ "link" ] uint8_t ks_resv = 0 int32_t ks_instance = 0 char [31] ks_name = [ "z9363_net0" ] uint8_t ks_type = 0x1 char [31] ks_class = [ "net" ] uint8_t ks_flags = 0x1 caddr32_t ks_data = 0x9f1f488 uint32_t ks_ndata = 0x16 size32_t ks_data_size = 0x445 hrtime_t ks_snaptime = 0x292c55a57d94e6 int32_t _ks_update = 0 caddr32_t _ks_private = 0 int32_t _ks_snapshot = 0 caddr32_t _ks_lock = 0 } 11 145 read_kstat_data:return 2013 Oct 2 00:16:05: kstat_read() returned ENOMEM! pid = 55897, psargs = /opt/smartdc/agents/lib/node_modules/marlin/build/node/bin/node /opt/smartdc/ag Note that on the way into read_kstat_data(), ks_flags is 0x21, which is KSTAT_FLAG_INVALID. This flag is set when the kstat is allocated, cleared on kstat_install, and set again in kstat_delete. It's supposed to indicate when a kstat is not visible to userland. However, this userland program has such a kstat_t, which means it was returned by the kstat_read() of kstat 0. Indeed, header_kstat_snapshot() does not look at this flag at all, nor does libkstat. Also of note, read_kstat_data() does check KSTAT_FLAG_INVALID on the actual kstat, not the one that the user passed in. So it looks like what happened here is that this program did a kstat_chain_update() which read kstat 0 during the small window in dls_stat_create() where the size is 0x420. It read a kstat header with size 0x420 and flag KSTAT_FLAG_INVALID. dls_stat_create() finished creating the kstat with the correct size and cleared the INVALID flag. Then the program tried to read the kstat, but got ENOMEM because the size was too small. The cleanest fix would seem to be to have header_kstat_update() and header_kstat_snapshot() skip kstats with this flag set.
History
Updated by Robert Mustacchi about 6 years ago
- Status changed from New to Resolved
- % Done changed from 90 to 100
Resolved in dc9df4786b08572d6032efbd47f727287e691656.
Updated by Forrest Fletcher almost 2 years ago
Issues of the online working have been huge for the students. The handling of the issues and https://www.rushmypapers.org/ is depicted for the further selection of the items for the smoothness in life of the teachers.
Updated by Thaddeus Hutchinson almost 2 years ago
I always want to try taking part with buddies as I think it is a good deal with we can have with our life. It was really nice to have been here as I could much more things know about https://www.huffingtonpost.com/ursula-nwobu/write-my-essay-the-new-bu_1_b_11451790.html site and different categories.