zoneadm takes ages to do something on a system with many snapshots
I was investigating why "zones:default" service failed on my laptop, and found that attempts to boot zones timed out. Eventually I traced the simple "zoneadm" call and I see long streams repeating "ZFS_IOC_SNAPSHOT_LIST_NEXT" and "open("/etc/default/be"...)" and "B E _ H A S _ G R U B = t r u e\n\n" for minutes...
I have auto-snapshots there, or more recently znapzend, so a ZFS tree under the zone root with all the subtree datasets, some zbe versions from OS updates, times the amout of snapshots, has almost 2000 items, which even with an SSD in place takes a while to parse (clocked 6 seconds for a `zfs list`) :\
It seems that it looks at all those datasets and snaps, and goes back to research the "/etc/default/be" file and discard what it sees, I guess... and note it is a LOT of reads of that "be" file which is the same one in the GZ, and there is no such in the local zone. Per discussion on IRC, "it does look for container name there (ROOT)" - so the first optimization matter is if it can't look for the value once, and then iterate all the datasets it needs. Also during the startup phase there are a lot of open and open64 calls repetitively to other files, like locale data and zone index and some others, but that might be initialization of libraries etc. and it has a constrained impact (although it does repeat), just a few hundred actions that could be scrapped compared to tens of thousands of "/etc/default/be" reads.
And also, if it is stuck iterating my ZFS tree while looking for a zoneroot, shouldn't it be interested in just the ZBE filesystem datasets? Why care about all those snaps?
I'll post a trace a bit later.