Feature #2596
openSupport UPS-initiated graceful shutdown in halt/bootadm
0%
Description
Several years ago I installed NUT (Network UPS Tools from http://networkupstools.org/) into a number of servers I maintained to enable graceful shutdowns according to signals from direct-connected or networked (APC SNMP) uninterruptible power supplies. Some of those servers were running Solaris (8 and 9 at that time), others had an assortment of Linux builds.
Some time ago I discovered that the changes in OpenSolaris (and likely Solaris 10) forbid this integration, and to enable it back - some changes are due in bootadm.c and/or halt.c (thus I think it's an illumos and not just an OpenIndiana bug/feature).
For those not acquainted with NUT, it has a multi-tiered architecture of UPS-drivers, UPS-monitors, and ultimately the shutdown scripts which take UPS powered system quirks into account.
First there is a number of programs talking UPS protocols (on the USB or serial wire, or over an IP network like SNMP UPSes), you can think of them as "UPS drivers". A program called "upsdrvctl" controls the UPS-driver process lifecycle. Out of the box NUT claims support for thousands of UPS models, and since it is open-source, it is not difficult to add new ones if you can get hold of, or reverse-engineer, the UPS'es protocol (I did it for newer Powercom models at that time). Now the project is actively supported by some UPS vendors.
There is a server program "upsd" which runs on the same box where the UPSes are connected (in case of direct-connect ones; it can run on each box in case of networked UPSes). This program represents the UPS data in a standardized fashion over TCP/IP to its clients.
Then there are clients. They talk to a server's "upsd" and use the most-recent UPS information for their needs. For example, a CGI client represents this data over a web-server and can pass an administrator's commands back to the UPSes (like causing a remote reboot if required).
The most common and important client is the "upsmon" - it waits for the power to go bad (according to "upsd") and initiates the safe OS shutdown if the number of powered UPSes is smaller than the number of power supplies required to run this box (configurable). In case of interesting events (like UPS on battery, UPS online, shutting down, etc.) "upsmon" can call notification scripts and send an email, etc.
Now, for most OSes the shutdown call is irreversible. So it is quite possible that "upsmon" initiates the safe shutdown, the machine takes some 10 minutes to disable its databases and other servers, and powers off to wait for external power to come up (when the depleted UPS recharges and powers itself on), so that BIOS or OBP would power the server box back on and boot. However, a lot can happen during the 10 minutes of shutdown. For example, external power can come back up and the UPS will not power off. So there is no power-on event for the BIOS to boot the server.
There are some UPS models where you can schedule the UPS shutdown/reboot regardless of external power availability (i.e. if the power is available, the UPS will power up, but it will power off first), or perhaps you could script some magic into an IPMI card to detect such conditions and try to poweron the server after a while.
For most deployments (without very well managed UPSes), however, it was common practice for the shutdown attempts to actually force depletion of the UPS so it would power off, by keeping the servers running as long as possible. If a reasonable timeout expires (more time than the UPS is known to support on battery), this means that the line power is back and the UPS is not going to die, so the remnants of the server's OS just issue a reboot call.
This logic usually took place in shutdown scripts, like /etc/rc0 for Solaris 8 and 9, with a background loop filtering away the OS signals and waiting to issue a reboot call after a timeout, and the foreground routine calling halt.
However, newer Solaris descendants seem to rely on bootadm for the OS lifecycle and ultimately call halt/poweroff/reboot, and there are explicit calls to kill all remaining processes during a shutdown. That is, all processes were sent a SIGTERM, then after some 5 or 30 seconds a SIGKILL was issued to all remaining processes (including the "hanging" background loop which ignored all other signals).
NOTE: I did find this set of kill() calls somewhere in bootadm or halt for OpenSolaris SXCE back in the day, but I can't pinpoint it now in illumos sources. Perhaps this part of the problem has been resolved - then this bugreport is to track that the solution remains in place, and forced killers are not added back to ruin this solution.
Now, in order to integrate illumos-derived OSes with graceful UPS shutdowns, including the scenario outlined above, the optional loop to gracefully halt the services and local zones (see also bug #2594), and fast-reboot the GZ after a timeout if the system is still powered up then, should be implemented in halt and/or bootadm, perhaps with a special command-line switch to go along this codepath (perhaps including the timeout length before a reboot). This call to halt-and-reboot would be used from "upsmon" shutdown handler, and might be used to integrate alternate power-cycle management solutions other than NUT.
Updated by Jim Klimov over 11 years ago
It is also important to know that particular UPS deployments can have different shutdown routines than the two discussed above - powering the server off (init 5), or halting and rebooting after a timeout (new proposed feature).
For example, when using managed UPSes which support poweroff/reboot commands, the master box might want to shutdown last by calling the correct UPS driver's appropriate command. For this to be feasible, the "/usr" filesystem must likely be mounted at least read-only (or a statically built copy of needed drivers must be available in rootfs like in "/sbin"), and likely the shutdown logic should be implementable as a (shell) script which would be called from halt itself, or as an SMF/milestone stop method (with some guarantee that its child processes are not killed by halt).
Also note that the new shutdown modes (i.e. "halt, timeout, reboot" routine) if implemented, might warrant new "uadmin" actions, probably in the AF_REBOOT family with an optional integer timeout argument, and appropriate manpage updates.