I've been dealing with UPS (Uninterruptable Power Supply) problems [1] for a week and a half now, and it's finally calmed down a bit. Bunny's UPS has been replaced, and I'm waiting for Smirk to order battery replacements for my UPS so in the mean time, I'm using a spare UPS from The Company.
Bunny suspects the power situation here at Chez Boca is due to some overgrown trees interfering with the power lines, causing momentary fluctuations in the power and basically playing hell with not only the UPSes but the DVRs as well. This past Wednesday was particuarly bad—the UPS would take a hit and drop power to my computers, and by the time I got up and running, I would take another hit (three times, all within half an hour). It got so bad I ended up climbing around underneath the desks rerunning power cables with the hope of keeping my computers powered for more than ten minutes.
It wasn't helping matters that I was fighting my syslogd replacement [2] during each reboot (but that's another post [3]).
So Smirk dropped off a replacement UPS, and had I just used the thing, yesterday might have been better. But nooooooooooooooooo! I want to monitor the device (because, hey, I can), but since it's not an APC [4], I can't use apcupsd [5] to monitor it (Bunny's new UPS is an APC, and the one I have with the dead battery is an APC). In searching for some software to monitor the Cyber Power 1000AVR LCD [6] UPS, I came across NUT (Network UPS Tools) [7], which supports a whole host of UPSes [8], and it looks like it can support monitoring multiple UPSes on a single computer (functionality that apcupsd lacks).
It's nice, but it does have its quirks (and caused me to have nuclear meltdowns yesterday). I did question the need for five configuration files and its own user accounting system, but upon reflection, the user acccounting system is probably warranted (maybe), given that you can remotely command the UPSes to shutdown. And the configurations files aren't that complex; I just found them annoying. I also found the one process per UPS, plus two processes for monitoring, a bit excessive, but the authors of the program were following the Unix philosophy of small tools collectively working together. Okay, I can deal.
The one quirk that drove me towards nuclear meltdown was the inability of the USB (Universal Serial Bus) “driver” (the program that actually queries the UPS over the USB bus) to work properly when a particular directive was present in the configuration file and running in “explore” mode (used to query the UPS for all its information). So I have the following in the UPS configuration file:
>
```
[apc1000]
driver = usbhid-ups
port = auto
desc = "APC Back UPS XS 1000"
vendorid = 051D
```
I try to run usbhid-ups in explore mode, and it fails. Comment out the vendorid, but add it to the commnd line, and it works. But without the vendorid, the usbhid-ups program wouldn't function normally (it's the interface between the monitoring processes and the UPS).
It's bad enough that you can only use the explore mode when the rest of the UPS monitoring software isn't running, but this? It took me about three hours to figure out what was (or wasn't) going on.
You can obviously generate kilowatt usage, yet I can't query for it over USB? Not even as a vendor extention? You suck!] [9]
Then there was the patch I made to keep NUT from logging every second to syslogd (I changed one line from “if result > 0 return else log error” to “if result >= 0 return else log error” since 0 isn't an error code), then I found this bug report [10] on the mailing list archive, and yes, that bug was affecting me as well; after I applied the patch, I was able to get more informtion from the Cyber Power UPS (and it didn't affect the monitoring of the APC).
And their logging program, upslog, doesn't log to syslogd. It's not even an option. I could however, have it output to stdout and pipe that into logger, but that's an additional four processes (two per UPS) just to log some stats into syslogd. Fortunately, the protocol used to communicate with the UPS monitoring software is well documented and easy to implement, so it was an easy thing to write a script (Lua, of course) to query the information I wanted to log to syslogd and run that every five minutes via cron.
Now, the information you get is impressive. apcupsd gives out rather terse information like (from Bunny's system, which is still running apcupsd):
>
```
APC : 001,038,0997
DATE : Sat Apr 17 22:23:25 EDT 2010
HOSTNAME : bunny-desktop
VERSION : 3.14.6 (16 May 2009) debian
UPSNAME : apc-xs900
CABLE : USB Cable
MODEL : Back-UPS XS 900
UPSMODE : Stand Alone
STARTTIME: Thu Apr 08 23:20:10 EDT 2010
STATUS : ONLINE
LINEV : 118.0 Volts
LOADPCT : 16.0 Percent Load Capacity
BCHARGE : 084.0 Percent
TIMELEFT : 48.4 Minutes
MBATTCHG : 5 Percent
MINTIMEL : 3 Minutes
MAXTIME : 0 Seconds
SENSE : Low
LOTRANS : 078.0 Volts
HITRANS : 142.0 Volts
ALARMDEL : Always
BATTV : 25.9 Volts
LASTXFER : Unacceptable line voltage changes
NUMXFERS : 6
XONBATT : Fri Apr 16 00:40:37 EDT 2010
TONBATT : 0 seconds
CUMONBATT: 11 seconds
XOFFBATT : Fri Apr 16 00:40:39 EDT 2010
SELFTEST : NO
STATFLAG : 0x07000008 Status Flag
MANDATE : 2007-07-03
SERIALNO : JB0727006727
BATTDATE : 2143-00-36
NOMINV : 120 Volts
NOMBATTV : 24.0 Volts
NOMPOWER : 540 Watts
FIRMWARE : 830.E6 .D USB FW:E6
APCMODEL : Back-UPS XS 900
END APC : Sat Apr 17 22:24:00 EDT 2010
```
NUT will give back:
>
```
battery.charge: 42
battery.charge.low: 10
battery.charge.warning: 50
battery.date: 2001/09/25
battery.mfr.date: 2003/02/18
battery.runtime: 3330
battery.runtime.low: 120
battery.type: PbAc
battery.voltage: 24.8
battery.voltage.nominal: 24.0
device.mfr: American Power Conversion
device.model: Back-UPS RS 1000
device.serial: JB0307050741
device.type: ups
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.vendorid: 051D
driver.version: 2.4.3
driver.version.data: APC HID 0.95
driver.version.internal: 0.34
input.sensitivity: high
input.transfer.high: 138
input.transfer.low: 97
input.transfer.reason: input voltage out of range
input.voltage: 121.0
input.voltage.nominal: 120
ups.beeper.status: disabled
ups.delay.shutdown: 20
ups.firmware: 7.g3 .D
ups.firmware.aux: g3
ups.load: 2
ups.mfr: American Power Conversion
ups.mfr.date: 2003/02/18
ups.model: Back-UPS RS 1000
ups.productid: 0002
ups.serial: JB0307050741
ups.status: OL CHRG
ups.test.result: No test initiated
ups.timer.reboot: 0
ups.timer.shutdown: -1
ups.vendorid: 051d
```
Same information, but better variable names, plus you can query for any number of variables. Not all UPSes support all variables, though (and there are plenty more variables that my UPSes don't support, like temperature). You can also send commands to the UPS (for instance, I was able to shut off the beeper on the failing APC) using this software.
So yes, it's nice, but its quirky nature was something I wasn't expecting after a week of electric musical chairs.
[5] http://sourceforge.net/projects/apcupsd/
[7] http://www.networkupstools.org/
[8] http://www.networkupstools.org/compat/stable.html
[9] /boston/2010/04/17/ups.jpg
[10] http://lists.alioth.debian.org/pipermail/nut-upsdev/2010-March/004673.html