2022-04-10

Rant: hardware, software, data --- why issit not werka?

#hardware

#software

#data

Sigh.

If you are not into electronic fiddling, overambitious ideas, overestimated skills and tools, and a continuous, if not annoying stream of whacky ideas, better stop reading now. You have been warned!

Some time after the beginning of the year 2000 I started fiddling with microcontrollers. PIC16F84 was my first adventure. An Atmel instance of the venerable 8051 my second. Assembly and C were my tools of choice. I then discovered Forth and the Renesas R8C controller. Then I changed jobs, and controllers. Atmel ATmega controllers came to my workbench to stay for a long time. I did not like msp430 much, I never got the hang of ARM M0 or M4 controllers. And when the first risc-v controllers came around I was tired.

So what to do with them fancy little machines? I settled on data acquisition quite soon, the environment being my main interest. Temperature is fairly easy as long as you have only one sensor. Once you switch to several, you find out soon, that proper calibration is not to be dismissed. Later you find out, that moisture will gnaw on your solder joints, connectors, parts. Measuring the distance to the water surface in a cistern using ultrasonic sensors sounds easy, however, at suspected 100% relative humidity (or more?) you sensor will slowly crumble. It's a mechanical device after all. For that very reason I didn't want any mechanical things like a float with chains to some count-the-links-contraption. You will find out, that for still unkown reasons said measurements drifts away some 1 cm equivalent only to disappear hours or days later for no apparent reason.

You will find out, that a controller program running for an hour is peanuts. Makeing said program run for months without any failure is where things become interesting. You start wondering about clock drift and sensor drift. Subtly filling up RAM for some hard to find occasional memory leak and then crashing seemingly for no reason is seen like a few times a year --- just enough to be annoying. Bonus points for overwriting something in eeprom or flash rendering the device basically dead. And no, I did not manage to find this one. It's still lurking somewhere.

Extra credit points if you find why for your favourite godesses sake adding radio links to the game will unfold a whole new universe of hiccups --- even though the radio will collect a full byte before telling the controller to read it. Ok, nowadays the radio will collect several bytes, then check and decode them before informing the controller. You will learn, that the receiving radio is working nicely up to the day, where you arrange the stuff on your shelf differently. Moving the radio an arms length to the left killed it. Well not really, it did blink away happily, but it was unable to read the signal. Move the bloody thing back, everything works again, while the godesses conspiring against you chuckle a bit.

Similar degree of fun can be achieved by trying to read and decode the signal of the DCF77 time source (old fashioned AM Radio at 77.5 kHz). Oh, nice I can just wait for the pulses coming from the receiver, add an interrupt service routine to it and be done, right? Don't ever try this. There will be pulses missing. There will be extra spikes. There will be times, when your receiver is just deaf or feeling sick or both. Don't ever wait for the next edge. Instead sample the signal every so often and try to make sense of it later. That does not make it simpler, but at least your controller will continue to run.

Most of my sensor stations are connected by rs485 cables. And yes, that works ok. Two years ago I decided to update the stations to newer software, and while at it, change the protocol on the wire to fix a few of my homebrew shortcomings. And while at it, change the database and the visual effects of the whole thing. Needless to say the whole thing exploded in my face. Stability on the wire protocol / wiring / power supply / whatever was not coming together for some reason. Things would ticker along for a week and then stop for no apparent reason. After 6 months or so I gave up. I had no idea any more, where to look. A few months later I did replace a few smaller things like the database and the visualization. I did touch the program that collects the data and added a feature like "send an email if some sensor does not deliver readings for more than one hour". I had to change the delivery of data to the database, but that made the thing simpler. And this is still running today.

So it doesn't look too bad, does it? Well, I spent several hours this week to find out, why sensor station X ceased to work (broken solder joint). And then I did find that the raspberry Pi, which acts as collector, did run in degraded state, its root file system being mounted readonly. Additionally the mounted root file system was the one on the sdCard, instead of the one on the connected SSD. That got broken by a kernel update editing cmdline.txt. Fixed that, only to find out that the collector program would get timeouts, and the connected controller would signal "noone collects my wonderful data" (collected from the radio link). This afternoon it started working again. For no apparent reason.

TL;DR: Go and learn something better than fiddling with computers.

Cheers,

~ew

PS: And I didn't even mention the really cool data to gather:

Home