Smash: An efficient compression algorithm for microcontrollers

Author: gbrown_

Score: 82

Comments: 28

Date: 2020-10-29 21:54:31

________________________________________________________________________________

jhfdbkofdcho wrote at 2020-10-29 22:15:39:

Reminds me of this compression project from a few years back from Steve Chamberlin at Big Mess O' Wires, FC8.

_FC8 is designed to be as fast as possible to decompress on "legacy" hardware, while still maintaining a decent compression ratio. Generic C code for compression and decompression is provided, as well as an optimized 68K decompressor for the 68020 or later CPUs. The main loop of the 68K decompressor is exactly 256 bytes, so it fits entirely within the instruction cache of the 68020/030. Decompression speed on a 68030 is about 25% as fast as an optimized memcpy of uncompressed data._

_The algorithm is based on the classic LZ77 compression scheme, with a sliding history window and duplicated data replaced by (distance,length) markers pointing to previous instances of the same data. No extra RAM is required during decompression, aside from the input and output buffers. The match-finding code and length lookup table were borrowed from liblzg by Marcus Geelnard._

https://www.bigmessowires.com/2016/05/06/fc8-faster-68k-deco...

oso2k wrote at 2020-10-30 06:30:44:

This sounds like it shares some similarities with LZ4.

jhfdbkofdcho wrote at 2020-10-31 03:46:54:

Indeed

reidrac wrote at 2020-10-30 07:43:48:

This is basically an ad.

The author of LZSA has a nice comparison that may help:

https://github.com/emmanuel-marty/lzsa

LZSA focuses on very fast decompression on 8-bit systems; so I suspect some of those could be used for MCUs as well, although getting to the level of integration that Segger is selling with their compression method may not be trivial.

rkagerer wrote at 2020-10-30 09:08:56:

After I failed to find the code and an accompanying license, I reached the same conclusion:

_This is basically an ad._

keithnz wrote at 2020-10-29 23:59:23:

A nice library that's super small (embedded into a 16 bit PIC chip) and works pretty nicely is lzfx

https://code.google.com/archive/p/lzfx/#

but it got abandoned, though the code is still completely useable for small micro controllers. It works pretty well for small ish payloads

ed25519FUUU wrote at 2020-10-30 04:39:49:

Abandoned or finished?

jononor wrote at 2020-10-30 11:08:59:

The author calls it "unmaintained". And Google Code was shut down in 2016, everything there is read-only archived. While such a project probably does not need regular updates, bugs are often found in code over time. For this project there is effectively no place to report, discuss, fix and integrate such. So abandoned does not seem inaccurate.

keithnz wrote at 2020-10-30 11:25:37:

yeah, it doesn't really have a proper home anymore, but it is pretty much finished as a library. While I have used it for many years now and not had a problem, it still may have problems on edge cases I haven't touched on. I mainly mention it here because it took me a while to find a nice small simple library that can be easily embedded in smaller microcontrollers and uses no dynamic memory. Often a hard thing to come across anything done well.

jononor wrote at 2020-10-30 13:06:00:

Your tip is much appreciated! I have also been on the lookout for a microcontroller-friendly compression library, and it has indeed been hard to find.

KMnO4 wrote at 2020-10-29 23:27:38:

Segger makes high quality stuff, but they’re expensive and their business model is “anti-modern”. The cost to use emCompress is $2500 USD.

Also, they specify that SMASH gets a worse compression ratio than LZMA but is an “order of magnitude less complex measured in lines of code”.

I’m not convinced that fewer lines of code means better performance on embedded systems. If anything, embedded code that has several thousand lines of code typically means there are hundreds of preprocessor macros that enable/disable certain paths depending on your architecture for performance.

jononor wrote at 2020-10-30 11:11:58:

Lines of code is not all that important. But program size is a key metric for microcontroller-type embedded systems, since the total available FLASH is typically in the 1-1000 kB range.

alpanka wrote at 2020-10-30 11:59:46:

Segger also provides cheap high quality development boards and very affordable academic versions.

Lauterbach on the other hand...

cozzyd wrote at 2020-10-30 05:20:31:

At least they let you use a GdbServer on the J-Link Edu Mini (although I suppose that may not be an option for non-academic users).

snops wrote at 2020-10-29 22:26:36:

superdisk wrote at 2020-10-30 04:04:23:

So, uh, where's the algorithm? This article is just hyping it up but I can't find the actual algo anywhere.

vardump wrote at 2020-10-30 09:27:10:

While interesting the niche for Smash seems to be small.

You'd have to have a system that's big enough to have some RAM where to decompress in the first place. So forget those ultra small 256 bytes or less RAM systems.

Mid-sized systems (say 16 kB+ RAM), I think you'd still mostly run directly from ROM. Any RAM is probably needed for other use.

Larger systems with ARM M0 core (or better) and 64 kB+ RAM have a ton of CPU power and typically plenty of other resources as well.

Perhaps there are people who need to deal with mask ROMs, but do those need to be that small either in 2020? Or penny pincher cases where you just can't have an extra I2C EEPROM / flash chip and your target uC doesn't have enough flash storage.

Maybe I'm missing something.

In any case, LZ-family compressors (like LZ4) have served me well so far.

Edit: That LZSA mentioned in other comment(s) looks cool. Regardless, a small niche of applications where you'd need this in the first place.

amelius wrote at 2020-10-29 23:31:46:

Does this allow code to be decompressed while the code is executing? See [1] for one such example.

[1]

https://link.springer.com/chapter/10.1007/978-94-017-8798-7_...

simcop2387 wrote at 2020-10-30 03:16:35:

would depend a lot on the architecture of the controller. not all of them allow you to run code from ram (Harvard vs von Neumann architecture differences).

bluedino wrote at 2020-10-29 22:38:49:

Can decompression cause issues with real-time applications? Say a packet of data takes a little longer to process than another.

icegreentea2 wrote at 2020-10-29 23:11:58:

You would need to understand what the worst case decoding time/operations for a given sized packet, and build your system/code around that.

Real-time does not necessarily mean that all operation X must take exact same time, it just means that you know that all operation X will occur within some span.

magicsmoke wrote at 2020-10-30 01:54:08:

Real-time systems are unfortunately named. Time-constrained systems would have been better. Variable naming is, after all, one of the two real hard things in Compsci.

jononor wrote at 2020-10-30 11:13:09:

Latency constrained perhaps? Time has many meanings.

mhh__ wrote at 2020-10-29 22:50:31:

I think yes in principle although this is why you test things thoroughly (In an embedded context there both probably won't be much data flowing around and the code will probably be simple enough that any glaring problems should be fairly obvious)

Ultimately it depends on whether you mean embedded as in a self-driving car, or embedded as in a microwave controller.

mmcallister wrote at 2020-10-29 22:08:41:

Error establishing a database connection

Hugged to death :-(

jhfdbkofdcho wrote at 2020-10-29 22:11:20:

https://web.archive.org/web/20201029220223/https://blog.segg...

robinanil wrote at 2020-10-30 00:44:06:

How does it compare again zstd?

mmastrac wrote at 2020-10-30 04:41:52:

Poorly, since it's designed for different goals (scale vs microprocessor).