💾 Archived View for gemini.complete.org › using-filespooler-over-syncthing captured on 2024-08-31 at 12:09:16. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-07-09)

-=-=-=-=-=-=-

Using Filespooler over Syncthing

Filespooler[1] is a way to execute commands in strict order on a remote machine, and its communication method is by files. This is a perfect mix for Syncthing[2] (and others, but this page is about Filespooler and Syncthing).

1: /filespooler/

2: /syncthing/

This page also functions as a tutorial for Filespooler.

While I talk about Syncthing in particular here, the instructions here apply to pretty much any directory-synchronization tool, such as Dropbox, Box, etc. All that's required is that it meet the requirements laid out in Guidelines for Writing To Filespooler Queues Without Using Filespooler[3]. The most important one is that when a file appears in the `jobs` directory with a name matching `fspl-*.fspl`, it must be ready to process. Most sync tools will save files to a temporary filename while syncing them, then rename them when done, which meets these requirements.

3: /guidelines-for-writing-to-filespooler-queues-without-using-filespooler/

Preparation

Before you can use Filespooler, of course you will need to install it. This is quick and easy; see the Installation section on the Filespooler homepage[4] for details.

4: /filespooler/

We are going to assume we have two machines: one that is the "sender" and one that is the "receiver". We're going to implement a service that can encode or decode a payload using base64(1). The base64 command is part of most Unix/Linux distributions, and is included in GNU coreutils. We will be using Syncthing as the transport for this service!

Creating the queue

For the sake of this demonstration, we will create the queue in an already-existing directory. Let's say you have a directory already being synced between the two computers called `~/sync`. Let's make a queue in there:

sender$ fspl queue-init -q ~/sync/b64queue

As with most Unix commands, if it has no output, it worked properly.

All `fspl` commands that work with a queue take a `-q queuedir` parameter (or `--queuedir` if you prefer). In this case, it gives the path to the queue we will be creating.

Let's see what's in that directory:

sender$ ls -lR ~/sync/b64queue
/home/jgoerzen/sync/b64queue:
total 3
drwxr-xr-x 2 jgoerzen jgoerzen 2 May 16 20:17 jobs
-rw-r--r-- 1 jgoerzen jgoerzen 2 May 16 20:17 nextseq
-rw-r--r-- 1 jgoerzen jgoerzen 0 May 16 20:17 nextseq.lock

/home/jgoerzen/sync/b64queue/jobs:
total 0

The `~/sync/b64queue/jobs` directory will hold the data about the actual jobs we want to process. It is written to by the sender and read from by the receiver. `nextseq` and `nextseq.lock` are files that help the queue processor do its job. They should only be touched by the receiver, never by the sender.

Sending a first request

Now, we're going to use `fspl prepare` to prepare a request. This command will:

Now since this command writes its output to stdout, we need a way to save it to the queue. You can, of course, Write To Filespooler Queues Without Using Filespooler[5]. But it's easier to just use a Filespooler command: `fspl queue-write`. This is separate from `fspl prepare` because you might want to encode the job file in your pipeline (see, for instance, Compressing Filespooler Jobs[6] and Encrypting Filespooler Jobs with GPG[7]).

5: /guidelines-for-writing-to-filespooler-queues-without-using-filespooler/

6: /compressing-filespooler-jobs/

7: /encrypting-filespooler-jobs-with-gpg/

That may have sounded complicated, but let's put the pieces together and it will be easy:

sender$ echo Hi | fspl prepare -s ~/b64seq -i - | fspl queue-write -q ~/sync/b64queue

Let's break that down:

* `-s seqfile` gives the path to a *sequence file* used on the sender side. This file has a simple number in it that increments a unique counter for every generated job file. It is matched with the `nextseq` file within the queue to make sure that the receiver processes jobs in the correct order. It MUST be separate from the file that is in the queue and should NOT be placed within the queue. There is no need to sync this file, and it would be ideal to not sync it.

* The `-i` option tells `fspl prepare` to read a file for the packet payload. `-i -` tells it to read stdin for this purpose. So, the payload will consist of three bytes: "Hi\n" (that is, including the terminating newline that `echo` wrote)

* `fspl queue-write` reads stdin and writes it to a file in the queue directory in a safe manner. The file will ultimately match the `fspl-*.fspl` pattern and have a random string in the middle.

At this point, wait a few seconds (or however long it takes) for the queue files to be synced over to the recipient.

Inspecting the queue

On the receiver, we can see if any jobs have arrived yet:

receiver$ fspl queue-ls -q ~/sync/b64queue
ID                   creation timestamp          filename
1                    2022-05-16T20:29:32-05:00   fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl

If you have an empty output, you could examine the `~/sync/b64queue/jobs` directory to see if it actually does contain files synced over from the sender.

Let's say we'd like some information about the job. Try this:

receiver$ $ fspl queue-info -q ~/sync/b64queue -j 1
FSPL_SEQ=1
FSPL_CTIME_SECS=1652940172
FSPL_CTIME_NANOS=94106744
FSPL_CTIME_RFC3339_UTC=2022-05-17T01:29:32Z
FSPL_CTIME_RFC3339_LOCAL=2022-05-16T20:29:32-05:00
FSPL_JOB_FILENAME=fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl
FSPL_JOB_QUEUEDIR=/home/jgoerzen/sync/b64queue
FSPL_JOB_FULLPATH=/home/jgoerzen/sync/b64queue/jobs/fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl

This information is intentionally emitted in a format convenient for parsing.

Running the queued jobs

Well, now that we've done all this, let's get to business and run the jobs!

receiver$ fspl queue-process -q ~/sync/b64queue --allow-job-params base64
SGkK

There are two new parameters here:

* Have environment variables set as we just saw in `queue-info`

* Have the text we previously prepared - "Hi\n" - piped to it

By default, `fspl queue-process` doesn't do anything special with the output; see Handling Filespooler Command Output[8] for details on other options. So, the base64-encoded version of our string is "SGkK". We successfully sent a packet using Syncthing as a transport mechanism!

8: /handling-filespooler-command-output/

At this point, if you do a `fspl queue-ls` again, you'll see the queue is empty. By default, `fspl queue-process` deletes jobs that have been successfully processed.

Demonstrating Ordering

Let's make a demo to see if ordering is always preserved. We'll intentionally inject jobs into the queue in the wrong order:

sender$ echo first | fspl prepare -s ~/b64seq -i - > /tmp/first
sender$ echo second | fspl prepare -s ~/b64seq -i - | fspl queue-write -q ~/sync/b64queue

So we only injected the second packet into the queue. What happens on the receiver?

receiver$ fspl queue-ls -q ~/sync/b64queue
ID                   creation timestamp          filename
3                    2022-05-16T20:22:19-05:00   fspl-7671ed24-8cb7-46fc-baff-1c99ceb6cfc9.fspl

The packet we saved off as "first" is ID 2; we can verify it this way:

sender$ fspl stdin-info < /tmp/first
FSPL_SEQ=2
FSPL_CTIME_SECS=1652750533
FSPL_CTIME_NANOS=412072059
FSPL_CTIME_RFC3339_UTC=2022-05-17T01:22:13Z
FSPL_CTIME_RFC3339_LOCAL=2022-05-16T20:22:13-05:00

`fspl stdin-info` does the same thing as `fspl queue-info`, just for things on stdin. Most `fspl queue` commands have a `fspl stdin` version also.

OK, so we have set up an out-of-order situation in the queue. What happens when we try to process the queue:

receiver$ fspl queue-process -q ~/sync/b64queue --allow-job-params base64

Nothing! Because we are missing packet 2 in the queue. If we increase the log level, we can see why:

receiver$ $ fspl --log-level debug queue-process -q ~/sync/b64queue --allow-job-params base64
DEBUG prepare_seqfile_lock{path="/home/jgoerzen/sync/b64queue/nextseq"}: filespooler::seqfile: Attempting to prepare lock at "/home/jgoerzen/sync/b64queue/nextseq.lock"
DEBUG open{path="/home/jgoerzen/sync/b64queue/nextseq"}: filespooler::seqfile: Attempting to acquire write lock
DEBUG open{path="/home/jgoerzen/sync/b64queue/nextseq"}: filespooler::seqfile: Attempting to open file "/home/jgoerzen/sync/b64queue/nextseq"
DEBUG scanqueue_map{queuedir="/home/jgoerzen/sync/b64queue" decoder=None}: filespooler::jobqueue: Reading header from "/home/jgoerzen/sync/b64queue/jobs/fspl-7671ed24-8cb7-46fc-baff-1c99ceb6cfc9.fspl"
DEBUG filespooler::cmd::cmd_exec: Sequence processing: Stopping processing as there is no job with the next sequence ID to process

OK, let's move our saved packet into the queue:

sender$ fspl queue-write -q ~/sync/b64queue < /tmp/first
sender$ rm /tmp/first

And now we should see it process two packets:

receiver$ fspl queue-process -q ~/sync/b64queue --allow-job-params base64
Zmlyc3QK
c2Vjb25kCg==

And indeed, it processed two packets -- and in the correct order! (You can tell because "first" is shorter than "second").

Passing Parameters

I mentioned before that you can pass parameters. I also indicated that we would build an encoder *and decoder*. The base64 command accepts a `-d` to decode. So, how about we try it? Let's use the encoded strings we've been playing with and see if we can get the original back out:

sender$ echo SGkK | fspl prepare -s ~/b64seq -i - -- -d | fspl queue-write -q ~/sync/b64queue
sender$ echo Zmlyc3QK | fspl prepare -s ~/b64seq -i - -- -d | fspl queue-write -q ~/sync/b64queue
sender$ echo c2Vjb25kCg== | fspl prepare -s ~/b64seq -i - -- -d | fspl queue-write -q ~/sync/b64queue

OK, now we should have 3 jobs in the queue:

receiver$ fspl queue-ls -q ~/sync/b64queue
ID                   creation timestamp          filename
4                    2022-05-16T20:30:40-05:00   fspl-ef38c718-59b1-4302-ae80-df3ec8e9ee46.fspl
5                    2022-05-16T20:31:10-05:00   fspl-34063b9b-9393-4f60-9a2e-d7ff1b8456bf.fspl
6                    2022-05-16T20:31:27-05:00   fspl-f330e9ac-16ba-4517-b6bb-523c81ef6fa6.fspl

Good. Let's inspect one:

receiver$ fspl queue-info -q ~/sync/b64queue -j 4
FSPL_SEQ=4
...
FSPL_PARAM_1=-d

I omitted most of the output, but notice that `FSPL_PARAM_1`. Since we have been processing the queue with `--allow-job-params`, anything after the `--` on the `fspl prepare` command line will be passed on to the command we execute: in this case, `base64`. So we can mix encode and decode jobs in a single queue. Let's add a fourth job, an encode one, just to be sure:

sender$ echo foo | fspl prepare -s ~/b64seq -i - | fspl queue-write -q ~/sync/b64queue

All right! Let's process the queue:

receiver$ fspl queue-process -q ~/sync/b64queue --allow-job-params base64
Hi
first
second
Zm9vCg==

Yes! That is absolutely correct! The three decode jobs did indeed decode, and the encode job did indeed encode, and they were all processed in the correct order.

Dealing with errors

Thus far, everything we have done has been with commands that succeed. What happens when a command fails? Let's try it.

`base64` can encode anything, but it can only decode things that are valid base64 strings. What if we accidentally send plain text to the decoder? `base64 -d` will exit with an error. Let's see how that works:

sender$ echo Hi | fspl prepare -s ~/b64seq -i - -- -d | fspl queue-write -q ~/sync/b64queue

And on the recipient, we've got one item in the queue:

receiver$ fspl queue-ls -q ~/sync/b64queue
ID                   creation timestamp          filename
8                    2022-05-16T20:36:49-05:00   fspl-5e8fe338-3c9c-4c77-b207-e7c296fb9ca3.fspl

Let's see what happens when we process this:

fspl queue-process -q ~/sync/b64queue --allow-job-params base64
base64: invalid input
ERROR filespooler::exec: Command exited abnormally with status ExitStatus(ExitStatus(256))
Error: Aborting processing due to exit status ExitResult(ExitStatus(ExitStatus(256))) from command

Well we certainly detected the error! Let's decode what is happening:

The packet is still in the queue (run `fspl queue-ls`) to confirm. You can keep running `fspl queue-process` as much as you like, but it will never proceed because this command will always exit with an error. Even if you add 100 more packets after it, because the *first* one has an error, none of the others will be processed because of the default of strict ordering.

This is perfect for many use cases. For instance, if you are Using Filespooler for Backups[9], you probably *want* an incremental that fails because of a full disk to keep failing until you've fixed the disk problem.

9: /using-filespooler-for-backups/

What can you do now? There are quite a few choices:

10: /filespooler-reference/

Let's look at the second option. In this example, the offending job is ID 8. Recall that Filespooler has a sequence file that tells it what to process next. We can inspect that file:

receiver$ fspl queue-get-next -q ~/sync/b64queue
8

That's as it should be. We can tell it to just skip job 8:

receiver$ fspl queue-set-next -q ~/sync/b64queue 9

That's fine. But job 8 is still sitting in the queue. There is (intentionally, for now) no `fspl queue-rm` command, but it is easy enough to do get rid of job 8 manually:

receiver$ fspl queue-info -q ~/sync/b64queue -j 8 | grep ^FSPL_JOB_FULLPATH=
FSPL_JOB_FULLPATH=/home/jgoerzen/sync/b64queue/jobs/fspl-5e8fe338-3c9c-4c77-b207-e7c296fb9ca3.fspl
receiver$ rm /home/jgoerzen/sync/b64queue/jobs/fspl-5e8fe338-3c9c-4c77-b207-e7c296fb9ca3.fspl

And done!

Advanced topics and observations

Only syncing the jobs/ subdirectory with an append-only queue

In this example, we synced the entire `b64queue` directory. This is unnecessary. It would be ideal to sync only the `jobs` subdirectory of it, to prevent it looking like a valid queue for processing on the sender. To do this, see Filespooler Append-Only Queues[11].

11: /filespooler-append-only-queues/

Getting help

Besides the Filespooler Reference[12], you can get a list of all `fspl` subcommands with `fspl help`. Summaries of options valid for each subcommand are available with `fspl SUBCOMMAND --help`.

12: /filespooler-reference/

Additional topics

13: /filespooler-reference/

14: /one-to-many-with-filespooler/

15: /many-to-one-with-filespooler/

16: /handling-filespooler-command-output/

17: /filespooler/

18: /introduction-to-filespooler/

--------------------------------------------------------------------------------

Links to this note

19: /building-an-asynchronous-internet-optional-instant-messaging-system/

I loaded up this title with buzzwords. The basic idea is that IM systems shouldn't have to only use the Internet. Why not let them be carried across LoRa radios, USB sticks, local Wifi networks, and yes, the Internet? I'll first discuss how, and then why.

20: /filespooler-append-only-queues/

In the Using Filespooler over Syncthing[21] example, we synced the entire `b64queue` directory. This is unnecessary.

21: /using-filespooler-over-syncthing/

22: /guidelines-for-writing-to-filespooler-queues-without-using-filespooler/

Filespooler[23] provides the `fspl queue-write` command to easily add files to a queue. However, the design of Filespooler intentionally makes it easy to add files to the queue by some other command. For instance, Using Filespooler over Syncthing[24] has Syncthing do the final write, the nncp-file (but not the nncp-exec) method in Using Filespooler over NNCP[25] had NNCP do it, and so forth.

23: /filespooler/

24: /using-filespooler-over-syncthing/

25: /using-filespooler-over-nncp/

26: /using-filespooler-over-rclone-and-s3-rsync-net-etc/

You can use Filespooler with a number of other filesystems and storage options. s3fs, for instance, lets you mount S3 filesystems locally. I can't possibly write about every such option, so I'll write about one: rclone.

27: /gitsync-nncp-over-filespooler/

You can use gitsync-nncp[28] (a tool for Asynchronous[29] syncing of git[30] repositories) atop Filespooler[31]. This page shows how. Please consult the links in this paragraph for background on gitsync-nncp and Filespooler.

28: /gitsync-nncp/

29: /asynchronous-communication/

30: /git/

31: /filespooler/

32: /compressing-filespooler-jobs/

Filespooler[33] has a powerful concept called a *decoder*. A decoder is a special command that any Filespooler command that reads a queue needs to use to decode the files within the queue. This concept is a generic one that can support compression, encryption, cryptographic authentication, and so forth.

33: /filespooler/

34: /introduction-to-filespooler/

It seems that lately I've written several shell implementations of a simple queue that enforces ordered execution of jobs that may arrive out of order. After writing this for the nth time in bash, I decided it was time to do it properly. But first, a word on the *why* of it all.

35: /verifying-filespooler-job-integrity/

Sometimes, one wants to verify the integrity and authenticity of a Filespooler[36] job file before processing it.

36: /filespooler/

37: /filespooler/

Filespooler lets you request the remote execution of programs, including stdin and environment. It can use tools such as S3, Dropbox, Syncthing[38], NNCP[39], ssh, UUCP[40], USB drives, CDs, etc. as transport; basically, a filesystem is the network for Filespooler.
Filespooler is particularly suited to distributed and Asynchronous Communication[41].

38: /syncthing/

39: /nncp/

40: /uucp/

41: /asynchronous-communication/

More on www.complete.org

Homepage

Interesting Topics

How This Site is Built

About John Goerzen

Web version of this site

(c) 2022-2024 John Goerzen