💾 Archived View for buetow.org › gemfeed › atom.xml captured on 2021-12-05 at 23:47:19.
⬅️ Previous capture (2021-12-03)
-=-=-=-=-=-=-
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <updated>2021-12-01T09:12:31+00:00</updated> <title>buetow.org feed</title> <subtitle>Having fun with computers!</subtitle> <link href="gemini://buetow.org/gemfeed/atom.xml" rel="self" /> <link href="gemini://buetow.org/" /> <id>gemini://buetow.org/</id> <entry> <title>Bash Golf Part 1</title> <link href="gemini://buetow.org/gemfeed/2021-11-29-bash-golf-part-1.gmi" /> <id>gemini://buetow.org/gemfeed/2021-11-29-bash-golf-part-1.gmi</id> <updated>2021-11-29T14:06:14+00:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>This is the first blog post about my Bash Golf series. This series is random Bash tips, tricks and weirdnesses I came across. It's a collection of smaller articles I wrote in an older (in German language) blog, which I translated and refreshed with some new content.. .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Bash Golf Part 1</h1> <pre> '\ . . |>18>> \ . ' . | O>> . 'o | \ . | /\ . | / / .' | jgs^^^^^^^`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Art by Joan Stark </pre> <p class="quote"><i>Published by Paul Buetow 2021-11-29</i></p> <p>This is the first blog post about my Bash Golf series. This series is about random Bash tips, tricks and weirdnesses I came across. It's a collection of smaller articles I wrote in an older (in German language) blog, which I translated and refreshed with some new content.</p> <h2>TCP/IP networking</h2> <p>You probably know the Netcat tool, which is a swiss army knife for TCP/IP networking on the command line. But did you know that the Bash natively supports TCP/IP networking?</p> <p>Have a look here how that works:</p> <pre> ❯ cat < /dev/tcp/time.nist.gov/13 59536 21-11-18 08:09:16 00 0 0 153.6 UTC(NIST) * </pre> <p>The Bash treats /dev/tcp/HOST/PORT in a special way so that it is actually establishing a TCP connection to HOST:PORT. The example above redirects the TCP output of the time-server to cat and cat is printing it on standard output (stdout).</p> <p>A more sophisticated example is firing up an HTTP request. Let's create a new read-write (rw) file descriptor (fd) 5, redirect the HTTP request string to it, and then read the response back:</p> <pre> ❯ exec 5<>/dev/tcp/google.de/80 ❯ echo -e "GET / HTTP/1.1\nhost: google.de\n\n" >&5 ❯ cat <&5 | head HTTP/1.1 301 Moved Permanently Location: http://www.google.de/ Content-Type: text/html; charset=UTF-8 Date: Thu, 18 Nov 2021 08:27:18 GMT Expires: Sat, 18 Dec 2021 08:27:18 GMT Cache-Control: public, max-age=2592000 Server: gws Content-Length: 218 X-XSS-Protection: 0 X-Frame-Options: SAMEORIGIN </pre> <p>You would assume that this also works with the ZSH, but it doesn't. This is one of the few things which don't work with the ZSH but in the Bash. There might be plugins you could use for ZSH to do something similar, though.</p> <h2>Process substitution</h2> <p>The idea here is, that you can read the output (stdout) of a command from a file descriptor:</p> <pre> ❯ uptime # Without process substitution 10:58:03 up 4 days, 22:08, 1 user, load average: 0.16, 0.34, 0.41 ❯ cat <(uptime) # With process substitution 10:58:16 up 4 days, 22:08, 1 user, load average: 0.14, 0.33, 0.41 ❯ stat <(uptime) File: /dev/fd/63 -> pipe:[468130] Size: 64 Blocks: 0 IO Block: 1024 symbolic link Device: 16h/22d Inode: 468137 Links: 1 Access: (0500/lr-x------) Uid: ( 1001/ paul) Gid: ( 1001/ paul) Context: unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 Access: 2021-11-20 10:59:31.482411961 +0000 Modify: 2021-11-20 10:59:31.482411961 +0000 Change: 2021-11-20 10:59:31.482411961 +0000 Birth: - </pre> <p>This example doesn't make any sense practically speaking, but it clearly demonstrates how process substitution works. The standard output pipe of "uptime" is redirected to an anonymous file descriptor. That fd then is opened by the "cat" command as a regular file.</p> <p>A useful use case is displaying the differences of two sorted files:</p> <pre> ❯ echo a > /tmp/file-a.txt ❯ echo b >> /tmp/file-a.txt ❯ echo c >> /tmp/file-a.txt ❯ echo b > /tmp/file-b.txt ❯ echo a >> /tmp/file-b.txt ❯ echo c >> /tmp/file-b.txt ❯ echo X >> /tmp/file-b.txt ❯ diff -u <(sort /tmp/file-a.txt) <(sort /tmp/file-b.txt) --- /dev/fd/63 2021-11-20 11:05:03.667713554 +0000 +++ /dev/fd/62 2021-11-20 11:05:03.667713554 +0000 @@ -1,3 +1,4 @@ a b c +X ❯ echo X >> /tmp/file-a.txt # Now, both files have the same content again. ❯ diff -u <(sort /tmp/file-a.txt) <(sort /tmp/file-b.txt) ❯ </pre> <p>Another example is displaying the differences of two directories:</p> <pre> ❯ diff -u <(ls ./dir1/ | sort) <(ls ./dir2/ | sort) </pre> <p>More (Bash golfing) examples:</p> <pre> ❯ wc -l <(ls /tmp/) /etc/passwd <(env) 24 /dev/fd/63 49 /etc/passwd 24 /dev/fd/62 97 total ❯ ❯ while read foo; do > echo $foo > done < <(echo foo bar baz) foo bar baz ❯ </pre> <p>So far, we only used process substitution for stdout redirection. But it also works for stdin. The following two commands result into the same outcome, but the second one is writing the tar data stream to an anonymous file descriptor which is substituted by the "bzip2" command reading the data stream from stdin and compressing it to its own stdout, which then gets redirected to a file:</p> <pre> ❯ tar cjf file.tar.bz2 foo ❯ tar cjf >(bzip2 -c > file.tar.bz2) foo </pre> <p>Just think a while and see whether you understand fully what is happening here.</p> <h2>Grouping</h2> <p>Command grouping can be quite useful for combining the output of multiple commands:</p> <pre> ❯ { ls /tmp; cat /etc/passwd; env; } | wc -l 97 ❯ ( ls /tmp; cat /etc/passwd; env; ) | wc -l 97 </pre> <p>But wait, what is the difference between curly braces and normal braces? I assumed that the normal braces create a subprocess whereas the curly ones don't, but I was wrong:</p> <pre> ❯ echo $ 62676 ❯ { echo $; } 62676 ❯ ( echo $; ) 62676 </pre> <p>One difference is, that the curly braces require you to end the last statement with a semicolon, whereas with the normal braces you can omit the last semicolon:</p> <pre> ❯ ( env; ls ) | wc -l 27 ❯ { env; ls } | wc -l > > ^C </pre> <p>In case you know more (subtle) differences, please write me an E-Mail and let me know.</p> <h2>Expansions</h2> <p>Let's start with simple examples:</p> <pre> ❯ echo {0..5} 0 1 2 3 4 5 ❯ for i in {0..5}; do echo $i; done 0 1 2 3 4 5 </pre> <p>You can also add leading 0 or expand to any number range:</p> <pre> ❯ echo {00..05} 00 01 02 03 04 05 ❯ echo {000..005} 000 001 002 003 004 005 ❯ echo {201..205} 201 202 203 204 205 </pre> <p>It also works with letters:</p> <pre> ❯ echo {a..e} a b c d e </pre> <p>Now it gets interesting. The following takes a list of words and expands it so that all words are quoted:</p> <pre> ❯ echo \"{These,words,are,quoted}\" "These" "words" "are" "quoted" </pre> <p>Let's also expand to the cross product of two given lists:</p> <pre> ❯ echo {one,two}\:{A,B,C} one:A one:B one:C two:A two:B two:C ❯ echo \"{one,two}\:{A,B,C}\" "one:A" "one:B" "one:C" "two:A" "two:B" "two:C" </pre> <p>Just because we can:</p> <pre> ❯ echo Linux-{one,two,three}\:{A,B,C}-FreeBSD Linux-one:A-FreeBSD Linux-one:B-FreeBSD Linux-one:C-FreeBSD Linux-two:A-FreeBSD Linux-two:B-FreeBSD Linux-two:C-FreeBSD Linux-three:A-FreeBSD Linux-three:B-FreeBSD Linux-three:C-FreeBSD </pre> <h2>- aka stdin and stdout placeholder</h2> <p>Some commands and Bash builtins use "-" as a placeholder for stdin and stdout:</p> <pre> ❯ echo Hello world Hello world ❯ echo Hello world | cat - Hello world ❯ cat - <<ONECHEESEBURGERPLEASE Hello world ONECHEESEBURGERPLEASE Hello world ❯ cat - <<< 'Hello world' Hello world </pre> <p>Let's walk through all three examples from the above snippet:</p> <ul> <li>The first example is obvious (the Bash builtin "echo" prints its arguments to stdout).</li> <li>The second pipes "Hello world" via stdout to stdin of the "cat" command. As cat's argument is "-" it reads its data from stdin and not from a regular file named "-". So "-" has a special meaning for cat.</li> <li>The third and fourth examples are interesting as we don't use a pipe as of "|" but a so-called HERE-document and a HERE-string. But the end results are the same.</li> </ul> <p>The "tar" command understands "-" too. The following example tars up some local directory and sends the data to stdout (this is what "-f -" commands it to do). stdout then is piped via an SSH session to a remote tar process (running on snonux.de) and reads the data from stdin and extracts all the data coming from stdin (as we told tar with "-f -") on the remote machine:</p> <pre> ❯ tar -czf - /some/dir | ssh hercules@snonux.de tar -xzvf - </pre> <p>This is yet another example of using "-", but this time using the "file" command:</p> <pre> $ head -n 1 grandmaster.sh #!/usr/bin/env bash $ file - < <(head -n 1 grandmaster.sh) /dev/stdin: a /usr/bin/env bash script, ASCII text executable </pre> <p>Some more golfing:</p> <pre> $ cat - hello hello ^C $ file - #!/usr/bin/perl /dev/stdin: Perl script text executable </pre> <h2>Alternative argument passing</h2> <p>This is a quite unusual way of passing arguments to a Bash script:</p> <pre> ❯ cat foo.sh #/usr/bin/env bash declare -r USER=${USER:?Missing the username} declare -r PASS=${PASS:?Missing the secret password for $USER} echo $USER:$PASS </pre> <p>So what we are doing here is to pass the arguments via environment variables to the script. The script will abort with an error when there's an undefined argument.</p> <pre> ❯ chmod +x foo.sh ❯ ./foo.sh ./foo.sh: line 3: USER: Missing the username ❯ USER=paul ./foo.sh ./foo.sh: line 4: PASS: Missing the secret password for paul ❯ echo $? 1 ❯ USER=paul PASS=secret ./foo.sh paul:secret </pre> <p>You have probably noticed this *strange* syntax:</p> <pre> ❯ VARIABLE1=value1 VARIABLE2=value2 ./script.sh </pre> <p>That's just another way to pass environment variables to a script. You can write it as well as like this:</p> <pre> ❯ export VARIABLE1=value1 ❯ export VARIABLE2=value2 ❯ ./script.sh </pre> <p>But the downside of it is that the variables will also be defined in your current shell environment and not just in the scripts sub-process.</p> <h2>: aka the null command</h2> <p>First, let's use the "help" Bash built-in to see what it says about the null command:</p> <pre> ❯ help : :: : Null command. No effect; the command does nothing. Exit Status: Always succeeds. </pre> <p>PS: IMHO, people should use the Bash help more often. It is a very useful Bash reference. Too many fallbacks to a Google search and then land on Stack Overflow. Sadly, there's no help built-in for the ZSH shell though (so even when I am using the ZSH I make use of the Bash help as most of the built-ins are compatible). </p> <p>OK, back to the null command. What happens when you try to run it? As you can see, absolutely nothing. And its exit status is 0 (success):</p> <pre> ❯ : ❯ echo $? 0 </pre> <p>Why would that be useful? You can use it as a placeholder in an endless while-loop:</p> <pre> ❯ while : ; do date; sleep 1; done Sun 21 Nov 12:08:31 GMT 2021 Sun 21 Nov 12:08:32 GMT 2021 Sun 21 Nov 12:08:33 GMT 2021 ^C ❯ </pre> <p>You can also use it as a placeholder for a function body not yet fully implemented, as an empty function ill result in a syntax error:</p> <pre> ❯ foo () { } -bash: syntax error near unexpected token `}' ❯ foo () { :; } ❯ foo ❯ </pre> <p>Or use it as a placeholder for not yet implemented conditional branches:</p> <pre> ❯ if foo; then :; else echo bar; fi </pre> <p>Or (not recommended) as a fancy way to comment your Bash code:</p> <pre> ❯ : I am a comment and have no other effect ❯ : I am a comment and result in a syntax error () -bash: syntax error near unexpected token `(' ❯ : "I am a comment and don't result in a syntax error ()" ❯ </pre> <p>As you can see in the previous example, the Bash still tries to interpret some syntax of all text following after ":". This can be exploited (also not recommended) like this:</p> <pre> ❯ declare i=0 ❯ $[ i = i + 1 ] bash: 1: command not found... ❯ : $[ i = i + 1 ] ❯ : $[ i = i + 1 ] ❯ : $[ i = i + 1 ] ❯ echo $i 4 </pre> <p>For these kinds of expressions it's always better to use "let" though. And you should also use $((...expression...)) instead of the old (deprecated) way $[ ...expression... ] like this example demonstrates:</p> <pre> ❯ declare j=0 ❯ let i=$((j + 1)) ❯ let i=$((j + 1)) ❯ let i=$((j + 1)) ❯ let i=$((j + 1)) ❯ echo $j 4 </pre> <h2>(No) floating point support</h2> <p>I have to give a plus-point to the ZSH here. As the ZSH supports floating point calculation, whereas the Bash doesn't:</p> <pre> ❯ bash -c 'echo $(( 1/10 ))' 0 ❯ zsh -c 'echo $(( 1/10 ))' 0 ❯ bash -c 'echo $(( 1/10.0 ))' bash: line 1: 1/10.0 : syntax error: invalid arithmetic operator (error token is ".0 ") ❯ zsh -c 'echo $(( 1/10.0 ))' 0.10000000000000001 ❯ </pre> <p>It would be nice to have native floating point support for the Bash too, but you don't want to use the shell for complicated calculations anyway. So it's fine that Bash doesn't have that, I guess. </p> <p>In the Bash you will have to fall back to an external command like "bc" (the arbitrary precision calculator language):</p> <pre> ❯ bc <<< 'scale=2; 1/10' .10 </pre> <p>See you later for the next post of this series. E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Defensive DevOps</title> <link href="gemini://buetow.org/gemfeed/2021-10-22-defensive-devops.gmi" /> <id>gemini://buetow.org/gemfeed/2021-10-22-defensive-devops.gmi</id> <updated>2021-10-22T10:02:46+03:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>I have seen many different setups and infrastructures during my carreer. My roles always included front-line ad-hoc fire fighting production issues. This often involves identifying and fixing these under time pressure, without the comfort of 2-week-long SCRUM sprints and without an exhaustive QA process. I also wrote a lot of code (Bash, Ruby, Perl, Go, and a little Java), and I followed the typical software development process, but that did not always apply to critical production issues.. .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Defensive DevOps</h1> <pre> c=====e H ____________ _,,_H__ (__((__((___() //| | (__((__((___()()_____________________________________// |ACME | (__((__((___()()()------------------------------------' |_____| ASCII Art by Clyde Watson </pre> <p class="quote"><i>Published by Paul Buetow 2021-10-22</i></p> <p>I have seen many different setups and infrastructures during my carreer. My roles always included front-line ad-hoc fire fighting production issues. This often involves identifying and fixing these under time pressure, without the comfort of 2-week-long SCRUM sprints and without an exhaustive QA process. I also wrote a lot of code (Bash, Ruby, Perl, Go, and a little Java), and I followed the typical software development process, but that did not always apply to critical production issues.</p> <p>Unfortunately, no system is 100% reliable, and you can never be prepared for a subset of the possible problem-space. IT infrastructures can be complex. Not even mentioning Kubernetes yet, a Microservice-based infrastructure can complicate things even further. You can take care of 99% of all potential problems by following all DevOps best practices. Those best practices are not the subject of this blog post; this post is about the sub 1% of the issues arising from nowhere you can't be prepared for. </p> <p>Is there a software bug in a production, even though the software passed QA (after all, it is challenging to reproduce production behaviour in an artificial testing environment) and the software didn't show any issues running in production until a special case came up just now after it got deployed to production a week ago? Are there multiple hardware failure happening which causes loss of service redundancy or data inaccessibility? Is the automation of external customers connected to our infrastructure putting unexpectedly extra pressure on your grid, driving higher latencies and putting the SLAs at risk? You bet the solution is: Sysadmins, SREs and DevOps Engineers to the rescue. </p> <p>You agree that fixing production issues this way is not proactive but rather reactive. I prefer to call it defensive, though, as you "defend" your system against a production issue. But, at the same time, you have to take a cautious (defensive) approach to fix it, as you don't want to make things worse. </p> <p>Over time, I have compiled a list of fire-fighting automation strategies, which I would like to share here. </p> <h2>Meet Defensive DevOps</h2> <p>Defensive DevOps is a term I invented by myself. I define it this way:</p> <ul> <li>It is the practice of automating production issues away ASAP as they appear. </li> <li>For rapid development, ignore most of the CI and QA best practices.</li> <li>Ignore the SCRUM process (if your team does SCRUM), as it will take too long to implement a solution. </li> <li>Be extremely careful (defensive) executing any fixing code in production, taking all failure scenarios into consideration and always have a rollback plan at hand. </li> <li>Still deliver a high-quality solution so that no customer will ever notice that there was an issue in the first place.</li> </ul> <p>That sounds a bit crazy, but this is, unfortunately, in rare occasions the reality. As the question is not whether production issues will happen, the question is WHEN they will happen. Every large provider, such as Google, Netflix, and so on, suffered significant outages before, and I firmly believe that their engineers know what they are doing. But you can prepare for the unexpected only to a certain degree.</p> <h2>Don't fully automate from the beginning</h2> <p>Do you have to solve problem X? The best solution would be to fully automate it away, correct? No, the best way is to fix problem X manually first. Does the problem appear on one server or on thousand servers? The scale does not matter here. The point is that you should fix the problem at least once manually, so you understand the problem and how to solve it before implementing automation around it.</p> <p>You should also have a short meeting with your team. Every person may has a different perspective and can give valuable input for determining the best strategy. But, again, keep the session short and efficient. Focus on the facts. After all, you are the domain expert and you probably know what you are doing.</p> <p>Once you understand the problem, fix it on a different server again. This time maybe write a small program or script. Semi-automate the process, but don't fully automate it yet. Start the semi-automated solution manually on a couple of more servers and observe the result. You want to gain more confidence that this really solved the problem. This can take a couple of hours manually running it over and over again. During that process, you will improve your script iteratively.</p> <h2>Develop code directly on production systems</h2> <p>You have to develop code directly on a production system. This sounds a bit controversial, but you want to get a working solution ASAP, and there is a very high chance that you can't reproduce problem X in a development or QA environment. Or at least it will consume significant effort and time to reproduce the problem, and by the time your code is ready, it's already too late. So the most practical solution is to directly develop your solution against a production system with the problem at hand. </p> <p>You might not have your full-featured IDE available on a production system, but a text editor, such as Vim (or Neovim), is sufficient for writing scripts. Some editors allow you to edit files remotely. With Vim you can accomplish it with "vim scp://SERVER///path/to/file.sh". Every time you save the file, it will be automatically uploaded via SCP to the server. From there, you can execute it directly. This comes with the additional benefits of still having access to all the Vim plugins installed locally, which you might not have installed on any production machines. This approach also removes any network delays you might experience when running your editor directly on a remote machine. </p> <p>Unfortunately, it will be a bit more complicated when you rely on code reviews (e.g. in a FIPS environment). Pair-programming could be the solution here.</p> <h3>Don't make it worse</h3> <p>You want to triple-check that your script is not damaging your system even further. You might introduce a bug to the code, so there should always be a way to roll back any permanent change it causes. You have to program it in a defensive style:</p> <ul> <li>Make sure that all that your script does is logged to a file. Best, when it's a Bash script, use "set -x". This makes the script print all commands as they are executed. Always write the output to a file. This helps to verify that your script is working as intended. The log output should always include timestamps for each significant operation performed.</li> <li>Make sure that no command executed by your script is failing. You should use "set -e" in your script, which makes the whole script terminate immediately if a command in it exits with a non-zero status. This will save you from apparent errors, e.g. trying to move files to a non-existent directory or trying to operate on a non-existent file. </li> <li>Your script should never delete any files. If solving problem X involves deleting files, don't delete them but rename or move them to a separate directory so that these can be recovered just in case. </li> <li>When you rename/move files around, always add a timestamp to a directory or the end of the file name (e.g. with "mv FILE FILE.$(date +%s"). This ensures that a backup never gets overwritten by another backup during a subsequential run of your script. Alternatively, before renaming a file, check whether the destination file already exists or not. </li> <li>When solving problem X involves manipulating files in place, be ultra-cautious. Best try to avoid in-place file manipulation. But if you really have to, you should, if disk space permits, always create a backup of the file first. Depending on the particular case, you might add a timestamp to the backup or only keep the very first initial backup of a file.</li> <li>You should implement a "--dry" switch in your script. When you run the script in dry mode, it won't manipulate anything on the system, but it would only print out what it would do. Always run your script in dry mode before running it for real. </li> </ul> <p>Furthermore, when you write Bash script, always run the tool ShellSheck (https://shellshock.io/) on it. This helps to catch many potential issues before applying it in production. </p> <h2>Test your code</h2> <p>You probably won't have time for writing unit tests. But what you can do is to pedantically test your code manually. But you have to do the testing on a production machine. So how can you test your code in production without causing more damage? </p> <p>Your script should be idempotent. This means you can run it infinite times in a row, and you will always get the same result. For example, in the first run of the script, a file A get's renamed to A.backup. The second time you run the script, it attempts to do the same, but it recognises that A has already been renamed to A.backup and then it is skipping that step. This is very helpful for manually testing, as it means that you can re-run the script every time you extended it. You should dry-run the script at least once before running it for real. You can apply the same principle for almost all features you add to the code. </p> <p>You may also want to inject manual negative testing into your script. For example, you want to run a particular function F in your script but only if a certain pre-condition is met, and you want to ensure that the code branching works as expected. The pre-condition check could be pretty complex (e.g. N log messages containing a specific warning string are found in the applications logs, but only on the cluster leader server). You can flip the switch directly in the code manually (e.g. run F only, when the pre-condition isn't met) and then perform a dry run of the script and study the output. Once done, flip the switch back to its correct configuration. For double insurance, test the same on a different server type (e.g. on a follower and not on a leader system).</p> <p>By following these principles, you test every line of code while you are developing on it. </p> <h2>Automation</h2> <p>At one point, you will be tired of manually running your script and also confident enough to automate it. You could deploy it with a configuration management system such as puppet Puppet and schedule a periodic execution via cron, a systemd timer or even a separate background daemon process. You have to be extremely careful here. The more you automate, the more damage you can cause. You don't want to automate it on all servers involved at once, but you want to slowly ramp up the automation. </p> <p>First, automate it only on one single server and monitor the result closely. At first, only automate running the script in dry mode. Also, don't forget that you still should log everything that the script is doing. Once everything looks fine, you can automate the script on the canary server for real. It shouldn't be a disaster if something goes wrong as usually systems are designed in a HA fashion, where the same data is still at least on another server available. In the worst-case scenario, you could recover data from there or from the local backup files your script created.</p> <p>Now, you can add a handful more canary servers to the automation. You should keep close attention to what the automation is doing. You could use a tool like DTail for distributed log file following. At this point, you could also think of deploying a monitoring check (e.g. Icinga) to see whether your script is not terminating abnormally or logging warnings or errors.</p> <a class="textlink" href="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.html">DTail - The distributed log tail program</a><br /> <p>From there, you could automate the solution on more and more servers. Best, ramp up the automation to a handful of systems, and later to a whole line of servers (e.g. all secondary servers of a given cluster). And afterwards, automate it on all servers.</p> <p>Remember, whenever something goes wrong, you will have plenty of logs and backup files available. The disaster recovery would involve extending your script to take care of that too or writing a new script for rolling back the backups. </p> <h2>Out of office hours</h2> <p>If possible, don't deploy any automation shortly before out of office hours, such as in the evening, before holidays or weekends. The only exception would be that you, or someone else, will be available to monitor the automation out of office hours. If it is a critical issue, someone, for example, the on-call person, could take over. Or ask your boss to work now but to take off another day to compensate.</p> <p>You should add an easy off-switch to your automation so that everyone from your team knows how to pause it if something goes wrong in order to adjust the automation accordingly. Of course, you should still follow all the principles mentioned in this blog post when making any changes. </p> <h2>Retrospective</h2> <p>For every major incident, you need to follow up with an incident retrospective. A blame-free, detailed description of exactly what went wrong to cause the incident, along with a list of steps to take to prevent a similar incident from occurring again in the future.</p> <p>This usually means creating one or more tickets, which will be dealt with soon. Once the permanent fix is deployed, you can remove your ad-hoc automation and monitoring around it and focus on your regular work again.</p> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Keep it simple and stupid</title> <link href="gemini://buetow.org/gemfeed/2021-09-12-keep-it-simple-and-stupid.gmi" /> <id>gemini://buetow.org/gemfeed/2021-09-12-keep-it-simple-and-stupid.gmi</id> <updated>2021-09-12T09:39:20+03:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>A robust computer system must be kept simple and stupid (KISS). The fancier the system is, the more can break. Unfortunately, most systems tend to become complex and challenging to maintain in today's world. In the early days, so I was told, engineers understood every part of the system, but nowadays, we see more of the 'lasagna' stack. One layer or framework is built on top of another layer, and in the end, nobody has got a clue what's going on.. .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Keep it simple and stupid</h1> <pre> _______________ |*\_/*|_______ | ___________ | .-. .-. ||_/-\_|______ | | | | | .****. .****. | | | | | | 0 0 | | .*****.*****. | | 0 0 | | | | - | | .*********. | | - | | | | \___/ | | .*******. | | \___/ | | | |___ ___| | .*****. | |___________| | |_____|\_/|_____| .***. |_______________| _|__|/ \|_|_.............*.............._|________|_ / ********** \ / ********** \ / ************ \ / ************ \ -------------------- -------------------- </pre> <p class="quote"><i>Published by Paul Buetow 2021-09-12, last updated 2021-10-22</i></p> <p>A robust computer system must be kept simple and stupid (KISS). The fancier the system is, the more can break. Unfortunately, most systems tend to become complex and challenging to maintain in today's world. In the early days, so I was told, engineers understood every part of the system, but nowadays, we see more of the "lasagna" stack. One layer or framework is built on top of another layer, and in the end, nobody has got a clue what's going on.</p> <h1>Need faster hardware</h1> <p>This not just makes the system much more complex, difficult to maintain and challenging to troubleshoot, but also slow. So more experts are needed to support it. Also, newer and faster hardware is required to make it run smoothly. Often, it's so much easier to buy speedier hardware than rewrite a whole system from scratch from the bottom-up. The latter would require much more resources in the short run, but in the long run, it should pay off. Unfortunately, many project owners scare away from it as they only want to get their project done and then move on.</p> <h1>Too complex to be replaced</h1> <h2>On COBOL</h2> <p>Have a look at COBOL, a prevalent programming language of the past. No one is learning COBOL in college or university anymore, but many legacy systems still require COBOL experts. Why is this? It's just too scary to write everything from scratch. There's too much COBOL code out there that can't be replaced from today to tomorrow. </p> <a class="textlink" href="https://nymag.com/intelligencer/2020/04/what-is-cobol-what-does-it-have-to-do-with-the-coronavirus.html">https://nymag.com/intelligencer/2020/04/what-is-cobol-what-does-it-have-to-do-with-the-coronavirus.html</a><br /> <h2>On Kubernetes</h2> <p>Now have a look at Kubernetes (k8s), the current trendy infrastructure thing to use nowadays. Of course, there are many benefits of using k8s (auto-scaling, reproducible deployments, dynamic resource allocation and resource sharing, saving of hardware costs, good commercial for potential employees as it is the current hot sauce of infrastructure). But all of this also comes with costs: You need experts operating the k8s cluster (or you need to pay extra for a managed cluster in the cloud), increased complexity of the system (k8s comes with a steep learning curve). The latter not only applies to the engineers managing the k8s cluster - it also applies to the software engineers, who now have to develop 'cloud native' applications and, therefore, have to change how they developed software how they used to. They all need to be re-educated on what cloud-native means, and they also need to understand the key concepts of k8s for writing optimal software for it.</p> <h2>The younger generation of IT professionals</h2> <p>Maybe the younger generation knows all of this already after graduation, but then they are missing other critical parts of the system for sure. I have seen engineers who knew about containers and how to configure resource restrictions for a Docker container managed via k8s but have never heard the terms Linux control groups and Linux namespaces. So obviously, there is some knowledge gap of the underlying architecture. This can be a big problem when you have to troubleshoot such a system during a production incident and k8s adds a lot of abstraction to the mix which doesn't make it easier. </p> <p>Coming back to COBOL, k8s is on its way to becoming something similar. One day, k8s might not be the hottest tech stuff everyone wants to use. But there will be still many legacy k8s clusters around but not enough experts available to manage those:</p> <a class="textlink" href="https://www.techrepublic.com/article/why-kubernetes-is-our-modern-day-cobol-says-a-tech-expert/">https://www.techrepublic.com/article/why-kubernetes-is-our-modern-day-cobol-says-a-tech-expert/</a><br /> <p>Another article which stroke me is:</p> <a class="textlink" href="https://it.slashdot.org/story/21/09/23/163212/todays-students-dont-understand-the-basics-of-computer-operations">Today's Students Don't Understand the Basics of Computer Operations </a><br /> <p>And here is something to smile about:</p> <a class="textlink" href="https://christine.website/blog/theres-a-node-2021-10-02">https://christine.website/blog/theres-a-node-2021-10-02</a><br /> <h1>The bloated web</h1> <p>Another example is the modern web. Have you ever wondered why the internet becomes slower and slower nowadays? The modern web is so much like lasagna that I decided to use Gemini to be the primary protocol of my website. The HTML version of this website is just a fallback as many visitors don't know what Gemini is and don't have any compatible software installed for surfing the Geminispace:</p> <a class="textlink" href="2021-04-24-welcome-to-the-geminispace.html">2021-04-24-welcome-to-the-geminispace.html</a><br /> <p>The Gemtext protocol is KISS. There's no way to do other formattings than headings, links, paragraphs, lists, quotes, and bare text blocks (e.g., ASCII art or code snippets). There's no way to create bloated Gemini sites, and due to its limited capabilities, there's also no way to commercialise it (e.g. there's no good way to track the site visitors as things like cookies don't exist). By design, the Gemini protocol can't be extended, so there is no chance to abuse it even in the future. Gemini sites will stay KISS forever, and there won't be any fancy HTML/JavaScript frameworks like we see on the modern web.</p> <h1>Fancy log-management solutions</h1> <p>Yet another example I want to bring up is DTail, the distributed log tail program I wrote. There are many great and fancy log-management solutions available to choose from, and they all seem complex to set up and maintain. The ELK stack, for example, requires you to operate an ElasticSearch cluster (or multiple, if you are geo-redundant), Logstash (different configurations and instances, depending on your infrastructure) and a Kibana web-frontend (which also needs to be highly available). I have operated ElasticSearch clusters on multiple occasions, and I must say that it is not an easy task to optimise it for the particular workload you might encounter. I also have seen many ES clusters operated by other people, and I have seen these clusters failing a lot (so it's not just me). The reduced complexity of DTail also makes it more robust against outages. You won't troubleshoot your distributed application very well if the log management infrastructure isn't working either.</p> <a class="textlink" href="2021-04-22-dtail-the-distributed-log-tail-program.html">2021-04-22-dtail-the-distributed-log-tail-program.html</a><br /> <p>I don't say that the ELK stack doesn't work, but it requires experts and additional hardware resources to support it. But instead, if you keep your infrastructure simple (e.g. only use DTail), it will maintain pretty much by itself. </p> <h1>More KISS</h1> <h2>The Adslowbe PDF Reader</h2> <p>Another perfect example is the Adobe PDF reader. How can it be that the inventor of the PDF format creates such a terrible user experience with its official reader? The reader is awful bloated, and slow. There are much better alternatives around (especially for Linux and other UNIX like operating systems, look at Zathura for example). I believe the reason Adobe's reader is like this is featuritis, and 90% of the users don't use 90% of all available features. Less is more; keep it simple and stupid. </p> <h2>The power of plain text files</h2> <p>Speaking of file formats, never underestimate the power of plain text files. Plain text files don't require any special software to be opened, and they outlive the software which created them in the first place. You will still be able to read a plain text file on a modern computer system ten (or twenty) years from now, but you probably won't be able to read such an old version of an Adobe Photoshop image file if the software required for reading that format isn't supported anymore and doesn't run anymore on modern computers.</p> <h2>KISS for programmers</h2> <p>Not to mention, keeping things simple and stupid also reduces the potential malicious attack surface. It's not just about the software and services you use and operate. It's also about the software you write. Here is a nice article about the KISS principle in software development:</p> <a class="textlink" href="https://thevaluable.dev/kiss-principle-explained/">https://thevaluable.dev/kiss-principle-explained/</a><br /> <h1>When KISS is not KISS anymore</h1> <p>There is, however, a trap. The more you spend time with things, the more these things feel natural to you and you become an expert. The more you become an expert, the more you introduce more abstractions and other clever ways of doing things. For you, things seem to be KISS still, but another person may not be an expert and might not understand what you do. One of the fundamental challenges is to keep things really KISS. You might add abstraction upon abstraction to a system and don't even notice it until it is too late.</p> <p>Enough ranted for now :-). E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>On being Pedantic about Open-Source</title> <link href="gemini://buetow.org/gemfeed/2021-08-01-on-being-pedantic-about-open-source.gmi" /> <id>gemini://buetow.org/gemfeed/2021-08-01-on-being-pedantic-about-open-source.gmi</id> <updated>2021-08-01T10:37:58+03:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>I believe that it is essential to always have free and open-source alternatives to any kind of closed-source proprietary software available to choose from. But there are a couple of points you need to take into consideration. . .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>On being Pedantic about Open-Source</h1> <pre> __ _____....--' .' ___...---'._ o -`( ___...---' \ .--. `\ ___...---' | \ \ `| | |o o | | | | \___'.-`. '. | | `---' '^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^' LGB - Art by lgbearrd </pre> <p class="quote"><i>Published by Paul Buetow 2021-08-01</i></p> <p>I believe that it is essential to always have free and open-source alternatives to any kind of closed-source proprietary software available to choose from. But there are a couple of points you need to take into consideration. </p> <h2>The costs of open-source</h2> <p>One benefit of using open-source software is that it doesn't cost anything, right? That's correct in many cases. However, in some cases you still need to spend a significant amount of time configuring the software to work for you. It will be more expensive to use open-source software than proprietary commercial one if you aren't careful. </p> <p>Not to say that I haven't seen the same effect with commercial software where people had to, after buying it, put a bunch of effort to make it work due to the lack of quality or due to high complexity. But that's either bad luck or bad decision-making. Most commercial providers I have worked with try to make it work for you, so you also will buy other products and services from them later on and don't lose you as a happy customer.</p> <h2>Commercial providers</h2> <p>Producers of commercial software want to earn money after all. This is to grow their businesses and also to be able to pay their employees, who also need to care for their families. Employees build up their careers, build houses, and are proud of their accomplishments in the company.</p> <p>So per se, commercial software is not a bad thing. Right? At least, commercial closed-source software is not a bad thing in its heart. Unfortunately, some companies have to keep their software closed-source to not lose their competitive edge over other competitors. </p> <h2>Earning on open-source</h2> <p>There are also companies that earn on open-source software. All the code they write is free for download and use, but you, as a customer, could pay for service and support if you are not an expert and can't manage it by yourself. </p> <p>I like this approach, as you can balance the effort and costs the way it suits you best, and in doubt, you can audit the source code. Are you already an expert? Perfect, you don't need to buy additional support for the software. Everything can be set up by yourself, given that you have the time and priority.</p> <p>Also, once an open-source project reached a certain size, it is unlikely to be abandoned one day. As long as at least one person is willing to be the open-source maintainer, the project won't die. Whereas commercial providers can decide from today to tomorrow to retire software or go bankrupt (unless you purchase Microsoft Word, I don't believe it will die anytime soon). </p> <h2>Open-source organizations and individual contributors</h2> <p>Besides corporations, millions of individual open-source contributors write free and open-source software not for money but for pleasure. Often, they are organized in non-profit organizations, working together to reach a common goal (it is worth mentioning that there are also many professionals, payed by large corporations, working full-time for non-profit open-source projects in order to push the features and reach the goals of the corporations). Sometimes, people don't agree on the project goal, so it gets forked, which can be a good thing. The more diversity, the better, as this is where competition and innovation happens. Also, the end user will end up with more choices. </p> <p>These open-source projects are of a very high quality standard and are rock-solid, if not better, alternatives to proprietary counterparts. If the project isn't backed by a large corporation already, you should donate to these open-source organizations and/or individual contributors. I have donated to some projects I use personally. Do you learn a foreign language and use Anki flashcards? It's entirely free and open-source, and they happily accept donations ensuring future maintenance and development.</p> <h2>Lesser known projects and the charm of clunkiness</h2> <p>Looking at the smaller, lesser-known open-source projects (not talking about established open-source projects like FreeBSD and Linux): You can't, however, expect the software to be perfect and bug-free. After all, most of the code is written for pleasure and fun in the developers' free time. Besides the developer himself, you might be the only user of the project. The software may be a bit clunky to use, and probably bugs are lurking around, and it might only work for a very specific use case.</p> <p>Clunkiness can be charmful, though. And it can also encourage you to contribute code to make it better. There is a lot of such code in personal GitHub and GitLab repositories. The quality of such small open-source projects varies drastically. Many hobbyist programmers see programming as an art and put tons of effort into their projects. Others upload broken crap, which is dangerous to use. So have a look at the code before you use it!</p> <h2>The security aspect</h2> <p>One of the main conceptions about open-source software is that it is more secure than closed-source software because everybody can read and fix the code. Is that actually true? You can only be sure when you audit the code by yourself. If you are like me, you won't have time to audit all the open-source software you use. It's impossible to audit more than 100 million lines of Linux kernel code. Static code analysis tools come in handy here, but they still require humans to look at the results.</p> <p>Security bugs in open-source projects are exposed to the public and fixed quickly, while we don't know exactly what happens to security bugs in closed-source ones. Still, hackers and security specialists can find them through reverse engineering and penetration testing. Overall, thinking of security, In my opinion it is still better to prefer open-source software because the more significant the project, the higher the probability that security bugs are found and fixed as more parties are looking into it. Furthermore, provided you have the necessary resources, you could still deduct an audit by yourself. The latter especially happens when companies with its own security and penetration testing departments are evaluating the use of open-source. This is something not every company can afford though.</p> <h2>Always watch out for open-source alternatives</h2> <p>Do you need Microsoft Word? Why don't you just use the Vim text editor or GNU Emacs to write your letters? If that's too nerdy, you can still use open-source alternatives such as AbiWord or LibreOffice. Larger organizations have the tendency to standardize the software their employees have to use. Unfortunately, as Microsoft Word is the de-facto standard text processing program, most companies prefer Word over LibreOffice. Same with Microsoft Excel vs LibreOffice Calc or other spreadsheet alternatives like Gnumeric. I don't know why that is; please E-Mail me, and I will update this blog article. I guess the devil lies in the detail here.</p> <p>I only use free and open-source operating systems on my personal Laptops, Desktop PCs and servers (FreeBSD and Linux based ones). Most of the programs and apps I use on them are free and open-source as well, and I am comfortable with it for over twenty years. Exceptions are the BIOSes and some firmwares of my devices. I also use Skype as most of my friends and family are using it. They are, unfortunately, proprietary software still. But I will be looking into Matrix as a Skype alternative when I have time. There are also open BIOS alternatives, but they usually don't work on my devices.</p> <h2>What about mobile?</h2> <p>I struggle to go 100% open-source on my Smartphone. I use a Samsung phone with the stock Android as provided by Samsung. I love the device as it is large enough to use as a portable reading and note-taking device, and it can also take decent pictures. As a cloud backup solution, I have my own NextCloud server (open-source). Android is mainly open-source software, but many closed parts are still included. I replaced most of the standard apps with free and open-source variants from the F-Droid store though.</p> <p>I could get a LineageOS based phone to get rid of the proprietary Android parts (I tried that out a couple of times in the past). But then a couple of convenient apps, such as Google Maps or Banking or Skype or the E-Ticket apps of various Airlines, various review apps when searching for restaurants, Audible (I think Audible offers an excellent service), etc., won't work anymore. The proprietary Google Maps is still the best maps app, even though there are open alternatives available. It's not that I couldn't live without these apps, but they make life a lot more convenient.</p> <h2>Know the alternatives</h2> <p>Thinking about alternative solutions is always a good idea. My advice is never to be entirely dependant on any proprietary software. Before you decide to use proprietary software, try to find alternatives in the open-source world. You might need to invest some time playing around with the options available. Maybe they are good enough for you, or maybe not.</p> <p>If you still want to use proprietary software, use it with caution. Have a look at the recent change at Google Photos: For a long time, "high quality" photos could be uploaded there quota-less for free. However, Google recently changed the model so that people exceeding a quota have to start paying for the extra space consumed. I am not against Google's decision, but it shows you that a provider can always change its direction. So you can't entirely rely on these. I repeat myself: Don't fully rely on anything proprietary, but you might still use proprietary software or services for your own convenience.</p> <h2>You can't control it all</h2> <p>The biggest problem I have with going 100% open-source is actually time. You can't control all the software you use or might be using in the future. You have only a finite amount of time available in your life. So you have to decide what's more important: Investigate and use an open-source alternative of every program and app you have installed, or rather spend quality time with your family and have a nice walk in the park or go to a sports class or cook a nice meal? You can't control it all in today's world of tech, not as a user and even not as a tech worker. There's a great blog post worth reading: </p> <a class="textlink" href="https://unixsheikh.com/articles/how-to-stay-sane-in-todays-world-of-tech.html">https://unixsheikh.com/articles/how-to-stay-sane-in-todays-world-of-tech.html</a><br /> <h2>The middle way</h2> <p>Regarding my personal Smartphone dilemma: I guess the middle way is to use two phones: </p> <ul> <li>Have a secondary, proprietary Android phone with Google Play store (or an Apple iPhone if this is more your thing) and all its benefits for occasional use. Use the proprietary phone only with intention. Such a phone implies some risks regarding your privacy. If you aren't careful, app providers will collect your personal data for building a digital profile of you, which gets used for online advertisement and other things. This doesn't only applies to the Smartphone, this also applies to some proprietary software (including cloud services such as Google Photos) you use on your home computer or websites you visit (I am looking at you, Facebook, Twitter and friends). Try to disable all tracking features on such a phone. It's not a guarantee that nobody will be collecting data from you anymore, but you should take at least the chance. Cal Newport once mentioned that you should not use privacy concerning apps as much anyway and instead spend more time on things which matter.</li> <li>Have a primary phone, entirely based on free and open-source software. There will be probably no app collecting your personal data. Try to use the primary phone for all of your everyday activities and fall back to the proprietary phone only for particular use cases. Once there is decent hardware (with a decent camera) running Linux (such as Mobian, for example) available, I will consider a purchase. The only 3rd party which then will still be able to track you will be your network provider. You could start your own phone network, but that seems overkill. There is already the Pinephone and the Librem 5 running a real Linux (Android is Linux based, but it doesn't count as a real Linux for me). Still, I want to wait a bit longer for better hardware to be available (I want to have a good camera always with me).</li> <li>You could also add a tertiary phone to the mix, which you only use for work and nothing else. That one will be very likely a proprietary phone too. You only have to keep this one around when you are working or when you are on-call.</li> </ul> <p>I have been playing with other smartphone OS alternatives, especially with MeeGo (which has died already) and SailfishOS, too. Security and privacy seem to be significantly improved compared to an Android. As a matter of fact, I bought a cheap and used Sony Xperia XA2 last year and installed SailfishOS on it. It's a nice toy, but it's still not the holy open-source grail as there are also proprietary parts in SailfishOS. Platforms such as Mobian, Ubuntu Touch and Plasma Mobile are more compelling to me. People must explore alternatives to Android and Apple here, as otherwise, you won't own any gadgets anymore:</p> <a class="textlink" href="https://news.slashdot.org/story/21/07/10/0120236/by-2030-you-wont-own-any-gadgets">https://news.slashdot.org/story/21/07/10/0120236/by-2030-you-wont-own-any-gadgets</a><br /> <p>Anyhow, any gadgets, including your phone, should be a tool you use. Don't let the phone use you!</p> <h2>The downside of being a nobody</h2> <p>Be aware that it might be to your disadvantage if you manage to go completely under cover without anyone collecting data from you. Suppose you are a nobody on the web (no social media profiles, no tracking history, etc.). In that case, you aren't behaving like the mass, and therefore you are suspicious. So it might be even a good thing to leave your marks here and there once in a while. You aren't hiding anything anyway, correct? Just be mindful what you are sharing about yourself. I share personal things very rarely on Facebook for example. And I only share a small subset of my personal life on my personal homepage and this blog and on all of my social media accounts. Nobody is interested in what I have for breakfast anyway I guess. Write me an E-Mail if you are interested in what I am having for breakfast.</p> <h2>Mobile open-source OSes are still evolving</h2> <p>You might have noticed that I wrote a lot about Smartphones in this article. The reason is that free and open-source software for Smartphones is still evolving. In contrast, for Laptops and Desktop PCs, it's already there. There is no reason to use proprietary operating systems such as Windows or macOS on your computers unless your employer forces you to use one of these. Why would they force you? It has to do with standardization again. The IT department only can manage so many platforms. It wouldn't be manageable by IT if every employee would install their own Linux distribution or one of the *BSDs. That might work for small startups but not for larger companies, especially not for a security-focused companies.</p> <p>I would love a standardized Linux at work, though. Dell and Lenovo also officially support Linux on their notebooks. The culprit may be knowledgeable IT staff maintaining and giving support to the Desktop Linux users. Not all colleagues are Linux geeks like you and me. I am using macOS for work, but I am not an Apple expert. Occasionally I have to contact IT support regarding some issues I have. I don't use the macOS GUI a lot; I mainly live in the terminal so I can run the same tools I also use on Linux.</p> <h2>Conclusion</h2> <p>Should you be pedantic about open-source software? It depends. It depends on your fundamental values and how much time you are ready to invest. Open-source software is not just free as in money, but also free as in freedom. You will gain back complete control of your personal data. Unfortunately, installing ready proprietary apps from the Play Store is much more convenient than building up a trustworthy open-source-based infrastructure by yourself. As a guideline, use proprietary software and services with caution. Be mindful about your choices and where you leave your digital fingerprints. In doubt, think less is more. Do you really need this new shiny app? What benefit does it provide to you? Probably you don't really need that shiny new app.</p> <p>You have better chances when you know how to manage your own server and install and manage alternatives to the big cloud providers by yourself. I have the advantage that I have work experience as a Linux Systems Administrator here. I mentioned NextCloud already. I use NextCloud for online photo and file storage, contact and calendar sync and as an RSS news feed server. You could do the same with your own E-Mail server, you can also host your own website and blog. I also mentioned Matrix as a Skype alternative (which could also be an alternative to WhatsApp, Skype, Telegram, Viber, ...). I don't know a lot about Matrix yet, but it seems to be a very neat alternative. I am ready to invest time in it as one of my future personal pet projects. Not only because I think it's better, but also because for fun and as a hobby. But this doesn't mean that I invest *all* of my personal free time in it.</p> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>The Well-Grounded Rubyist</title> <link href="gemini://buetow.org/gemfeed/2021-07-04-the-well-grounded-rubyist.gmi" /> <id>gemini://buetow.org/gemfeed/2021-07-04-the-well-grounded-rubyist.gmi</id> <updated>2021-07-04T10:51:23+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>When I was a Linux System Administrator, I have been programming in Perl for years. I still maintain some personal Perl programming projects (e.g. Xerl, guprecords, Loadbars). After switching jobs a couple of years ago (becoming a Site Reliability Engineer), I found Ruby (and some Python) widely used there. As I wanted to do something new, I then decided to give Ruby a go for all medium-sized programming and scripting projects.. .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>The Well-Grounded Rubyist</h1> <p class="quote"><i>Published by Paul Buetow 2021-07-04</i></p> <p>When I was a Linux System Administrator, I have been programming in Perl for years. I still maintain some personal Perl programming projects (e.g. Xerl, guprecords, Loadbars). After switching jobs a couple of years ago (becoming a Site Reliability Engineer), I found Ruby (and some Python) widely used there. As I wanted to do something new, I decided to give Ruby a go.</p> <p>You should learn or try out one new programming language once yearly anyway. If you end up not using the new language, that's not a problem. You will learn new techniques with each new programming language and this also helps you to improve your overall programming skills even for other languages. Also, having some background in a similar programming language makes it reasonably easy to get started. Besides that, learning a new programming language is kick-a** fun!</p> <a href="https://buetow.org/gemfeed/2021-07-04-the-well-grounded-rubyist/book-cover.jpg"><img src="https://buetow.org/gemfeed/2021-07-04-the-well-grounded-rubyist/book-cover.jpg" /></a><br /> <p>Superficially, Perl seems to have many similarities to Ruby (but, of course, it is entirely different to Perl when you look closer), which pushed me towards Ruby instead of Python. I have tried Python a couple of times before, and I managed to write good code, but I never felt satisfied with the language. I didn't love the syntax, especially the indentations used; they always confused me. I don't dislike Python, but I don't prefer to program in it if I have a choice, especially when there are more propelling alternatives available. Personally, it's so much more fun to program in Ruby than in Python.</p> <a href="https://buetow.org/gemfeed/2021-07-04-the-well-grounded-rubyist/book-backside.jpg"><img src="https://buetow.org/gemfeed/2021-07-04-the-well-grounded-rubyist/book-backside.jpg" /></a><br /> <p>Yukihiro Matsumoto, the inventor of Ruby, said: "I wanted a scripting language that was more powerful than Perl and more object-oriented than Python" - So I can see where some of the similarities come from. I personally don't believe that Ruby is more powerful than Perl, though, especially when you take CPAN and/or Perl 6 (now known as Raku) into the equation. Well, it all depends on what you mean with "more powerful". But I want to stay pragmatic and use what's already used at my workplace.</p> <h2>My Ruby problem domain</h2> <p>I wrote a lot of Ruby code over the last couple of years. There were many small to medium-sized tools and other projects such as Nagios monitoring checks, even an internal monitoring & reporting site based on Sinatra. All Ruby scripts I wrote do their work well; I didn't encounter any significant problems using Ruby for any of these tasks. Of course, there's nothing that couldn't be written in Perl (or Python), though, after all, these languages are all Turing-complete and all these languages also come with a huge set of 3rd party libraries :-).</p> <p>I don't use Ruby for all programming projects, though. </p> <ul> <li>I am using Bash for small sized (usually below 500 lines of code) scripts and ad-hoc command-line automation.</li> <li>I program in Google Go for more complex tools (such as DTail) and for problem solving involving data crunching.</li> <li>Occasionally, I write some lines of Java code for minor feature enhancements and fixes to improve the reliability of some the services.</li> <li>Sometimes, I still program in good old C. This is for special projects (e.g. I/O Riot) or low-level PoCs or SystemTap guru mode scripts.</li> </ul> <a class="textlink" href="https://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.html">Also have a look at my personal Bash coding style.</a><br /> <a class="textlink" href="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.html">Read here about DTail - the distributed log tail program.</a><br /> <a class="textlink" href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux.html">This is a magazine article about I/O Riot I wrote.</a><br /> <p>For all other in-between tasks I mainly use the Ruby programming language (unless I decide to give something new a shot once in a while).</p> <h2>Being stuck in Ruby-mediocrity</h2> <p>As a Site Reliability Engineer there were many tasks and problems to be solved as efficiently and quickly as possible and, of course, without bugs. So I learned Ruby relatively fast by doing and the occasional web search for "how to do thing X". I always was eager to get the problem at hand solved and as long as the code solved the problem I usually was happy.</p> <p>Until now, I never read a whole book or took a course on Ruby. As a result, I found myself writing Ruby in a Perl-ish procedural style (with Perl, you can do object-oriented programming too, but Perl wasn't designed from the ground up to be an object-oriented language). I didn't take advantage of all the specialities Ruby has to offer as I invested most of my time in the problems at hand and not in the Ruby idiomatic way of doing things.</p> <p>An unexpected benefit was that most of my Ruby code (probably not all, there are always dark corners in some old code bases lurking around) was easy to follow and extend or fix, even by people who usually don't speak Ruby, as there wasn't too much magic involved in my code - However, I could have done better still. Looking at other Ruby projects, I noticed over time that there is so much more to the language I wanted to explore. For example new techniques and the Ruby best practise, and much more about how things work under the hood, I wanted to learn about.</p> <h2>O'Reilly Safari Books Online</h2> <p>I do have an O'Reilly Safari Online subscription (thank you, employer). To my liking, I found the "The Well-Grounded Rubyist" book there (the text version and also the video version of it). I watched the video version for a couple of weeks, chunking the content into small pieces so it was able to fit into my schedule, increasing the playback speed for the topics I knew already well enough and slowed it down to actual pace when there was something new to learn and occasionally jumped back to the text book to review what I just learned. To my satisfaction, I was already familiar with over half of the language. But there was still the big chunk, especially how the magic happens under the hood in Ruby, which I missed out on, but I am happy now to be aware of it now.</p> <p>I also loved the occasional dry humour in the book: "An enumerator is like a brain in a science fiction movie, sitting on a table with no connection to a body but still able to think". :-)</p> <p>Will I rewrite and refactor all of my existing Ruby programs? Probably not, as they all do their work as intended. Some of these scripts will be eventually replaced or retired. But depending on the situation, I might refactor a module, class or a method or two once in a while. I already knew how to program in an object-oriented style from other languages (e.g. Java, C++, Perl Moose and plain) before I started Ruby, so my existing Ruby code is not as bad as you might assume after reading this article :-). In contrast to Java/C++, Ruby is a dynamic language, and the idiomatic ways of doing things differs from statically typed languages.</p> <h2>Key takeaways</h2> <p>These are my key takeaways. These only point out some specific things I have learned, and represent, by far, not everything I've learned from the book.</p> <h3>"Everything" is an object</h3> <p>In Ruby, everything is an object. However, Ruby is not Smalltalk. It depends on what you mean by "everything". Fixnums are objects. Classes also are, as instances of class Class. Methods, operators and blocks aren't but can be wrapped by objects via a "Proc". A simple assignment is not and can't. Statements like "while" also aren't and can't. Comments obviously also fall in the latter group. Ruby is more object-oriented than everything else I have ever seen, except for Smalltalk.</p> <p>In Ruby, like in Java/C++, classes are classes, objects are instances of classes, and there are class inheritances. There is single inheritance in Ruby, but with the power of mixing in modules, you can extend your classes in a better way than multiple class inheritances (like in C++) would allow. It's also different to Java interfaces, as interfaces in Java only come with the method prototypes and not with the actual method implementations like Ruby modules.</p> <h3>"Normal" objects and singleton objects</h3> <p>In Ruby, you can also have singleton objects. A singleton object can be an instance of a class but be modified after its creation (e.g. a method added to only this particular instance after its instantiation). Or, another variant of a singleton object is a class (yes, classes are also objects in Ruby). All of that is way better described in the book, so have a read by yourself if you are confused now; just remember: Rubys object system is very dynamic and flexible. At runtime, you can add and modify classes, objects of classes, singleton objects and modules. You don't need to restart the Ruby interpreter; you can change the code during runtime dynamically through Ruby code.</p> <h3>Domain specific languages</h3> <p>Due to Ruby's flexibility through object individualization (e.g. adding methods at runtime, or changing the core behaviour of classes, catching unknown method calls and dynamically dispatch and/or generate the missing methods via the "method_missing" method), Ruby is a very good language to write your own small domain specific language (DSL) on top of Ruby syntax. I only noticed that after reading this book. Maybe, this is one of the reasons why even the configuration management system Puppet once tried to use a Ruby DSL instead of the Puppet DSL for its manifests. I am not sure why the project got abandoned though, probably it has to do with performance. Do be honest, Ruby is not the fastest language, but it is fast enough for most use cases. And, especially from Ruby 3, performance is one of the main things being worked on currently. If I want performance, I can always use another programming language.</p> <h3>Ruby is "self-ish"</h3> <p>Ruby will fall back to the default "self" object if you don't specify an object method receiver. To give you an example, some more explanation is needed: There is the "Kernel" module mixed into almost every Ruby object. For example, "puts" is just a method of module "Kernel". When you write "puts :foo", Ruby sends the message "puts" to the current object "self". The class of object "self" is "Object". Class Object has module "Kernel" mixed in, and "Kernel" defines the method "puts". </p> <pre> >> self => main >> self.class => Object >> self.class.included_modules => [PP::ObjectMixin, Kernel] >> Kernel.class => Module >> Kernel.methods.grep(/puts/) => [:puts] >> puts 'Hello Ruby' Hello Ruby => nil >> self.puts 'Hello World' Hello World => nil </pre> <p>Ruby offers a lot of syntactic sugar and seemingly magic, but it all comes back to objects and messages to objects under the hood. As all is hidden in objects, you can unwrap and even change the magic and see what's happening under the hood. Then, suddenly everything makes so much sense.</p> <h3>Functional programming</h3> <p>Ruby embraces an object-oriented programming style. But there is good news for fans of the functional programming paradigm: From immutable data (frozen objects), pure functions, lambdas and higher-order functions, lazy evaluation, tail-recursion optimization, method chaining, currying and partial function application, all of that is there. I am delighted about that, as I am a big fan of functional programming (having played with Haskell and Standard ML before).</p> <p>Remember, however, that Ruby is not a pure functional programming language. You, the Rubyist, need to explicitly decide when to apply a functional style, as, by heart, Ruby is designed to be an object-oriented language. The language will not enforce side effect avoidance, and you will have to enable tail-recursion optimization (as of Ruby 2.5) explicitly, and variables/objects aren't immutable by default either. But that all does not hinder you from using these features. </p> <p>I liked this book so much so that I even bought myself a (used) paper copy of it. To my delight, there was also a free eBook version in ePub format included, which I now have on my Kobo Forma eBook reader. :-)</p> <h2>Perl</h2> <p>Will I abandon my beloved Perl? Probably not. There are also some Perl scripts I use at work. But unfortunately I only have a limited amount of time and I have to use it wisely. I might look into Raku (formerly known as Perl 6) next year and use it for a personal pet project, who knows. :-). I also highly recommend reading the two Perl books "Modern Perl" and "Higher-Order Perl".</p> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Gemtexter - One Bash script to rule it all</title> <link href="gemini://buetow.org/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.gmi" /> <id>gemini://buetow.org/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.gmi</id> <updated>2021-06-05T19:03:32+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>You might have read my previous blog post about entering the Geminispace, where I pointed out the benefits of having and maintaining an internet presence there. This whole site (the blog and all other pages) is composed in the Gemtext markup language. . .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Gemtexter - One Bash script to rule it all</h1> <pre> o .,<>., o |\/\/\/\/| '========' (_ SSSSSSs )a'`SSSSSs /_ SSSSSS .=## SSSSS .#### SSSSs ###::::SSSSS .;:::""""SSS .:;:' . . \\ .::/ ' .'| .::( . | :::) \ /\( / /) ( | .' \ . ./ / _-' |\ . | _..--.. . /"---\ | ` | . | -=====================,' _ \=(*#(7.#####() | `/_.. , ( _.-''``';'-''-) ,. \ ' '+/// | .'/ \ ``-.) \ ,' _.- (( `-' `._\ `` \_/_.' ) /`-._ ) | ,'\ ,' _.'.`:-. \.-' / <_L )" | _/ `._,' ,')`; `-'`' | L / / / `. ,' ,|_/ / \ ( <_-' \ \ / `./ ' / /,' \ /|` `. | )\ /`._ ,'`._.-\ |) \' / `.' )-'.-,' )__) |\ `| : /`. `.._(--.`':`':/ \ ) \ \ |::::\ ,'/::;-)) / ( )`. | ||::::: . .::': :`-( |/ . | ||::::| . :| |==[]=: . - \ |||:::| : || : | | /\ ` | ___ ___ '|;:::| | |' \=[]=| / \ \ | /_ ||``|||::::: | ; | | | \_.'\_ `-. : \_``[]--[]|::::'\_;' )-'..`._ .-'\``:: ` . \ \___.>`''-.||:.__,' SSt |_______`> <_____:::. . . \ _/ `+a:f:......jrei''' </pre> <p class="quote"><i>Published by Paul Buetow 2021-06-05</i></p> <p>You might have read my previous blog post about entering the Geminispace, where I pointed out the benefits of having and maintaining an internet presence there. This whole site (the blog and all other pages) is composed in the Gemtext markup language. </p> <a class="textlink" href="https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.html">Welcome to the Geminispace</a><br /> <p>This comes with the benefit that I can write content in my favourite text editor (Vim). </p> <h2>Motivation</h2> <p>Another benefit of using Gemini is that the Gemtext markup language is easy to parse. As my site is dual-hosted (Gemini+HTTP), I could, in theory, just write a shell script to deal with the conversion from Gemtext to HTML; there is no need for a full-featured programming language here. I have done a lot of Bash in the past, but I am also often revisiting old tools and techniques for refreshing and keeping the knowledge up to date here.</p> <a href="https://buetow.org/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all/blog-engine.jpg"><img alt="Motivational comic strip" title="Motivational comic strip" src="https://buetow.org/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all/blog-engine.jpg" /></a><br /> <p>I have exactly done that - I wrote a Bash script, named Gemtexter, for that:</p> <a class="textlink" href="https://codeberg.org/snonux/gemtexter">https://codeberg.org/snonux/gemtexter</a><br /> <p>In short, Gemtexter is a static site generator and blogging engine that uses Gemtext as its input format.</p> <h2>Output formats</h2> <p>Gemtexter takes the Gemntext Markup files as the input and generates the following outputs from it (you find examples for each of these output formats on the Gemtexter GitHub page):</p> <ul> <li>HTML files for my website</li> <li>Markdown files for a GitHub page</li> <li>A Gemtext Atom feed for my blog posts</li> <li>A Gemfeed for my blog posts (a particular feed format commonly used in Geminispace. The Gemfeed can be used as an alternative to the Atom feed).</li> <li>An HTML Atom feed of my blog posts</li> </ul> <p>I could have done all of that with a more robust language than Bash (such as Perl, Ruby, Go...), but I didn't. The purpose of this exercise was to challenge what I can do with a "simple" Bash script and learn new things.</p> <h2>Taking it as far as I should, but no farther</h2> <p>The Bash is suitable very well for small scripts and ad-hoc automation on the command line. But it is for sure not a robust programming language. Writing this blog post, Gemtexter is nearing 1000 lines of code, which is actually a pretty large Bash script.</p> <h3>Modularization </h3> <p>I modularized the code so that each core functionality has its own file in ./lib. All the modules are included from the main Gemtexter script. For example, there is one module for HTML generation, one for Markdown generation, and so on. </p> <pre> paul in uranus in gemtexter on 🌱 main ❯ wc -l gemtexter lib/* 117 gemtexter 59 lib/assert.source.sh 128 lib/atomfeed.source.sh 64 lib/gemfeed.source.sh 161 lib/generate.source.sh 50 lib/git.source.sh 162 lib/html.source.sh 30 lib/log.source.sh 63 lib/md.source.sh 834 total </pre> <p>This way, the script could grow far beyond 1000 lines of code and still be maintainable. With more features, execution speed may slowly become a problem, though. I already notice that Gemtexter doesn't produce results instantly but requires few seconds of runtime already. That's not a problem yet, though. </p> <h3>Bash best practises and ShellCheck</h3> <p>While working on Gemtexter, I also had a look at the Google Shell Style Guide and wrote a blog post on that:</p> <a class="textlink" href="https://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.html">Personal bash coding style guide</a><br /> <p>I followed all these best practices, and in my opinion, the result is a pretty maintainable Bash script (given that you are fluent with all the sed and grep commands I used).</p> <p>ShellCheck, a shell script analysis tool written in Haskell, is run on Gemtexter ensuring that all code is acceptable. I am pretty impressed with what ShellCheck found. </p> <p>It, for example, detected "some_command | while read var; do ...; done" loops and hinted that these create a new subprocess for the while part. The result is that all variable modifications taking place in the while-subprocess won't reflect the primary Bash process. ShellSheck then recommended rewriting the loop so that no subprocess is spawned as "while read -r var; do ...; done < <(some_command)". ShellCheck also pointed out to add a "-r" to "read"; otherwise, there could be an issue with backspaces in the loop data.</p> <p>Furthermore, ShellCheck recommended many more improvements. Declaration of unused variables and missing variable and string quotations were the most common ones. ShellSheck immensely helped to improve the robustness of the script.</p> <a class="textlink" href="https://shellcheck.net">https://shellcheck.net</a><br /> <h3>Unit testing</h3> <p>There is a basic unit test module in ./lib/assert.source.sh, which is used for unit testing. I found this to be very beneficial for cross-platform development. For example, I noticed that some unit tests failed on macOS while everything still worked fine on my Fedora Linux laptop. </p> <p>After digging a bit, I noticed that I had to install the GNU versions of the sed and grep commands on macOS and a newer version of the Bash to make all unit tests pass and Gemtexter work.</p> <p>It has been proven quite helpful to have unit tests in place for the HTML part already when working on the Markdown generator part. To test the Markdown part, I copied the HTML unit tests and changed the expected outcome in the assertions. This way, I could implement the Markdown generator in a test-driven way (writing the test first and afterwards the implementation).</p> <h3>HTML unit test example</h3> <pre> gemtext='=> http://example.org Description of the link' assert::equals "$(generate::make_link html "$gemtext")" \ '<a class="textlink" href="http://example.org">Description of the link</a><br />' </pre> <h3>Markdown unit test example</h3> <pre> gemtext='=> http://example.org Description of the link' assert::equals "$(generate::make_link md "$gemtext")" \ '[Description of the link](http://example.org) ' </pre> <h2>Handcrafted HTML styles</h2> <p>I had a look at some ready off the shelf CSS styles, but they all seemed too bloated. There is a whole industry selling CSS styles on the interweb. I preferred an effortless and minimalist style for the HTML site. So I handcrafted the Cascading Style Sheets manually with love and included them in the HTML header template. </p> <p>For now, I have to re-generate all HTML files whenever the CSS changes. That should not be an issue now, but I might move the CSS into a separate file one day.</p> <p>It's worth mentioning that all generated HTML files and Atom feeds pass the W3C validation tests.</p> <p> </p> <h2>Configurability</h2> <p>In case someone else than me wants to use Gemtexter for his own site, it is pretty much configurable. It is possible to specify your own configuration file and your own HTML templates. Have a look at the GitHub page for examples.</p> <h2>Future features</h2> <p>I could think of the following features added to a future version of Gemtexter:</p> <ul> <li>Templating of Gemtext files so that the .gmi files are generated from .gmi.tpl files. The template engine could do such things as an automatic table of contents and sitemap generation. It could also include the output of inlined shell code, e.g. a fortune quote. </li> <li>Add support for more output formats, such as Groff, PDF, plain text, Gopher, etc.</li> <li>External CSS file for HTML.</li> <li>Improve speed by introducing parallelism and/or concurrency and/or better caching.</li> </ul> <h2>Conclusion</h2> <p>It was quite a lot of fun writing Gemtexter. It's a relatively small project, but given that I worked on that in my spare time once in a while, it kept me busy for several weeks. </p> <p>I finally revamped my personal internet site and started to blog again. I wanted the result to be exactly how it is now: A slightly retro-inspired internet site built for fun with unconventional tools. </p> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Personal Bash coding style guide</title> <link href="gemini://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi" /> <id>gemini://buetow.org/gemfeed/2021-05-16-personal-bash-coding-style-guide.gmi</id> <updated>2021-05-16T14:51:57+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the 'Google Shell Style Guide' I thought it is time to also write my own thoughts on that. I agree to that guide in most, but not in all points. . .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Personal Bash coding style guide</h1> <pre> .---------------------------. /,--..---..---..---..---..--. `. //___||___||___||___||___||___\_| [j__ ######################## [_| \============================| .==| |"""||"""||"""||"""| |"""|| /======"---""---""---""---"=| =|| |____ []* ____ | ==|| // \\ // \\ |===|| hjw "\__/"---------------"\__/"-+---+' </pre> <p class="quote"><i>Published by Paul Buetow 2021-05-16</i></p> <p>Lately, I have been polishing and writing a lot of Bash code. Not that I never wrote a lot of Bash, but now as I also looked through the Google Shell Style Guide, I thought it is time also to write my thoughts on that. I agree with that guide in most, but not in all points. </p> <a class="textlink" href="https://google.github.io/styleguide/shellguide.html">Google Shell Style Guide</a><br /> <h2>My modifications</h2> <p>These are my modifications to the Google Guide.</p> <h3>Shebang</h3> <p>Google recommends using always...</p> <pre> #!/bin/bash </pre> <p>... as the shebang line, but that does not work on all Unix and Unix-like operating systems (e.g., the *BSDs don't have Bash installed to /bin/bash). Better is:</p> <pre> #!/usr/bin/env bash </pre> <h3>Two space soft-tabs indentation</h3> <p>I know there have been many tab- and soft-tab wars on this planet. Google recommends using two space soft-tabs for Bash scripts. </p> <p>I don't care if I use two or four space indentations. I agree, however, that we should not use tabs. I tend to use four-space soft-tabs as that's how I currently configured Vim for any programming language. What matters most, though, is consistency within the same script/project.</p> <p>Google also recommends limiting the line length to 80 characters. For some people, that seems to be an old habit from the '80s, where all computer terminals couldn't display longer lines. But I think that the 80 character mark is still a good practice, at least for shell scripts. For example, I am often writing code on a Microsoft Go Tablet PC (running Linux, of course), and it comes in convenient if the lines are not too long due to the relatively small display on the device.</p> <p>I hit the 80 character line length quicker with the four spaces than with two spaces, but that makes me refactor the Bash code more aggressively, which is a good thing. </p> <h3>Breaking long pipes</h3> <p>Google recommends breaking up long pipes like this:</p> <pre> # All fits on one line command1 | command2 # Long commands command1 \ | command2 \ | command3 \ | command4 </pre> <p>I think there is a better way like the following, which is less noisy. The pipe | already indicates the Bash that another command is expected, thus making the explicit line breaks with \ obsolete:</p> <pre> # Long commands command1 | command2 | command3 | command4 </pre> <h3>Quoting your variables</h3> <p>Google recommends always quote your variables. Generally, it would be best if you did that only for variables where you are unsure about the content/values of the variables (e.g., content is from an external input source and may contain whitespace or other special characters). In my opinion, the code will become quite noisy when you always quote your variables like this:</p> <pre> greet () { local -r greeting="${1}" local -r name="${2}" echo "${greeting} ${name}!" } </pre> <p>In this particular example, I agree that you should quote them as you don't know the input (are there, for example, whitespace characters?). But if you are sure that you are only using simple bare words, then I think that the code looks much cleaner when you do this instead:</p> <pre> say_hello_to_paul () { local -r greeting=Hello local -r name=Paul echo "$greeting $name!" } </pre> <p>You see, I also omitted the curly braces { } around the variables. I only use the curly braces around variables when it makes the code either easier/clearer to read or if it is necessary to use them:</p> <pre> declare FOO=bar # Curly braces around FOO are necessary echo "foo${FOO}baz" </pre> <p>A few more words on always quoting the variables: For the sake of consistency (and for making ShellCheck happy), I am not against quoting everything I encounter. I also think that the larger the Bash script becomes, the more critical it becomes always to quote variables. That's because it will be more likely that you might not remember that some of the functions don't work on values with spaces in them, for example. It's just that I won't quote everything in every small script I write. </p> <h3>Prefer built-in commands over external commands</h3> <p>Google recommends using the built-in commands over available external commands where possible:</p> <pre> # Prefer this: addition=$(( X + Y )) substitution="${string/#foo/bar}" # Instead of this: addition="$(expr "${X}" + "${Y}")" substitution="$(echo "${string}" | sed -e 's/^foo/bar/')" </pre> <p>I can't entirely agree here. The external commands (especially sed) are much more sophisticated and powerful than the built-in Bash versions. Sed can do much more than the Bash can ever do by itself when it comes to text manipulation (the name "sed" stands for streaming editor, after all).</p> <p>I prefer to do light text processing with the Bash built-ins and more complicated text processing with external programs such as sed, grep, awk, cut, and tr. However, there is also medium-light text processing where I would want to use external programs. That is so because I remember using them better than the Bash built-ins. The Bash can get relatively obscure here (even Perl will be more readable then - Side note: I love Perl).</p> <p>Also, you would like to use an external command for floating-point calculation (e.g., bc) instead of using the Bash built-ins (worth noticing that ZSH supports built-in floating-points).</p> <p>I even didn't get started with what you can do with awk (especially GNU Awk), a fully-fledged programming language. Tiny Awk snippets tend to be used quite often in Shell scripts without honouring the real power of Awk. But if you did everything in Perl or Awk or another scripting language, then it wouldn't be a Bash script anymore, wouldn't it? ;-)</p> <h2>My additions</h2> <h3>Use of 'yes' and 'no'</h3> <p>Bash does not support a boolean type. I tend just to use the strings 'yes' and 'no' here. I used 0 for false and 1 for true for some time, but I think that the yes/no strings are easier to read. Yes, the Bash script would need to perform string comparisons on every check, but if performance is crucial to you, you wouldn't want to use a Bash script anyway, correct?</p> <pre> declare -r SUGAR_FREE=yes declare -r I_NEED_THE_BUZZ=no buy_soda () { local -r sugar_free=$1 if [[ $sugar_free == yes ]]; then echo 'Diet Dr. Pepper' else echo 'Pepsi Coke' fi } buy_soda $I_NEED_THE_BUZZ </pre> <h3>Non-evil alternative to variable assignments via eval</h3> <p>Google is in the opinion that eval should be avoided. I think so too. They list these examples in their guide:</p> <pre> # What does this set? # Did it succeed? In part or whole? eval $(set_my_variables) # What happens if one of the returned values has a space in it? variable="$(eval some_function)" </pre> <p>However, if I want to read variables from another file, I don't have to use eval here. I only have to source the file:</p> <pre> % cat vars.source.sh declare foo=bar declare bar=baz declare bay=foo % bash -c 'source vars.source.sh; echo $foo $bar $baz' bar baz foo </pre> <p>And suppose I want to assign variables dynamically. In that case, I could just run an external script and source its output (This is how you could do metaprogramming in Bash without the use of eval - write code which produces code for immediate execution):</p> <pre> % cat vars.sh #!/usr/bin/env bash cat <<END declare date="$(date)" declare user=$USER END % bash -c 'source <(./vars.sh); echo "Hello $user, it is $date"' Hello paul, it is Sat 15 May 19:21:12 BST 2021 </pre> <p>The downside is that ShellCheck won't be able to follow the dynamic sourcing anymore.</p> <h3>Prefer pipes over arrays for list processing</h3> <p>When I do list processing in Bash, I prefer to use pipes. You can chain them through Bash functions as well, which is pretty neat. Usually, my list processing scripts are of a structure like this:</p> <pre> filter_lines () { echo 'Start filtering lines in a fancy way!' >&2 grep ... | sed .... } process_lines () { echo 'Start processing line by line!' >&2 while read -r line; do ... do something and produce a result... echo "$result" done } # Do some post-processing of the data postprocess_lines () { echo 'Start removing duplicates!' >&2 sort -u } genreate_report () { echo 'My boss wants to have a report!' >&2 tee outfile.txt wc -l outfile.txt } main () { filter_lines | process_lines | postprocess_lines | generate_report } main </pre> <p>The stdout is always passed as a pipe to the next following stage. The stderr is used for info logging.</p> <h3>Assign-then-shift</h3> <p>I often refactor existing Bash code. That leads me to add and removing function arguments quite often. It's pretty repetitive work changing the $1, $2.... function argument numbers every time you change the order or add/remove possible arguments.</p> <p>The solution is to use of the "assign-then-shift"-method, which goes like this: "local -r var1=$1; shift; local -r var2=$1; shift". The idea is that you only use "$1" to assign function arguments to named (better readable) local function variables. You will never have to bother about "$2" or above. That is very useful when you constantly refactor your code and remove or add function arguments. It's something that I picked up from a colleague (a pure Bash wizard) some time ago:</p> <pre> some_function () { local -r param_foo="$1"; shift local -r param_baz="$1"; shift local -r param_bay="$1"; shift ... } </pre> <p>Want to add a param_baz? Just do this:</p> <pre> some_function () { local -r param_foo="$1"; shift local -r param_bar="$1"; shift local -r param_baz="$1"; shift local -r param_bay="$1"; shift ... } </pre> <p>Want to remove param_foo? Nothing easier than that:</p> <pre> some_function () { local -r param_bar="$1"; shift local -r param_baz="$1"; shift local -r param_bay="$1"; shift ... } </pre> <p>As you can see, I didn't need to change any other assignments within the function. Of course, you would also need to change the function argument lists at every occasion where the function is invoked - you would do that within the same refactoring session.</p> <h3>Paranoid mode</h3> <p>I call this the paranoid mode. The Bash will stop executing when a command exits with a status not equal to 0:</p> <pre> set -e grep -q foo <<< bar echo Jo </pre> <p>Here 'Jo' will never be printed out as the grep didn't find any match. It's unrealistic for most scripts to run in paranoid mode purely, so there must be a way to add exceptions. Critical Bash scripts of mine tend to look like this:</p> <pre> #!/usr/bin/env bash set -e some_function () { .. some critical code ... set +e # Grep might fail, but that's OK now grep .... local -i ec=$? set -e .. critical code continues ... if [[ $ec -ne 0 ]]; then ... fi ... } </pre> <h2>Learned</h2> <p>There are also a couple of things I've learned from Google's guide.</p> <h3>Unintended lexicographical comparison.</h3> <p>The following looks like a valid Bash code:</p> <pre> if [[ "${my_var}" > 3 ]]; then # True for 4, false for 22. do_something fi </pre> <p>... but it is probably an unintended lexicographical comparison. A correct way would be:</p> <pre> if (( my_var > 3 )); then do_something fi </pre> <p>or</p> <pre> if [[ "${my_var}" -gt 3 ]]; then do_something fi </pre> <h3>PIPESTATUS</h3> <p>I have never used the PIPESTATUS variable before. I knew that it's there, but I never bothered to understand how it works until now thoroughly.</p> <p>The PIPESTATUS variable in Bash allows checking of the return code from all parts of a pipe. If it's only necessary to check the success or failure of the whole pipe, then the following is acceptable:</p> <pre> tar -cf - ./* | ( cd "${dir}" && tar -xf - ) if (( PIPESTATUS[0] != 0 || PIPESTATUS[1] != 0 )); then echo "Unable to tar files to ${dir}" >&2 fi </pre> <p>However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you'll need to assign PIPESTATUS to another variable immediately after running the command (don't forget that [ is a command and will wipe out PIPESTATUS).</p> <pre> tar -cf - ./* | ( cd "${DIR}" && tar -xf - ) return_codes=( "${PIPESTATUS[@]}" ) if (( return_codes[0] != 0 )); then do_something fi if (( return_codes[1] != 0 )); then do_something_else fi </pre> <h2>Use common sense and BE CONSISTENT.</h2> <p>The following two paragraphs are thoroughly quoted from the Google guidelines. But they hit the hammer on the head:</p> <p class="quote"><i>If you are editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.</i></p> <p class="quote"><i>The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying rather than on how you are saying it. We present global style rules here, so people know the vocabulary. But local style is also important. If the code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this.</i></p> <h2>Advanced Bash learning pro tip</h2> <p>I also highly recommend having a read through the "Advanced Bash-Scripting Guide" (not from Google). I use it as the universal Bash reference and learn something new every time I look at it.</p> <a class="textlink" href="https://tldp.org/LDP/abs/html/">Advanced Bash-Scripting Guide</a><br /> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Welcome to the Geminispace</title> <link href="gemini://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.gmi" /> <id>gemini://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace.gmi</id> <updated>2021-04-24T19:28:41+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>Have you reached this article already via Gemini? You need a special client for that, web browsers such as Firefox, Chrome, Safari etc. don't support the Gemini protocol. The Gemini address of this site (or the address of this capsule as people say in Geminispace) is: ... to read on visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Welcome to the Geminispace</h1> <p class="quote"><i>Published by Paul Buetow 2021-04-24, last updated 2021-06-18, ASCII Art by Andy Hood</i></p> <p>Have you reached this article already via Gemini? It requires a Gemini client; web browsers such as Firefox, Chrome, Safari, etc., don't support the Gemini protocol. The Gemini address of this site (or the address of this capsule as people say in Geminispace) is:</p> <a class="textlink" href="gemini://buetow.org">gemini://buetow.org</a><br /> <p>However, if you still use HTTP, you are just surfing the fallback HTML version of this capsule. In that case, I suggest reading on what this is all about :-).</p> <pre> /\ / \ | | |NASA| | | | | | | ' ` |Gemini| | | |______| '-`'-` . / . \'\ . .' ''( .'\.' ' .;' '.;.;' ;'.;' ..;;' AsH </pre> <h2>Motivation</h2> <h3>My urge to revamp my personal website</h3> <p>For some time, I had to urge to revamp my personal website. Not to update the technology and its design but to update all the content (+ keep it current) and start a small tech blog again. So unconsciously, I began to search for an excellent platform to do all of that in a KISS (keep it simple & stupid) way.</p> <h3>My still great Laptop running hot</h3> <p>Earlier this year (2021), I noticed that my almost seven-year-old but still great Laptop started to become hot and slowed down while surfing the web. Also, the Laptop's fan became quite noisy. This was all due to the additional bloat such as JavaScript, excessive use of CSS, tracking cookies+pixels, ads, and so on there was on the website. </p> <p>All I wanted was to read an interesting article, but after a big advertising pop-up banner appeared and made everything worse, I gave up and closed the browser tab.</p> <h2>Discovering the Gemini internet protocol</h2> <p>Around the same time, I discovered a relatively new, more lightweight protocol named Gemini, which does not support all these CPU-intensive features like HTML, JavaScript, and CSS. Also, tracking and ads are unsupported by the Gemini protocol.</p> <p>The "downside" is that due to the limited capabilities of the Gemini protocol, all sites look very old and spartan. But that is not a downside; that is, in fact, a design choice people made. It is up to the client software how your capsule looks. For example, you could use a graphical client, such as Lagrange, with nice font renderings and colours to improve the appearance. Or you could use a very minimalistic command line black-and-white Gemini client. It's your (the user's) choice.</p> <a href="https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace/amfora-screenshot.png"><img alt="Screenshot Amfora Gemini terminal client surfing this site" title="Screenshot Amfora Gemini terminal client surfing this site" src="https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace/amfora-screenshot.png" /></a><br /> <a href="https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace/lagrange-screenshot.png"><img alt="Screenshot graphical Lagrange Gemini client surfing this site" title="Screenshot graphical Lagrange Gemini client surfing this site" src="https://buetow.org/gemfeed/2021-04-24-welcome-to-the-geminispace/lagrange-screenshot.png" /></a><br /> <p>Why is there a need for a new protocol? As the modern web is a superset of Gemini, can't we use simple HTML 1.0 instead? That's a good and valid question. It is not a technical problem but a human problem. We tend to abuse the features once they are available. You can ensure that things stay efficient and straightforward as long as you are using the Gemini protocol. On the other hand, you can't force every website on the modern web to only create plain and straightforward-looking HTML pages.</p> <h2>My own Gemini capsule</h2> <p>As it is effortless to set up and maintain your own Gemini capsule (Gemini server + content composed via the Gemtext markup language), I decided to create my own. What I like about Gemini is that I can use my favourite text editor and get typing. I don't need to worry about the style and design of the presence, and I also don't have to test anything in ten different web browsers. I can only focus on the content! As a matter of fact, I am using the Vim editor + its spellchecker + auto word completion functionality to write this. </p> <p>This site was generated with Gemtexter. You can read more about it here:</p> <a class="textlink" href="https://buetow.org/gemfeed/2021-06-05-gemtexter-one-bash-script-to-rule-it-all.html">Gemtexter - One Bash script to rule it all</a><br /> <h2>Gemini advantages summarised</h2> <ul> <li>Supports an alternative to the modern bloated web</li> <li>Easy to operate and easy to write content</li> <li>No need to worry about various web browser compatibilities</li> <li>It's the client's responsibility how the content is designed+presented</li> <li>Lightweight (although not as lightweight as the Gopher protocol)</li> <li>Supports privacy (no cookies, no request header fingerprinting, TLS encryption)</li> <li>Fun to play with (it's a bit geeky, yes, but a lot of fun!)</li> </ul> <h2>Dive into deep Gemini space</h2> <p>Check out one of the following links for more information about Gemini. For example, you will find a FAQ that explains why the protocol is named Gemini. Many Gemini capsules are dual-hosted via Gemini and HTTP(S) so that people new to Gemini can sneak peek at the content with a regular web browser. Some people go as far as tri-hosting all their content via HTTP(S), Gemini and Gopher.</p> <a class="textlink" href="gemini://gemini.circumlunar.space">gemini://gemini.circumlunar.space</a><br /> <a class="textlink" href="https://gemini.circumlunar.space">https://gemini.circumlunar.space</a><br /> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>DTail - The distributed log tail program</title> <link href="gemini://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.gmi" /> <id>gemini://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program.gmi</id> <updated>2021-04-22T19:28:41+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>This article first appeared at the Mimecast Engineering Blog but I made it available here in my personal Gemini capsule too. ...to read on visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>DTail - The distributed log tail program</h1> <p class="quote"><i>Published by Paul Buetow 2021-04-22, last updated 2021-04-26</i></p> <a href="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program/title.png"><img alt="DTail logo image" title="DTail logo image" src="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program/title.png" /></a><br /> <p>This article first appeared at the Mimecast Engineering Blog but I made it available here in my personal internet site too.</p> <a class="textlink" href="https://medium.com/mimecast-engineering/dtail-the-distributed-log-tail-program-79b8087904bb">Original Mimecast Engineering Blog post at Medium</a><br /> <p>Running a large cloud-based service requires monitoring the state of huge numbers of machines, a task for which many standard UNIX tools were not really designed. In this post, I will describe a simple program, DTail, that Mimecast has built and released as Open-Source, which enables us to monitor log files of many servers at once without the costly overhead of a full-blown log management system.</p> <p>At Mimecast, we run over 10 thousand server boxes. Most of them host multiple microservices and each of them produces log files. Even with the use of time series databases and monitoring systems, raw application logs are still an important source of information when it comes to analysing, debugging, and troubleshooting services.</p> <p>Every engineer familiar with UNIX or a UNIX-like platform (e.g., Linux) is well aware of tail, a command-line program for displaying a text file content on the terminal which is also especially useful for following application or system log files with tail -f logfile.</p> <p>Think of DTail as a distributed version of the tail program which is very useful when you have a distributed application running on many servers. DTail is an Open-Source, cross-platform, fairly easy to use, support and maintain log file analysis & statistics gathering tool designed for Engineers and Systems Administrators. It is programmed in Google Go.</p> <h2>A Mimecast Pet Project</h2> <p>DTail got its inspiration from public domain tools available already in this area but it is a blue sky from-scratch development which was first presented at Mimecast’s annual internal Pet Project competition (awarded with a Bronze prize). It has gained popularity since and is one of the most widely deployed DevOps tools at Mimecast (reaching nearly 10k server installations) and many engineers use it on a regular basis. The Open-Source version of DTail is available at:</p> <a class="textlink" href="https://dtail.dev">https://dtail.dev</a><br /> <p>Try it out — We would love any feedback. But first, read on…</p> <h2>Differentiating from log management systems</h2> <p>Why not just use a full-blown log management system? There are various Open-Source and commercial log management solutions available on the market you could choose from (e.g. the ELK stack). Most of them store the logs in a centralized location and are fairly complex to set up and operate. Possibly they are also pretty expensive to operate if you have to buy dedicated hardware (or pay fees to your cloud provider) and have to hire support staff for it.</p> <p>DTail does not aim to replace any of the log management tools already available but is rather an additional tool crafted especially for ad-hoc debugging and troubleshooting purposes. DTail is cheap to operate as it does not require any dedicated hardware for log storage as it operates directly on the source of the logs. It means that there is a DTail server installed on all server boxes producing logs. This decentralized comes with the direct advantages that there is no introduced delay because the logs are not shipped to a central log storage device. The reduced complexity also makes it more robust against outages. You won’t be able to troubleshoot your distributed application very well if the log management infrastructure isn’t working either.</p> <a href="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program/dtail.gif"><img alt="DTail sample session animated gif" title="DTail sample session animated gif" src="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program/dtail.gif" /></a><br /> <p>As a downside, you won’t be able to access any logs with DTail when the server is down. Furthermore, a server can store logs only up to a certain capacity as disks will fill up. For the purpose of ad-hoc debugging, these are not typically issues. Usually, it’s the application you want to debug and not the server. And disk space is rarely an issue for bare metal and VM-based systems these days, with sufficient space for several weeks’ worth of log storage being available. DTail also supports reading compressed logs. The currently supported compression algorithms are gzip and zstd.</p> <h2>Combining simplicity, security and efficiency</h2> <p>DTail also has a client component that connects to multiple servers concurrently for log files (or any other text files).</p> <p>The DTail client interacts with a DTail server on port TCP/2222 via SSH protocol and does not interact in any way with the system’s SSH server (e.g., OpenSSH Server) which might be running at port TCP/22 already. As a matter of fact, you don’t need a regular SSH server running for DTail at all. There is no support for interactive login shells at TCP/2222 either, as by design that port can only be used for text data streaming. The SSH protocol is used for the public/private key infrastructure and transport encryption only and DTail implements its own protocol on top of SSH for the features provided. There is no need to set up or buy any additional TLS certificates. The port 2222 can be easily reconfigured if you preferred to use a different one.</p> <p>The DTail server, which is a single static binary, will not fork an external process. This means that all features are implemented in native Go code (exception: Linux ACL support is implemented in C, but it must be enabled explicitly on compile time) and therefore helping to make it robust, secure, efficient, and easy to deploy. A single client, running on a standard Laptop, can connect to thousands of servers concurrently while still maintaining a small resource footprint.</p> <p>Recent log files are very likely still in the file system caches on the servers. Therefore, there tends to be a minimal I/O overhead involved.</p> <h2>The DTail family of commands</h2> <p>Following the UNIX philosophy, DTail includes multiple command-line commands each of them for a different purpose:</p> <ul> <li>dserver: The DTail server, the only binary required to be installed on the servers involved.</li> <li>dtail: The distributed log tail client for following log files.</li> <li>dcat: The distributed cat client for concatenating and displaying text files.</li> <li>dgrep: The distributed grep client for searching text files for a regular expression pattern.</li> <li>dmap: The distributed map-reduce client for aggregating stats from log files.</li> </ul> <a href="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program/dgrep.gif"><img alt="DGrep sample session animated gif" title="DGrep sample session animated gif" src="https://buetow.org/gemfeed/2021-04-22-dtail-the-distributed-log-tail-program/dgrep.gif" /></a><br /> <h2>Usage example</h2> <p>The use of these commands is almost self-explanatory for a person already used to the standard command line in Unix systems. One of the main goals is to make DTail easy to use. A tool that is too complicated to use under high-pressure scenarios (e.g., during an incident) can be quite detrimental.</p> <p>The basic idea is to start one of the clients from the command line and provide a list of servers to connect to with –servers. You also must provide a path of remote (log) files via –files. If you want to process multiple files per server, you could either provide a comma-separated list of file paths or make use of file system globbing (or a combination of both).</p> <p>The following example would connect to all DTail servers listed in the serverlist.txt, follow all files with the ending .log and filter for lines containing the string error. You can specify any Go compatible regular expression. In this example we add the case-insensitive flag to the regex:</p> <pre> dtail –servers serverlist.txt –files ‘/var/log/*.log’ –regex ‘(?i:error)’ </pre> <p>You usually want to specify a regular expression as a client argument. This will mean that responses are pre-filtered for all matching lines on the server-side and thus sending back only the relevant lines to the client. If your logs are growing very rapidly and the regex is not specific enough there might be the chance that your client is not fast enough to keep up processing all of the responses. This could be due to a network bottleneck or just as simple as a slow terminal emulator displaying the log lines on the client-side.</p> <p>A green 100 in the client output before each log line received from the server always indicates that there were no such problems and 100% of all log lines could be displayed on your terminal (have a look at the animated Gifs in this post). If the percentage falls below 100 it means that some of the channels used by the servers to send data to the client are congested and lines were dropped. In this case, the color will change from green to red. The user then could decide to run the same query but with a more specific regex.</p> <p>You could also provide a comma-separated list of servers as opposed to a text file. There are many more options you could use. The ones listed here are just the very basic ones. There are more instructions and usage examples on the GitHub page. Also, you can study even more of the available options via the –help switch (some real treasures might be hidden there).</p> <h2>Fitting it in</h2> <p>DTail integrates nicely into the user management of existing infrastructure. It follows normal system permissions and does not open new “holes” on the server which helps to keep security departments happy. The user would not have more or less file read permissions than he would have via a regular SSH login shell. There is a full SSH key, traditional UNIX permissions, and Linux ACL support. There is also a very low resource footprint involved. On average for tailing and searching log files less than 100MB RAM and less than a quarter of a CPU core per participating server are required. Complex map-reduce queries on big data sets will require more resources accordingly.</p> <h2>Advanced features</h2> <p>The features listed here are out of the scope of this blog post but are worthwhile to mention:</p> <ul> <li>Distributed map-reduce queries on stats provided in log files with dmap. dmap comes with its own SQL-like aggregation query language.</li> <li>Stats streaming with continuous map-reduce queries. The difference to normal queries is that the stats are aggregated over a specified interval only on the newly written log lines. Thus, giving a de-facto live stat view for each interval.</li> <li>Server-side scheduled queries on log files. The queries are configured in the DTail server configuration file and scheduled at certain time intervals. Results are written to CSV files. This is useful for generating daily stats from the log files without the need for an interactive client.</li> <li>Server-side stats streaming with continuous map-reduce queries. This for example can be used to periodically generate stats from the logs at a configured interval, e.g., log error counts by the minute. These then can be sent to a time-series database (e.g., Graphite) and then plotted in a Grafana dashboard.</li> <li>Support for custom extensions. E.g., for different server discovery methods (so you don’t have to rely on plain server lists) and log file formats (so that map-reduce queries can parse more stats from the logs).</li> </ul> <h2>For the future</h2> <p>There are various features we want to see in the future.</p> <ul> <li>A spartan mode, not printing out any extra information but the raw remote log files would be a nice feature to have. This will make it easier to post-process the data produced by the DTail client with common UNIX tools. (To some degree this is possible already, just disable the ANSI terminal color output of the client with -noColors and pipe the output to another program).</li> <li>Tempting would be implementing the dgoawk command, a distributed version of the AWK programming language purely implemented in Go, for advanced text data stream processing capabilities. There are 3rd party libraries available implementing AWK in pure Go which could be used.</li> <li>A more complex change would be the support of federated queries. You can connect to thousands of servers from a single client running on a laptop. But does it scale to 100k of servers? Some of the servers could be used as middleware for connecting to even more servers.</li> <li>Another aspect is to extend the documentation. Especially the advanced features such as map-reduce query language and how to configure the server-side queries currently do require more documentation. For now, you can read the code, sample config files or just ask the author for that! But this will be certainly addressed in the future.</li> </ul> <h2>Open Source</h2> <p>Mimecast highly encourages you to have a look at DTail and submit an issue for any features you would like to see. Have you found a bug? Maybe you just have a question or comment? If you want to go a step further: We would also love to see pull requests for any features or improvements. Either way, if in doubt just contact us via the DTail GitHub page.</p> <a class="textlink" href="https://dtail.dev">https://dtail.dev</a><br /> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Realistic load testing with I/O Riot for Linux</title> <link href="gemini://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux.gmi" /> <id>gemini://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux.gmi</id> <updated>2018-06-01T14:50:29+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>This text first was published in the german IT-Administrator computer Magazine. 3 years have passed since and I decided to publish it on my blog too. . .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Realistic load testing with I/O Riot for Linux</h1> <pre> .---. / \ \.@-@./ /`\_/`\ // _ \\ | \ )|_ /`\_`> <_/ \ jgs\__/'---'\__/ </pre> <p class="quote"><i>Published by Paul Buetow 2018-06-01, last updated 2021-05-08</i></p> <h2>Foreword</h2> <p>This text first was published in the german IT-Administrator computer Magazine. 3 years have passed since and I decided to publish it on my blog too. </p> <a class="textlink" href="https://www.admin-magazin.de/Das-Heft/2018/06/Realistische-Lasttests-mit-I-O-Riot">https://www.admin-magazin.de/Das-Heft/2018/06/Realistische-Lasttests-mit-I-O-Riot</a><br /> <p>I havn't worked on I/O Riot for some time now, but all what is written here is still valid. I am still using I/O Riot to debug I/O issues and pattern once in a while, so by all means the tool is not obsolete yet. The tool even helped to resolve a major production incident at work caused by disk I/O.</p> <p>I am eagerly looking forward to revamp I/O Riot so that it uses the new BPF Linux capabilities instead of plain old Systemtap (or alternatively: Newer versions of Systemtap can also use BPF as the backend I have learned). Also, when I wrote I/O Riot initially, I didn't have any experience with the Go programming language yet and therefore I wrote it in C. Once it gets revamped I might consider using Go instead of C as it would spare me from many segmentation faults and headaches during development ;-). I might also just stick to C for plain performance reasons and just refactor the code dealing with concurrency.</p> <p>Pleace notice that some of the screenshots show the command "ioreplay" instead of "ioriot". That's because the name has changed after taking those.</p> <h1>The article</h1> <p>With I/O Riot IT administrators can load test and optimize the I/O subsystem of Linux-based operating systems. The tool makes it possible to record I/O patterns and replay them at a later time as often as desired. This means bottlenecks can be reproduced and eradicated. </p> <p>When storing huge amounts of data, such as more than 200 billion archived emails at Mimecast, it's not only the available storage capacity that matters, but also the data throughput and latency. At the same time, operating costs must be kept as low as possible. The more systems involved, the more important it is to optimize the hardware, the operating system and the applications running on it.</p> <h2>Background: Existing Techniques</h2> <p>Conventional I/O benchmarking: Administrators usually use open source benchmarking tools like IOZone and bonnie++. Available database systems such as Redis and MySQL come with their own benchmarking tools. The common problem with these tools is that they work with prescribed artificial I/O patterns. Although this can test both sequential and randomized data access, the patterns do not correspond to what can be found on production systems.</p> <p>Testing by load test environment: Another option is to use a separate load test environment in which, as far as possible, a production environment with all its dependencies is simulated. However, an environment consisting of many microservices is very complex. Microservices are usually managed by different teams, which means extra coordination effort for each load test. Another challenge is to generate the load as authentically as possible so that the patterns correspond to a productive environment. Such a load test environment can only handle as many requests as its weakest link can handle. For example, load generators send many read and write requests to a frontend microservice, whereby the frontend forwards the requests to a backend microservice responsible for storing the data. If the frontend service does not process the requests efficiently enough, the backend service is not well utilized in the first place. As a rule, all microservices are clustered across many servers, which makes everything even more complicated. Under all these conditions it is very difficult to test I/O of separate backend systems. Moreover, for many small and medium-sized companies, a separate load test environment would not be feasible for cost reasons.</p> <p>Testing in the production environment: For these reasons, benchmarks are often carried out in the production environment. In order to derive value from this such tests are especially performed during peak hours when systems are under high load. However, testing on production systems is associated with risks and can lead to failure or loss of data without adequate protection.</p> <h2>Benchmarking the Email Cloud at Mimecast</h2> <p>For email archiving, Mimecast uses an internally developed microservice, which is operated directly on Linux-based storage systems. A storage cluster is divided into several replication volumes. Data is always replicated three times across two secure data centers. Customer data is automatically allocated to one or more volumes, depending on throughput, so that all volumes are automatically assigned the same load. Customer data is archived on conventional, but inexpensive hard disks with several terabytes of storage capacity each. I/O benchmarking proved difficult for all the reasons mentioned above. Furthermore, there are no ready-made tools for this purpose in the case of self-developed software. The service operates on many block devices simultaneously, which can make the RAID controller a bottleneck. None of the freely available benchmarking tools can test several block devices at the same time without extra effort. In addition, emails typically consist of many small files. Randomized access to many small files is particularly inefficient. In addition to many software adaptations, the hardware and operating system must also be optimized.</p> <p>Mimecast encourages employees to be innovative and pursue their own ideas in the form of an internal competition, Pet Project. The goal of the pet project I/O Riot was to simplify OS and hardware level I/O benchmarking. The first prototype of I/O Riot was awarded an internal roadmap prize in the spring of 2017. A few months later, I/O Riot was used to reduce write latency in the storage clusters by about 50%. The improvement was first verified by I/O replay on a test system and then successively applied to all storage systems. I/O Riot was also used to resolve a production incident caused by disk I/O load.</p> <h2>Using I/O Riot</h2> <p>First, all I/O events are logged to a file on a production system with I/O Riot. It is then copied to a test system where all events are replayed in the same way. The crucial point here is that you can reproduce I/O patterns as they are found on a production system as often as you like on a test system. This results in the possibility of optimizing the set screws on the system after each run.</p> <h3>Installation</h3> <p>I/O Riot was tested under CentOS 7.2 x86_64. For compiling, the GNU C compiler and Systemtap including kernel debug information are required. Other Linux distributions are theoretically compatible but untested. First of all, you should update the systems involved as follows:</p> <pre> % sudo yum update </pre> <p>If the kernel is updated, please restart the system. The installation would be done without a restart but this would complicate the installation. The installed kernel version should always correspond to the currently running kernel. You can then install I/O Riot as follows:</p> <pre> % sudo yum install gcc git systemtap yum-utils kernel-devel-$(uname -r) % sudo debuginfo-install kernel-$(uname -r) % git clone https://github.com/mimecast/ioriot % cd ioriot % make % sudo make install % export PATH=$PATH:/opt/ioriot/bin </pre> <p>Note: It is not best practice to install any compilers on production systems. For further information please have a look at the enclosed README.md.</p> <h3>Recording of I/O events</h3> <p>All I/O events are kernel related. If a process wants to perform an I/O operation, such as opening a file, it must inform the kernel of this by a system call (short syscall). I/O Riot relies on the Systemtap tool to record I/O syscalls. Systemtap, available for all popular Linux distributions, helps you to take a look at the running kernel in productive environments, which makes it predestined to monitor all I/O-relevant Linux syscalls and log them to a file. Other tools, such as strace, are not an alternative because they slow down the system too much.</p> <p>During recording, ioriot acts as a wrapper and executes all relevant Systemtap commands for you. Use the following command to log all events to io.capture:</p> <pre> % sudo ioriot -c io.capture </pre> <a href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure1-ioriot-io-recording.png"><img alt="Screenshot I/O recording" title="Screenshot I/O recording" src="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure1-ioriot-io-recording.png" /></a><br /> <p>A Ctrl-C (SIGINT) stops recording prematurely. Otherwise, ioriot terminates itself automatically after 1 hour. Depending on the system load, the output file can grow to several gigabytes. Only metadata is logged, not the read and written data itself. When replaying later, only random data is used. Under certain circumstances, Systemtap may omit some system calls and issue warnings. This is to ensure that Systemtap does not consume too many resources.</p> <h3>Test preparation</h3> <p>Then copy io.capture to a test system. The log also contains all accesses to the pseudo file systems devfs, sysfs and procfs. This makes little sense, which is why you must first generate a cleaned and playable version io.replay from io.capture as follows:</p> <pre> % sudo ioriot -c io.capture -r io.replay -u $USER -n TESTNAME </pre> <p>The parameter -n allows you to assign a freely selectable test name. An arbitrary system user under which the test is to be played is specified via paramater -u.</p> <h3>Test Initialization</h3> <p>The test will most likely want to access existing files. These are files the test wants to read but does not create by itself. The existence of these must be ensured before the test. You can do this as follows:</p> <pre> % sudo ioriot -i io.replay </pre> <p>To avoid any damage to the running system, ioreplay only works in special directories. The tool creates a separate subdirectory for each file system mount point (e.g. /, /usr/local, /store/00,...) (here: /.ioriot/TESTNAME, /usr/local/.ioriot/TESTNAME, /store/00/.ioriot/TESTNAME,...). By default, the working directory of ioriot is /usr/local/ioriot/TESTNAME.</p> <a href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure2-ioriot-test-preparation.png"><img alt="Screenshot test preparation" title="Screenshot test preparation" src="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure2-ioriot-test-preparation.png" /></a><br /> <p>You must re-initialize the environment before each run. Data from previous tests will be moved to a trash directory automatically, which can be finally deleted with "sudo ioriot -P".</p> <h3>Replay</h3> <p>After initialization, you can replay the log with -r. You can use -R to initiate both test initialization and replay in a single command and -S can be used to specify a file in which statistics are written after the test run.</p> <p>You can also influence the playback speed: "-s 0" is interpreted as "Playback as fast as possible" and is the default setting. With "-s 1" all operations are performed at original speed. "-s 2" would double the playback speed and "-s 0.5" would halve it.</p> <a href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure3-ioriot-replay.png"><img alt="Screenshot replaying I/O" title="Screenshot replaying I/O" src="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure3-ioriot-replay.png" /></a><br /> <p>As an initial test, for example, you could compare the two Linux I/O schedulers CFQ and Deadline and check which scheduler the test runs the fastest. They run the test separately for each scheduler. The following shell loop iterates through all attached block devices of the system and changes their I/O scheduler to the one specified in variable $new_scheduler (in this case either cfq or deadline). Subsequently, all I/O events from the io.replay protocol are played back. At the end, an output file with statistics is generated:</p> <pre> % new_scheduler=cfq % for scheduler in /sys/block/*/queue/scheduler; do echo $new_scheduler | sudo tee $scheduler done % sudo ioriot -R io.replay -S cfq.txt % new_scheduler=deadline % for scheduler in /sys/block/*/queue/scheduler; do echo $new_scheduler | sudo tee $scheduler done % sudo ioriot -R io.replay -S deadline.txt </pre> <p>According to the results, the test could run 940 seconds faster with Deadline Scheduler:</p> <pre> % cat cfq.txt Num workers: 4 hreads per worker: 128 otal threads: 512 Highest loadavg: 259.29 Performed ioops: 218624596 Average ioops/s: 101544.17 Time ahead: 1452s Total time: 2153.00s % cat deadline.txt Num workers: 4 Threads per worker: 128 Total threads: 512 Highest loadavg: 342.45 Performed ioops: 218624596 Average ioops/s: 180234.62 Time ahead: 2392s Total time: 1213.00s </pre> <p>In any case, you should also set up a time series database, such as Graphite, where the I/O throughput can be plotted. Figures 4 and 5 show the read and write access times of both tests. The break-in makes it clear when the CFQ test ended and the deadline test was started. The reading latency of both tests is similar. Write latency is dramatically improved using the Deadline Scheduler.</p> <a href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure4-ioriot-read-latency.png"><img alt="Graphite visualization of the mean read access times in ms with CFQ and Deadline Scheduler." title="Graphite visualization of the mean read access times in ms with CFQ and Deadline Scheduler." src="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure4-ioriot-read-latency.png" /></a><br /> <a href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure5-ioriot-write-latency.png"><img alt="Graphite visualization of the average write access times in ms with CFQ and Deadline Scheduler." title="Graphite visualization of the average write access times in ms with CFQ and Deadline Scheduler." src="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure5-ioriot-write-latency.png" /></a><br /> <p>You should also take a look at the iostat tool. The iostat screenshot shows the output of iostat -x 10 during a test run. As you can see, a block device is fully loaded with 99% utilization, while all other block devices still have sufficient buffer. This could be an indication of poor data distribution in the storage system and is worth pursuing. It is not uncommon for I/O Riot to reveal software problems.</p> <a href="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure6-iostat.png"><img alt="Output of iostat. The block device sdy seems to be almost fully utilized by 99%." title="Output of iostat. The block device sdy seems to be almost fully utilized by 99%." src="https://buetow.org/gemfeed/2018-06-01-realistic-load-testing-with-ioriot-for-linux/figure6-iostat.png" /></a><br /> <h2>I/O Riot is Open Source</h2> <p>The tool has already proven to be very useful and will continue to be actively developed as time and priority permits. Mimecast intends to be an ongoing contributor to Open Source. You can find I/O Riot at:</p> <a class="textlink" href="https://github.com/mimecast/ioriot">https://github.com/mimecast/ioriot</a><br /> <h2>Systemtap</h2> <p>Systemtap is a tool for the instrumentation of the Linux kernel. The tool provides an AWK-like programming language. Programs written in it are compiled from Systemtap to C- and then into a dynamically loadable kernel module. Loaded into the kernel, the program has access to Linux internals. A Systemtap program written for I/O Riot monitors when, with which parameters, at which time, and from which process I/O syscalls take place and their return values.</p> <p>For example, the open syscall opens a file and returns the responsible file descriptor. The read and write syscalls can operate on a file descriptor and return the number of read or written bytes. The close syscall closes a given file descriptor. I/O Riot comes with a ready-made Systemtap program, which you have already compiled into a kernel module and installed to /opt/ioriot. In addition to open, read and close, it logs many other I/O-relevant calls.</p> <a class="textlink" href="https://sourceware.org/systemtap/">https://sourceware.org/systemtap/</a><br /> <h2>More refereces</h2> <a class="textlink" href="http://www.iozone.org/">IOZone</a><br /> <a class="textlink" href="https://www.coker.com.au/bonnie++/">Bonnie++</a><br /> <a class="textlink" href="https://graphiteapp.org">Graphite</a><br /> <a class="textlink" href="https://en.wikipedia.org/wiki/Memory-mapped_I/O">Memory mapped I/O</a><br /> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Methods in C</title> <link href="gemini://buetow.org/gemfeed/2016-11-20-methods-in-c.gmi" /> <id>gemini://buetow.org/gemfeed/2016-11-20-methods-in-c.gmi</id> <updated>2016-11-20T18:36:51+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>You can do some sort of object oriented programming in the C Programming Language. However, that is very limited. But also very easy and straight forward to use.. .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Methods in C</h1> <p class="quote"><i>Published by Paul Buetow 2016-11-20</i></p> <p>You can do some sort of object-oriented programming in the C Programming Language. However, that is very limited. But also very easy and straightforward to use.</p> <h2>Example</h2> <p>Let's have a look at the following sample program. All you have to do is to add a function pointer such as "calculate" to the definition of struct "something_s". Later, during the struct initialization, assign a function address to that function pointer:</p> <pre> #include <stdio.h> typedef struct { double (*calculate)(const double, const double); char *name; } something_s; double multiplication(const double a, const double b) { return a * b; } double division(const double a, const double b) { return a / b; } int main(void) { something_s mult = (something_s) { .calculate = multiplication, .name = "Multiplication" }; something_s div = (something_s) { .calculate = division, .name = "Division" }; const double a = 3, b = 2; printf("%s(%f, %f) => %f\n", mult.name, a, b, mult.calculate(a,b)); printf("%s(%f, %f) => %f\n", div.name, a, b, div.calculate(a,b)); } </pre> <p>As you can see, you can call the function (pointed by the function pointer) the same way as in C++ or Java via:</p> <pre> printf("%s(%f, %f) => %f\n", mult.name, a, b, mult.calculate(a,b)); printf("%s(%f, %f) => %f\n", div.name, a, b, div.calculate(a,b)); </pre> <p>However, that's just syntactic sugar for:</p> <pre> printf("%s(%f, %f) => %f\n", mult.name, a, b, (*mult.calculate)(a,b)); printf("%s(%f, %f) => %f\n", div.name, a, b, (*div.calculate)(a,b)); </pre> <p>Output:</p> <pre> pbuetow ~/git/blog/source [38268]% gcc methods-in-c.c -o methods-in-c pbuetow ~/git/blog/source [38269]% ./methods-in-c Multiplication(3.000000, 2.000000) => 6.000000 Division(3.000000, 2.000000) => 1.500000 </pre> <p>Not complicated at all, but nice to know and helps to make the code easier to read!</p> <h2>The flaw</h2> <p>However, that's not really how it works in object-oriented languages such as Java and C++. The method call in this example is not a method call as "mult" and "div" in this example are not "message receivers". I mean that the functions can not access the state of the "mult" and "div" struct objects. In C, you would need to do something like this instead if you wanted to access the state of "mult" from within the calculate function, you would have to pass it as an argument:</p> <pre> mult.calculate(mult,a,b)); </pre> <p>How to overcome this? You need to take it further.</p> <h2>Taking it further</h2> <p>If you want to take it further, type "Object-Oriented Programming with ANSI-C" into your favourite internet search engine, you will find some crazy stuff. Some go as far as writing a C preprocessor in AWK, which takes some object-oriented pseudo-C and transforms it to plain C so that the C compiler can compile it to machine code. This is similar to how the C++ language had its origins.</p> <p>E-Mail me your thoughts at comments@mx.buetow.org!</p> </div> </content> </entry> <entry> <title>Spinning up my own authoritative DNS servers</title> <link href="gemini://buetow.org/gemfeed/2016-05-22-spinning-up-my-own-authoritative-dns-servers.gmi" /> <id>gemini://buetow.org/gemfeed/2016-05-22-spinning-up-my-own-authoritative-dns-servers.gmi</id> <updated>2016-05-22T18:59:01+01:00</updated> <author> <name>Paul Buetow</name> <email>comments@mx.buetow.org</email> </author> <summary>Finally, I had time to deploy my own authoritative DNS servers (master and slave) for my domains 'buetow.org' and 'buetow.zone'. My domain name provider is Schlund Technologies. They allow their customers to manually edit the DNS records (BIND files). And they also give you the opportunity to set your own authoritative DNS servers for your domains. From now I am making use of that option.. .....to read on please visit my site.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <h1>Spinning up my own authoritative DNS servers</h1> <p class="quote"><i>Published by Paul Buetow 2016-05-22</i></p> <h2>Background</h2> <p>Finally, I had time to deploy my authoritative DNS servers (master and slave) for my domains "buetow.org" and "buetow.zone". My domain name provider is Schlund Technologies. They allow their customers to edit the DNS records (BIND files) manually. And they also allow you to set your authoritative DNS servers for your domains. From now, I am making use of that option.</p> <a class="textlink" href="http://www.schlundtech.de">Schlund Technologies</a><br /> <h2>All FreeBSD Jails</h2> <p>To set up my authoritative DNS servers, I installed a FreeBSD Jail dedicated for DNS with Puppet on my root machine as follows:</p> <pre> include freebsd freebsd::ipalias { '2a01:4f8:120:30e8::14': ensure => up, proto => 'inet6', preflen => '64', interface => 're0', aliasnum => '5', } include jail::freebsd class { 'jail': ensure => present, jails_config => { dns => { '_ensure' => present, '_type' => 'freebsd', '_mirror' => 'ftp://ftp.de.freebsd.org', '_remote_path' => 'FreeBSD/releases/amd64/10.1-RELEASE', '_dists' => [ 'base.txz', 'doc.txz', ], '_ensure_directories' => [ '/opt', '/opt/enc' ], 'host.hostname' => "'dns.ian.buetow.org'", 'ip4.addr' => '192.168.0.15', 'ip6.addr' => '2a01:4f8:120:30e8::15', }, . . } } </pre> <h2>PF firewall</h2> <p>Please note that "dns.ian.buetow.org" is just the Jail name of the master DNS server (and "caprica.ian.buetow.org" the name of the Jail for the slave DNS server) and that I am using the DNS names "dns1.buetow.org" (master) and "dns2.buetow.org" (slave) for the actual service names (these are the DNS servers visible to the public). Please also note that the IPv4 address is an internal one. I have a PF to use NAT and PAT. The DNS ports are being forwarded (TCP and UDP) to that Jail. By default, all ports are blocked, so I am adding an exception rule for the IPv6 address. These are the PF rules in use:</p> <pre> % cat /etc/pf.conf . . # dns.ian.buetow.org rdr pass on re0 proto tcp from any to $pub_ip port {53} -> 192.168.0.15 rdr pass on re0 proto udp from any to $pub_ip port {53} -> 192.168.0.15 pass in on re0 inet6 proto tcp from any to 2a01:4f8:120:30e8::15 port {53} flags S/SA keep state pass in on re0 inet6 proto udp from any to 2a01:4f8:120:30e8::15 port {53} flags S/SA keep state . . </pre> <h2>Puppet managed BIND zone files</h2> <p>In "manifests/dns.pp" (the Puppet manifest for the Master DNS Jail itself), I configured the BIND DNS server this way:</p> <pre> class { 'bind_freebsd': config => "puppet:///files/bind/named.${::hostname}.conf", dynamic_config => "puppet:///files/bind/dynamic.${::hostname}", } </pre> <p>The Puppet module is a pretty simple one. It installs the file "/usr/local/etc/named/named.conf" and it populates the "/usr/local/etc/named/dynamicdb" directory with all my zone files.</p> <p>Once (Puppet-) applied inside of the Jail, I get this:</p> <pre> paul uranus:~/git/blog/source [4268]% ssh admin@dns1.buetow.org.buetow.org pgrep -lf named 60748 /usr/local/sbin/named -u bind -c /usr/local/etc/namedb/named.conf paul uranus:~/git/blog/source [4269]% ssh admin@dns1.buetow.org.buetow.org tail -n 13 /usr/local/etc/namedb/named.conf zone "buetow.org" { type master; notify yes; allow-update { key "buetoworgkey"; }; file "/usr/local/etc/namedb/dynamic/buetow.org"; }; zone "buetow.zone" { type master; notify yes; allow-update { key "buetoworgkey"; }; file "/usr/local/etc/namedb/dynamic/buetow.zone"; }; paul uranus:~/git/blog/source [4277]% ssh admin@dns1.buetow.org.buetow.org cat /usr/local/etc/namedb/dynamic/buetow.org $TTL 3600 @ IN SOA dns1.buetow.org. domains.buetow.org. ( 25 ; Serial 604800 ; Refresh 86400 ; Retry 2419200 ; Expire 604800 ) ; Negative Cache TTL ; Infrastructure domains @ IN NS dns1 @ IN NS dns2