💾 Archived View for thrig.me › blog › 2023 › 05 › 05 › testing-new-and-strange-functions.gmi captured on 2024-08-18 at 17:52:09. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
strlcpy(dictionary->d_name, name, 31);
What does the above do, especially as one bumps up against the dstsize limit? 31 is too long for my feeble brain so we'll go with a shorter test string, say three characters. Also we want some means to see if the function writes past the end of the string, as that would be super bad. This suggests the use of a union, though in hindsight you could probably just work on a char[6] or whatever and see if any subsequent got clobbered--I had a union elsewhere in the actual code, so went with one.
// cpy1 - test strlcpy #include <stdio.h> #include <string.h> #define MAXLEN 3 union { char buf[MAXLEN]; char overlay[MAXLEN * 2]; } bar; int main(void) { char *toolong = "verymuchtoolongofaninputstring"; for (int i = 0; i < MAXLEN * 2; i++) { bar.overlay[i] = 'x'; } strlcpy(bar.buf, toolong, MAXLEN); fprintf(stderr, "%s\n%s\n", bar.buf, bar.overlay); }
This is a bad test, as it does not show whether something bad happened to the overlay past the end of the shorter buf. However, it does reveal that a MAXLEN of three results in only "ve" in bar.buf, when actually we do want up to three characters and no '\0' termination; for this application there is a string length being stored somewhere else so the "string" doesn't actually need NUL termination, and we want to make use of all the bytes available.
A debugger might actually be useful here (I rarely reach for the debugger) to inspect the contents of the overlay.
$ CC=egcc CFLAGS=-g3 make cpy2 egcc -g3 -o cpy2 cpy2.c $ bt Reading symbols from cpy2... (gdb) b 19 Breakpoint 1 at 0x1a91: file cpy2.c, line 19. (gdb) r Starting program: /tmp/cpy2 Breakpoint 1, main () at cpy2.c:19 19 fprintf(stderr, "%s\n%s\n", bar.buf, bar.overlay); (gdb) x/6c bar.overlay 0x7a509b3e18 <bar>: 118 'v' 101 'e' 114 'r' 0 '\000' 120 'x' 120 'x'
bt(1) is a script that finds the most recent *.core or otherwise executable in the current directory and loads said up in a debugger. It's not portable to Linux for various reasons, so you should write your own version thereof if interested.
Anyways with a MAXLEN + 1 for the dstsize we get a full three characters copied, but the next byte in overlay has been clobbered with '\0' which for this application is inappropriate. Probably this fact is somewhere in the fine manual, but "try it and see" is also a thing, and for software usually won't do something costly or catastrophic like to totally ruin an engine block.
However, one may question the validity of such optimizations, as they defeat the whole purpose of strlcpy() and strlcat(). As a matter of fact, the first version of this manual page got it wrong.
Hubris is a thing. So you probably do want to test these things carefully. Especially if it's a power tool, or equivalent. Or, I don't know, maybe you are King Lear this time around?
https://www.youtube.com/watch?v=MJZqx28acE0
HISTORY strlcpy() and strlcat() first appeared in OpenBSD 2.4.
This function is maybe not so new (OpenBSD has a six month release cadence, is on 7.3 now, and so 2.4 can be dated with some degree of accuracy) but everything is new to someone who is new to the what is new to them.
// cpy3 - test copy with no NUL termination #include <stdio.h> #include <string.h> #define MAXLEN 3 union { char buf[MAXLEN]; char overlay[MAXLEN * 2]; } bar; int main(void) { char *toolong = "verymuchtoolongofaninputstring"; for (int i = 0; i < MAXLEN * 2; i++) { bar.overlay[i] = 'x'; } memcpy(bar.buf, toolong, MAXLEN); fprintf(stderr, "buf %.*s\nlay %s\n", MAXLEN, bar.buf, bar.overlay); }
tags #c #openbsd