💾 Archived View for thrig.me › blog › 2024 › 09 › 14 › forwhile.gmi captured on 2024-12-17 at 10:43:33. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-09-29)
-=-=-=-=-=-=-
Two inefficient cat(1) one of which has a catastrophic failure mode.
$ yes | perl -e 'print for readline' Out of memory! $ yes | perl -e 'print while readline' y y y y y y y ^C^C^C
What's the difference here? The 'for' loop tries to read the entire contents of the input into an array, and then loops that array; the 'while' loop instead only deals with a single line at a time. Therefore if you only ever deal with a single line at a time, the 'while' form should be preferred to avoid catastrophic memory growth, or at best the process using more memory than it should. These days some systems have massive amounts of memory, so such a concern may seem minor or irrelevant—"why not just slurp the whole file into memory??"—which may be fine if you're in a hurry, and the files are generally small, but not if you want to be efficient with memory, or the files start getting too big, especially if the code needs to be run on a microvirt and not some horking huge system that a developer might typically use.
Chunky chortleware browser: memory use? hold my beer.
If one did not know the above fact, apart from learning it the hard way when a process falls apart or takes too long to run when the input is too large, a typical approach would be to benchmark the two different implementations and to use increasing amounts of input lines, say 10 lines, 1000 lines, 100000 lines, etc. This method has all sorts of uses in programming, and one would graph or table the results and see if the time to run (or also memory use, but that can be harder to track, and excessive CPU use is likely if lots of memory is being allocated and fiddled with) for the different approaches are the same or different over different amounts of input. These could instead by user requests instead of lines of input, though there can be complications to properly simulating typical user requests, especially as the system complexity goes up.