Investigating bug report FS#1763 (signature check does not fail even if some sources are not signed) reinforced my impression from a year ago, that the pkgmk source code had become too disorganized for easy modification. Although I did eventually track down the cause of the behaviour reported in FS#1763, the exercise prompted me to imagine how the script might be refactored so that such tickets would receive attention within days rather than months.
When the first version of pkgmk was written, it had a modest goal: read the Pkgfile and build a compressed archive. Over the years pkgmk acquired more features, such as the ability to use a shared directory for downloads. The addition of features to pkgmk was rarely accompanied by the thankless work of weaving these disparate parts into a coherent structure. As a result, the typical response when new users scroll through the pkgmk script is to feel lost, constantly paging up and down to acquire their bearings and understand how all the pieces fit together.
Satisfying requests for new pkgmk features has probably suffered years of delay, due to the small fraction of CRUX users who feel at home in the pkgmk source code. For example, a new contributor might wonder: "do I need to make my changes in the download_file() function, the download_source() function, or both? should I be passing this argument to get_filename() or get_basename()? has make_work_dir() been called by this point or not?" In planning a replacement script, I aimed to introduce as few subroutines as possible, and never to use two function names that are easily confused. If a code block only gets used once across all the possible invocations of pkgmk, I tried not to file it away in a separate subroutine, but instead to leave it in the main routine so that the reader will not be paging up and down to see the exact commands involved in the action. With less mental bandwith allocated to remembering the distinction between similarly-named functions, or to keeping an inventory of all the subroutines that might be impacted by a new line of code, future contributors will be able to locate the right place for adding new code or fixing undesired behaviour.
At least with pkgmk, the requisite shell scripting experience is widespread among CRUX users, and inaction on a bug report might be more easily explained by apathy than by a disorganized code base. In contrast, C++ experience is in short supply these days, which renders the source code of prt-get and pkgadd that much more intimidating to would-be contributors. One workaround I considered, thanks to the time I spent running OpenBSD as my daily driver, was to expand the prt-get manpage with more useful examples. The examples in the existing manpage illustrate only the simplest tasks, leaving new users ignorant of such advanced techniques as piping prt-get output to awk and grep so that newly-added dependencies of installed ports can be identified. My expansion of the prt-get manpage was intended to lessen the burden of supporting new users in common tasks, whose solutions currently must be learned by digging through the Flyspray tasklist or by searching through the IRC logs (two sources of insider knowledge that a CRUX newcomer is unlikely to reach for in a moment of frustration).
But training new users to work around the limitations of old tools is a band-aid solution, hardly addressing the fact that patch submissions to prt-get can sit for years with no action. A more direct approach is to rewrite the CRUX system tools in languages more familiar to the average contributor, so that the maintenance of these tools does not fall exclusively to the handful of developers with C++ experience.
In refactoring pkgmk, I started from the "micro-pkgmk" written by therealfun in 2018, and added only as much code as needed to satisfy some of the additional functionality that CRUX users had come to expect. I also kept in mind FS#1763 and #1851, trying to observe the recommended practices of defensive coding. Lastly, I incorporated experimental parsing of git URLs, so that port maintainers can specify a git repo rather than relying on the git host to offer downloadable tarballs. I dubbed resulting script 'pkgmeek' (which in some languages might be regarded as a homophone for 'pkgmake').
Having acquired a Perl reference book last December, I first built up my confidence by translating the bash scripts prtwash and prtsweep into the Perl idiom. Then I took a fresh look at prt-get, diving into its C++ source code to understand its inner workings and data structures. This preliminary research (and the manpage expansion mentioned above) almost qualifies as "readme-oriented development", because I had a definite, documented behaviour in mind for the finished product, and the code would have to exhibit this behaviour in order to be deemed a viable prt-get replacement. The core of my prt-get replacement (dubbed 'prt-auf') only took a couple days to write, but in this early stage it still lacked some of the luxuries like 'printf' (printing a formatted list of ports), 'dup' (listing the ports that appear more than once among active repositories), and saving pkgmk build logs. I added the missing features one-by-one over the next couple weeks, eventually catching up with prt-get in everything except code size.
- prt-auf: about 850 lines of Perl
- prt-get: about 6000 lines of C++, and 1200 lines of header files
Time will only tell whether these replacements for pkgmk and prt-get will ever see wider adoption beyond the handful of early testers. It is a common trait of developers to look at legacy code and see a mess that ought to be rewritten from the ground up, ignoring all the painful lessons and bug fixes that are embedded in the legacy code. Having actually gotten my patch to the legacy code accepted by core maintainers last year, I don't believe I am writing from a place of ignorance, though.
It is mildly distressing that patch submissions on our bug tracker are nowhere near the volume they were in 2017. The patch submissions that do arrive are not dealt with in a timely manner, which is more easily explained by a reduction in the number of core developers than by stipulating that CRUX package tools have reached full maturity and cannot gain much benefit from new patches. The next most probable explanations for the slow response to bug reports are: 1, interest in CRUX is waning (and could stand to be reinvigorated), or 2, the would-be contributors start paging through a legacy code base and lose confidence in their ability to help out. If either of these latter explanations has any merit, pkgmeek and prt-auf are offered as a possible remedy to the problem that "we don't have people that actually sign up to care about our core tools" (FS#1410).