💾 Archived View for gemini.hitchhiker-linux.org › gemlog › re_dependencies.gmi captured on 2023-03-20 at 17:53:15. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-01-29)

➡️ Next capture (2023-06-14)

-=-=-=-=-=-=-

Re: dependencies

2022-10-29

Idiomrottning has a great article on software dependencies.

Dependencies

1. There are dependencies that are gratuitous and unnecessary.
2. There are dependencies that make your code significantly more readable and comfy.
That second fact, to me, suggests that a hardline NIH approach is not the right path forward. That first fact suggests that pruning our deps a little bit is possible and probably good, but we are gonna need a good way to handle dependencies regardless. The “zero deps” approach is something I’m not a fan of, just as little as those unweildy, over-dependent trees.

I think this part is pretty easy to agree upon. And people mostly have come to understand this over time, as dependencies in C and C++ became quite unwieldy for a very long time, leading to an almost universal adoption of language level package managers amongst the last few generations of programming languages. While there are definite concerns with the implementation and security implications of many of these systems, on the whole I believe it has meant a great leap forward.

My experience is largely around Cargo, having written much more Rust in the past few years than anything else. But I am sure that much opf what I'm going to say will also apply more or less to `go get`, or `npm`, or whatever your language of choice happens to be using. Even Fortran has `fpm` now.

As an aside, she links to a great talk given in Milan by the creator of Zig, Andrew Kelley. I watched that talk on YouTube just a few days ago. Well worth checking out. But I digress from my digression. Zig, being at a pre-1.0 stage currently, still has no official language level package manager, although Mr. Kelley has stated clearly that he intends to build one with the community around the time of the 1.0 release of the language. Filling the void in the interim we have a few community sponsored projects, of which the two with the most traction are `Gyro` and `Zigmod`. I've tried both, and they both work great. Of the two, I prefer Zigmod and think it has some novel features that I'd like to see make it into the "official" zig codebase eventually.

Now, what I really want to address is the friction between distro and package maintainers versus language level package managers. I'm going to lump BSD in with the linux distros here because they also have to package this software, and their choices are relevant to the discussion. First I would say that the situation at Debian is untenable and their policies need to change. Said policies and tooling evolved around the C language and have not kept with the times.

Then there's the totally almost hands off approach taken by Arch (and yes, I do use Arch). Arch just builds the software in the way that the upstream recommends. They rely on using the most up to date stable releases to keep the system secure, in the hope that the upstreams will handle their own shit. As a software developer, I rather like this approach. But I also recognize the desire of many for more formal verification, as well as the fact that a large number of project will allow their code to go stale, particularly in regard to dependencies.

I use NetBSD's pkgsrc for packages in my little distro, HitchHiker. Pkgsrc takes a bit of a middle ground approach, at least when building Rust projects. Whereas normally you build a Rust project just by typing `cargo build` and let it download and verify dependencies itself, pkgsrc downloads all of the crates that a Rust package requires and verifies the checksums before running Cargo, passing command line switches to tell Cargo where to find the crates. I believe that FreeBSD does something similar. I'm not sure what this is actually accomplishing myself, apart from adding quite a bit to the workload of the package maintainers and the ridiculous size of the checksum files that have to be included in the pkgsrc tree. It's also quite a bit slower than Cargo. It would be one thing if they were attempting to have only one blessed version of a particular dependency in the entire distribution, like Debian does, but that is not what they are doing. They're just adding work and fighting the tool to make it work the way they wish that it worked.

Some proposed solutions from the original link I posted earlier:

1. Every module has at least two versions: latest-uploaded and latest-vetted. By default, you get the latest-vetted. This goes especially for recursive dependency resolution. You can pin to older versions (if so, we need some sorta in-app CVE alert system) and you can, with enough hoops, get newer, unvetted versions.
2. The modules hook into the distros’ normal package managers. So you can apt-remove and dpkg -L and so on.
3. And, when it’s time for the distro to package an app, the distro can bring in and pin vetted versions of the dependencies (again, that means some sorta built in CVE alerter is needed) and bless those versions officially.

There's some good sense here. I would love to see the Cargo developers reach out to distro maintainers and work out a system together. I also don't think that's going to happen in the near future. In the meantime, there are things that developers can and should do.

In the Rust ecosystem, there are a few nice Cargo plugins that can help with the first two. I started using `cargo audit` a while back in my projects. It provides alerts if your dependencies are out of date, unmaintained, or have any CVE's filed against them, including transitive deps. I have made it a habit to run it before tagging any release. I'm not a big fan of CI builds unless you are part of a large organization, but it can work nicely in a CI context as well.

I'm also in favor of changing the way publishing happens. For Rust and crates.io, currently the only requirement to publish to crates.io is a Github account. I have an account, grudgingly, partly for this reason. But that is an incredibly low bar for entry. I would like to see some form of formal review process, if only for the initial upload of a crate, to ensure that it is legitimate and not malware. I'd also like to see a very strict, one strike and you're banned for life, anti-malware policy in place. Others have also proposed adding a group of vetted core packages which extend the std library in useful and common ways, and I also think that this is a good idea.

Short of all of that, as a developer there are certain metrics that can be helpful to sort through crates.io to find the (hopefully) better packages. Number of downloads is a starting point, as it shows how commonly used a library is. I also like to look at release history to see how often something is updated. Both of these can be misleading, however. Just because a piece of software is often used doesn't automatically make it the best fit for your application. And to the second metric, after a project has attained a certain level of maturity, it is often feature complete and releases can and should slow down. So in addition to those metrics I also like to look at a crates dependents and it's dependencies. Looking at the former will give you some idea of the sort of projects which are using a crate, which can be either a big confidence booster or quite the opposite, if it is largely unused. The latter is important, as it will give you at least part of the picture of how large your dependency graph might get from adding this one package.

This is not a simple issue, folks

There are those who are absolutely against language level package managers. The common argument is that your system package manager is the only tool that should be used to solve the problems that the likes of Cargo and NPM were designed to solve, but this is disingenuous at best. I'm looking at you, Drew Devault. In reality, if you look at most large C++ projects you will find dozens of sub-projects which are all vendored code. Due to the thrown together nature of this ecosystem, these dependencies are often incredibly difficult to update, and this code winds up becoming extremely stale. This is the garbage that might be available as a shared library on your distribution, but because distributions are shipping wildly varied versions of it, the software vendor just ships a specific version of the code and compiles it in statically in order to avoid a massive kludge of ifdefs and other preprocessor abuse. Some libraries which have been available for a long time have managed to avoid this by providing a consistent interface over the years, but they are probably the exception at this point. The reality is that your distro's package manager never has and never will know anything about this pinned code. Nobody is really managing it, because there's no tools that were designed TO manage it. That's what a language level package manager was designed to avoid, and they do their job well. And frankly most of the people who shout this argument the loudest know this and just happen to leave it out of their arguments.

On the flip side, it's way too easy to have a dependency graph explode just by adding a few packages. And it's pretty much impossible to audit all of that code yourself, so there is most definitely a problem.

I don't claim to have all of the answers. But I do think that there needs to be a more healthy and productive discussion, with less finger pointing and a lot better listening on all sides.

Tags for this page

software

package-managers

dependencies

Home

All posts

All content for this site is released under the CC BY-SA license.

© 2022 by JeanG3nie

Finger

Contact