💾 Archived View for gemini.hitchhiker-linux.org › gemlog › open_source_and_infrastructure.gmi captured on 2023-12-28 at 15:20:24. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-06-14)

-=-=-=-=-=-=-

Open Source and Infrastructure

2022-07-03

The Software Freedom Conservancy has gathered some attention around their "Give up Github" campaign recently. In addition to some news articles, it's been thoroughly discussed on the fediverse and seems to be gathering at least some momentum among individual developers.

Give up Github landing page

The tipping point seems to have been around the launch of Copilot, the programming AI assistant. While there is definitely a lot that I find creepy about Copilot, there are a lot of other good reasons why open source projects should probably not be based on proprietary code forges or use proprietary infrastructure. As is often the case, Drew Devault already covered this topic pretty well in the past.

https://drewdevault.com/2022/03/29/free-software-free-infrastructure.html

I don't want to rehash the same things that others have been saying. Copilot is creepy, like I said. It's also probably inevitable, as much as I don't like it. But I digress.

In context - the wider internet

People have slowly been waking up to the realization that if you're not paying for a product, then you are the product. This is most commonly observed in relation to Google and Facebook, but it applies to software forges as well. Microsoft is the very definition of a for profit company, and when they bought GitHub I'm sure that there was an expectation to squeeze more profit from it. Well looky looky, they found a way to monetize all of those free accounts. And guess what? They're going to create a world in which there are two classes of developer. Those with an AI assistant and those without. Those with an AI will be dramatically more productive, at least in terms of code output (code quality is a different story and don't matter in a corporate culture that has largely moved to the move fast and break things mentality).

So embrace, extend, extinguish has become embrace, lock in and monetize. Yeah, Microsoft changed. They've changed from a monopolistic software company to software as social media. You, and by extension your code, are the product. Are we surprised?

But there is still so much more to say. Github gives you access to so much *stuff* for free, and you can bet that people take advantage of it.

Let's talk about Docker and CI

I have been avoiding learning Docker for years. In that time, I haven't really put into words why it bothers me as much as it does, but I want to start. I'll start in terms of software development and continuous integration. Let's think in terms of energy efficiency balanced with developer efficiency. But really, this *should* be drastically weighted towards energy efficiency because the only way we're going to survive as a species is by curbing our energy consumption.

Now, in terms of a large company with many developers working on a common codebase, CI is more likely to be a net win. If every developer ran their own compiler every time they make a change, along with a full test suite, that's a lot of energy compared with a CI setup that periodically runs throughout the day on everyone's changes. In contrast for a project that has one primary developer, or even a small group, it's quite easy to make this into extreme overkill. And the way that I see CI deployed by a lot of open source projects on Github falls into this category. In particular, if your CI runs on every commit, and everyone who contributes to the project has a fork of the main repo, then each of their forks runs the CI on every commit as well. This is on top of each eveloper compiling the code on their own local machine and/or running the test suite locally. Of course, you can set this up so that the CI runs just for pull requests, but not everyone knows enough about docker to make wise choices when setting up their pipelines.

Now thinking about some of Docker's other use cases, I start to get really pissed off. Yeah, it's fast and convenient to be able to spin up a server in Docker. And so everyone does, and this is now an industry standard way to deploy your web site or application. So an average piece of hardware in a data center might now have literally hundreds of web servers and database servers running at a single time, each with their own copies of libc, openssl, and probably a whole ton of things that are just sitting there idle but consuming power and resources. In contrast, a single Apache instance and database server could likely serve a much larger number of sites from that same hardware, while consuming a fraction of the disk space and energy. But that would require skill and coordination to set it up. It really ought to be outlawed.

I keep hearing about containerization being the future for the Linux desktop as well. Umm, wait.. guys? Is this a good idea? It's drastically wasting resources in the datacenter, but now you want to bring it to the desktop as well? And you want to sandbox every application in the name of security? Right, because every application having it's own copy of every system library, many with conflicting versions, is going to *decrease* the potential attack surface? Oh, you didn't think about that? Well you f$@king should have. Piling on more abstractions comes with real costs. Every. Single. Abstraction. in a system should have to be thoroughly justified and audited if it's justification is security. But really the reason why a lot of people are pushing for this is because it's easy, not because it's better. We're only now beginning to see the costs become apparent. Look at the Firefox snap fiasco on Ubuntu for why rushing headlong in this direction might not be a great idea. But again, I digress.

Open source infrastructure

As an individual software dev, I could easily get by without using *any* code forge, proprietary or open. Git was in fact designed as a fully distributed system, and forges are an artificial construct designed to re-centralize that system. While I applaud the push to add federation to Gitea and hopefully other forges, almost all of these forges share the common problem of being designed as a centralized platform. Yes, you can run a local Gitea or Gitlab instance, but in order to contribute to it someone would have to create an account. Federation will only fix the issue of what server that location is on, but it will still have to exist. Git itself doesn't have this requirement. The problem is that both Gitlab and Gitea were modeled too closely on Github. But it's what we have to work with. Sourcehut is the outlier here, retaining the possibility of working with git send-email directly. But even so, I would submit that the only reason to use code forges is for visibility. You want people to know about your project and hopefully contribute to it, so you throw it up on a forge. But the current state of the art in software forges is that they all have major flaws, making it even more important to use one that is open source, so that the development of the forge software itself can be influenced in the right direction.

There was a question on Mastodon yesterday about how to set up a mirror, because this developer wanted to move to Codeberg but keep their Github repo for visibility. I'm not going to say that's wrong because I do the same. But it's a surprising question to me, because you don't need anything other than Git itself to do it. In fact someone had beaten me to the correct answer of adding a second push url. But then the original poster replied, asking how to synchronize them because one repo might have changes that the other does not. Say, because you merged a pull request on Github. Oh my, have we gotten so far from our tools? I tried to calmly reply that git is still all that you need. Make the push, and if one of those repos has changes that your local copy does not, the push will fail (assuming that the remote is not set to auto-merge, which they should not be). Then merge the changes from that repo into your local and push again. I know, getting off into the weeds again, but it's related. And it's related because it's a demonstration of how developers, particularly younger ones, have gotten so far removed from their tools that they don't realize their options. And this has happened as a direct result of the popularity of Github and, to a lesser extent, VScode.

Yeah, that's right. VScode. I don't want to start an editor war here, but if your editor takes care of creating and managing Github repositories for you then you are that much less likely to actually learn how to use Git in anger. Your skill set is limited, and Microsoft isn't going to build into VScode tools which will make it just as easy to migrate away from Github, are they? Remember, they are now partially a social media company and you are their product. Why do you think they made VScode available, out of the goodness of their hearts? Or because it's part of a strategy to funnel more data into their control? Don't be naive. That's as far into an editor flame war as I want to go.

Anyway, if you are developing open source software, you owe it to yourself and to future generations to consider open infrastructure first. It's disturbing to me that we're not seeing more large organizations making the move, only individual developers. But hey, you have to start somewhere. I personally endorse codeberg.org and sr.ht as great alternatives for those who don't want to go to the trouble of self hosting, or who still want their code to be somewhere with at least some centralization for visibility's sake. Of the two I use Codeberg, not for any particular reason other than I found their site first.

Tags for this page

code

open_source

gpl

technology

environment

Home

All posts

All content for this site is licensed as CC BY-SA.

© 2022 by JeanG3nie

Finger

Contact