2018-04-20 GitHub Backup

I don’t like the fact that GitHub has turned out to be the *de facto* central hub for Free Software development. But I use it, too. It’s easy to track issues, it’s easy to accept merge requests and comment on them without using Email, and so on. And that’s cool. But if GitHub goes down, are you sure you have a copy of all your local repos still available or did you delete some over the years?

without using Email

We use multiple URLs for Oddmuse and that works quite well. Basically we pull changes from one repository and push to three of them. That means that the other two repositories are our backups.

multiple URLs for Oddmuse

So, here’s what I think I need to do:

1. have a local repo on my laptop for every GitHub project I ever maintained

2. keep remote repo on my own server just in case

3. eventually, in case it’s ever necessary, install one of stagit, cgit, gitea, or gitlab (whichever is simpler to install and maintain)

I’ve been following chapter 4.4 Git on the Server - Setting Up the Server in the book.

4.4 Git on the Server - Setting Up the Server

Start as root:

# check that git-shell is missing and add it
cat /etc/shells
echo /usr/bin/git-shell >> /etc/shells
sudo adduser git
chsh git -s /usr/bin/git-shell
# answer some questions, then switch to user git
su git
cd
mkdir .ssh && chmod 700 .ssh
touch .ssh/authorized_keys && chmod 600 .ssh/authorized_keys
# back to root
exit
# copy authorized keys
cat /home/alex/.ssh/authorized_keys >> /home/git/.ssh/authorized_keys
# make sure git can use ssh if you use AllowUsers to restrict access
emacs /etc/ssh/sshd_config
service sshd reload

In order to add a new project, as root:

cd /home/git/
mkdir hex-mapping.git
cd hex-mapping.git
git init --bare
cd ..
chown -R git.git hex-mapping.git/

On my laptop, using `magit` in Emacs, this is what the remoting popup looks like:

Configure origin
 u remote.origin.url     git@github.com:kensanata/hex-mapping.git
                         ssh://git@alexschroeder.ch:882/home/git/hex-mapping.git
 U remote.origin.fetch   +refs/heads/*:refs/remotes/origin/*
 s remote.origin.pushurl unset
 S remote.origin.push    unset
 O remote.origin.tagOpts [--no-tags|--tags]

(The only complication is that I don’t run my SSH daemon on the standard port because of all the login attempts.)

Or on the command line:

$ git remote -v
origin	git@github.com:kensanata/hex-mapping.git (fetch)
origin	git@github.com:kensanata/hex-mapping.git (push)
origin	ssh://git@alexschroeder.ch:882/home/git/hex-mapping.git (push)

You can add URLs using `git remote add origin <url>`.

This works!

Continuing to the next chapter, 4.5 Git on the Server - Git Daemon.

4.5 Git on the Server - Git Daemon

Edit `/etc/systemd/system/git-daemon.service` as root:

[Unit]
Description=Start Git Daemon

[Service]
ExecStart=/usr/bin/git daemon --reuseaddr --base-path=/home/git/ /home/git/

Restart=always
RestartSec=500ms

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=git-daemon

User=git
Group=git

[Install]
WantedBy=multi-user.target

Then, as root:

systemctl enable git-daemon

And something to do for every repository you create:

cd /home/git/hex-mapping.git
touch git-daemon-export-ok

Todo: adding monitoring to Munin and Monit. Then again... Why go through all that? I think having a backup is enough for the moment.

systemctl stop git-daemon
systemctl disable git-daemon

​#Git ​#Administration ​#cgit

Comments

(Please contact me if you want to remove your comment.)

This isn’t really a backup though. It’s a lot of work instead. Unless I misunderstood, you need to manually create a new repository twice everytime you create something new, once at github and once on your server. And then you need to make sure to always push to both.

A proper backup should be effortless - set it up once and have it work forever. So, something that uses the github API to read all your github repositories and clone and update them automatically on your server would make more sense IMHO.

But still, you’d miss important data when github would suddenly implode. What about all the issues tracked at their bugtracker? What about open PRs by other people? What about the wikis (though you might not use that)?

A proper backup script should also backup those things. Unfortunately I have never found such a solution and was always too lazy to write one 😉

– Andreas Gohr 2018-04-21 07:39 UTC

Andreas Gohr

---

Yes, I need to add a mechanism to automate the creation of repos on my own server. Pushing to both repos is no problem. As soon as you add a second URL to your „origin“ the. Every `push` will push to both repos.

As for issues and PRs, you are right. But then again the closed ones I don’t care about too much and the open ones are hopefully small in number. I already keep my wikis elsewhere, as you know. 😀

So yes, it’s not perfect but right now it seems good enough. My fear is that I’ll forget repos, they slowly disappear from my laptop and on the day GitHub goes down, I don’t even know what I had there.

– Alex Schroeder 2018-04-21 17:04 UTC