💾 Archived View for gitnick.srht.site › 2021 › 03 › 19 › git-privacy captured on 2022-06-11 at 20:49:42. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
_ _ _ _ _ (_)__| |__ ____ __| |_ ___ _ _ ___ | ' \| / _| / /(_-< '_ \ ' \/ -_) '_/ -_) |_||_|_\__|_\_\/__/ .__/_||_\___|_| \___| |_|
The text below is comes straight from my Git repository (with slight modifications). Find it at the link below:
[Git Repository][1]
--------------------------------------------------------------------------------
Git has a major privacy problem. With only 3 commands anyone can find out the times and dates (down to the second) someone worked on their Git repo.
git clone <target-repo> cd <target-repo> git log --format=fuller
An unmodified Git repo reveals too much about a developer's life. It reveals what dates and times they made commits and when those commits were modified. Based on that, with some inference techniques, others can deduce when the developer sleeps, their range of likely timezones and roughly how efficient they are as a developer. Combined with other data sets, Git poses a serious privacy issue.
Git commit objects[2] have exactly 2 (sometimes 3) timestamps to worry about. I'll get to the 3rd later. Here are the 2 main ones:
Git doesn't have a way to remove timestamps, but both the GIT_AUTHOR_DATE and GIT_COMMITTER_DATE can be set to any arbitrary date. For instance 1 Jan 2000 at midnight. This gives maximum privacy. Simply set GIT_AUTHOR_DATE and GIT_COMMITTER_DATE in your shell's environment variables. For Bash:
export GIT_COMMITTER_DATE="2000-01-01 00:00:00+0000" export GIT_AUTHOR_DATE="2000-01-01 00:00:00+0000"
To make the changes permanent in bash, append the commands to ~/.bashrc:
echo -e "export GIT_COMMITTER_DATE=\"2000-01-01 00:00:00+0000\"\nexport GIT_AUTHOR_DATE=\"2000-01-01 00:00:00+0000\"" >> ~/.bashrc
However, if necessary it's just as simple to set both the GIT_AUTHOR_DATE and GIT_COMMITTER_DATE to the real date without the seconds, minutes and hours. This provides greater privacy yet still meaningful timestamps:
export GIT_COMMITTER_DATE="$(date +%Y-%m-%d) 00:00:00+0000" export GIT_AUTHOR_DATE="$(date +%Y-%m-%d) 00:00:00+0000"
To make the changes permanent in bash, append the commands to ~/.bashrc just as before:
echo -e "export GIT_COMMITTER_DATE=\"$(date +%Y-%m-%d) 00:00:00+0000\"\nexport GIT_AUTHOR_DATE=\"$(date +%Y-%m-%d) 00:00:00+0000\"" >> ~/.bashrc
Environment variables don't change after they're set. Therefore the date updates when you open a new shell, not upon a new day.
It's important to digitally sign Git commits and especially releases to prevent man-in-the-middle attacks. These signatures contain their own timestamps which can be just as bad for privacy as Git timestamps, especially if every commit is signed.
To automatically 'remove' timestamps in Gnupg signatures in new Git commits, the system time needs to be faked. Luckily Gnupg has a flag for just that: --faked-system-time <iso>. Git needs to run a version of the Gnupg program that always fakes the system time.
To accomplish that a bash script can be placed somewhere in $PATH, for instance /usr/bin/gpg2-git. gpg2-git should contain:
gpg2 --faked-system-time <iso>! $@
The <iso> time can be any time after the signing key was generated. For reference, my iso value is 20201130T000000 (30 November 2020 at midnight). My key was created 29 November 2020.
For enhanced privacy, exclude Gnupg version number and comments from signatures in /usr/bin/gpg2-git with:
gpg2 --faked-system-time <iso>! --no-emit-version --no-comments $@
And don't forget:
chmod +x <path>/gpg2-git
Finally, to make Git use the new gpg2-git program, add the following lines to ~/.gitconfig:
[gpg] program = gpg2-git
Done. Git will now use a fake system time for every signed commit. Git preserves almost no metadata[3] by design, so privacy is looking pretty good.
The most popular code hosting platform Github is known to record when commits are pushed[4]. See the ticket about Github contribution activity[5].
Push times aren't really exclusive to Github. It's possible that other code hosting platforms track them outside of the public API. It's easy enough for anyone to crawl a public repo and track push times anyway. Unless the developer controls the code hosting platform then they can't know for certain whether push times are being tracked.
The easiest way to resolve this is don't push any code manually. Instead use a cron job that pushes all repositories to the remotes automatically at midnight.
Environment variables may seem a very crude way to obfuscate Git timestamps. It's possible to use Git hooks to accomplish timestamp obfuscation, but it doesn't work very well since it's still necessary to manually override the date for some Git commands. Git developers need to make timestamp obfuscation a feature of Git to finally resolve the privacy problem.
This text is licensed under CC-BY-SA 4.0.
Link(s):
2: https://mirrors.edge.kernel.org/pub/software/scm/git/docs/user-manual.html#commit-object
3: https://git.wiki.kernel.org/index.php/ContentLimitations
4: https://api.github.com/repos/cirosantilli/china-dictatorship/events
5: https://github.com/isaacs/github/issues/142
Unless otherwise noted, the writing in this journal is licensed under CC BY-SA 4.0
Copyright 2019-2022 Nicholas Johnson