💾 Archived View for tilde.pink › ~kaction › log › 2021-11-25.1.gmi captured on 2023-09-28 at 16:12:54. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
Lately I was trying to make offline dump of all activities around specific project hosted on GitHub. GitHub provides REST API to access most of hosted content, but it is designed in a way that makes the undertaking very hard.
You can list issues of the project, sure, but only in tiny (less than 100) batches. Issue comments are fetched by separate call, also in small batches, so the total amount of API calls necessary is huge. And you are rate-limited.
But the worst part is incremental updates. Since issue numbers are increasing, you can tell which issues are new. But there is no way to know whether old issue was modified or got new comment. In other words, making one-off dump for a migration from GitHub is feasible, keeping up-to-date not so much.
There is no surprise that GitHub API is optimized for the biggest user: clicky-clicky web-interface, but I wish they were providing periodic dumps, like Stack Exchange does.
https://archive.org/details/stackexchange
But note how well the same problem is solved by Linux project -- everything is exported by single git repository, getting up-to-speed is matter of "git pull".
https://lore.kernel.org/lkml/_/text/mirror/
Sigh. I ranted about online-first generation enough already. Anyway, it looks like my best bet to survive this mobile-first madness is surfraw(1) and mastering github search syntax.
PS. Glad that extradiction of Jullian Assange failed. PPS. Damn, too early.