💾 Archived View for gmn.clttr.info › sources › geminispace.info.git › tree › serve › templates › new… captured on 2024-03-21 at 15:42:16.
⬅️ Previous capture (2024-02-05)
-=-=-=-=-=-=-
{% include 'fragments/header.gmi' %} ## News ### 2024-01-05 happy birthday :) On this very day three years ago "geminispace.info" was born and being announced an the long gone mailinglist a few days later. And still today, with some adjusted bits here and some additions there, geminispace.info is essentially still standing on the foundations that ~natpen build with GUS. Kudos to Natalie We had some ups and downs over the years, but mostly geminispace.info has been a reliable source of information for the Gemini community. Atleast we feel so and we hope you do as well. That being said we'd like to ask the community to help fund the hosting cost of geminispace.info. => /about More information about donations can be found on our "About" page. ### 2023-12-23 IPv6 is back geminispace.info is now available through IPv6 under the adress 2a02:247a:207:8e00:1::1 ### 2023-08-12 robots.txt clarification We've added some clarification on how geminispace.info parses robots.txt files: => /documentation/indexing Indexing documentation ### 2023-07-30 pubnix & robots.txt When fetching a robots.txt geminispace.info is now aware of pubnix-style user dirs (domain.com/~joe and domain.com/users/joe) allowing for per-user robots settings. The provisions for this feature have been in the code for quite a long time, but we didn't make use of it until now. ### 2023-07-29 robots.txt Dear capsule pilots (especially those who run mirrors or gateways of bloated web site): Please put a robots.txt in place. => gemini://geminiprotocol.net/docs/companion/robots.gmi robots.txt companion spec at circumlunar ### 2023-06-07 server switch Be welcome on our brand new production instance sporting Debian 12. ### 2023-03-04 twtxt The "known feeds" page now includes twtxt.txt feeds. => known-feeds check the "Known feeds" page ### 2023-02-10 We now provide a list of URIs that are currently excluded from crawl & indexing. This should improve the transparency on what geminispace.info is doing. At the moment there is no reason given as to why a specific exclude is in place. We might add this in the future. => documentation/filters list of excluded URIs ### 2023-01-29 updated TLS certificate geminispace.info now uses an updated certificate that uses X.509 Version 3. I hope this improves compatibility with clients as the previously used X.590 v1 seems to move out of support in some implementations. ### 2023-01-27 update delay We had some issues with the crawler stuck in an "infinite maze" that should have never been crawled. This is solved for the moment and the index is up to date again. Additionally there is some intermittent trouble with name resolution. I have no clue what causes this. If someone has experience in debugging name resolution on linux (Debian) i'd thankfully accept any advice. ### 2023-01-05 I've made some adjustments to the raw database for some major performance improvements. This helps mostly when we update the index or restart the server, it does not affect searching on geminispace.info. Due to the announced price increases from our hoster I'm thinking about hosting geminispace.info on a spare RasPi at home. The downside would be that geminispace.info would change it's IP every 24 hours - so IP-based blocking of the crawler would be impossible. Do you think this is acceptable? Feedback welcome. ### 2023-01-01 Happy new year everyone, hope your all doing well. Our provider (netcup.de) anounced a price increasing - so maybe we are going to migrate to another provider some day in June. We need atleast 2 vCPUs (4 will be better), 8 GB of RAM and atleast 100 GB block storage. Suggestions welcome. ### 2022-12-18 After some small adjustments to the indexing i'm confident we can postpone the need for a major rewrite quite a few months. ### 2022-12-01 donations As of today geminispace.info has received donations that sum up to 82.78€. Thats almost 8 months of hosting costs covered. Thank you very much. ### 2022-11-12 With the still ongoing increase of gemini capsules, the current techstack geminispace.info hits it's limits in on or another case. There are several options to improve the situation: 1. do a major rewrite and move to another tech stack (especially regarding data storage and full text search) 2. move to another implementation like the one tlgs.one uses 3. shut down the service At the moment i have no motivation to put the required efforts into option 1 and 2. We'll see if this changes in the next few months - the current contract for the vps will end in July 2023. ### 2022-08-22 donations welcome We've set up a way to send donations to help covering the ongoing costs of running geminispace.info => about more information can be found on our About page ### 2022-08-18 duplicate results Due to a small glitch in the crawler we had duplicate results in the dataset for a few weeks. Thanks to the report of Acidus this has now been fixed and the duplicate entries were removed. Despite this, gemini keeps growing organically. The raw data known to geminispace.info at the moment exceeds 10 GB of data and we already exclude some high traffic capsules like news or wikipedia relays. ### 2022-07-21 crawling issues We had some crawling issues in the last days. In the end it turns out someone decided to serve huge video files over gemini. At the moment we process all files in memory, so the crawl simply got killed by the oom-killer once the downloaded video size hits the available memory. This is workarounded by excluding the capsule in question from the crawl. A more proper fix for this needs to be implemented in the future. ### 2022-06-07 suggestions disabled The poor performance on "no result searches" was caused by some misbehaviour when trying to compute suggestions for alternate search terms which eventually led to an exception. I disabled suggestions for empty search results for the moment. Suggestions will come back once i sorted this out. ### 2022-05-16 there's currently an issue with search querys that will lead to no results (e.g. geminispace.info can't find a page that matches the criterias): These searches will take a very long time until a "no results" page is returned, sometimes they will even fail with a "42 TEMPORARY FAILURE". Any search for a known pattern will return the results within seconds, so expect that geminispace.info does not know about a page that matches your criteria if the search takes more then 20 seconds. We are looking into this. ### 2022-05-13 speeding up crawling Our crawl engine is now multi-threaded. This means that multiple requests are made in parallel and the overall crawl time is greatly reduced. Additionally the crawling is now more random, which should avoid requesting huge amounts of pages from a single capsule in a short time. ### 2022-05-09 memory usage issue solved? It seems like we've finally solved our memory issue. In the end it may have been a small parameter for whoosh which ended up loading the whole index into RAM. At a first glance this didn't cause any performance drain, it even seems the system is more responsive now. Maybe due to the high memory pressure causing overhead. ### 2022-03-27 Debian update The server running geminispace.info has been updated to Debian 11.3 without any issues. :) ### 2022-03-25 improved indexing speed With some small tweaks to the indexing process and the removal of old, now defunct, capsules which we still tried to crawl reduced the time needed for a complete update dramatically. ### 2022-03-20 dependency hell We had an outage due to a dependency upgrade that hit late. `markupsafe`, which is not used by geminispace.info but rather is a dependency of `jinja`, shipped a breaking change in a minor release which caused some trouble for various people. We were just late to the party. It's workarounded for the moment, will have a look at it later. ### 2022-03-19 TLS config update geminispace.info allows now more variants of TLS ciphers which hopefully will allow us to crawl even more capsules. ### 2022-03-05 monitoring geminispace.info is now monitored (and i will be alerted if something goes wrong) by shit.cx. Big thanks to Jon for providing this service. => gemini://status.shit.cx shit.cx status monitoring. ### 2022-02-08 oopsie So the last refactor went...erm...upside down. We had a outage for a few hours because of this. I rolled the changes back and will do another attempt for a (hopefully successfull) refactor in the next days *fingers crossed* ### 2022-02-06 filtering clients I've blocked two ips for repeatedly doing stupid requests again and again: