💾 Archived View for station.martinrue.com › krixano › 9185188a3a024a76b271414b50ef010a captured on 2023-03-20 at 19:54:41. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-01-29)

➡️ Next capture (2023-04-20)

🚧 View Differences

-=-=-=-=-=-=-

👽 krixano

New Stats Page on AuraGem: gemini://auragem.space/search/stats

All of indexed Geminispace is about 39 GB, and all text files are around 7.6 GB.

7 months ago · 👍 astroseneca, kakafega, freezr, bencollversphone, smokey

Links

[1] gemini://auragem.space/search/stats

Actions

👋 Join Station

3 Replies

👽 krixano

@smokey As for the gemtext accounting for ~20%, it makes sense when you take into account that gemtext is going to be smaller than other file formats. So even if there's more gemtext files, those files are also smaller. Right now you can see all of the file formats in the index on the mimetypes page with counts (although, not with sizes, because those actually take a bit for the db to calculate since atm they aren't cached). I will add a link from the stats page to the mimetypes page right now. · 7 months ago

👽 krixano

@smokey You can't ever be sure that you have crawled everything, because there can be servers that aren't linked from anywhere. By "all of indexed" I mainly meant everything auragem has indexed, not all of "indexable" geminispace (indexed and indexable mean different things, of course). Also, in order for something to be indexable, they would have to be linked from a page that is itself indexable, so indexable by definition means you would be able to crawl it *eventually*. However, this is very much dependent on what seeds you have (what pages you start out with). AuraGem started out with a fairly broad set of seeds, increasing its chances of getting more sites. · 7 months ago

👽 smokey

Very cool! How can you be sure auragem has crawled *all* of indexable geminispace and not just part of it? Does the crawler slowing down in discovery speed eventually to a near halt give a good indication of a complete crawl? Also, if gemtext only accounts for ~20 percent of geminispace data, what are the other data types and in what order of highest to lowest? · 7 months ago