💾 Archived View for tilde.town › ~dzwdz › b › 2022-12-10-search-engines.gmi captured on 2023-07-10 at 13:41:29. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
I think most people would agree that the quality of search engine results is worsening. Half of them is either SEO spam or trying to sell you something. With AI written articles apparently being a thing now, the situation won't be getting any better.
When I mention this to people, they usually mention something like Searx, where you have a service aggregating results from multiple sites - Wikipedia, Stack Overflow, etc.
This way you don't get as many results from random sites - which avoids the spam ones... at the cost of also avoiding the well written personal sites. It throws the baby out with the bathwater.
I think a better approach would be to create a custom crawler, seeded from people's bookmarks with a depth limit. You'd convince a bunch of people to convert their bookmarks to a html page you could point the crawler at, build a bookmarking service together with the search engine, or whatever.
The point is - people don't tend to bookmark garbage. If you don't start accepting bookmarks from arbitrary people (why would you?), it will be hard for SEO spammers to make their way in. Did some spam results creep in anyways? The depth of the crawler would be small enough that you could see how they got in there, and do something about it.
This might not be that crazy of an idea. There are already plenty of open source full text search engines out there, and bandwidth isn't that expensive. However, I don't have the experience or resources to implement this. I'm just throwing this idea out there to see what people think. Hopefully it will interest someone.