💾 Archived View for dioskouroi.xyz › thread › 29442818 captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content
➡️ Next capture (2021-12-05)
-=-=-=-=-=-=-
________________________________________________________________________________
After the creation date, putting any information in the name is asking for trouble one way or another.
Clearly these suggestions predate SEO.
Do you really feel that the old URIs cannot be kept running? If so, you chose them very badly. Think of your new ones so that you will be able to keep then running after the next redesign.
"Never do anything until you're willing to commit to it forever" is not a philosophy I'm willing to embrace for my own stuff, thanks. Bizarre how blithely people toss this out there. Follow the logic further: don't rent a domain name until you have enough money in a trust to pay for its renewals in perpetuity!
Think of the URI space as an abstract space, perfectly organized. Then, make a mapping onto whatever reality you actually use to implement it. Then, tell your server. You can even write bits of your server to make it just right.
Oh, well if it's capable of implementing something _abstract_, I'm sure that means there will never be any problems. (See: the history of taxonomy and library science)
Going a little in the opposite direction is the best compromise: keep the information that was available via the old url accessible via the old url in perpetuity (or for the duration of the website). Even a redirect is better than destroying URLs that have been linked to elsewhere for years.
So what happen to that URN discussion? It has been 20 years. Have there been any results I can actually use on the Web today? I am aware that BitTorrent, Freenet and IPFS use hash based URIs, though none of them are really part of the actual Web. There is also rfc6920, but I don't think I have ever seen that one in the wild.
Hashes aside, allowing linking to a book by it's ISBN doesn't seem to exist either as far as I am aware, at least not without using Wikipedia's or books.google.com's services.
IEEE Xplore at least uses DOIs for research papers. Don't know if anyone else does, though.
Everyone uses DOIs for research papers, and
will take you there. In fact, I think the URI form is now the preferred way of printing DOIs.
There are over a thousand redirects in an Apache config file for a company I contracted with. The website was 20 when I worked there, it is now 26 years and AFAIK they still stick to this principle. And it's still a creaky old LAMP stack. It can be done, but only if this equation holds:
URL indexing discipline > number of site URLs
(There was no CMS, every page was hand-written PHP. And to be frank, maintenance was FAR simpler than the SPA frameworks I work with today.)
Cool rules of thumb don't run contrary to human behaviour and/or rules of nature.
If what you want is a library and a persistent namespace, you'll need to create institutions which enforce those. Collective behaviour on its own won't deliver, and chastisement won't help.
(I'd fought this fight for a few decades. I was wrong. I admit it.)
People can know what good behaviour is, and not do good; that doesn't mean it isn't helpful to disseminate (widely-agreed-upon!) ideas about what is good. The point is to give the people who _want_ to do good, the information they need in order to do good.
It's all just the Golden Rule in the end; but the Golden Rule needs an accompaniment of knowledge about _what_ struggles people tend to encounter in the world—what invisible problems you might be introducing for others, that you won't notice because they haven't happened to you yet.
"Clicking on links to stuff you needed only to find them broken" is one such struggle; and so "not breaking your own URLs, such that, under the veil of ignorance, you might encounter fewer broken links in the world" is one such corollary to the Golden Rule.
In this case ... it's all but certainly a losing battle.
Keep in mind that when this was written, the Web had been in general release for about 7 years. The rant itself was a response to the emergent phenomenon that _URIs were not static and unchanging_. The Web as a whole was a small fraction of its present size --- the online population was (roughly) 100x smaller, and it looks as if the number of Internet domains has grown by about the same (1.3 million ~1997 vs. > 140 million in 2019Q3, growing by about 1.5 million per year). The total number of websites in 2021 depends on what and how you count, but is around 200 million active and 1.7 billion total.
https://www.nic.funet.fi/index/FUNET/history/internet/en/kas...
https://makeawebsitehub.com/how-many-domains-are-there/
https://websitesetup.org/news/how-many-websites-are-there/
And we've got thirty years of experience telling us that the _mean_ life of a URL is on the order of months, not decades.
If your goal is stable and preserved URLs and references, you're gonna need another plan, 'coz this one? It ain't workin' sunshine.
What's good, in this case, is to provide a mechanism for archival, preferably multiple, and a means of searching that archive to find specific content of interest.
Collective behavior can work if it’s incentivized.
Not where alternative incentives are stronger.
Preservation for infinity is competing with current imperatives. The future virtually always loses that fight.
Any favorite strategies for achieving this in practice, e.g. across site infrastructure migrations? (changing CMS, static site generators, etc)
Personally about the only thing that has worked for me has been UUID/SHA/random ID links (awful for humans, but it's relatively easy to migrate a database) or hand-maintaining a list of all pages hosted, and hand-checking them on changes. Neither of which is a Good Solution™ imo: one's human-unfriendly, and one's impossible to scale, has a high failure rate, and rarely survives migrating between humans.
June 17, 2021, 309 points, 140 comments
https://news.ycombinator.com/item?id=27537840
July 17, 2020, 387 points, 156 comments
https://news.ycombinator.com/item?id=23865484
May 17, 2016, 297 points, 122 comments
https://news.ycombinator.com/item?id=11712449
June 25, 2012, 187 points, 84 comments
https://news.ycombinator.com/item?id=4154927
April 28, 2011, 115 points, 26 comments
https://news.ycombinator.com/item?id=2492566
April 28, 2008, 33 points, 9 comments
https://news.ycombinator.com/item?id=175199
(and a few more that didn't take off)
I know it seems to be part of HN culture to make these lists, but not sure why. There's a "past" link with every story that provide a comprehensive search for anyone that is interested in whatever past discussions :-/
Immediacy and curation have value.
Note that dang will post these as well. He's got an automated tool to generate the lists, which ... would be nice to share if it's shareable.
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
that URL changed; it used to start `http:`-- now it starts `https:` -- not cool!
The HTTP url works fine still, it sends you to the right place.
Not exactly though, it only redirects you to the HTTPS version if it was set up that way. Otherwise, it will show a broken page.
but the entire point of the rule is that you should set up your sites so that old URLs continue to work.
I can't follow – does it or doesn't it?
Except the name URL to URI!
IPFS gives you immutable links
cool!
Except now you can't update the contents of a page not even one tiny bit without also changing its URL, which is equally useless.
While I concede that the ability to retrieve the previous version of a page by visiting the old URL (provided anybody actually still _has_ that old content) might come in handy sometimes, I posit that in the majority of cases people will want to visit the _current_ version of a page by default. Even more so, I as the author of my homepage will want the internal page navigation to always point to the latest version of each page, too.
So then you need an additional translation layer for transforming an "always give me the latest version of this resource"-style link into an "this particular version of this resource" IPFS link (I gather IPNS is supposed to fill that role?), which will then suffer from the same problem as URLs do today.
a content-derived uri is a necessity for some things (indisputable facts, scientific papers, opinions at a moment in time, etc.) but foolish for others. Think of a website displaying the current time or anything inherently mutable.
But having unchanged documents move to new locations on the same domain without a redirect but a 404 is just utter unforgivable failure. Or silently deleted documents, also an uncool nuisance.
Both happen a lot. That's what comes to my mind, when I read the initial quote.
Until the IPFS devs decide that IPFS protocol v1 might become insecure in the medium term and create an new, secure, better, incompatible IPFS protocol v2 and every link, every search index, every community ever suddenly disappears. Don't laugh. It happened to Tor two months ago. The onion service bandwidth has been reduced by about 2/3 over November as distros and people upgraded to incompatible Tor v3 only software.
https://metrics.torproject.org/hidserv-rend-relayed-cells.pn...
The weakness is always the people.
> The onion service bandwidth has been reduced by about 2/3 over November
Back to 2019 levels of bandwidth? I feel like I may be misreading that graph, but I'm more curious about what bandwidth suddenly spiked the last two years more so than the drop back down.
At any rate, I always understood the main point of Tor was providing an overlay network for accessing the internet and maintaining secure anonymity as much as possible, with its own internal network being more of a happy side effect.
I don't think IPFS would be as quick to kill compatibility with a vulnerable hashing algo compared to Tor since they're not aiming for security and anonymity as primary goals.
I feel at this point reminiscing about a time when the web was actually designed to be usable isn’t really productive.
Some of the largest companies on the planet are actively opposed to this concept. If you care about this kind of thing champion it from within your own organization.
A lot of things can be said about the impact that the largest companies have on the web, but on the specific discussion about not breaking URIs, I think they are generally good at keeping, or redirecting, links to content for as long as the content is up.
They just create walled gardens instead.
Pinterest is a great example of how organizations put business interests ahead of building an accessible web.