It's probably a good thing some malformed URLs are considered “valid”

It seems it's all too easy to generate double slashes in the path component [1] of a URL (Uniform Resource Locator), because I received via email a report that my current [2] feed [3] files [4] all had that issue.

Sigh.

I made a change a few months ago in how I internally store the base URL of my blog. It used to be that I did not store the trailing slash (so that "https://boston.conman.org/" would be stored as "https://bost.conman.org") so I had code to keep adding it back in when generating links. I changed the code to store the tailing slash, but missed one section of code because I don't subscribe to any of my feed files and didn't notice the issue.

I also fixed an actual crashing bug. All I have to say about that is that web robots are quite good at generating really garbage requests [5] using a variety of methods [6]—it's like free fuzz testing [7]! Woo hoo! Sob!

[1] /boston/2023/01/11.1

[2] https://boston.conman.org/bostondiaries.rss

[3] https://boston.conman.org/index.atom

[4] https://boston.conman.org/index.json

[5] /boston/2019/07/09.1

[6] https://www.iana.org/assignments/http-methods/http-methods.xhtml

[7] https://en.wikipedia.org/wiki/Fuzzing

Gemini Mention this post

Contact the author