No really, these are related.
The HTTP User-Agent is what web clients are maybe supposed to send to identify themselves; not sending this header will result in various web servers hating on the request. For example, with User-Agent not sent, stackoverflow.com denied the request along the lines of:
Access Denied ... Method: block XID: 1045912792-BFI IP: ... X-Forwarded-For: ... User-Agent:
Note the empty User-Agent. So, one must sent this header to something. But not anything; certain strings will cause requests to be rejected by certain web servers. For example with a User-Agent of
User-Agent: lwp-request/la la la
the Discord chat CDN will reject the request with the helpful message "error code: 1010". Change the agent to
User-Agent: lpw-request/fa la la la la la
and everything is fine. The conclusion is that various website operators are assholes who block various User-Agent strings. Who knew?
Folks who use web clients that are not zeroday prone Google bloatware will often pick a User-Agent that identifies their web client as zeroday prone Google bloatware. This may of course cause problems, as one will encounter web developers who detail the CSS esoterica necessary where the zeroday prone Google bloatware differs in behavior from the zeroday prone Google funded bloatware. That would be Chrome and Firefox, respectively. On the other hand, given that website operators are assholes, one may need to set Chrome as the User-Agent so that various websites then magically work in Firefox. At least we have moved on, somewhat, from those Internet Explorer on Windows only websites?
Statistics based on the User-Agent therefore will have error bars of some size attached, though the number of people who do actually modify the header is probably pretty low.
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 8.0; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; MS-RTC LM 8)
I wonder how many puritanical webservers will reject this agent...
To be fair, web client users are also assholes. Given that IT--or whatever they are calling it these days, devops, perhaps--is generally overworked and underfunded, if a string match on the User-Agent saves you X amount of work per day or Y amount of needless CPU burn, that block is going to happen.
$ urlgrep '[.]pdf' https://www.drdobbs.com/parallel/graphics-programming-black-book/184404919 error: could not fetch resource: errno=Forbidden, status=403 $ http_agent=meow urlgrep '[.]pdf' https://www.drdobbs.com/parallel/graphics-programming-black-book/184404919 ...
What am I supposed to do? Manually click on 77 links like a caveman?
This is the Advanced Configuration and Power Interface, and its relation to the HTTP protocol User-Agent may seem dim. However, just like the forgery of the User-Agent string, so do non-Windows operating systems forge their identity:
The ACPI specification is not followed by most vendors. It seems it is just a point reference at best. Some machines call magic CMOS methods to do stuff and create stubs for the related specification methods ... OpenBSD has to register as Windows to the BIOS in order to get the proper methods since many implementations will provide empty or broken methods for other operating systems.
https://irofti.net/papers/zzz.pdf
One might wonder whether these protocols could be simplified, e.g. to not waste bytes on the User-Agent header, or since ACPI does not support much of anything besides Windows why not make that the sole interface to implement? Chesterton's Fence is a thing. And fixing a protocol after the barn doors were left wide open will likely not happen, or could take a very long time. Hindsight is also a thing, and the computer field is very young.
Odds are, we'll have to kluge along as best we can. (This view of the future may vary somewhat with that of the Marketing department, or whatever that is going on in the C-Suite.)
tags #w3m #legacyweb