💾 Archived View for thebird.nl › gn-gemtext-threads › issues › buggy-use-of-urljoin.gmi captured on 2023-01-29 at 02:54:18. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

Buggy Use of `urllib.parse.urljoin`

Tags

Description

The

`urllib.parse.urljoin` function

will extract the base url from the first argument, and will lead to subtle errors if the configurations are not set up correctly to include the trailing slash.

For example, if you call

this function

with the arguments

get_highest_user_access_role("123", "456", gn_proxy_url="https://genenetwork.org/gn3-proxy")

the function does not actually access

https://genenetwork.org/gn3-proxy/available?resource=123&user=456

as one might expect, instead, it actually accesses

https://genenetwork.org/available?resource=123&user=456

If you compare the 2 urls, you see that the "gn3-proxy" part of the url is dropped. If you include the trailing slash as follows

get_highest_user_access_role("123", "456", gn_proxy_url="https://genenetwork.org/gn3-proxy")

then it accesses

https://genenetwork.org/gn3-proxy/available?resource=123&user=456

as is expected.

This failure mode is a little too subtle, and leads to time usage trying to troubleshoot the issue. We need a more robust way to join the URIs such that the system will always do the expected thing regardless of whether one remembers to add the trailing slash.