💾 Archived View for perso.pw › blog › articles › openbsd-pkg_add_performance_analysis.gmi captured on 2024-03-21 at 15:54:08. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-05-24)

-=-=-=-=-=-=-

OpenBSD: pkg_add performance analysis

Comment on Mastodon

Introduction

OpenBSD package manager pkg_add is known to be quite slow and using much bandwidth, I'm trying to figure out easy ways to improve it and I may nailed something today by replacing ftp(1) http client by curl.

Testing protocol

I used on an OpenBSD -current amd64 the following command "pkg_add -u -v | head -n 70" which will check for updates of the 70 first packages and then stop. The packages tested are always the same so the test is reproducible.

The traditional "ftp" will be tested, but also "curl" and "curl -N".

The bandwidth usage has been accounted using "pfctl -s labels" by a match rule matching the mirror IP and reset after each test.

What happens when pkg_add runs

Here is a quick intro to what happens in the code when you run pkg_add -u on http://

Using FETCH_CMD variable it's possible to tell pkg_add to use another command than /usr/bin/ftp as long as it understand "-o -" parameter and also "-S session" for https:// connections. Because curl doesn't support the "-S session=..." parameter, I used a shell wrapper that discard this parameter.

Raw results

I measured the whole execution time and the total bytes downloaded for each combination. I didn't show the whole results but I did the tests multiple times and the standard deviation is near to 0, meaning a test done multiple time was giving the same result at each run.

operation               time to run     data transferred
---------               -----------     ----------------
ftp http://             39.01           26
curl -N http://	        28.74           12
curl http://            31.76           14
ftp https://            76.55           26
curl -N https://        55.62           15
curl https://           54.51           15

Charts with results

Analysis

There are a few surprising facts from the results.

Conclusion

Using http:// is way faster than https://, the risk is about privacy because in case of man in the middle the download packaged will be known, but the signify signature will prevent any malicious package modification to be installed. Using 'FETCH_CMD="/usr/local/bin/curl -L -s -q -N"' gave the best results.

However I can't explain yet the very different behaviors between ftp and curl or between http and https.

Extra: set a download speed limit to pkg_add operations

By using curl as FETCH_CMD you can use the "--limit-rate 900k" parameter to limit the transfer speed to the given rate.