💾 Archived View for dioskouroi.xyz › thread › 29434740 captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content
➡️ Next capture (2021-12-05)
-=-=-=-=-=-=-
________________________________________________________________________________
Don't write internal cli tools in python
100%. I stopped writing anything that had to be deployed (basically everything except Jupyter notebooks for data stuff) in Python because it’s truly a nightmare. Go and goreleaser is great for writing a CLI (and if it’s public, it can auto generate binaries and upload to GitHub, create a Homebrew/Scoop bucket, etc)
https://docs.python.org/3/library/zipapp.html
_Using the zipapp module, it is possible to create self-contained Python programs, which can be distributed to end users who only need to have a suitable version of Python installed on their system. The key to doing this is to bundle all of the application’s dependencies into the archive, along with the application code_
Venv+pip or pipx are also fairly good if you have a pip registry.
If your users are all developers I’ve found `go install` an absolute winner for CLIs.
I have the same issue. I'm pretty sure you can dig through my post history and find a blanket statement of me swearing off python for all new projects. Here are some ... reflections.
In short a package manager + git is the easiest way to distribute and maintain software ... once you know how to use them. Configuring in a new environment? There is nothing that works reliably, other than maybe a full blown human level intelligent agent that is a senior developer with direct access to poke the system until it works.
Further, developers with little time or experience packaging software for regular users may balk at what they rightly perceive as a route that is more difficult and creates more work in the long run because as a dev you can't just drop a "fix pushed" message your messenger of choice and go back to work.
Static linking seems like it is a solution until you have to distribute to multiple operating systems. Only things like cosmopolitan libc have been able to overcome this, but even that requires a binary blob.
In long ....
I maintain a dozen or so python packages. Small stuff, mostly used by myself and my research group. At some point I needed to make some of that functionality available for other devs and some non-technical users. It was and is a massive pain.
Why was it such an issue? I think it is because many of the quick internal scripts were originally written assuming that only technical users as advanced or nearly as advanced as the author would be using them, and that any such user would be able to use git, configure their environment, and generally keep their system up to date using a package manager.
Aside. Add to that the fact that my dev environment is Gentoo, which has one of the sanest python environment management solutions available, and every time I step outside it or get reports from the poor lost souls trying to run my code "elsewhere" I enter a world of sanity destroying chaos. The workarounds that I created to try and mitigate some of the issues were poorly designed because I tried to YAGNI without realizing that you inevitably gonna need it (YIGNI?) and that the technical debt surrounding issues with interfacing with the environment is 10x more time consuming than anything else, especially if it is in the form of reports from souls lost in the warp.
Even if a dev can use git and a package manager they can still get stuck configuring the environment. Seems easy until there are secrets to distribute or someone needs to run something in a sightly different environment and nothing works.
For non-technical users it can be a stretch to get docker set up correctly, especially if they need X11. If I can get ssh access to the machine then there is a chance, but that doesn't scale beyond maybe 2 or 3 users. I have seen enough of this to have started to seriously consider just learning how to create installers for macos and windows and build them via CI. The long term solution is to switch off python completely, but the legacy code will continue to haunt us for a long time.
I've learned docker to try to create controlled environments. I've looked at statically linking SBCL to musl libc. I've looked into cosmopolitan libc. I've explored trying to use Emacs, elisp, and org-mode as another possible route to bootstrap stable and predictable environments. I've looked into Gentoo prefix. I've wondered whether I could get users to install Pharo, but realized that it is already a tossup whether they can get docker running. I'm so sick of having to repeatedly do searches in high dimensional configuration space (fight with the environment) just to get some random code to run for a user who can't deal with the configuration space themselves (and it's not their fault!). At this point it would be easier to wire all my users up to a mitogen botnet ... except that some of them use windows!
Telling a user "install python" or "install x" and then run this code leaves out the "and resolve the 10 completely novel errors you will encounter along the way." This is because there are at least 5 ways to "install python" (oh, don't forget the even bigger nightmare that is "install pip") on any given operating system (forget distros [0]), and each one results in a setup where things like data or configuration files are installed in different places with no obvious way to figure out where those places are, or you are forced to resort to a runtime lookup that causes CLI program startup time to explode to hundreds of milliseconds, making the entire point of having a quick CLI script moot.
Hello world may be the easiest program to write but it is by far the hardest program to run. Dependencies? Data? Network? Good luck.
0.
https://repology.org/project/python/packages
Why is python bad for cli applications
Right from the very first bytes of your cli app you run into issues. The interpreter line. You have to choose #!/usr/bin/python or #!/usr/bin/python2 or #!/usr/bin/python3 (or similar variations with /usr/bin/env). All of these happily ignored by Windows. The latter two may not exist on some platforms. The first one could be invoking either a python2 or python3 interpreter.
There's some python magic for determining the version, but if you use a feature not supported by the version it raises an uncatchable exception...because python's parser has control before your code has control.
This isn't the only deployment issue, but a good example of what you're in for.
Packaging and deploying Python apps is an adventure.
you should try PyInstaller
If you don't know how to do something, everything is an adventure.
Your comment really isn't adding any value to understanding the state of Python packaging.
From my personal experience, there's so many things that can crop up, not limited to:
- Conflicts with system packages from e.g. apt
- Conflicts with system packages from easy_install/pip
- Mismatching Python versions (even among 3.x with syntax additions only available in newer versions)
- Different dependency management systems (pip's requirements.txt, setuptools's setup.py, poetry's pyproject.toml, etc.)
- ... and the list goes on if you start talking about C extensions!
Additionally, not everyone does things in the same way which means the easiest way to package Python tools tends to be just bundling Python + its entire standard library together. Contrast this with Go, where `go install` does basically the right thing 9 times out of 10. I'm not a huge Go fan, but the convenience of `go install` is really unmatched.
In my experience, `pipx install` (with `--system-site-packages` if you want system Qt themes) generally works as a user, though it breaks when the Python version updates, it doesn't generate a zero-dependency easily-deployed binary, and the packaging ecosystem is indeed a serious problem as a developer. Have you experienced other issues using it?
I mentioned this on another reply, but Windows has as far as I remember a way to wrap a Python program as an exe executable. Maybe it's time for python to have a universal go-like binary. Yes, a 1KB program will turn into a 300MB behemoth, but people don't seem to mind it with Electron (I don't usually) so maybe it can work.
virtual environments address all these concerns, no?
Absolutely not. A virtualenv is typically a pointer to a systemwide python, or something managed by yet another piece of crap - chpy, asdf, whatever - they are not self contained. Upgrading the system python will typically break the virtualenv.
Anything, even a statically linked binary can run into problems if you upgrade the entire OS..
I agree. I've used Python for a decade. Go and Node are a pain in the ass compared to Python, simply because I am unfamiliar with them. Use tox to create a virtualenv and then you can run irrespective of system packages and build the virtualenv in a single 'tox' command.
I have successfully used CPAN, Maven, NuGet, Homebrew, and other packaging systems while knowing little about them. You can copy a snippet, install the package, and get back to work. Why should Python be any different?
I think “adventure” implies significant depth into the unknown which one must traverse. A single coin flip is also an unknown but generally wouldn’t be described as an adventure.
Years ago I went on a long distance canoe trip in the Arctic. It rained every day, the mosquitos ate us alive, and I came uncomfortably close to hypothermia. It was a great experience and an _adventure_ - that's what I was thinking of.
Doing the same for a Go CLI or an Electron app is a walk in the park. I did it in minutes from not knowing anything and it worked on people's computers. I'd choose it any day.
I_like_ writing CLI apps in Python, and for various reasons haven't been subjected to the packaging woes, but I think execution latency is an important thing for CLI tools, and I've had enough Python CLI tools spend most of their time (a user-perceptible amount) loading modules that I prefer less dynamic languages for tools that aren't completely trivial.
It makes sense that _you_ as an author don’t think of python packaging as a problem - it’s only really inflicted upon non-developer end users of the products.
Could you write a guide? Pretty please? I've had nothing but woe in this realm, and anytime one asks about this the answers usually contradict each other, or leave out significant detail. I think an article that details your workflow could be tremendously interesting for others.
What problems do you have with CLI apps? In my experience, the problem is whether you're targeting developers (i.e., people that can run pip install <tool>) or end-users (those that just want to run a binary).
Python is hell for the latter. I support Windows and Linux, so I can bend pyinstaller to my will with enough effort, but I need to have a special Windows VM to "cross-compile".
For the former, though, it's a matter of creating a setup.py and telling the user to pip install the repo on GitHub. Or you can use Poetry, which makes things slightly easier.
i understand that there are subtle differences between minor versions of python, and its batteries. But what about putting it all into a docker image?
so instead of choosing a different language without particular problems, add another tool in your chain to manage to handle the problems?
Based on the idea choose the right tool for the job it seems python is not right for this job.
Don't know what the 'right' tool is. i think it is easier to code up stuff in Python, as compared to golang. I am much more productive in Python, the language is more expressive, imho.
An alternative would be to write in Perl5, perl is pretty much fixed, these days (if you insist on not using containers)
Try to compile your go tools in year, or two. If think that you will be hard pressed to find the dependent library versions. Packaging is not the strongest part of go. A version in your package translates to a tag in the referenced git repository, now sometimes people do funny things with their repositories, and older tags disappear.
Python and Perl5 are better, at packaging. At least you have all of the older library versions on
I think, that the decentralized approach to packaging (as used in golang) is far from optimal.
Maybe this is all due to the fact, that google is using a mono repository, internally for their own code. So that they are not eating their own 'dog food'.
Sorry for ranting, judgements like 'right tools' really get me going..
“Right tool” is not a value judgement. Python is better suited to certain applications, and go is better suited to others (with overlap obviously). This is partially a function of the language itself (eg, statically compiled or not) and partially a function of ecosystem (eg, writing container tools, go is by far the most pragmatic choice and Python for many ML applications).
I think GP was just saying that most times, fewer moving parts is better. Avoiding Docker is great if you don’t need it (especially for a CLI tool). Obviously there are trade offs that must be evaluated in the context of a given problem and domain.
Docker is a mess if you want to read or write any files on the host system. Which is often what you want to do from a cli.
Mainly because volumes are mounted as the host user UID, while the container might be running any other user, often root.
Quite baffling they haven’t made this easier since it would be a real killer feature unlocking even more potential from containers.
right, one possibility is to pipe standard output to an output file.
Another use case is running a http server locally, for something that has a graphic ui. Like one of my projects here:
https://github.com/mosermichael/s9k
I like oclif - it’s what powers Heroku CLI. Great TypeScript support.
For the past 3 years I've written almost all of my users' system tools in Bash v3. Nobody has yet reported a bug to me. Runs fine on Linux, Mac, and WSL, on every architecture. Single file, easy for anyone to read and edit, gets the job done. I don't even have to care that it's unpopular because there are no installation steps. Just 'bash foo.sh'. Sometimes that means they need to download some Go app that my script uses, but that's much better than me having to write and support all that other code.
If I have to do weird things with data structures or complex logic, I reach for Python, but I do it one of two ways: 1) so stupid that "python foo.py" works (no external deps), or 2) publish it to pypi so the user can just 'pip install foobar'.
I have a similar strategy but Bash... "easy for anyone to read and edit"?
Heck no. Bash is a terrible language by modern standards, with obscure syntax, many weird quirks, no type checking, no return values... How could developers program in Bash before Shellcheck!?
I think that internally, I draw the line at around 100 lines. Even refactoring a program shorter than that it's not a pleasant experience (not difficult, but requires a great deal of attention, when compared to any modern language).
I've also been having bash as a strict req for tool language. As you say there are 1% or less cases where you can benefit from using smth else but that's 1%!
My team had js as the main language and that's what we had that 1% in. Mostly no-deps scripts but where deps where needed we could hack around npx to install the deps on-demands. I would use deno today for that actually.
But! Almost nobody gets it. If you're into operations: "here we write our tool chain in Python". If you're into dev: "here we write our tool chain in js/go/php/etc/etc Bash would just be a +1"
+1??? You still need to write bash, or shell in general, for orchestration of build/ci/cd pipelines. If someone thinks that in 2021 they can escape learning shell, they are out of their minds. So no, it's not a +1.
Bash is the language that is not only available quite everywhere but it's also the one that allows you to write oneliner solutions
https://twitter.com/andreineculau/status/1466107915450916867...
and move on.
As devs we should spend 10x more on solid foundations that work as many of us, for as many years to come, than we do on hip-new-thingie-that-promises-to-fix-all-my-problems. It's as if we took the regex joke and our brains could only comprehend: I'm never using a regex. Keep it simple? Never.
I'm also a huge bash fan, and agree that distributing Python tools is a mess. But "Runs fine on Linux, Mac and WSL" is a stretch. Bash on Mac OS is a ball of shit unless you're installing a newer GNU bash via homebrew. Mac OS bash is just too old. I can't even remember the list of stuff that breaks because I stopped trying. If the target audience is mostly Mac users I'll just write zsh to begin with.
I ran into those problems too initially, but I've learned to avoid them. The version of Bash on Macs is version 3.2.57 iirc. This page has documentation of which features were introduced in v3, v4 (
https://tldp.org/LDP/abs/html/bash2.html
).
What I personally do is write POSIX shell, then add in things like arrays, $BASH_SOURCE, printf %q, and a few other handy features that have been around for a while. To test it, there is an official Docker image for Bash v3 (
).
I love bash. It's probably my favourite language for productivity. I spent some time a while back trying to write a Lisp in bash, and have written a ~1kloc performance testing tool to get results while an entire team worked on a Haskell implementation.
Don't write internal cli tools in python
A lot of the advice is good but I take an issue with this one. With poetry and docker, packaging Python apps for easy consumption is a non-issue. Same for Ruby. If you can get your team to standardize on poetry, you might not even need containers -- but, honestly, running these tools from CI or automation (anywhere) is so useful that you probably want container versions anyway.
Golang is not a good fit for exploratory CLIs that work with complex data structures and are written for one-off, low-CPU consumption purposes -- not for scale-up API services. Just having an interactive shell (or `pry` in Ruby, those two are identical for the purpose) saved me probably weeks of time. Trying to unit test any moderately complex API surface brings me to tears when I compare it to trivial object mocking in something like Ruby.
Python (or Ruby) are ideal for this and have excellent frameworks for CLI tools.
> With poetry and docker, packaging Python apps for easy consumption is a non-issue.
This is why you shouldn't write CLI tools in Python, you need frigging docker to package them
You most certainly don't, but many people seem to have lost their minds around packaging.
Well it depends, right? My most recent Python packaging adventure was managed through Docker, but this was because I also had Java-based dependencies (reAgent RL framework).
Given how much easier that was than getting people to use conda/pip (conda is better for DS stuff as it handles C-level dependencies), I completely understand how people just suggest Docker.
Like, if you're doing pure Python lots of this may not be necessary, but as soon as you start having multiple C-level dependencies pip breaks (and pip didn't even do dependency resolution till last year(!)).
I recently had to package a machine learning Jupyter notebook created by a very smart person, albeit a scientist, not a developer. Getting the tool somehow into production, making it reproducible, testable, and maintainable, has proven to be a major headache.
Before, I only casually dabbled in the python world, but this was the first time I had to care about packaging, dependencies, CI and the like. Turns out, as TFA said, nobody knows how to package python apps right. For what it’s worth, I wasn’t even able to find some kind of best practice to manage friggin dependencies. There’s like a myriad of package managers, all of them work differently, and nobody seemed to had something like redistributing an app to other people on their mind. Coming from PHP, JavaScript and Go, this was utterly ridiculous to me.
Go ahead, tell me I got it all wrong and it’s really easy using tool Xyz, but if a developer with some experience under their belt isn’t able to figure this out in a few days, things are just broken.
I have to ask, did you read
?
I find that the majority of the people who find difficulty with packaging are really just struggling with documentation and ecosystem fragmentation, as you hinted at. For example, that page I linked weirdly focuses on one specific tool (Pipenv) and doesn't attempt to survey the landscape of packaging and dependency management tools.
There tends to be a poor balance of "explanation", "example", and "overview" in these documents. I would say that I don't quite know why, but I know that writing good docs is incredibly difficult, so I have sympathy for people who've tried and didn't quite succeed.
One thing I'll offer is that, in the Python world, "managing dependencies" is distinct from "packaging". This seems to be different from other language ecosystems, where there is 1 package management tool that does everything.
That said, I take serious issue with the claim in the article that you quoted:
> Nobody knows how to correctly install and package Python apps.
This simply is not true. There are lots and lots of Python applications that are packaged and that you can install without a problem. You can complain about the need for virtual environments instead of having "app-local" packages by default, sure. But that's not the issue in most cases.
I think ultimately this amounts to FUD. Not because your experience is invalid, but because the blame is put on the wrong thing. The fact that Python packaging documentation is generally dogshit doesn't mean that the actual process of making a Python package is hard. If someone sat with you and showed you how to set things up the first time, I am confident that you would have no trouble cranking out Python packages that you can distribute to your engineering team without any problems.
> I wasn’t even able to find some kind of best practice to manage friggin dependencies
Again, this is a documentation problem.
> There’s like a myriad of package managers, all of them work differently.
Not really, Pip is still mostly the only package manager in town. But there are several tools like Pipenv and Poetry that replace the old Setuptools when it comes to actually building and installing packages, as well as generating lockfiles for dependencies (which Setuptools does not do at all).
I don't find it particularly bad that there are a few different tools to do similar jobs. I do however find it upsetting that the Python packaging document I linked doesn't even attempt to describe them or explain when/why you would use one or the other.
> nobody seemed to had something like redistributing an app to other people on their mind.
That's just outright untrue. Otherwise, PyPI wouldn't exist. Again, this sounds like a documentation problem.
> machine learning Jupyter notebook
The other issue here is that you're looking at a machine learning research script, not an "application" as such. Machine learning libraries tend to have complicated dependencies with bindings to C and C++ libraries that might or might not be bundled with the packages themselves. This would be and is a problem in _any_ language ecosystem, e.g. Node or Ruby or Perl.
These are NOT easy applications when it comes to packaging.
Another issue with your scientist's code is that they were very likely using Conda, which is kind of its own thing. Conda is somewhat language-agnostic, and largely bypasses all the established Python "developer" tooling, in pursuit of somewhat-reproducible computing environments in support of reproducible research. Conda is very popular precisely because machine learning libraries tend to be so complicated to package correctly, and Conda solves that problem as a kind of portable alternative to Deb, RPM, et al.
Unfortunately, Conda poses some challenges if/when you need to take a project that lives in a Conda environment and transfer it to a more typical Python package, mostly because Conda environments can control things like the C compiler, while Python package managers cannot.
However, I will insist that this is not a Python-specific problem. If they wrote their application in Clojure, for example, you would have the exact same issue in migrating it from Conda to the usual suite of Clojure tools.
Oh, and speaking of Clojure: Python isn't the only language with the "which tool do I use and why are they all so complicated?" problem. I don't have a goddamn clue how to package for Clojure (I tried!), but I'm not going off a forum about how nobody should use Clojure for internal tools.
> Unfortunately, Conda poses some challenges if/when you need to take a project that lives in a Conda environment and transfer it to a more typical Python package, mostly because Conda environments can control things like the C compiler, while Python package managers cannot.
To be honest, having been in this situation it generally makes more sense to implement the pip dependencies inside the conda venv (by running pip), as conda can handle the C-level dependencies and pip is mostly perfect (modulo back when it didn't actually _manage dependencies_) for pure python stuff.
Also, it sounds like you know a lot about python packaging, please by the love of all that's holy, write this stuff down somewhere other than HN. There really isn't a lot of good docs around this.
That's pretty much how I've done the Conda transition too. Switch to using Pip inside the Conda env to suss out what the non-Python deps are. Also a lot of packages are distributed on PyPI as binary packages now, so as long as you aren't using Musl in production you should just be able to use those instead.
> Also, it sounds like you know a lot about python packaging, please by the love of all that's holy, write this stuff down somewhere other than HN. There really isn't a lot of good docs around this.
Believe me, I want to! I am pretty bad at managing my "project/writing wishlist".
After reading the linked documentation, it seems that it is fairly easy to package a python package. But building a python application (CLI) has another dedicated page that is much scarier and much less easy to follow:
https://packaging.python.org/tutorials/packaging-projects/
Let me preface this by saying thank you for your insightful response, and the many suggestions - I'm definitely going to revalidate some of my assumptions and try to package that app again!
I peaked at the documentation, but found it both overwhelming and unfocused, thus hard to follow. As you say, it's focused on specific tools and feels more like a reference than actual instructions for what I was aiming to do.
> One thing I'll offer is that, in the Python world, "managing dependencies" is distinct from "packaging". This seems to be different from other language ecosystems, where there is 1 package management tool that does everything.
That might be the actual thing that stumped me: I don't intend on publishing my package somewhere publicly, but simply enable other devs on the team to install the required dependencies with a single command, without any assumptions on their environment, which seems awkward to me.
> That said, I take serious issue with the claim in the article that you quoted:
> Nobody knows how to correctly install and package Python apps.
This simply is not true. There are lots and lots of Python applications that are packaged and that you can install without a problem. You can complain about the need for virtual environments instead of having "app-local" packages by default, sure. But that's not the issue in most cases.
That is because we understood that quote differently. In your post you mention Pip, Pipenv, Poetry, Setuptools, PyPI, and Conda. Additionally I came across Virtualenv, and Pyenv. All of these tools are somehow related to python packages and dependency management, and everyone on the internet seems to have their own opinion one which combination is the best. There seems to be no consensus on one of the most foundational tasks in software engineering, namely sharing code with other people. To me that counts as nobody knowing how to it correctly.
That said, reading the docs again, which clearly recommend using Pipenv - and that seems to be simple enough - I probably should have headed for the official docs right away.
> These are NOT easy applications when it comes to packaging.
I noticed that :) Caused me to hit the wall multiple times, for example while using Alpine Linux/MUSL and wondering why the C bindings had to be compiled on every install. But you're right, that would bring its own complexities in every language and might be partially responsible for my general frustration with python dependencies.
While the scientist didn't use Conda (I think he just happily installs everything he needs globally), that was the recommendation I found on one of his primary packages' website (I think it was pandas), so I followed through and tried to set things up using Conda. It was a nightmare, frankly.
I'm really not opposed to learning new things. And I get that my assumption that packages live inside an application directory by default might be just that, and colored by other ecosystems I'm more used to. And, as I said earlier, I'm definitely going to check out Pipenv and try to refactor that thing into something more standards-compliant.
Still, I think Python places more roadblocks than necessary in your way, and the combination of bad docs, several opinions on how to do basic stuff, as well as the bad habit of global dependencies, make it difficult for newbies to deal with Python.
Which, all in all, confirms the statement in TFA for me: Don't use Python for internal tools.
And frankly, as someone affected by Python packaging, I'm definitely going to complain on forums as much as I want if things are harder than they have to be. Just because other systems have their warts too doesn't mean there's nothing to improve.
It has taken everybody else thirty years to realise: Python is not very good....
Python is fucking phenomenal. Packaging python is an exercise in frustration until you become a level 3 wizard.
It really isn't that hard to package for Python. Certainly a lot easier than packaging for Debian, and not much different from packaging for Ruby or Node.
The bad part is the Setuptools documentation, which is slowly improving, but is (and has been for years) so bad that almost nobody can learn from reading it.
Ruby is pretty good in this, but Python I'd heavily argue against. Maybe in another decade when they manage to restabilize what they wrought - it used to be easy.
Python gets more and more complex the further away the people running the tool are from fancy recentish distros (Fedora, Ubuntu, Arch) or special constrained environments (Nix, running the CLI in docker). The moment you have to deal with unspecified RHEL version (6 is still reasonably common) or derivative, or Mac or Windows, kiss any expectation of python packaging being nice "bye bye".
Unless of course you have a platform team that can handle the packaging and distribution for you, but then it probably falls a bit under "constrained environment".
Ruby is also a pain in the butt, fwiw. It's just slightly better because gems are standardized, but you still need something like rbenv to manage Ruby versions, and gods help you if you need openssl.
In my experience the major difference is that rbenv feels more of a convenience in upgrading ruby on truly outdated systems (or when you can't use prepackaged for reasons), and that for majority work minor version differences are at most one line in Gemspec away.
I really, _really_ can't say that about Python.
I'd state it as "don't use external dependencies carelessly". It applies to more than just Python.
Writing and deploying Python tools is easy if all your dependencies come in distro packages.
And your code is carefully written to run perfectly fine on both 2.7 and anything between at least 3.4 and 3.10, but might be better to handle as early as 3.0
... I might have somewhat similar scars to TFA author, I guess...
The only big problems with such an approach are string formatting (f-strings are really, really nice) but also dictionary ordering is massively different between 3.5 and 3.7
Presumably this is known to all the people who've been doing Python for a long time, but it bit me in the ass relatively recently.
> Golang is not a good fit for exploratory CLIs that work with complex data structures and are written for one-off, low-CPU consumption purposes
"exploratory CLIs that work with complex data structures and are written for one-off ... purposes"
Sounds like a maintenance nightmare if you are actually "deploying" Python scripts that fit this definition. Yeah, golang will usually require a bit more boilerplate up front, but it is going to make your workflows infinitely more maintainable & flexible in the long term due to type safety + easy (and docker-less) packaging.
IMHO, if it's not a literal one-liner in ($SHELL|curl|awk|jq|*), it should probably be done right the first time in golang/etc or go back to the drawing board.
There's a category of scripts that are complex enough so that shell is too convoluted, but can be a 10-20 lines Python script that would be 50-100 in Go.
By the way, OP specifically mentioned "one-off", which you also quoted. Why mention deployment?
> Why mention deployment?
Because the article does.
> With poetry and docker, packaging Python apps for easy consumption is a non-issue.
Yeah, apps it's worth doing this for, cli tools it's not.
The packaging thing (and I don't necessarily think "put it in Docker" is a great answer, especially if you're using Python as glue -- why isolate your glue?) is something you kind of solve once and are done with. You probably _already_ want/need a common development environment between people that is close to what you deploy, and adding a fixed Python into the mix with pyenv or whatever is not a big deal.
I don't know, I've generally had a good time with. You have to know how to do it and where the foot guns are (I'm still learning), but it's better than being comparatively gimped in development velocity using Go, never mind Rust or C++.
No one wants a CLI in Ruby, and if you look arround there is no popular CLI written in Ruby it's one of the worse language to write CLI in because no one has the Ruby runtime installed on their machine, then you have different architectures, different OS etc ... Go is way way better than Ruby on that topic.
As for complex data structure, I don't understand exactly what do you mean, as if a dynamic language would model that easier than a strongly typed language.
When you get a single binary for CLI it's hard to use anything else, the pain of using pip for anything python based.
Homebrew is written in ruby, not only is it one of the most popular cli tools but it is also often cited as one of the best designed cli user experiences.
> there is no popular CLI written in Ruby
Chef?
And [Brew](
)
Also Metasploit
Vagrant is written in Ruby but is currently getting ported to Golang.
Regarding being cloud provider agnostic: it’s not always for fault tolerance, there can be a couple different reasons.
1) it gives your company a stronger bargaining position with the cloud provider.
Granted, my companies tend to have extremely high spend- but being able to shave a dozen or so percent off your bill is enough to hire another 50 engineers in my org.
2) you may end up hitting some kind of unarguable problem.
These could be business driven (my CEO doesn’t like yours!), technical (GCP is not supported by $vendor) or political (you need to make a China version of your product, no GCP in China!)
Everything is trade offs. AWS never worked for us because the technical implementation of their hypervisor was not affined to CPU cores of the machine, meaning you often compete with other VMs on memory bandwidth. — but AWS works in China (kinda). So my solutions support both GCP and AWS as slightly less supported backup.
I'd add another reason: Devs need to be able to run stuff locally sometimes.
It's neat having a serverless single page app that is hosted in s3, served through cloudfront, with lambda's that post messages to sqs queues that are read by god knows what else, but what happens when there's a bug? How do you test it? You can throw more cloud at it and give each dev a way to build their own copy of the stack, but that's even more work to manage. Maybe localstack behaves the same, but can you integrate it with your test framework?
I never took a hard "we must never use aws-only services" approach, but having the ability to run something locally was a huge plus. Postgres RDS? Totally fine, you don't need amazon to run postgres. Redshift? Worth the lock-in given the performance. Lambda? Eh, probably not, given that we already have a streamlined way to host a webapp.
Argh. I wish I could upvote this twice.
One of the main reasons we never used google spanner was that we can’t test it locally.
People don't often think of their local development environment as a "platform" that their stuff needs to work in, but it really is. In that sense, unless you're hosting off of your laptop (please don't!), every app is multi-platform.
Every startup I've worked at (and I've been at this for 15+ years) has moved hosting providers, but I still wouldn't put it high on the list of reasons to avoid vendor lock-in. If you make sure someone(s) know how the app actually runs, and you try to pick stuff you can run locally, the vendor lock-in stuff won't be your biggest challenge in the move.
i agree 100% that you need to structure the project so there's a way to develop locally in your dev machine -- without a network connection -- and run integration tests against local versions of services.
looks like google spanner have plugged that workflow gap since you evaluated it for your project:
> The Cloud SDK provides a local, in-memory emulator, which you can use to develop and test your applications for free without creating a GCP Project or a billing account. As the emulator stores data only in memory, all state, including data, schema, and configs, is lost on restart. The emulator offers the same APIs as the Cloud Spanner production service and is intended for local development and testing, not for production deployments.
https://cloud.google.com/spanner/docs/emulator
although there are
https://cloud.google.com/spanner/docs/emulator#limitations_a...
Agree here.
At my $WORK we run all of our backend on AWS but anything I've touched has to also run locally.
We use serverless framework, and there are plugins for running the lambdas locally as well as for dynamodb, sqs, ses and eventbridge.
I think it's a case of choosing your dependencies carefully though. I would be wary of integrating against an AWS service which does not have an API compatible offline or provided-elsewhere alternative.
A case where we fail at this is Cognito. Even our 'offline'/local stack has to connect to our dev environment for that one.
>you may end up hitting some kind of unarguable problem.
Another example: there's multiple countries (for example, here in Russia) where personal data must be stored in data centers located in the country's borders and not every country has a AWS datacenter on its soil
.
The guy also recommends using several proprietary AWS services in other points.
Then he goes on to advocate against designing for cloud flexibility.
This almost feels like AWS marketing.
Especially the alerts thing. I think every company I've ever worked for made the mistake of ignoring alert spam. If an alert doesn't require human action, then it should be a log or a metric. And by all means plot it on a graph (the metric that triggered the alarm, or in the case of a boolean test result, the frequency of the failure). Look at the graph during real incidents if you want. Talk about it at the monthly meeting. But don't generate an alert that people should ignore. You're playing Russian Roulette.
Don't migrate an application from the datacenter to the cloud
Reading the actual text of this one I get a different impression, but I'm still not sure I agree with this one. Applications can be radically different from each other in terms of how they are run.
At one company, we ran a simple application as SaaS for our customers or gave them packages to run on-prem. We'd stack something like seven SaaS customers on a single set of hardware (front-ends and DB servers). The cloud offering was a no-brainer, you can just migrate customers one by one to AWS or whatever, or spin up a new customer on AWS instead of in our colocation center.
Applications have a very wide range of operational complexity. Some applications are total beasts--you ask a new engineer to set up a test environment as part of on-boarding and it takes them a week. Some applications are very svelte, like a single JAR file + PostgreSQL database. The operational complexity (complexity of running the software) doesn't always correspond to the complexity of the code itself or its featureset.
> I've been involved now in three attempts to do large-scale migrations of applications written for a specific datacenter to the cloud and every time I have crashed upon the rocks of undocumented assumptions about the environment
I've only participated in a single on-prem to cloud migration. Some parts of the migration were easy, e.g. moving a postgres DB that was running on some on-prem linux server to run in AWS RDS. Some parts were rather unpleasant: where you discover that a bunch of the application code that runs in worker processes assumes it has access to a shared CIFS network share that can be used for communication throug the filesystem, and absolute file paths to on-prem CIFS network share locations are stored in metadata throughout the database. So then your available moves for how to migrate the application code and migrate the CIFS network share and migrate the data in the database all become somewhat tangled together.
I helped migrate an app from on-prem to cloud. During the migration we found that the app needed a locally installed oracleDB. Well, it violates on-prem best practices and cloud best practices. I think migrating just exposes all the shortcuts baked into a "craplication."
You spring back to the present day, almost bolting out of your chair to object, "Don't do X!". Your colleagues are startled by your intense reaction, but they haven't seen the horrors you have.
They may be startled, but they almost certainly won't listen. The purgatory nature of IT work culture ensures this repetitive pattern.
That’s my experience too. The older I get the more frustrated I am that people don’t want to learn from my mistakes.
It’s not that I’m a jerk about it. Most of the time it’s business types saying “you don’t need to do all that stuff - just deploy it like that, it’ll be fine”.
And then it’s not fine - and also, it’s somehow my fault.
It seems to be more about personal aggrandisement (“I’m the boss and my word is law”) rather than trying to build a great business together. I’m pretty over it.
It is easier to right than to read.
It is easier to talk than to listen.
It is easier to build new than adapt.
All three of those statements are false. But all three are seductive
> It is easier to right than to read.
I just wanted to point out that I found that particular error really funny given the context.
Leaving it!
Bummer about Python :/ it’s my go-to for CLI tools, but I’ve seen that problem too.. pipenv helps, but I wonder if there’s a better way to package them so they’re more future proof.. or do I really need to learn go?
I've worked with Python for over a decade and have worked with Go for a few years. The Go tooling and workflow for building and deploying applications is much more pleasant than the Python ecosystem. Once you know which target operating systems and CPU architectures you wish to deploy to, the toolchain makes it very easy to cross compile to produce n deployable binaries, then the static linking means you generally only need to copy 1 binary file to each target, plus your own application configuration. If you have a Python background and have done some programming with a static type system before, then Go is very quick to learn. The Go language itself is a bit annoyingly inexpressive for bashing out algorithms, but that probably doesn't matter if you're writing CLI tools.
For packaging python stuff, it kind of depends who or what the end user of the CLI tool is. I used to work in a small business that shipped windows desktop software to customers, a lot of the software was written in python. From memory we packaged it with py2exe -- so you end up building a zip file containing a self-contained executable, the version of the python interpreter your tool needs, along with all the python packages as well as native libraries. That worked quite reliably, but it'd seem rather distasteful to use that approach for sharing CLI tools with your colleagues or CI machines in a dev team!
edit: there are pitfalls to deploying go application binaries if you try to put them in scratch containers and use libraries that assume they can find data such as timezone data provided in the usual place in the filesystem by the distribution (there will be no such data files in a scratch container unless you explicitly copy them in), or build a go linux binary with dynamic linking assuming glibc, then try to run it in an alpine based environment with musl. So it's not magic. But it's mostly pretty nice.
> Once you know which target operating systems and CPU architectures you wish to deploy to
If you know that, this is not the problem for you.
Often you do not know that. Then this is a huge problem.
If the standard tech stack at your organisation includes Python, there's no reason why you shouldn't write CLI tools in Python. Packaging and distribution is only a problem for organisations that do not usually deal with Python.
IMO we loose a lot with go: having to compile, loosing the interactive shell, etc. Best case you work with a lot of people who know how to install python and use pip. Many people whine on boards, but it's not that complicated, especially with python 3.
> we loose a lot with go: having to compile
If you're concerned about compile time -- Go does a pretty reasonable job of caching (including caching unit test results). If you're working on a Python project with a large number of unit tests, because Python is so slow to execute anything, and the go build and test tools are quite fast, it wouldn't be that surprising if it was actually faster to compile and run the test suite in Go vs running the test suite in Python -- particularly for incremental work where you make a change in one library and rerun the impacted tests.
If you're concerned more about the workflow of needing an additional compile step, go has `go run script.go` which lets you use go like a scripting language, assuming you're in an environment with the go toolchain installed.
> install python and use pip
Now you've given your users 2 issues completely unrelated to the problem they're trying to solve. If you can't give your users a single simple command, or a single file to download your tool is too complicated to install.
Not to mention all kinds of python version issues, especially given that botched 2-to-3 migration means there's a lot more Python 2 than there should be.
Yes. Except. If I have a Python2 tool I paid $X for, and now I need to to change it for $Y because, because, because no reason. Tough luck! Stop "whining" and write a cheque!
Where does that happen? Honestly I think I only know of open source python stuff, so I'm real curious!
What is open sauce?
My point being to the users of the products we create, software licensing is not what they are thinking about.
I'm asking you about this
>Yes. Except. If I have a Python2 tool I paid $X for, and now I need to to change it for $Y because, because, because no reason. Tough luck! Stop "whining" and write a cheque!
I used to abuse Python's REPL and realized I was wasting a lot of time in the REPL that could be saved by articulating my tests in a proper test file or even just main().
pkg.go.dev and gopls also automate away most of the "exploration" I would be doing in Python REPL.
you need to take advice with a grain of salt. Python is fine for CLI tools, just like Go is fine for them. If you know Python, that weird statement in the article was not for you. I honestly don't get what the problem is. You know Python? You use it for CLI? Power to ya. You know Go? Use it for CLI? Power to ya too.
I am working with python full-time. I can package a cli so it works on other systems, but I always envied go - just compiles to one file with no dependencies. This makes it a much better choice for cli apps or portable apps.
Hey you can always use docker for shipping /s
Go is a different animal, and yes it has superb packaging. Shipping a binary is a beautiful thing. I remember windows has some packager for Python creates exe files (pyInstaller?). Maybe it's time for a universal binary executable for python....
The post has really great advice but...
Don't write internal cli tools in python
Disagree completely with this. This has been probably the biggest overall boost for both engineers and operators at a few companies, I worked at.
You deliver fast, it's easy to debug, and requires no compilation -- which is usually a bigger hassle than any Python-specific problem. It gets really important if you have operators on Linux/Windows/Mac.
Have you tried go lately? It needs to "compile", but the compiler is so fast that the combined compile/run is essentially the same experience as running a script. goreleaser can also make cross compilation extremely easy.
Don't Design for Multiple Cloud Providers
Designing for portability is important. Otherwise you expose yourself to dreadful uncertainties.
"AWS will not disappear". That is probably true. The average business can take this risk (and if you are huge you are not listening to me). But AWS might raise its prices to a point they are getting all your profit. DO you trust Amazon? Really? The particular AWS feature you depended on with the tight coupling "Don't Design for Multiple Cloud Providers" implies may get deprecated. What then?
This is as old as the hills: Design in layers. Have an AWS layer. If AWS goes away, quadruples their fees, deprecates your services, or you are hit with USA sanctions then there is a layer that has to be rewritten.
Old wisdom. Use it.
Parler was on AWS and they got booted off. Because of that, they collapsed. If they had two cloud deployments they could have survived. AWS will never go away, but they can make your small (or medium) size business go away pretty quick.
And just a side note, Parler was toxic and I shed no tears about their demise...
"meaning developers were constantly reading about some great new feature of AWS they weren't able to use or try out"
That is a feature! Bleeding edge bleeds.
Not sure if I agree about the Python jab. I've seen "pip install ...." run flawlessly more times than I've had breakfast cereal, and I eat a lot of cereal.
I kinda agree on his first point about migrating stuff to the cloud but if you've done your deployments on like-like platforms (on prem containers to cloud containers) its not that bad.
I use python professionally and can't make sense of this comment. I'm doing my thing, running my ls, cd, greps and cats, and now I need to run my python cli.... And so I do pip install in which env exactly? The system one? Or do I have to create a venv just to install my python cli, and have to enable it every time I wanna run the cli? It's easy to say "just pip install", but I can't see how the details make sense in the context of a cli. Explain please?
You can use pipx, which will create a new venv for each tool it installs. You call the program normally afterwards, and it will use the correct interpreter.
There's not much wrong with just installing the tool to the global (user) interpreter, though.
to the rescue!
Well, maybe it's the latest in a long line of options...
I started using asdf instead of homebrew for installing and managing Python and I haven't looked back. Asdf is infinitely better than homebrew python.
Let's take the AWS CLI as an example. Run "pip3 install awscli --upgrade --user" and you're done. Drop the "--upgrade --user" and you've installed it globally. Easy Peezy.
Sure, you can do a lot with venv, Anaconda and whatnot, but if you have a well written package, it can be very portable without the need of environments.
What if you have two well written packages you want to use, with conflicting dependency versions?
What if you have an old version of glib-c on your machine?
There’s always going to be odd edge cases where things break, no matter what language you build your tooling, but I’ve installed thousands of Python packages on CentOS7, Ubuntu, OSX, Debian via pip and haven’t had issues. Ergo, I disagree with the writer.
This article is basically saying “things are complex, complexity is hard, so don’t do complex things”. On that point I agree. If you don’t have the time to invest in the tech, don’t build unsupported systems. But don’t throw Python under the bus because you don’t have time to build a proper package.
No, I don't wanna install random shit on the systems env
Don't run your own Kubernetes cluster
If we ran our cluster in the cloud we'd be on the hook for hundreds of thousands of dollars of additional costs due to the high throughput of our service. There are always exceptions to any list of rules.
I think Python still has a place for CLI tools, both internal and external.
If you can get away with a zero dependency Python script then there's no struggle. You can download the single Python file and run it, that's it. It works without any ceremony and just about every major system has Python 3.x installed by default. I'd say it's even easier than a compiled Go binary because you don't need to worry about building it for a specific OS or CPU architecture and then instructing users on which one to download.
Argparse (part of the Python standard library) is also quite good for making quick work out of setting up CLI commands, flags, validation, etc..
There's a number of tasks where using Python instead of Bash is easier. I tend to switch between both based on what I'm doing.
Python2 or Python3?
If you say "Python3 of course, is this 2002 or something?" what do I do with all my Python2 scripts?
Python 3. The 2 vs 3 era ended quite some ago. Most major operating systems that aren't end of life have Python 3.6+ installed by default giving you access to nice things like f-strings.
A lot of Python 2.x scripts will work with Python 3. If they don't then it's on you to fix them since Python 2.x was officially labeled end of life almost 2 years ago from today. On the bright side, your Python 2.7 script had a good run. 2.7 was released back in 2010, so having ~10 years of not having to worry about anything is pretty good! Chances are we'll get the same experience or longer with Python 3.6+, we already at the 5 year mark for Python 3.6.
Nobody wants to mention "don't roll your own security"? That's a 101 kind of question - very easy to feel clever when you try it as an amateur, nightmarish when (really not if, when) you get it wrong.
That is one area where I think you want to outsource that to specialists.
I absolutely agree with this, but it think it’s good for hackers to have a play with it just to go down the rabbit hole a bit.
Build some toy security… but don’t deploy it.
I have lost work by recommanding on hiring experts to review sensitive code I would be tasked with writing.
Glad of that.
0) Don't write software - outsource it to someone smarter.
-1) Don't outsize writing software. Let someone smarter outsource it for you.
It’s the business equivalent of an AbstractButtonFactoryFactoryFactory.
They don't even need to be smarter. They just need to value their time less than you do.
Nobody knows how to correctly install and package Python apps
Anybody tried PyInstaller ? it packages the whole Python project, dependencies included into a single exactable binary
Soon I was auditing new services for "multi-cloud compatibility", ensuring that instead of using the premade SDKs from AWS, we maintained our own.
I wonder if a useful middle-ground is to have lint checker rules to enforce using a blessed subset of a cloud provider's SDK. So that some thought/effort must be put into using a new feature?
Don't write internal cli tools in python
What if your team is Python-based? Why would I write a CLI tool to be used by other Python programmers in Go or Rust, when some of them know neither?
It doesn't matter that you know Go and can generate all possible binaries; eventually, someone else will have to make a change in your tool. It will already be difficult for them to understand a new codebase, so you don't need to make it harder by also exposing them to another language.
I work with Azure and my issue is always problems with the Azure Keyvault.. At a personal deployment, it's always passwords I always forget to save them because I'm juggling with tons of things at once lol.
Don't migrate an application from the datacenter to the cloud
Eh, the salesman told me it would be seamless while we were watching the football game from his company’s box. And they are the experts: it’s their cloud!
I’m gonna tell the team to do it this way when I get back to the office. I think they just like running hardware and aren’t thinking of our balance sheet.
If you are in AWS, don't pretend that there is a real need for your applications to be deployable to multiple clouds.
Isn't the reason people do this to make sure they have leverage in case AWS increases prices in the future? I can see how cloud providers have probably made it extremely difficult to design for multiple clouds and so this effort might not be worth it but the reason at least seems justifiable.
That's a seriously good and honest list.
Thank you.
If you are in AWS, don't pretend that there is a real need for your applications to be deployable to multiple clouds. If AWS disappeared tomorrow, yes you would need to migrate your applications. But the probability of AWS outliving your company is high
Well, it's not about AWS shutting down at all! It's about them having complete control over your infrastructure, so they dictate the terms. This has many consequences: (1) they can raise prices and you can do absolutely nothing about it, (2) since you chose AWS with its dynamic pricing instead of flat-rate dedicated servers, each expansion (traffic, new services) is a cost for you. This means at some point you will realize you will save sick amounts of money if you switch to bare metal (as several notable companies have done). Except that at this point it's really difficult because you have to basically start from zero so the inertia basically pulls you into continuing this vicious cycle.
So this is just a straw-man argument. Really, I haven't heard anyone saying "but Amazon can go out of business", it's just ridiculous.
Hopefully you will have enough money to pay AWS once they raise their prices.
I am not sure what will happen if you will not?