💾 Archived View for carcosa.net › journal › 20221020-static-typing.gmi captured on 2023-11-14 at 08:00:48. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
rlamacraft writes:
Dynamically Typed Code Is Just Not Good Engineering
I'm going to disagree with this, and I'm going to do it in two ways. The first, using empirical evidence, and the second, from my own experience.
Dan Luu did a literature review of the published peer-reviewed studies on the benefits of static vs. dynamic typing.
Literature review on the benefits of static types (WWW)
The upshot is that most of the studies have limitations that limit their general applicability, but if you wanted to take home a message from them, in aggregate, it's that if static typing provides stability/reliability/maintainability benefits to programs, the effect is very, very small. But also likewise, if dynamic typing provides a benefit to developer productivity, it is also very, very small.
There are a couple of studies that both come to about the same estimate of what percentage of errors in dynamically-typed languages are from type errors — about two (2) percent. It ought to follow that this is about the reliability benefit that you should expect to see from using static typing.
I've been programming for about 35 years now, about 20 years professionally. I've used languages with barely any notion of a type system (Commodore BASIC), languages that are statically but weakly typed (C), languages that are dynamically and weakly typed (Perl, JavaScript), languages that are dynamically and strongly typed (Python, Emacs Lisp, Common Lisp), and languages that are statically and strongly typed (Java, C#, Rust). By this point, more than half of my professional work is in statically typed languages.
The main thing that I draw from this experience is to confirm what the studies I highlight above show: type errors in dynamically-typed languages aren't a very big proportion of errors. In a medium-sized Python application like brutaldon, they just didn't come up much. Brutaldon didn't even have unit tests, since it was a hobby project written at least partially as an art/protest piece.
The reality is even worse, though. Simply using static typing *also* doesn't protect you from these 2% type errors. When I did have an error in a Python program that could reasonably be called a type error, it was almost always a NameError being thrown because of trying to access a property on None. Almost all of my dayjob work for the last 15 years or so has been C#. Can you guess what the most common error I encounter in C# code (both my own and others') is? That's right, a NullReferenceException, caused by trying to access a property on null. Any method in C# that returns a reference object (that is, basically anything that's not a primitive type or a struct) can return null, without violating any type constraints.
Now, the type system in C# is considered "not very expressive", and by 2022 we recognize that the null base type is a "billion dollar mistake". In my own C# code, I enforce the rule that null never crosses method boundaries, and use option and result types (not part of the language, but straightforwardly implemented in a library) as an alternative to returning null, to ensure that missing return values are handled correctly. And certainly there are modern statically typed languages that don't implement the billion dollar mistake — but these are not the ones widely used by industry, at least, not yet. And I'd note that the same patterns I'm using to circumvent null references in C# could also be used in Python.
There, are also, of course, statically typed languages with more expressive type systems, that do let you catch more kinds of errors, and more meaningful kinds of errors, than C# does. Haskell, or other languages with dependent types, would be the examples here. I don't really have experience with these, but my impression is that to really get the benefit, you must ensure that a lot of your domain logic is encapsulated in your types. That is, it's not enough to know that your variable is a float and not a character, and it's often not even enough to know that it's a Length and not a TimeSpan or a Velocity, but that it's a LengthInMeters that doesn't interoperate with a LengthInFeet, but does correctly return a VelocityInMpS when divided by a TimeSpanInSeconds.
My guess is that these languages *do* actually provide better reliability than dynamically-typed or Java-like statically-typed languages, but they constitute a *very* different programming paradigm, and if they don't reduce developer productivity in the long term, certainly do in the short term. But they are also not what most people are talking about when they argue about dynamic vs. static typing.
Use what you like, what's well-supported for your application domain, or what your employer requires you to use. Learn the weak points in whatever you use, and the patterns needed to avoid them. Don't expect a silver bullet.