💾 Archived View for yujiri.xyz › software › rust.gmi captured on 2023-01-29 at 03:12:03. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2022-06-03)
-=-=-=-=-=-=-
Rust is one of my favorite languages. It offers a lot of high-level conveniences while still qualifying - at least mostly - as a low level language.
The big unique thing about Rust is ownership. Every value is owned by the scope it's declared in, and only one scope is allowed to own it at a time. You have to *borrow* a value to pass it to another scope (like a function) without that scope taking ownership away from the caller. Values can be borrowed immutably or mutably, and mutable borrows are exclusive, which generally makes data races impossible.
I hear a lot of people saying the borrow checker is draconian and hard to satisfy, but I find its rules pretty intuitive and its error messages helpful. For almost every use case where you need to do something the borrow checker doesn't like, the standard library has a straightfoward solution in the form of a generic "smart pointer" type.
Ownership also enables Rust to have *neither garbage collection nor manual memory management* - both things I hate. Instead, Rust knows at compile time exactly when everything should be freed and inserts the free() calls for you. You get the safety and ergonomics of automatic memory management, with the performance of manual memory management.
The type system is very sophisticated. Being inspired by ML, it has interface (trait)-based polymorphism, generics, sum types, and tuples, so it's very rare that static typing ever "gets in the way". It also has type inference for local variables (but not for statics and function signatures), and even more advanced features like parameterized traits are used for fancy stuff like generic `.into()` methods.
I feel the type system is actually *too* sophisticated, as there's so much to it that I still struggle to understand parts of it after using the language for a long time, like turbofish, and the error messages I sometimes see with libraries like iced and serde are so arcane as to be meaningless to me. It's also common to see error messages saying something confusing like "that method exists for this type, but its trait bounds were not satisfied" when the truth seems to be that the method doesn't exist for the type and the error message is referring to some generic implementation that doesn't apply to that type.
Searching docs for what I want is sometimes hard because there are like literally 100 methods or 100 trait implementations to scroll through. Why? It's not because they're full of unnecessary, obscure things - at least not mostly. Look at the list of trait implementations for Vec (the basic growable array type), for example:
https://doc.rust-lang.org/stable/std/vec/struct.Vec.html
Most of these things are here for reasons. Why does Rust need all this while other languages don't? I think because Rust tries to be (and is) so fancy and powerful and flexible. Things like the AsRef trait (which took me like a year of using Rust before I finally understood its purpose, by the way) are necessary in Rust precisely because the compiler takes responsibility for managing ownership. And of course the mutable, immutable, and owned versions of everything...
Rust has basically no "magic". Syntactic constructs like indexing, iteration, and comparison use *traits* that you can implement on custom types. Even the implementation of everyday types like `Vec`, `Option` and `Result` are mostly library code (though the compiler does know about them to some extent to facilitate optimizations and better error messages).
Error-handling was the #1 thing I hoped Rust would improve over Go. It does, but not as much as I hoped. The improvements are:
let val = match could_fail() { Ok(val) => val, Err(err) => return Err(err), }
We can do:
let val = could_fail()?;
Excellent. If `could_fail()` returns an `Err`, we'll just propagate it upward, and otherwise, `val` becomes the unwrapped success value. Barely any more verbose than an exception language, and more explicit.
The big downside we still have is that there is no context by default. Propagating an error with `?` *only* propagates the original error, with no context added, so you might just get "error: file not found" out of a complex procedure that opens multiple files.
r/rust users informed me that the situation is similar to Go in that you're just expected to use third-party crates to get sane error handling. There isn't even just one that's dominant; apparently the verdict is that anyhow is appropriate for applications and thiserror is appropriate for libraries.
https://www.reddit.com/r/rust/comments/i1lyy5/how_does_new_error_handling_work/
anyhow works like this:
use anyhow::{Context, Result}; fn main() -> Result<()> { let args: Vec<String> = std::env::args().collect(); do_stuff(&args[1], &args[2], &args[3]) } fn do_stuff(file1: &str, file2: &str, file3: &str) -> Result<()> { let text = std::fs::read_to_string(file1).context("when reading")?; std::fs::write(file2, text).context("when writing")?; std::fs::remove_file(file3).context("when removing")?; Ok(()) }
And if I pass a filename that doesn't exist, I'll get output like:
Error: when reading Caused by: No such file or directory (os error 2)
This is almost exactly how Go's github.com/pkg/errors works.
Another problem with Rust error handling, this one actually i regression from Go, is that working with different possible types of errors is hard. Each function that can return an error is expected to have exactly one type it can return. For example OS stuff usually returns `io::Result<T>`, an alias for `Result<T, io::Error>`, where `io::Error` represents OS errors.
So what do you do if you have a function that might return an `io::Error` or might return a different kind of error? You have to convert every error that might get propagated into some common type. This is insanely cumbersome without the anyhow library, and even with, downcasting the error to check if it's a given specific type is really cumbersome. You can't do a simple comparison against a given error or use `errors.Is` like you can in Go.
Rust's syntax is pretty verbose. Not just that it's a brace and semicolon language, but types take extra characters: function parameters need `param: Type`, whereas in most other static languages it's just `param Type` or `Type param`, and parameterized types need a pair of angle brackets: `Vec<u8>` is a vector of bytes, instead of `Vec u8` like it would be with Haskell syntax.
The syntax for namespacing is `::` instead of `.` (except structs which still use `.`). A downside of this is that its precedence is misleading when combined with `.`: `users::table.load::<User>` looks like `users :: table.load :: <User>`, because the `::` is more visual separation so intuitively it should bind less tightly, but it's actually `users::table . load::<User>`.
Rust is littered with "glue" calls like `.unwrap()` when locking a lock, `.to_string()` on every string literal that you want to be a `String` instead of a `&str`, `.iter()` whenever you want to use an `Iterator` method on a sequence, etc. There is no concise string concatenation (`format!("{}{}", a, b)` is the best we got), and you need `impl StructName {...}` around method definitions, and `impl TraitName for StructName {...}` around trait implementations.
When pattern matching a sum type, variants are not in scope by default and need to be qualified with the sum type name:
enum Either { A, B, } // e is of type Either match e { Either::A => ..., Either::B => ..., }
(You can put them in scope with `use Either::*`, and this is done in the prelude for the builtin `Option` and `Result`. But that puts them in scope *everywhere*. What I want here is something like Zig, where you can basically just write the variant names while matching on them because the compiler knows what type you're matching on.)
Rust has variable declarations, but you can shadow by redeclaring the same name, solving the lexical coupling issues other explicit declaration languages have:
All bindings are immutable by default, which is important for the borrowing rules to be practical, and contributes to the language's goal of maximum safety.
Rust has OS-level threads as well as async/await (which I haven't used). The designers said somewhere that OS-level threads made more sense than green threads because Rust is supposed to be a systems language, and I'm happy with that.
Its communication system is similar to Go's, but ownership solves data races *and mutex hell*. Mutexed data can't be accessed without locking because of the type system, and when the unwrapped data goes out of scope, it's automatically unlocked. You can still deadlock of course, but this makes mutexes much easier to work with.
Rust actually leverages ownership to solve this too. Files are *automatically* closed when they go out of scope; it's handled by the Drop trait. I think it's the most elegant solution I've ever seen.
I've found the package/namespacing system very confusing. The keyword `mod` is used both to declare a module and to indicate that the definition is in another file. It took me a year to get to the point where I *think* I understand the 4 different import path prefixes (`crate::`, `super::`, `::`, and no prefix). There's the `extern crate` syntax which they say should only be necessary for "sysroot" crates, but I've found it seemingly necessary to work with Diesel. There is an explanation of some of the aspects that confuse me, but I still feel confused after reading it:
Path and module system changes
I actually love the way `use` works, though. You don't strictly have to have a `use` declaration at the top to be able to use an external crate because dependencies are all declared in Cargo.toml; all `use` does is unwrap namespaces. For example, `use std::env;` lets you use `env` directly in that scope, but without it you can still reference `env` as `std::env`, which can be more convenient for single uses, and I'm very attracted to the idea of not having to edit something at the top of the file when I realize I need to use a stdlib module on line 500.
In general, I despise Rust's zealous enforced privacy. Everything including struct fields is private by default, meaning you're basically out of luck if a dependency author didn't anticipate your use case.
Another onerous restriction is that a trait can only be implemented in either the crate that defines the type or the crate that defines the trait. You can't implement external traits on external types. And... serialization is an external trait! That means you just can't serialize types from a library that weren't defined to be serializable.
A lot of other real basics are subject to this issue to, like the ability to clone a value, or to ask if it's equal to another value of the same type, or the ability to print a default representation of it for debugging! If a library author didn't put the necessary `derive` attributes on their types, you just can't do those things with their types.
Rust uses macros to have type-safe string formatting, JSON literals, automatically derived trait implementations, and other niceties. They're a lot more sophisticated than C macros though; instead of naive string substitution, they create their own syntax contexts:
https://doc.rust-lang.org/1.7.0/book/macros.html#hygiene
Macros are a much more enlightened solution to all of these problems than runtime reflection in an otherwise static language. I've used them to encapsulate repeated patterns that couldn't be dealt with using normal functions (because they needed to return from the outer scope), and to implement my own error handling sugar for an unusual use case, making the code much less crowded.
That said, the set of features involved is insanely complex and I believe I will never understand them. There are a bunch of different kinds of macros: the legacy ones (`macro_rules!`, with their own wacky rules about module visibilty), the new declarative ones (`pub macro`), procedural ones, derive macros and attributes...
Rust doesn't have buid system hell. Cargo Just Works (and comes with subcommands to generate the project boilerplate for you). Cargo.toml allows specifying dependencies not only by crates.io name but also by git URL or filesystem path.
The compiler is the most helpful I've ever seen. It shows source context with colored output, the error descriptions are good, and it automatically points out unused stuff and unhandled Results!
There's even an official linter, clippy, which can do things like alert you when you've reimplemented a function from the stdlib.
But not all of the Rust toolchain is so good.
There's a fancy HTML docs generator, and the output is good, but there's no plaintext version for terminal viewing, you can *only* view Rust docs in a fucking web browser. A WEB browser. No, it's not okay for a systems language to be stapled to the fucking web.
Also, crates.io doesn't render anything without Javascript! That's another travesty. (Go to lib.rs instead!)
rustfmt is useless. Its preferences are simply objectively awful. It always wants to break your statements onto way too many lines:
Despite the massive amount of options in rustfmt.toml, none of them lets you disable this behavior, except by raising the max line length, which will make it *un*-break lines that you broke yourself. There's no way to tell rustfmt "line breaks are not your responsibility", which is what it needs to be told.
Also, part of the point of autoformatters is that they make everyone's code look the same. rustfmt doesn't, because almost every aspect of it can be customized in rustfmt.toml (just not the ones that are most problematic). I even used a customized config myself before I gave up on trying to use it at all.
I'm sad to say it because again it *is* one of my favorite languages, but Rust is horribly bloated.
One consequence particularly painful to me is that Rust does not, and probably will never support Plan 9.
Why not?
From what I've heard, it's because the Rust toolchain depends on LLVM, which is insanely huge and in C++, which doesn't support Plan 9. This dependency path is so thick that it would be unfeasible to port Rust to any system that LLVM doesn't support.
This is the cost of features.
Despite the number of features Rust already has, they're still adding more! There's an entire Github repo *just* for proposing changes to Rust (many of which are new features, more complexity, and you bet none of them are removing) with 600 open issues!
I want to link a certain thread on the Rust dev forum (no, the RFC repo isn't the only place new shit is proposed):
because I think this proposal is the perfect example of the bloat that these modern, powerful languages are experiencing. Writing `!` is too hard, we need a fucking new method!
In fairness, `!` does not really stand out visually, and that's part of why i prefer Python's decision to spell it `not`, but given Rust chose `!`, i'm strongly against duplicating that functionality.
I think the same thing about `.is_empty()` existing in the first place, although to a lesser extent. `.len() > 0` is not fucking hard, and it's crystal clear. It's actually even a character shorter.
In fact, notice this: if `.len() > 0` was the idiomatic way, there wouldn't have been a problem in the first place! `v.len() > 0` is much more visually different from `v.len() == 0` than `v.is_empty()` is from `!v.is_empty()`. Adding one unnecessary API led to the desire to add another.
And all while they're trying to cram every imaginable feature into this thing, some fundamental parts of the language itself aren't defined yet:
The Rust Reference: Behavior considered undefined
Warning: The following list is not exhaustive. There is no formal model of Rust's semantics for what is and is not allowed in unsafe code, so there may be more behavior considered unsafe.
Unfortunately, Rust hasn't actually defined its aliasing model. 🙀
Are you serious? The language has been 1.0 since 2015. You are supposed to have worked out everything like this by the time you call your language 1.0.
There are several ways in which Mozilla exerts a distasteful level of control over Rust:
Hyperbola wiki: Rust Trademark Concerns
Terrible in every way.
The standard library has a large number of obscure methods on its types, but from a breadth perspective, it includes absolutely nothing besides language basics, type methods and OS interfaces, no randomness, not even a time library. std::time has the Duration, Instant, and SystemTime types; the latter two are completely opaque meaning you can't do things like get the year number out of them (of course there's no strftime or strptime). Even Haskell has more of a stdlib than this.
The ecosystem is npm all over again. Every time I build a Rust project I watch in horror as cargo scrolls through fetching literally hundreds of dependencies, and it feels like a rare experience to find a crate that's reached 1.0.