💾 Archived View for yujiri.xyz › software › rust.gmi captured on 2022-06-03 at 23:22:00. Gemini links have been rewritten to link to archived content

View Raw

More Information

➡️ Next capture (2023-01-29)

-=-=-=-=-=-=-

yujiri.xyz

Software

Rust review

How Rust and I met

I first heard of Rust long ago through stray searches. I was interested because it seemed like a competitor to Go, which I had recently learned for a job but didn't really like. My biggest gripe with Go was error handling. But when I read about Rust's error handling, I misunderstood what I was reading, causing me to think it was even *more* verbose than Go's, so I stopped investigating.

Go review

I think that was before most of my search for a better language, which led me through brief dips into several but mostly Haskell. About a year passed.

Haskell review

Then I heard about Rust again from a friend who held it in high esteem. I did a little more research and found out that it has sugar to cut down on error handling boilerplate. That combined with an ML-inspired type system made it sound to me like Go done right, so I eagerly jumped in.

Ownership

So the big unique thing about Rust is ownership. Every value is owned by the scope it's declared in, and only one scope is allowed to own it at a time. You have to *borrow* a value to pass it to another scope (like a function) without that scope taking ownership away from the caller. Values can be borrowed immutably or mutably, and mutable borrows are exclusive, which generally makes data races impossible.

I hear a lot of people saying the borrow checker is draconian and hard to satisfy, but I find its rules pretty intuitive and its error messages helpful. For almost every use case where you need to do something the borrow checker doesn't like, the standard library has a straightfoward solution in the form of a generic "smart pointer" type.

Ownership also enables Rust to have *neither garbage collection nor manual memory management* - both things I hate. Instead, Rust knows at compile time exactly when everything should be freed and inserts the free() calls for you. You get the safety and ergonomics of automatic memory management, with the performance and simplicity of manual memory management.

Type system

The type system is very sophisticated. Being inspired by ML, it has interface (trait)-based polymorphism, generics, sum types, and tuples, so it's very rare that static typing ever "gets in the way". It also has inference for local variables (but not for statics and function signatures), and even more advanced features like parameterized traits are used for fancy stuff like generic `.into()` methods.

One might say the type system is *too* sophisticated, as there's so much to it that I still struggle to understand parts of it after using the language for a long time, like turbofish, and the error messages I sometimes see with libraries like iced and serde appear horribly arcane. It is also common to see error messages saying something confusing like "that method exists for this type, but its trait bounds were not satisfied" while the truth seems to be that the method doesn't exist for the type and the error message is referring to some generic universal implementation that simply doesn't apply to that type.

iced

serde

No magic

Rust has basically no "magic". Syntactic constructs like indexing, iteration, and comparison use *traits* that you can implement on custom types. Even the implementation of everyday types like `Vec`, `Option` and `Result` are *library code*, not compiler magic.

Error handling

So error-handling was the #1 thing I hoped Rust would improve over Go (type system was #2). It does, but not as much as I hoped. The improvements are:

let val = match could_fail() {
	Ok(val) => val,
	Err(err) => return Err(err),
}

We can do:

let val = could_fail()?;

Excellent. If `could_fail()` returns an `Err`, we'll just propagate it upward, and otherwise, `val` becomes the unwrapped success value. Barely any more verbose than an exception language, and more explicit.

The big downside we still have is that there is no context by default. Propagating an error with `?` *only* propagates the original error, with no context added, so when you see the error, it won't have a line number or any other accompanying information, let alone a stack trace. You get that out of the box in most dynamic languages, but in Rust you have to work hard to get them.

r/rust users informed me that the situation is similar to Go in that you're just expected to use third-party crates to get sane error handling. There isn't even just one that's dominant; apparently the verdict is that anyhow is appropriate for applications and thiserror is appropriate for libraries.

https://www.reddit.com/r/rust/comments/i1lyy5/how_does_new_error_handling_work/

anyhow works like this:

use anyhow::{Context, Result};

fn main() -> Result<()> {
    let args: Vec<String> = std::env::args().collect();
    do_stuff(&args[1], &args[2], &args[3])
}

fn do_stuff(file1: &str, file2: &str, file3: &str) -> Result<()> {
    let text = std::fs::read_to_string(file1).context("when reading")?;
    std::fs::write(file2, text).context("when writing")?;
    std::fs::remove_file(file3).context("when removing")?;
    Ok(())
}

And if I pass a filename that doesn't exist, I'll get output like:

Error: when reading

Caused by:
    No such file or directory (os error 2)

This is almost exactly how Go's github.com/pkg/errors works.

Another problem with Rust error handling, this one actually is a reggression from Go, is that working with different possible types of errors is hard. Each function that can return an error is expected to have exactly one type it can return. For example OS stuff usually returns `io::Result<T>`, an alias for `Result<T, io::Error>`, where `io::Error` has a method `kind` that returns `io::ErrorKind`, an enum of OS errors.

So what do you do if you have a function that might return an `io::Error` or might return a different kind of error? You have to convert every error that might get propagated to some common type. This is insanely cumbersome without the anyhow library, and even with, downcasting the error to check if it's a given specific type is really un-ergonomic. You can't do a simple comparison against a given error or use `errors.Is` like you can in Go.

Syntax

Rust's syntax is pretty verbose. Not just that it's a brace and semicolon language, but types take extra characters: function parameters need `param: Type`, whereas in most other static languages it's just `param Type` or `Type param`, and parameterized types need a pair of angle brackets: `Vec<u8>` is a vector of bytes, instead of `Vec u8` like it would be with Haskell syntax.

The syntax for namespacing is `::` instead of `.` (except structs which still use `.`). A downside of this besides being less ergonomic is that it's precedence is misleading when combined with `.`: `users::table.load::<User>` looks like `users :: table.load :: <User>`, because the `::` is more visual separation so intuitively it should bind less tightly, but it's actually `users::table . load::<User>`.

Rust is often littered with "glue" calls like `.unwrap()` when locking an `RwLock`, `.to_string()` on every string literal that you want to be a `String` instead of a `&str`, `.iter()` whenever you want to use an `Iterator` method on a sequence, etc. There is no concise string concatenation (`format!("{}{}", a, b)` is the best we got), along with needing `impl StructName {...}` around method definitions, and `impl TraitName for StructName {...}` around trait implementations.

When pattern matching a sum type, variants are not in scope by default and need to be qualified with the sum type name:

enum Either {
    A,
    B,
}

// e is of type Either
match e {
    Either::A => ...,
    Either::B => ...,
}

(You can put them in scope with `use Either::{A,B}`, and this is done in the prelude for the builtin `Option` and `Result`.)

Variable declarations

Rust has variable declarations, but you can shadow by redeclaring the same name, solving the lexical coupling issues other explicit declaration languages have:

Variable declarations

All bindings are immutable by default, which is important for the borrowing rules to be practical, and contributes to the language's goal of maximum safety.

Array operations

The Vec type (which is the main sequence type) accomodates most common operations out of the box: find, insert, remove, sort, reverse, filter, map, and a ton of obscure ones. The only thing you don't get is negative index.

Concurrency

Rust has OS-level threads as well as async/await (which I haven't used). The designers said somewhere that OS-level threads made more sense than green threads because Rust is supposed to be a systems language, and I'm happy with that.

Its communication system is similar to Go's, but ownership solves data races *and mutex hell*. Mutexed data can't be accessed without locking because of the type system, and when the unwrapped data goes out of scope, it's automatically unlocked. You can still deadlock of course, but this makes mutexes much easier to work with.

Resource management

Rust actually leverages ownership to solve this too. Files are *automatically* closed when they go out of scope; it's handled by the Drop trait. I think it's the most elegant solution I've ever seen.

On the other hand, it is less flexible than the Go and Python solutions in that it doesn't pull double-duty for other use cases. As far as I know, nothing in Rust provides the full power of `defer` or `finally`.

Module system

The thing I've found most confusing about Rust is the package import/namespacing system. One aspect is that the keyword `mod` is used both to explicitly declare a module *and* to indicate that the definition is in another file. There's weird stuff like import paths starting with `::` and the `crate` keyword for the current crate, and the `extern crate` keyword which they say should only be necessary for "sysroot" crates, but I've found it seemingly necessary to work with Diesel. There is an explanation of some of the aspects that confuse me, but I still feel confused after reading it:

Path and module system changes

I actually love the way `use` works. You don't strictly have to have a `use` declaration at the top to be able to use an external crate because dependencies are all declared in Cargo.toml; all `use` does is unwrap namespaces. For example, `use std::env;` lets you use `env` directly in that scope, but without it you can still reference `env` as `std::env`, which can be more convenient for single uses, and I'm very attracted to the idea of not having to edit something at the top of the file when I realize I need to use a stdlib module on line 500.

In general, I despise Rust's zealous enforced privacy. Everything is private by default and there is no backdoor, meaning you're basically out of luck if a dependency author didn't anticipate your use case.

Enforced privacy is stupid

Another onerous restriction is that a trait can only be implemented in either the crate that defines the type or the crate that defines the trait. You can't implement external traits on external types. And... serialization is an external trait! That means you just can't serialize types from a library that weren't defined to be serializable.

The serialization library

A lot of other real basics are subject to this issue to, like the ability to clone a value, or to ask if it's equal to another value of the same type, or the ability to print a default representation of it for debugging! If a library author didn't put the necessary `derive` attributes on their types, you just can't do those things with their types.

Macros

Rust uses macros to have type-safe string formatting, JSON literals, automatically derived trait implementations, and other niceties. They're a lot more sophisticated than C macros though; instead of naive string substitution, they create their own syntax contexts:

https://doc.rust-lang.org/1.7.0/book/macros.html#hygiene

Macros are a much more enlightened solution to all of these problems than runtime reflection in an otherwise static language. I've used them to encapsulate repeated patterns of that that couldn't be dealt with using normal functions (because they needed to return from the outer scope), and to implement my own error handling sugar for an unusual use case, making the code much less crowded.

That said, the set of features involved is insanely complex and I believe I will never understand them. There are a bunch of different kinds of macros: the legacy ones (`macro_rules!`, with their own wacky rules about module visibilty), the new declarative ones (`pub macro`), procedural ones, derive macros and attributes...

Tooling

Rust doesn't have buid system hell. Cargo Just Works (and comes with subcommands to generate the project boilerplate for you). Cargo.toml allows specifying dependencies not only by crates.io name but also by git URL or filesystem path. *Most* other languages I know make filesystem path imports hard.

Build systems are a scourge

The compiler is the most helpful I've ever seen. It shows source context with colored output, the error descriptions are good, and it automatically points out unused stuff and unhandled Results!

There's even an official linter, clippy, which can do things like alert you when you've reimplemented a function from the stdlib.

But not all of the Rust toolchain is so good.

rustdoc

There's a fancy HTML docs generator, and the output is good, but there's no plaintext version for terminal viewing, you can *only* view Rust docs in a fucking web browser. A WEB browser. No, it's not okay for a systems language to be stapled to the fucking web.

The web is evil

Web browsers need to stop

Also, crates.io doesn't render anything without Javascript! That's another travesty. (Go to lib.rs instead!)

rustfmt

rustfmt is useless. Its preferences are simply objectively awful. It always wants to break your statements onto way too many lines:

Example

Despite the massive amount of options in rustfmt.toml, none of them lets you disable this behavior, except by raising the max line length, which will make it *un*-break lines that you broke yourself. There's no way to tell rustfmt "line breaks are not your responsibility", which is what it needs to be told.

Also, part of the point of autoformatters is that they make everyone's code look the same. rustfmt doesn't, because almost every aspect of it can be customized in rustfmt.toml (just not the ones that are most problematic). I even used a customized config myself before I gave up on trying to use it at all.

Bloat and lack of stability

I'm sad to say it because I really like so many things about it, but Rust is a horribly bloated language.

Features are costs

One consequence particularly painful to me is that Rust does not, and probably will never support Plan 9.

Why not?

From what I've heard, it's because the Rust toolchain depends on LLVM, which is insanely huge and in C++, which doesn't support Plan 9. This dependency path is so thick that it would be unfeasible to port Rust to any system that LLVM doesn't support.

This is the cost of features.

Despite the number of features Rust already has, they're still adding more! There's an entire Github repo *just* for proposing changes to Rust (many of which are new features, more complexity, and you bet none of them are removing) with 600 open issues!

Rust RFCs

I want to link a certain thread on the Rust dev forum (no, the RFC repo isn't the only place new shit is proposed):

https://internals.rust-lang.org/t/the-is-not-empty-method-as-more-clearly-alternative-for-is-empty/10612

because I think this proposal is the perfect example of the bloat that these modern, powerful languages are experiencing. Writing `!` is too hard, we need a fucking new method!

In fairness, `!` does not really stand out visually, and that's part of why i prefer Python's decision to spell it `not`, but given Rust chose `!`, i'm strongly against duplicating that functionality.

I think the same thing about `.is_empty()` existing in the first place, although to a lesser extent. `.len() > 0` is not fucking hard, and it's crystal clear. It's actually even a character shorter.

In fact, notice this: if `.len() > 0` was the idiomatic way, there wouldn't have been a problem in the first place! `v.len() > 0` is much more visually different from `v.len() == 0` than `v.is_empty()` is from `!v.is_empty()`. Adding one unnecessary API led to the desire to add another.

And all while they're trying to cram every imaginable feature into this thing, some fundamental parts of the language itself aren't defined yet:

The Rust Reference: Behavior considered undefined

Warning: The following list is not exhaustive. There is no formal model of Rust's semantics for what is and is not allowed in unsafe code, so there may be more behavior considered unsafe.

The Rustonomicon: Aliasing

Unfortunately, Rust hasn't actually defined its aliasing model. 🙀

Are you serious? The language has been 1.0 since 2015. You are supposed to have worked out everything like this by the time you call your language 1.0.

Stdlib and ecosystem

Terrible in every way.

The standard library has a large number of obscure methods on its types, but from a breadth perspective, it includes absolutely nothing besides language basics, type methods and OS interfaces, no randomness, not even a time struct. std::time features the Duration, Instant, and SystemTime types; the latter two are completely opaque meaning you can't do things like get the year number out of them (of course there's no strftime or strptime). Even Haskell has more of a stdlib than this.

The ecosystem is npm all over again. Every time I build a Rust project I watch in horror as cargo scrolls through fetching literally hundreds of dependencies, and it feels like a rare experience to find a crate that's reached 1.0.

Performance

My impression of Rust's performance is that it's extremely good, but especially on memory use. Not having garbage collection is probably its main edge over other compiled languages. See this page about Rust vs C speed (yes, it's a two-sided comparison!):

https://kornel.ski/rust-c-speed

Basically, Rust is as performant as you can hope to be while offering great protection against memory safety and other bugs that plague every program written in C or C++.

Rust currently occupies the top spot as my favorite language. Most languages are either high or low level, but Rust is both. It's almost as expressive as dynamic languages, but safer than other static languages and has facilities for when you need really fine control over things like memory layout. There's an even a tutorial on writing an operating system in Rust.

OS tutorial

"Is it time to rewrite the operating system in Rust?" By Bryan Cantrill

contact

subscribe via RSS