💾 Archived View for dioskouroi.xyz › thread › 24991848 captured on 2020-11-07 at 00:39:19. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Why Dark didn't choose Rust

Author: pimterry

Score: 130

Comments: 92

Date: 2020-11-04 18:52:09

Web Link

________________________________________________________________________________

qppo wrote at 2020-11-04 19:19:23:

If you're writing async code you don't need to worry about pinning unless you're manually writing futures or designing an executor. But if you are curious, this chapter explains it pretty good:

https://rust-lang.github.io/async-book/04_pinning/01_chapter...

Apparently, recursion adds a new level of complexity to async.

It really doesn't - it's just some users may be surprised by the lack of magic when it comes to Rust. Async is implemented with futures, futures are types, and you can't recursively define a type without boxing.

Apparently, this boxing and pinning is what you get when you don't have a GC, and that when you do have a GC, you simply don't need to deal with it. So that was the final straw for me.

It's kind of weird to me that language developers would pass on Rust because they don't understand memory management. How is memory going to be managed in Dark? If it's GC'd, how is that implemented?

nemothekid wrote at 2020-11-04 19:26:07:

>_It's kind of weird to me that language developers would pass on Rust because they don't understand memory management. How is memory going to be managed in Dark? If it's GC'd, how is that implemented?_

This was weird to me as well, but it looks like Dark is more like a RAD tool with a custom integrated scripting language. In that case I don't think you need a sophisticated GC.

qppo wrote at 2020-11-04 19:30:41:

I can understand the point that X Lang is easier to write a DSL parser/interpreter in than Y Lang for a bunch of reasons, including memory management (I've hit the recursion snags a few times writing parsers in Rust myself - I get it). So I don't want to denigrate their design decision.

Just the logic seemed out of nowhere to me, like this shouldn't be terribly surprising to a language developer, and it's framed like they're discovering things for the first time.

didibus wrote at 2020-11-04 19:24:09:

Its just delegated to the host platform I'm assuming. Also, I don't know anything about Dark, but it's a DSL, and not a general purpose language, and could very well be running interpreted right now, just mapping things back/forth to the host.

This is all hypothetical, I actually don't know, just my best guess.

throwaway894345 wrote at 2020-11-04 19:33:35:

I’m guessing Dark is interpreted and that its objects are managed by the host language’s GC?

pbiggar wrote at 2020-11-04 19:34:10:

Yes, that's right.

pbiggar wrote at 2020-11-04 19:33:46:

> If it's GC'd, how is that implemented?

The language in which the Dark interpreter is built has a GC, and we just use that.

A GC is a huge undertaking, and we already have to build an editor, a language, a stdlib, and infrastructure.

qppo wrote at 2020-11-04 19:51:40:

Why do you need to build an editor too?

Side note, I've written a GC in rust (stop the world, compacting). It's not a cutting edge, concurrent generational GC but it's not _that_ big of an undertaking. And it performs very well for small problem sets since memory is all slab allocated. Only a few hundred lines in Rust.

I do definitely see how writing a tree walking interpreter is non ideal in Rust if you don't have a ton of experience writing Rust. I have a tiny lisp in Rust compared to doing it in dynamic languages it is quite verbose and hard to grok.

pbiggar wrote at 2020-11-04 19:56:20:

The premise of Dark is that we can remove a huge amount of accidental complexity by having an integrated toolkit of language, editor, and infrastructure. This allows for three major steps forward:

- we're able to deploy (safely!) in 50ms rather than minutes because of it. [1]

- we completely host the infra and you don't need to think about it (including DBs and queues, which are nicely integrated into the language) [2]

- trace driven development, where we use real values from production and show you them in your editor [2]

      [1] https://blog.darklang.com/how-dark-deploys-code-in-50ms/
    [2] https://darklang.com/launch/demo-video

qppo wrote at 2020-11-04 20:03:25:

I get the value prop and totally believe in your approach, I'm just confused why you would opt to write an entire editor when the LSP can do whatever you want and integrates into numerous editors.

snazz wrote at 2020-11-04 20:40:22:

They can significantly reduce tooling complexity with their own web-based editor. It's also a much more graphical application than a traditional text editor. Take a look at some screenshots from their demo video:

https://www.youtube.com/watch?v=orRn2kTtRXQ

hyperpape wrote at 2020-11-04 20:07:12:

It's pretty easy to understand non-moving garbage collection for a language where everything is boxed and heap-allocated without having to think about the concerns that lead to pinning. Pretty sure that's the order in which I encountered the concepts.

rat9988 wrote at 2020-11-04 19:24:25:

> because they don't understand memory management. How is memory going to be managed in Dark? If it's GC'd, how is that implemented?

That's one more thing you can afford to not understand and still be productive.

arcticbull wrote at 2020-11-04 19:32:56:

Sometimes, until you can't, then you have a really bad time.

For instance:

- You shouldn't rely on finalizers to manage resources due to the lack of determinism. No compiler support to help you through that.

- Similarly, when the magic blob GC starts acting up, how do you solve that?

- What about memory usage being orders of magnitude higher in a GC'd system?

It's just _different_ problems. To an extent the line where you have to care is much closer in Rust than Java, but in Java, when you cross the line, good luck. Rust forcing you to be expressive from the get-go pays dividends down the line, allowing you to be relatively more productive, later.

I guess it's like skiing vs. snowboarding. Skiing is easy to get going, but really difficult to get good at. Snowboarding is really hard to get started, but once you do, it's relatively much easier to be great at it.

alerighi wrote at 2020-11-04 19:56:55:

If you don't do embedded programming (where you shouldn't do dynamic memory allocation anyway, being that with or without a GC) memory is often not a constraint at all. And in a lot of situation it cost you less to increase the RAM (if you can) than optimizing the code.

> when you cross the line, good luck

And if you never cross the line? You wasted thousands of dollars for developing something in Rust that you could have written in Python in 1/4 of the time and be fine.

To me the job of the programmer is not to reason about memory. The job of the programmer is to reason about algorithms, data structures, and thus a language that abstracts the memory management is better if you can afford it. It has a cost, sure like it has a cost using C instead of assembly.

steveklabnik wrote at 2020-11-04 20:02:45:

One thing that gets lost in these discussions is, a GC only helps with memory. Rust's ownership system (and to some degree, RAII & friends in C++) let you manage arbitrary resources here. There's a lot more than just memory going on.

arcticbull wrote at 2020-11-04 20:07:28:

I think you misunderstood, if you run into memory pressure issues in Java you need to spend tons of time tweaking and optimizing your specific garbage collector by poking the black box and hoping.

Of course in certain circumstances where it really doesn't matter you get your choice of languages.

> To me the job of the programmer is not to reason about memory.

The job of a programmer is to describe what needs to happen to a set of inputs to yield a set of outputs, and there are many different ways to do so. Sometimes you do need to describe the relationships between data. Rust attempts to leverage your description of the transformation and relationships to make memory management something you don't have to worry about either.

There are limits, of course, and sometimes you need to help it out, however this amounts to improving or clarifying the description of your system. On the other hand resolving garbage collection issues often involves making surprising and seemingly irrelevant changes to your system that happen to influence via spooky action at a distance the behavior of an unrelated component of the system. And if you remove them it all falls apart again.

To borrow a turn of phrase, it has a cost, sure like it has a cost using C instead of assembly.

mlindner wrote at 2020-11-05 04:41:13:

> memory is often not a constraint at all

This type of thought process is how we get Electron applications and when several of them are running at the same time they can bring your OS to a crawl by having consumed every bit of memory available for some simple GUIs. Memory absolutely always matters.

Also the people who I see saying memory doesn't matter are usually developers who are running on 32GB+ work stations. They forget about the user still running with 4GB or 8GB of memory.

mping wrote at 2020-11-05 07:29:39:

As much as I hate electron I'm still waiting for an equivalent, low memory native GUI toolkit that has an equally low barrier to entry.

BookHeretykow wrote at 2020-11-04 21:08:37:

I've seen memory leaks because of optimization and other stupid excuses enough times to know this is not a valid statement. it doesn't matter GC or know. It's minimum to know how to manage your memory.

brundolf wrote at 2020-11-04 19:23:23:

Highly subjective and project-specific, but I still think it provides a nice counter-anecdote to the constant Rust praise we tend to see here. And I say that as a Rust fan myself.

...but remember that garbage collectors are great. By having a GC, we don't have to do any of the stuff that causes all these problems in Rust. Maybe that costs performance, but I need the ability to quickly write code a lot more than I need the extra performance.

This is very valid and sensible. Worth noting is that C++ would be a ridiculous choice for this type of application, and Rust manages to be a _not_ ridiculous choice which is already a huge accomplishment. But that doesn't mean it's always a _good_ choice.

steveklabnik wrote at 2020-11-04 19:49:38:

Thanks for writing this Paul; Rust is not going to be for everyone, and it's always nice to see reasoning laid out.

For whatever it's worth, I wish I had more time to contribute some code back when you were asking for examples. It kinda felt to me like a lot of your code was trying to write OCaml in Rust, which causes a lot of this friction. But it was only a feeling; I didn't have the capacity to really dig in. It is also very true that taking time to get up to "what is the Rust-y way to do this" is a challenge for folks learning Rust, and a valid reason to use a language that works closer to how your brain works.

I'm excited to see how F# works out for you; it's something I've always wanted to spend some time with, but never found the chance to actually do so.

FullyFunctional wrote at 2020-11-04 20:44:05:

FWIW, I had the exact same experience as the OP; I was slowly getting comfortable with Rust until I wanted (high performance) cooperative multitasking, which in Rust meant getting into Async and all the complexities of Rust multiplied to a point where I'm reconsidering some life choices. And I have yet to match the performance of the C++ equivalent solution :(

I suspect I would have liked the original green threads more, despite the drawbacks.

djhaskin987 wrote at 2020-11-04 19:47:18:

Ultimately it was a no-brainer to choose F#.

Their current code base is in OCaml. Porting code from OCaml to F# sounds about as painful as moving from Python 2 to 3[1], which while painful, is waaay better than a total re-write. F# has much better libraries and ecosystem (because it is built on .NET). Their frustration was with the libraries and ecosystem of OCaml, not its syntax[2]. So, if they switch to F#, much of their core logic remains syntactically the same, and simply needs reworking (rather than rewriting). They also have to retool/call .NET libraries instead of C/OCaml ones, of course, but this is really the point of the move anyway.

So: better libraries for what they're doing, not a total rewrite (just a re-work), preserve the overall codebase's structure, less disruption. It just makes good business sense to me.

[1]:

https://stackoverflow.com/questions/4239121/code-compatibili...

[2]:

https://blog.darklang.com/leaving-ocaml/

mmastrac wrote at 2020-11-04 19:22:23:

Yeah, the problem with jumping into Rust headfirst is that you assume that the most complex parts of the language are what you need to use to solve your problem.

It turns out that the vast majority of Rust code is just fine using Arc and letting the reference counts handle your memory management.

The lifetimes are a really cool feature for library/data structure authors, but if you're writing "web server" style code just use Arc, call clone, and be done with it, unless you've got a specific perf issue.

throwaway894345 wrote at 2020-11-04 19:44:01:

> It turns out that the vast majority of Rust code is just fine using Arc and letting the reference counts handle your memory management.

I wish there was more content in the Rust community that stressed this, or rather that made it clear what the happy path was so we don’t have to go off and learn it by trial and error. I always get the feeling that using Arc, etc gratuitously is a bad practice that may bite me later instead of “your code will be fine unless it really is perf sensitive (Steve’s sibling comment also kind of illustrates the lack of consensus about what is the happy path).

steveklabnik wrote at 2020-11-04 20:00:59:

I think that folks are happy to _say_ "oh just toss an Arc on it", and there's no like, moral opposition to doing so. But the issue is that it is much easier said than done.

So for example, let's take a really simple program:

    fn takes_ref(r: &i32) {
      println!("r: {}", r);
  }
  
  fn main() {
      let v = 5;
      
      takes_ref(&v);
  }

Sure. Not a big deal. But imagine that, for some reason, we are having issues here. We can toss an Arc on it, sure, no big deal:

    use std::sync::Arc;
  
  fn takes_ref(r: &i32) {
      println!("r: {}", r);
  }
  
  fn main() {
      let v = Arc::new(5);
      
      takes_ref(&v);
  }

That's not _too_ bad, we're only wrapping up the constructor. Okay, sure. But what happens when our requirements change, and we need to mutate something inside takes_ref? We can do that very easily in our original program:

    fn takes_ref(r: &mut i32) {
      *r += 1;
      println!("r: {}", r);
  }
  
  fn main() {
      let mut v = 5;
      
      takes_ref(&mut v);
  }

But if we want to do this in our Arc world... we have to do this:

    use std::sync::{Arc, Mutex};
  
  fn takes_ref(r: &mut i32) {
      *r += 1;
      println!("r: {}", r);
  }
  
  fn main() {
      let v = Arc::new(Mutex::new(5));
      
      takes_ref(&mut v.lock().unwrap());
  }

This is _way_ more boilerplate. And, I even messed it up the first time.

So the end experience here is "ugh so much boilerplate in Rust", when often times, the answer is "don't use Arc/Rc Mutex/RefCell" They exist because, in some cases, you legitimately do need to use them. But if we encourage people to reach for them too early, they can have an even worse time.

throwaway894345 wrote at 2020-11-04 20:17:06:

That makes sense, and this is exactly the kind of thing a newbie has to wade through without the benefit of knowing the tradeoffs in choosing one path or the other. There probably are docs that say when to use Arc and when not to, but there's still a cacophony of voices saying "just use Arc and deal with performance issues later" and it's hard for a newbie to understand whom to listen to. Note that this isn't a criticism of Rust, but rather a rough edge that I'm confident will eventually be polished off.

steveklabnik wrote at 2020-11-04 20:36:18:

Yep, absolutely agreed. I think we'll get there, just a lot of work to be done around figuring out how to turn new Rustaceans into intermediate ones without just saying "uhhh I dunno write some code and you'll get it."

brundolf wrote at 2020-11-04 19:28:49:

> It turns out that the vast majority of Rust code is just fine using Arc and letting the reference counts handle your memory management.

There are disadvantages to Rc/Arc compared with GC, though:

- They tend to be less performant

- Reference cycles can leak

- In languages where they exist at a library level instead of a language level, they're much less ergonomic than GC

So if your core problem domain involves creating and disposing of lots of heap-allocated stuff all the time, you're probably better off just using a language with a proper GC. The Rust version may not even _perform_ better if you're just using Arc, everywhere, by default.

Use Rust when your core domain/hot paths can mostly stick to the single-ownership model (and ideally the stack). Rc/Arc/RefCell are a band-aid for when that fails.

zozbot234 wrote at 2020-11-04 21:40:45:

> Rc/Arc/RefCell are a band-aid for when that fails.

I would not describe Rc/Arc/RefCell as "band-aids", let alone as "admissions of failure". Rc<> is precisely the right approach for objects that might need to have their lifecycle extended by multiple 'owning' references; Arc<> applies when the 'owning' references might span separate threads. Cell<> and RefCell<> are for shared mutable state within a single thread, whereas Mutex<> and Rwlock<> serialize concurrent access from multiple threads. To reiterate, these are not clumsy "band-aids" or "hacks"; they're elegant, self-contained solutions to rather well-defined resource management problems.

It's true that some "core domains" are inherently unsuitable to Rust-as-it-currently-exists, due yo the need of general GC. But these domains are not very common, and future versions of Rust might well add some support for optional, self-contained GC's.

brundolf wrote at 2020-11-04 21:45:16:

Maybe "trap door" is a better word.

My point is that these constructs exist so that you can do things with Rust that Rust isn't really ideal for. Any language or framework has cases that don't fit its golden path, and any (useful) one will have these trap-doors so that those cases aren't totally impossible. Another (more extreme) one that Rust has is unsafe { } blocks. Rc/Arc/RefCell move guarantees to runtime; unsafe { } removes them completely. And this is well and good: real-world requirements are messy and varied and rarely fit neatly into a predetermined model. But heavy use of trap-doors is a code smell, and may be an indicator that you're using the wrong language or framework for the job.

zozbot234 wrote at 2020-11-04 21:55:31:

> My point is that these constructs exist so that you can do things with Rust that Rust isn't really ideal for.

And this is where I disagree. Shared mutable state and multiple ownership via automated reference counting are legitimate patterns in a system programming language, and using Rc, Cell, RefCell, etc. enables these patterns in a way that preserves memory safety. That's pretty close to ideal, especially when compared to what you often see in C/C++.

brundolf wrote at 2020-11-04 22:01:10:

I haven't done much of what could be called "systems work" in my time with Rust, so it's hard for me to comment on that, but I would tend to assume the overhead of these constructs would be extra unappealing in that context, and that instead people would do some combination of re-structuring their code to fit the ownership model and/or using unsafe { }.

I guess it also depends on how much memory "churn" your program has. Maybe an operating system keeps objects around for a long time, in which case the overhead of reference-counting diminishes towards negligible. But a web server will typically allocate a bunch of memory and then throw it all away on a per-request basis. If all of this is happening behind ref-counters, I think it will start to become noticeable. Of course, for many kinds of web servers it's very possible to accomplish your goal using single-ownership. But it sounded like that wasn't true in the OP's case.

dgolubets wrote at 2020-11-05 14:08:20:

Assuming something is not very useful cognitive tool.

https://www.techempower.com/benchmarks/

A typical web server application has absolutely no problems with Arcs. They will be just a small percentage of overall number of objects allocated. Most of the time it's just your application dependencies\components that you need shared access to.

mc10 wrote at 2020-11-04 20:03:37:

As someone who's relatively new to Rust, I'm curious: what is an example of a situation where someone might lean on Rc/Arc as a "band-aid", and what would be a more "idiomatic", non-Rc/Arc solution?

brundolf wrote at 2020-11-04 20:24:03:

Rust's key feature - the borrow-checker - relies on the idea that each value has a single "owner" at any given time. This owner can be a function, another value (a parent struct), etc. You can put these values on the heap, but if you use Box (the go-to for heap allocation), that pointer still has to have a single logical "owner". Under idiomatic Rust, each value effectively lives in one single "place". This allows the compiler to determine with 100% confidence at what point it's no longer being used and can therefore be de-allocated.

Now, these values can be lent out ("borrowing") to sub-functions and such via references (mutable or immutable). Multiple immutable references can be handed out at once, but a mutable reference to a value has to be the only reference to that value of _any_ kind, at a given time.

The problem is, some domains really don't lend themselves to this restricted model. No two objects or functions can point, mutably, to the same object at the same time. You simply can't create a graph of inter-referenced objects where a single value may have multiple "parents". And sometimes even with a perfectly tree-like ownership structure moving values around can get complicated, because Rust has to know _for sure_ that the ownership model is adhered to. This is where explicit lifetimes and such can come into play. Even writing a linked-list in Rust without using unsafe { } (or Rc's) is _hard_ (

https://rust-unofficial.github.io/too-many-lists/

In Rust, Rc's are kind of an admission of defeat. You're telling Rust not to perform its normal "compile-time" automatic deallocation, instead having it track references at runtime (which comes with overhead) to know when to de-allocate. What this buys you is basically an out from the ownership system: instead of handing off a plain reference to multiple places, which Rust may not let you do, you just clone the Rc and hand off that "new" value which can go anywhere it wants. That Rc is then what gets tracked by the ownership system and de-allocated, and when de-allocated it decrements the count (again, at runtime), and eventually that runtime mechanism (hopefully) decides the real value can be de-allocated.

Basically any part of your code that uses Rc/Arc is giving up one of the biggest features of Rust. Which is totally fine, if you're reaping those advantages elsewhere and you just need to bridge a gap where ownership is too limiting. But if heap-juggling is going to be primary thing your program is doing, you'll probably have a better overall time with a GCed language.

steveklabnik wrote at 2020-11-04 20:28:52:

Here's an example. You want to do some computations on an array of values:

      fn main() {
        let mut v = vec![1, 2, 3];
        
        for i in &mut v {
            *i += 1;
        }
        
        println!("v: {:?}", v);
    }

They want to speed this up with threads. So they ask "how do I do threads in Rust" and get pointed to std::thread. So they write this code:

      use std::thread;

    fn main() {
        let mut v = vec![1, 2, 3];
        
        for i in &mut v {
            thread::spawn(move ||{
                *i += 1;
            });
        }
        
        println!("v: {:?}", v);
    }

and they get this error message:

      error[E0597]: `v` does not live long enough
      --> src/main.rs:6:18
       |
    6  |         for i in &mut v {
       |                  ^^^^^^
       |                  |
       |                  borrowed value does not live long enough
       |                  argument requires that `v` is borrowed for `'static`
    ...
    13 |     }
       |     - `v` dropped here while still borrowed

(there's more to the error message but I'm cutting it to the start)

So they ask "hey how do I make v live for 'static" and someone says "you use Arc" so they write this:

      use std::thread;
    use std::sync::Arc;

    fn main() {
        let v = Arc::new(vec![1, 2, 3]);
        
        for i in v.iter_mut() {
            thread::spawn(move ||{
                *i += 1;
            });
        }
        
        println!("v: {:?}", v);
    }

and get this error:

      error[E0596]: cannot borrow data in an `Arc` as mutable
     --> src/main.rs:7:18
      |
    7 |         for i in v.iter_mut() {
      |                  ^ cannot borrow as mutable
      |
      = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::sync::Arc<std::vec::Vec<i32>>`

So then they ask "hey I have an arc, but I want to mutate things inside of it, how do I do that?" and the answer is "use a mutex", so they write this:

      use std::thread;
    use std::sync::{Arc, Mutex};

    fn main() {
        let v = Arc::new(Mutex::new(vec![1, 2, 3]));
        
        for i in v.lock().unwrap().iter_mut() {
            thread::spawn(move ||{
                *i += 1;
            });
        }
        
        println!("v: {:?}", v);
    }

but this _still_ doesn't work, because the lock is held during multiple threads of execution. So they figure out that they can do this:

      use std::thread;
    use std::sync::{Arc, Mutex};

    fn main() {
        let v = Arc::new(Mutex::new(vec![1, 2, 3]));
        let mut joins = Vec::new();
        
        for i in 0..3 {
            let v = v.clone();
            
            let handle = thread::spawn(move ||{
                v.lock().unwrap()[i] += 1;  
            });
            
            joins.push(handle);
        }
        
        for handle in joins {
            handle.join().unwrap();
        }
        
        println!("v: {:?}", v);
    }

I've skipped a few iterations here because this comment is _already_ too large. The point is, they've now accomplished the task, but the boilerplate is _way way way_ out of control.

A more experienced Rust person would see this pattern and go "oh, hey, these threads don't actually live forever, because we want to join them all, but the compiler doesn't know that with thread::spawn because it's so general. What we want is scoped threads" and writes this:

      use scoped_threadpool::Pool;

    fn main() {
        let mut pool = Pool::new(3);
        let mut v = vec![1, 2, 3];
        
        pool.scoped(|scope| {
            for i in &mut v {
                scope.execute(move ||{
                    *i += 1;       
                });
            }
        });
        
        println!("v: {:?}", v);
    }

and moves on with life. Way more efficient, way easier to write, extremely hard for a new person to realize that this is what they should be doing.

psadauskas wrote at 2020-11-04 23:35:19:

This is exactly what I was struggling with over the weekend in a side project. My "vec" is lines from a file read from the filesystem, but my real goal is for it to be lines in the request body from an HTTP POST. As a Rust beginner, I get to go through these exact steps all over again but with tokio-flavored error messages instead, and its at least 2x more complicated. Like you said, its "extremely hard for [me] to realize [what it is I] should be doing."

steveklabnik wrote at 2020-11-05 21:13:01:

Sorry to hear that. This is partially why there's a culture of helping people with questions; ideally when you run into an issue, you should be able to hop onto the fourms or discord and get help, and people should be able to help suss out context. It's not always easy though :/

brundolf wrote at 2020-11-05 04:11:12:

If you want an easier to use web framework, might I recommend Rocket

https://rocket.rs/k

steveklabnik wrote at 2020-11-04 19:43:03:

It is not my experience that Arc is used in the vast majority of Rust code.

jbirer wrote at 2020-11-04 19:20:41:

As a C, Go and Javascript programmer (nearly 10 years of experience with both of them), I really feel what the author is saying when they say there are too many ways to do something and fighting the compiler. There comes a point when the languages quirks start to eat into your productivity.

That being said, I have a feeling that the author lacks programming expertise to develop a language (I am not saying this in an offensive way). They are dodging things that are needed to build a language (memory management, choices of libraries etc.).

busterarm wrote at 2020-11-04 19:50:17:

> That being said, I have a feeling that the author lacks programming expertise to develop a language (I am not saying this in an offensive way). They are dodging things that are needed to build a language (memory management, choices of libraries etc.).

The author is Paul Biggar, a former Sr. Compiler Engineer @ Mozilla and founder of CircleCI.

kiliancs wrote at 2020-11-04 21:00:54:

> As a C, Go and Javascript programmer (nearly 10 years of experience with both of them)

Off by one. Classic ;)

luminati wrote at 2020-11-04 19:49:04:

IIRC the author has a PhD in CS, and his thesis work was on PLT.

ook wrote at 2020-11-05 00:29:10:

http://www.tara.tcd.ie/handle/2262/77562

- "Design and implementation of an ahead-of-time compiler for PHP" is the PhD thesis in question.

A google tech talk related to this work is on youtube

https://www.youtube.com/watch?v=kKySEUrP7LA

You can find multiple other papers on google scholar where Paul was an author all related to programming languages, compilers etc.

mlindner wrote at 2020-11-05 05:16:23:

Some people are much better with theory than the application of that theory. I've met a lot of theory people that just care about implementing something that does something to get the result they want rather than the nitty gritty of making things efficient.

hnarn wrote at 2020-11-04 19:23:39:

The article ends with why Rust wasn’t chosen, but not before randomly disqualifying some other arguably more obscure languages; the author ends with saying that low level languages “suck”, and that garbage collectors are “great”, and yet Go isn’t mentioned once in the entire post?

I’m not even arguing that Go would be the best or even a good choice here, but it seems strange to not even mention it as it also qualifies as great judging by the arbitrary statements of what does and doesn’t “suck” at the very end.

skybrian wrote at 2020-11-04 19:36:22:

I can't speak for the author but I can share my experience from porting the Wren interpreter from C to Go as a hobby project:

Writing an interpreter in Go is doable, but interpreters are both very CPU-intensive and built on variant data structures that need to be fast, and Go interfaces are the only data type available that does variants containing pointers. Overhead is high compared to C. You don't have low-level access to things like NaN-packing and you also don't have a JIT available at runtime to speed things up.

In many cases it's fast enough, but if you really need to write code in a different language that's both fast on a CPU-bound task and interoperates well with Go packages, a better approach might be to write a compiler that generates Go code.

It's a bit of a weak point, but for a specialized use case. Most code is either not that performance sensitive or can be written to operate on homogenous arrays.

didibus wrote at 2020-11-04 19:28:34:

I think the author values immutability and/or managed mutation quite highly, as well as functional programming. Since their last startup was CircleCI where they picked Clojure[Script] and later for Dark they picked OCaml, and said they were considering F#, Haskell and Rust. So I'm guessing that disqualified non functional mutation heavy alternatives like Go.

throwaway894345 wrote at 2020-11-04 19:36:41:

Yeah, Go isn’t a very pleasant language for writing parsers, interpreters, compilers, etc albeit in my opinion this has more to do with sum types and less to do with immutability (you can emulate sum types in Go via interfaces but the ergonomics are quite a lot worse than in an ML-inspired language). This has been my experience anyway, as an avid Go enthusiast.

pbiggar wrote at 2020-11-04 19:36:07:

Yeah, that's sorta right. I didn't want to evaluate every language under the sun. Rust I assumed had similar semantics to what I had been using before, which is why it was so heavily considered.

nemothekid wrote at 2020-11-04 19:29:10:

>_and yet Go isn’t mentioned once in the entire post?_

The author is coming from OCaml and eventually switched to F#. Having a powerful type system probably eliminated the author from considering Go.

ffijo32j wrote at 2020-11-04 19:33:56:

Go doesn't even have try-catch. In high level languages you can just write huge amount of code in a try-catch block that runs in a single transaction and if something goes bad, the entire transaction will be rolled back (without writing a single line of code for this to happen). In Go you have to handle return codes after each operation and that is 100x more code. I don't have time for such micro-optimizations.

dragonwriter wrote at 2020-11-04 20:05:59:

> Go doesn't even have try-catch.

This is true in the narrow sense that Go doesn't use the keywords "try" and "catch" .

Its false, though, in that the defer/recover/panic set of tools provides similar functionality to try/catch/finally/throw, with somewhat different syntax.

Idiomatic Go uses panics and the associated handling infrastructure less than many other languages use exceptions and their handling infrastructure, though.

throwaway894345 wrote at 2020-11-04 19:41:34:

It’s not a micro-optimization, and you really shouldn’t naively optimize for lines of code. The explicit error handling is to encourage programmers to think about errors and to make it easier for readers to trace the error path through the call stack. Making it harder to punt on error handling is the whole point, just like static typing is tedious for the programmer who only cares if their code _appears to be_ correct.

ffijo32j wrote at 2020-11-04 19:45:27:

I see absolutely no point in doing this. For example a network connection may occur any time I execute database query. Why should I handle it? Transaction will be automatically reverted to the previous state without me doing anything. Saves a lot of time (because I have to write much less code) and I never had any issues with this in 15 years.

throwaway894345 wrote at 2020-11-04 20:00:04:

How does the language know whether you care about a particular error or not? Is the actual act of typing really a significant amount of your time? What about time spent reading and debugging code, which actively benefit from clear error tracing? If typing really is a significant part of your time, how would you feel if your editor autofilled `if err != nil { return err }` for you? What if the language added a `try` operator a la Rust (e.g., whenever you type `?`, it expands into `if err != nil { return err }` code behind the scenes)?

Thaxll wrote at 2020-11-04 20:01:00:

This statement makes no sense, you're confused between DB transaction ( rollback ) and error handling in a programming language.

mamcx wrote at 2020-11-04 20:54:43:

I'm implementing a language that's basically F#/OCaml. So it makes sense that it's easier to implement in F#/OCaml.

This is the key point. When your target and host language match in semantics is easier to be done in the host. You se how much gymnastics is necessary to implement near all langs on C.

---

I done a lot of F# before move to Rust, and frankly, F# is great and Rust take a time to "click". I'm doing a lang on it (

https://tablam.org

) and the truth is that Rust make things harder.

Some of the points of the author are not that "bad" and with some time it will see that are non the real issues, but the fact is that anyone, specially (IMHO) if have done a lot of other langs (not named C or pascal) will have a miserable time the first try.

Eventually, you Jump the wall of complexity and a lot of stuff click.. but then hit the REAL showstoppers (that are different depending according to what you do), and lets by real: Are truly hard. (maybe: But doing a lang, you hit this stuff more easily than "regular" coding. Doing a erp backend Rust is super productive to me)

---

2 things that make harder this stuff: Rc/Arc make complicated how mutate things. You need to wrap again with RefCell. I wish exist a super-charged RcRefCell that is "blessed" by the rust team as is with Rc, so the correct idioms to deal with this become widespread.

Also, the trait system and the restrictions of object safety somehow make harder to mimic a OO system, that in part could make a lot easier to deal with lang implementation.

Plus, I wish I can take alias everywhere, so i can cut the noise in syntax...

zozbot234 wrote at 2020-11-04 21:45:19:

> I wish exist a super-charged RcRefCell that is "blessed" by the rust team as is with Rc

The syntax for that is Rc<RefCell<…>> and yes, this is clearly intended. The idea is that Rc is limited to managing multiple owners, while the RefCell part deals with shared mutability.

steveklabnik wrote at 2020-11-04 21:59:34:

Long ago, we did have those as one type:

https://doc.rust-lang.org/0.5/std/arc.html#struct-mutexarc

But yes, as you say, we decided to split them up, because it is easy to combine them, and you may want to combine other pieces in different ways. Rc<T> on its own is good in many cases, you may want Arc<RwLock<T>> instead of Arc<Mutex<T>>, etc.

st1x7 wrote at 2020-11-04 19:28:07:

I'm suprised by:

Good parts: library ecosystem is great

Bad parts: having to fight the compiler

My limited impression of Rust was the opposite on both points - that the compiler is super helpful but the crate ecosystem sucks.

snazz wrote at 2020-11-04 19:30:00:

Depends on your point of reference and what you're using Rust for. The crate ecosystem is a huge breath of fresh air compared to package management in C and C++, but you can't expect it to have a very good solution for things like game engines and GUIs.

markrawls wrote at 2020-11-07 01:29:23:

Rust does actually have an excellent game engine crate, called tetra. It's heavily MonoGame/XNA inspired and is at a good stable point (in my opinion).

I typically work more in the area of systems administration and automation, so I don't have much experience in the way GUI development, but there are Rust bindings for dear-imgui; and I would be very surprised if there weren't QT/GTK bindings.

Scarbutt wrote at 2020-11-04 19:53:55:

What I got from all these posts was that F# and Scala are the only statically typed functional programming languages with an advance type system that have access to a big ecosystem.

fulafel wrote at 2020-11-05 06:11:49:

Is there a place in the blog series where the async requirement is expanded on? Is it just the desire to transparently have Dark code use async IO so it's cheaper / can be said to be very scalable?

edit: in

https://blog.darklang.com/adventures-in-async/

there's a short clue saying "The Dark web server is currently synchronous, and so long or slow requests--at sufficient volume--can cause operational issues for us" but it doesn't go into details.

pbiggar wrote at 2020-11-05 15:49:30:

Right, it's to make it more scalable. Since our users can write arbitrary code that runs on our servers, and can call 3rdparty servers, it's really not that hard to DOS us accidentally. Switching to async means that we're not just sitting there wasting resources when users do common stuff.

daxfohl wrote at 2020-11-04 20:22:34:

The most surprising thing to me is that nobody has created a JVM port of F#.

sideeffffect wrote at 2020-11-05 11:57:14:

F# has the expectation of reified generics (Don Syme, the language author created them for .NET before creating F#, after all), which are not a thing (yet?) on JVM.

Conversely, Scala has the expectation of (partial) type erasure as is done on JVM, and that's probably why the .NET port didn't get very far.

daxfohl wrote at 2020-11-05 17:16:25:

But does F# depend on reified generics? My understanding is that everything is still expressible without them, except some reflection operations (that could just not carry over), and dealing with int vectors etc would be slower. Seems like not a big blocker unless I'm mistaken. Is there a deeper problem that I'm missing?

Multicomp wrote at 2020-11-05 02:10:35:

There is scala which is a functional proglang for jvm (but not F# ported to jvm)

daxfohl wrote at 2020-11-05 17:23:57:

Sure but the author explicitly rejected scala due to its too-many features, even though he'd have preferred jvm. So it sounds like there is still some opportunity.

simias wrote at 2020-11-04 19:56:22:

I think for people used to garbage collection, going to a non-gc ecosystem seems to be a big jump, akin to switching from an imperative to a functional language or from a dynamic to a statically typed language.

Articles like this one make me think that maybe the Rust community should develop more resources to help people make this jump because the language is more likely to attract people with this background than other non-GC languages that are typically seen as lower level and relegated to high performance applications.

I'm very anti-GC myself: I think the pros don't outweigh the cons in the overwhelming majority of the cases, they hide a complexity that you'll have to deal with eventually if you ever need to optimize your memory consumption and they encourage sloppy coding by not forcing you to think about ownership relations in your program's data.

Having this debate over the years (often here on HN) I notice that the GC-fans tend to fall in two groups: those who really understand the tradeoffs and, unlike me, really think that GCs are worth it and those who simply have never really used languages with manual memory management - or not for very long - and seem to base their dislike of them on a lack of comprehension, cargo culting and associating it with C-style memory management where tracking memory allocations and frees adds a true cognitive overhead (unlike languages like C++ or Rust where usually RAII will take care of it 99% of the time).

I realize that I'm building a strawman here, but when I see the way the author talks about memory management (and how much they hate it) I can't help thinking that they're Doing It Wrong:

I was actually surprised at how little the actual memory management bothered me. I'm a big believer in garbage collection, and not having to think about memory, so I expected to hate this part of Rust, but it turns out it's kinda OK. You put everything in a Box::new (regular heap memory) or RC::new (reference counted memory) or Arc::new (reference counted memory suitable to be used concurrently in different threads), and then when they go out of scope they'll be cleaned up.

See, here's the problem. You don't like to think about memory management and ownership, so you think "oh, I'm just going to put everything in a Box/Rc". Spoiler alert: (A)Rc _is_ garbage collection, it's just a very simplistic one. You haven't switched from GC to manual memory management, you've reimplemented a lousy GC in Rust and, unsurprisingly, it fails to compete with the good GCs of the languages you're used to.

It's like when I started using Python and I naively tried to do RAII with destructors, only to realize that it's a very bad idea because you never know when the destructor is going to run in such a language.

I'm currently implementing a rather asynchronous multicast Rust applications where I have 3 threads processing messages and passing them around to deal with retransmits, packet reconstruction etc... I have one(1) Arc, zero Rc and one(1) Box in the entire codebase (and I intend to get rid of this Box soon).

Scarbutt wrote at 2020-11-04 20:47:55:

Neither Rust/C++/RAII make coding without a GC pleasant, so they are not the answer. Maybe languages like Nim and Zig can help fill this gap when they are released.

simias wrote at 2020-11-04 21:16:27:

Well that's kinda the crux of the issue, isn't it? Some people, including myself, actually disagree with you here. I actually find it absolutely fine, and I like that it forces me to think more about my architecture.

It may be a bit trickier when I start on a new project but I find that it pays off in the long run by making maintenance a lot easier.

Scarbutt wrote at 2020-11-04 23:07:42:

Agree with you in part that all professional programmers should be comfortable coding most things with out GC and not see it as obscure scary stuff but the in startup world there is no much "pays off in the long run" thinking, it's ship whatever is the fastest to build now over great software.

oconnor663 wrote at 2020-11-05 02:26:14:

Let's see if I can explain this. When you're writing an async, multi-threaded server in using the tokio runtime, async processes can be moved between threads. This means the memory can be copied, and so you need to ... pin things? OK, that's as much as I remember. Look in the HN comments after I publish this and I'm sure someone will explain better.

Ok, here's my attempt to explain Pin. First a smidgen of background. In garbage collected languages, we can more or less say "everything is a pointer". And given how garbage collection works, we can extend that a little bit to say "anything can keep anything else alive, by holding onto a pointer to it." Now in C/C++/Rust, this is very much not the case. The most common example is if you try to return a pointer to one of your local variables. Your pointer cannot keep that variable alive, it becomes a "dangling pointer", and your code definitely does not work (but might appear to for a while). Nothing new so far.

Now Rust comes along and decides to try to catch most of these bugs at compile time. So it makes you keep careful track of who points to what, and how long everything is planning on staying alive. As an important special case of this, one of the things Rust will essentially never let you do, is create object that holds a pointer to itself. It's basically the same problem we just mentioned: you might try to do something like returning that object, and the original memory location you took a pointer to would be destroyed, and the pointer in the returned copy would now be dangling. It's almost never safe, so Rust almost never allows it. That was pretty much the whole story for a while.

But then things got trickier. Folks started working on async, and designing the new "await" keyword. Like in most other languages, the goal there was that you could write something that looked a lot like a regular function, but secretly the compiler would generate some sort of struct for you, and the variables in your function would actually be the fields of that struct. (Why anyone wants to do this is a long topic, but the reason is the same as in many other languages, basically because having lots of threads is slow.) And this runs is into a very tricky problem: Having one variable point to another variable is perfectly legal. We do that all the time. You can't write normal code without doing that. But now that variables are actually struct fields...that means we've got an object that points to itself. And we just said that was forbidden.

The solution the Rust team landed on for this problem is called Pin. When you "pin" an object, you're essentially swearing that it will always stay in the same spot in memory, so that any pointers it has to itself will never become dangling. The exact way this is implemented involves a lot of Rust specifics (Pin is a struct, but Unpin is a trait), and the fact that it sits right at the boundary of safe code and unsafe code makes things trickier. But that's the general idea of pinning: a promise that I will never move this object or copy its bits somewhere else in memory.

brundolf wrote at 2020-11-04 19:14:54:

Small note for the author, I believe this was supposed to be outside of the quote block:

It received 39 pretty enthusiast[ic] replies.

lilyball wrote at 2020-11-04 19:18:48:

No, the quote block is correct. It was quoting a HN comment (which includes the typo).

mumblemumble wrote at 2020-11-04 19:19:10:

No, that's correct. The sentence is part of the message that was being quoted (which is linked up above).

unnouinceput wrote at 2020-11-04 19:44:40:

TL;DR - I don't like Rust.

blargmaster42_8 wrote at 2020-11-04 19:23:49:

Glorified code gen developer using his 15min of fame.

lubesGordi wrote at 2020-11-04 19:17:01:

Did I miss something?

"I don't like x,y,z, languages so I picked F#."

I didn't see anything about F# in this post.

mumblemumble wrote at 2020-11-04 19:20:15:

It's in a separate post, referenced by the 2nd link in the article. The link text is just the word "posts", so it's easy to miss.

https://blog.darklang.com/new-backend-fsharp/

pbiggar wrote at 2020-11-04 19:41:52:

Thanks. Just updated the post to be a bit clearer on this.

marcinzm wrote at 2020-11-04 19:19:19:

He links in the blog to his previous long post explaining why he chose F#:

https://blog.darklang.com/new-backend-fsharp/

brmgb wrote at 2020-11-04 19:46:07:

No, they did three parts so they could hit the front page three times. The whole thing is clearly a Dark marketing campaign but people upvote it so the community must like it.

robocat wrote at 2020-11-04 20:43:18:

Your comment should be flagged, but perhaps review the guidelines:

https://news.ycombinator.com/newsguidelines.html

And factually I think your opinion is incorrect. The first post didn’t seem like it was tailored for HN, and he said he was surprised:

https://news.ycombinator.com/item?id=24981505

https://news.ycombinator.com/item?id=24980661

Meanwhile it looks like this discussion has been modded down to remove it from front page due to the low quality of comments (yes: this comment doesn’t meet standards either but maybe it helps you?...)

Dylan16807 wrote at 2020-11-04 21:04:33:

Which part of the guidelines are you referring to? I don't think this counts as an accusation of shilling/astroturfing or similar. And they didn't say it shouldn't have been submitted.

cordite wrote at 2020-11-04 19:16:45:

A follow up to this is what they did choose: F#

This is posted at

https://blog.darklang.com/new-backend-fsharp/

cztomsik wrote at 2020-11-04 22:21:59:

I wonder how one can not agree with Rich on Maybe/Option (and I am definitely not blindly in his camp, for example I think persistent data structures and STM are total overkill for my kinds of apps)

Kotlin and TypeScript got null right. Rust got it wrong - it's not such a big deal because compiler will tell you but it's still wrong, it's still breaking change and it's also a kind of information leak.