Ruby's Proposed STM

Author: chrisseaton

Score: 283

Comments: 96

Date: 2020-10-28 17:32:39

________________________________________________________________________________

AaronFriel wrote at 2020-10-30 06:57:43:

Now what happens when you add STM to these languages? You've just created an enormous footgun by creating another color of function (

https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

What this proposal misses is that reasoning about software transactional memory in functions all but demands being able to abort and retry code to make progress, and that therefore means having a high level separation between transactional code and IO code.

Suppose you had a banking application, you would like to use STM (as in the "ractor" example in this article) to move balances from one to another. In Haskell, you'd perform STM actions on those TVars (transactional vars) and use the default retry policy of STM to automatically construct an order of operations such that each operation appears to occur atomically. (STM doesn't impose a clock based ordering but I don't think it would be tremendously difficult to do so.)

If you don't use Haskell's "check" or "retry", you find that your code will abort and fail with near certainty. You can't perform lots of transactions on tvars and not, occasionally, read from a var another transaction is writing.

So you add "check" calls to verify your preconditions and postconditions such as balances cannot go negative, and you add "retry" calls to automatically retry transactions that fail because of concurrency.

Now imagine doing that in a modern Ruby codebase, or a modern JavaScript codebase. There's so much mutation that happens in these environments, can you guarantee within a Ruby function that it hasn't mutated another variable with ease? That it hasn't written to disk, made an API call, or changed some global state?

Haskell pioneered the use of STM. It's easier to reason about in Haskell because the world of the STM can be isolated from the world of effects due to the IO monad. It's easy to declare "this function is blue" and "this function is red" and the never the twain shall meet.

In any other language, all bets are off.

thaumasiotes wrote at 2020-10-30 09:00:59:

> That’s why C# has that little caveat. You can avoid the pain of async in C# by using threads.

This is the first time I've seen threads referred to as a way to _avoid_ pain.

ancarda wrote at 2020-10-30 10:21:32:

I honestly prefer threads and mutex locks to async/async.

Feel free to burn me at the stake.

tzs wrote at 2020-10-30 14:38:40:

Same here. I'm curious how you got there.

I think a large part of that for me is because early in my career I got involved in Unix kernel work, first writing device drivers and later doing Unix ports, including both porting a swapping Unix to paged hardware (and rewriting pretty much all process and memory management from scratch) and porting a paging Unix to swapping hardware.

That was followed by a few years largely doing embedded stuff on systems too small to have operating systems or memory management units. For those I'd write a small kernel that supported threads and scheduling (sometimes cooperative and sometimes preemptive) and then write the embedded applications much like I would have written them on Unix, using threads instead of processes.

So for me threads are easy and things like async/wait seem overly complicated and less clear because I came at concurrency from the operating system side of things and the hardware side of things (e.g., dealing with interrupts) where thread-like thinking is natural and perhaps required.

Maybe it is conceptually harder for people who come at it from the application side of things?

mumblemumble wrote at 2020-10-30 19:04:55:

I think it's more that threads are a poorer fit for the application side of things.

On the app side, "I want to do this thing that might take a while _without_ blocking the UI, and then do something with the result, and I have zero need to communicate with the code that's running the task beyond giving it its arguments and getting its result back," is the most common use case. Task parallelism is tailor-made for that kind of problem.

Using a dedicated thread, on the other hand, gives you some raw material you can use to solve that problem, but leaves you to handle all the messy details on your own. And doing it the easy way - one short-lived thread for every thing you need to do in the background - is expensive, while doing it the correct way and implementing a thread pool is essentially just committing to Greenspunning your own personal task parallelism library.

formerly_proven wrote at 2020-10-30 13:00:55:

Whether async or threads are preferable depends a lot on context and often the answer will be "both".

E.g. GUI code -> async, because you must do everything on the UI thread. Heavy lifting in a GUI app -> threads. Traditionally you'd do this with a huge mess of callbacks, but these days it's reasonably easy to just write the GUI portion using async and have background computations run in threads signal their results through futures into the async code, which makes everything so much easier.

andrewzah wrote at 2020-10-30 17:59:39:

From what I can tell, both solve different problems.

There are times when one might want async executors like node's event loop or rust's tokio/async-std, but in other situations one might want os threads or "green threads" with a RwLock or Mutex.

What about a few long running threads that use async internally? ;-)

mumblemumble wrote at 2020-10-30 21:06:07:

That's another thing that C# (or, rather, .NET) handles really well. I can't remember the exact syntax, but the Task API has a way of declaring a background task as "long-running". The key difference is, long-running tasks get a dedicated thread; short-running tasks execute on the thread pool.

dullgiulio wrote at 2020-10-30 19:10:11:

That's the Go runtime.

FpUser wrote at 2020-10-30 14:59:00:

Count me in your camp (with the couple of exceptions)

tasogare wrote at 2020-10-30 13:16:49:

They aren’t even solving the same problem. Threads exist to use multiple CPUs while async/await are here not to waste time waiting on I/O.

coder543 wrote at 2020-10-30 14:07:16:

Threads existed long before multi-CPU computers or multi-core CPUs were common, so no... they don’t exist for that reason.

Parallelism and concurrency are orthogonal concerns. Multiple threads can run on the same core, while concurrent awaits can run on separate cores (depending on the language). Or the opposite!

Threads definitely solve wasted time waiting on I/O, but OS-level threads are a very limited resource in many environments, which can cause problems more quickly in programs with unbounded concurrency, and they tend to have a lot of startup cost, so they have to be created judiciously to avoid hurting performance.

Languages like Go and Elixir give the programmer access to unlimited lightweight “threads”, and that works great for concurrency, without requiring developers to re-color their functions.

One additional note is that some OSes (including Linux) have struggled to support true nonblocking File I/O for many years, so you literally have had to use multiple OS threads for I/O concurrency... one for each file operation in progress. Otherwise you would block the async executor! Languages often handle this implementation detail for the programmer, so you just don’t realize you’re using a thread-per-await sometimes.

(io_uring seems to finally solve this in a meaningful way on Linux)

taffer wrote at 2020-10-30 09:32:19:

Threads are fine as long as they do not share mutable data structures. For example, multithreaded application servers are a rather painless experience.

eyelidlessness wrote at 2020-10-30 14:54:12:

Since all of these concurrency approaches’ pain points are around shared mutable state, maybe it’s shared mutable state that’s the pain point.

bglusman wrote at 2020-10-30 15:37:10:

Yup, this is the conclusion of Joe Armstrong and why Erlang/Elixir/BEAM just doesn't allow mutable state, other than process dictionary which is inherently non-shared.

mumblemumble wrote at 2020-10-30 13:16:05:

That observation doesn't really contribute to a comparison of dedicated threads and async/await, though, because the same is true of async routines.

mumblemumble wrote at 2020-10-30 13:13:55:

For a long time I experienced excessive pain with both threads and task parallelism. It took a good year or two to realize the problem was that I didn't have a good instinct for figuring out which technology is more appropriate for a given situation.

a1369209993 wrote at 2020-10-30 18:08:36:

Honestly the main feature of Haskell in this regard isn't STM-colored functions (as you note, that's a _bad_ thing), but rather that it _almost_ supports color-generic functions in the form of `Monad f => A -> f B -> f C`. Unfortunately, it doesn't correctly handle the identity monad `type Id a = a`, instead requiring `newtype Id a = Id a`, so it's impossible to write a function that matches both (eg) `A -> B` and `Monad f => A -> f B`.

kccqzy wrote at 2020-10-30 19:05:11:

I don't understand your argument.

It is correct that the standard library provides you with a newtype-based identity monad. But as you have said, you can also write own `type Id a = a`. What's wrong with that?

a1369209993 wrote at 2020-10-30 20:08:34:

> What's wrong with [`type Id = a`]?

The fact that it doesn't work:

      > - The type synonym 'Id' should have 1 argument, but has been given none
  > - In the instance declaration for 'Monad Id'
  > | instance Monad Id where

AaronFriel wrote at 2020-10-30 21:05:51:

I think Haskell enforcing a separation between the "color" (or monad) of functions is generally good.

chrisseaton wrote at 2020-10-30 11:00:35:

It’d be good if all writes in a transaction were automatically transactional and we had no TVars, and also if all IO automatically made a transaction irrevocable.

But I’m not sure this is tractable in either implementation effort or runtime overhead.

marcosdumay wrote at 2020-10-30 19:36:22:

Are you talking about the Haskell one?

If so, you can't do IO inside a transaction (it won't compile), and any value not on a TVar is thread-local.

chrisseaton wrote at 2020-10-30 22:44:59:

> Are you talking about the Haskell one?

No?

I'm responding to the idea of the 'colour' of functions and suggesting how I think it should be in Ruby - all side effects transactional, but IO irrevocable. Drop the idea of TVar and make all mutable locations transactional.

jpcooper wrote at 2020-10-30 09:06:40:

There’s no guarantee that you’re not calling unsafePerformIO in Haskell, which a lot of outwardly pure libraries do. In fact, wouldn’t STM be doing this under the hood? Even the FFI in Haskell allows you to import a function without the return type being in IO as long as you pinky promise that it is pure.

The idea in the end is to provide a language for working on memory concurrently. You either choose speak it or not.

kpmah wrote at 2020-10-30 10:08:05:

As with Rust's 'unsafe', the fact that you can explicitly circumvent checks doesn't make those checks useless. If something goes wrong you know exactly where to look.

jpcooper wrote at 2020-10-30 10:19:45:

That may be the case for Rust. I am not sure what the implications of unsafe are for the type system.

In any case, the unsafeness of unsafePerformIO and importing an FFI function as a pure function are not reflected in the type in Haskell. These are things that can and do happen.

My point was that the ability to go outside the bounds of a language does not necessarily make the language useless.

a1369209993 wrote at 2020-10-30 19:04:02:

> I am not sure what the implications of unsafe are for the type system.

There are no implications, that's the point.

> the unsafeness of unsafePerformIO and importing an FFI function as a pure function are not reflected in the type in Haskell

Yes, that is correct behaviour; if it was reflected in the type it'd be call safePerformIO.

nextaccountic wrote at 2020-10-30 20:22:38:

> In any case, the unsafeness of unsafePerformIO and importing an FFI function as a pure function are not reflected in the type in Haskell.

It's reflected in safe Haskell annotations. Unfortunately nobody cares about those. Haskell has a culture problem regarding safety, just like most other languages, both high level and low level (with the exception of Java, that don't make data races UB)

rurban wrote at 2020-10-30 12:35:12:

Not only bets are off in non-functional language, it's the wrong solution.

STM puts the cost of locking into everything, ie slower default operations, whilst still being concurrency unsafe. Deadlocks can occur.

There are proven good solutions for fast and safe concurrency with ownership tracking and a good compiler/scheduler. No, not Rust. And not STM. We fought with this for years.

rrss wrote at 2020-10-30 12:45:34:

> There are proven good solutions for fast and safe concurrency with ownership tracking and a good compiler/scheduler.

Do you have any pointers to these solutions for the uninitiated?

rurban wrote at 2020-10-30 20:35:34:

All concurrency-safe languages.

Eg pony, concurrent pascal, the parrot vm, ...

amelius wrote at 2020-10-30 15:19:46:

Eh, STM is often used as a solution to _avoid_ locking.

https://en.wikipedia.org/wiki/Software_transactional_memory

brandmeyer wrote at 2020-10-30 16:22:47:

If you read into the implementation papers a little more deeply, you'll find that they are typically implemented with locks under the covers.

orthoxerox wrote at 2020-10-30 14:26:17:

Haskell should just have transactional IO, then. RDBMSs have had it since god knows how long ago.

AaronFriel wrote at 2020-10-30 14:48:39:

Haskell is a general purpose programming language, so I'm not sure what you mean. You can't in the general case have "transactional IO".

cat199 wrote at 2020-10-30 11:12:07:

> In any other language, all bets are off.

No, in any other language, one has to have an understanding of the codebase and what is mutating and what is not instead of relying on the language to do this work for them, which, granted, can be a lot more work.

dismissing things because 'i dont want to deal with it because it is hard' is not the same as dismissing it because 'this is not possible', this argument is the former.

see also: ocaml / scheme style FP, which can be 99.9% as pure-functional and type-checked as haskell if one wants to build their system that way.

marcosdumay wrote at 2020-10-30 19:41:55:

Well, if you don't like the wording, translate "all bets are off" into "the programmer(s) must never make a mistake on those 10k to 10M lines on the code".

That's not a reasonable thing to expect.

kimi wrote at 2020-10-30 07:08:16:

Clojure had STM from the start, and it was one of its selling points, but the real world experience with it is that nobody uses it - atoms (i.e. global serialized write access, with non-locked reading) are 99.9% good enough.

jwr wrote at 2020-10-30 08:53:35:

That is true if you take a narrow view/definition of "STM". It's true that atoms are enough in most cases, but the reason they work so well is because the rest of the language is designed with STM in mind. You need immutable data structures throughout the language in order to be able to use atoms for synchronization: when you access a data structure through an atom, you want to be sure you are getting your own "snapshot" that no one else will modify.

This is why I frown when people say "Clojure STM is useless" or "nobody uses Clojure STM". Everybody uses the foundations, and the reason why the foundations are so useful is precisely because they were designed for STM.

I've been writing Clojure apps for many years now, and it is true that atoms are usually enough. I used agents in the past for their serialization capability, but these days core.async is usually a better tool. I never found a use case for refs that could not be simplified. In most applications I wrote, transactionality was needed together with persistency, which shifted the responsibility to a database.

nlitened wrote at 2020-10-30 10:33:59:

I remember in one of his interviews, Rich Hickey mentioned that STM isn’t in the language to be used all over the place, it’s there to make some very rarely used things possible, which otherwise would not be possible at all.

Which is a very reasonable way to see it, in my opinion.

mullr wrote at 2020-10-30 07:41:23:

It certainly seems to be rarely used. It's interesting to look at the concurrency stuff in Clojure and its uptake:

- Atoms: very popular. Usually what people reach for, perhaps overly so.

- STM / refs: Not commonly used, but I wonder if this is because we aren't used to really considering them as an option?

- Agents: I think everybody agrees that you shouldn't use these

- core.async: What many people think you should use instead

- Reducers: I've never seen these used in the wild.

wtetzner wrote at 2020-10-30 18:29:46:

> Atoms: very popular. Usually what people reach for, perhaps overly so.

Having written Clojure professionally, and then moved on to Java (not intentionally, just kinda happened), I find myself using the same pattern in Java. AtomicReference + pcollections, along with Lombok's @Value is enough to get some pretty Clojurey concurrency patterns in Java.

minikomi wrote at 2020-10-30 09:48:04:

There's also pmap which is amazing, manifold & it's little deferred/event-ish universe..

Both have their place!

reitzensteinm wrote at 2020-10-30 11:14:53:

I have used refs for a multiplayer server. Five years in to Clojure, there was finally a reason to break them out.

Players were in rooms, which were updated ten times a second. All of the rooms are largely independent, meaning the best strategy to do this is to use a threadpool to update them in parallel.

However, occasionally operations would require synchronization between rooms (eg, move one player from one to another).

It worked pretty well.

blandflakes wrote at 2020-10-30 13:35:24:

I think this mirrors how concurrency really is in many applications:

* a lot of services simply defer their state management to some other service - a database or document store. Typically have to use their concurrency/transaction management

* In apps that have state, a lot of times the only safety is required - this state only changes in one place, I can't afford a partial read, but otherwise I don't really care about concurrency. Atoms for this.

* Sometimes there is an ordering or coordination to state, but a lot of time that is producer-consumer, so core.async can be a nice fit.

* In a (relatively) small subset of cases, I actually have multiple writers whose behavior I need to coordinate. STM is perfect, and I prefer it to managing my own locks in those cases.

augustl wrote at 2020-10-30 09:21:57:

Atoms - and immutable values :) Those two are really killer and solves all of my state problems, at least.

MrBuddyCasino wrote at 2020-10-30 07:36:35:

Also it isn’t particularly fast afaik. Hardware support was supposed to fix this, but the Intel primitives were buggy and not available on all models.

reitzensteinm wrote at 2020-10-30 08:18:57:

Hardware transactional memory wouldn't accelerate Clojure. The actual transactional bit is very cheap, built around compare and swap.

You first grab the current value of two refs, build two new values out of them through some process.

Then on commit, you lock both refs and check they're unchanged since the start of the transaction. If they are, you swap the values, if they're not, you grab the new values and rerun the computation (on and on until timeout).

The overhead comes from the lack of granularity - if two transactions change different keys in a map they'll conflict even though in theory they could run simultaneously (and indeed can with a hash table and HTM like TSX).

And also because of the speed of persistent data structures. They're not exactly slow given the flexibility they give you, but each write involves allocation.

MrBuddyCasino wrote at 2020-10-30 15:48:43:

> The overhead comes from the lack of granularity - if two transactions change different keys in a map they'll conflict even though in theory they could run simultaneously (and indeed can with a hash table and HTM like TSX).

Can you expand on that? Couldn't Clojure introduce a STM-aware HashMap, why would TSX help?

reitzensteinm wrote at 2020-10-30 16:06:31:

It could, but it just wouldn't be Clojure any more. The whole thing is built on persistent data structures, where if you modify an object you get the new version, and keep the old version unless it's garbage collected away. Further, there are guarantees that both versions perform well.

This isn't just used for Clojure's STM (although it makes the STM model trivial). It's essentially the fundamental premise of the language. Functional programming with structural sharing to allow immutable data structures to be fast enough to be useful in the real world.

Introducing a new data structure that was (I assume you mean) HTM aware would not offer the same guarantees. You'd be able to have two threads mutating it, but you couldn't (for instance) keep a history log of values by hanging on to a pointer to each one, that is still reasonably space efficient.

TSX or other HTM would allow you to build a hash map that two threads can modify simultaneously by literally just reading and writing to it as normal (with some nasty edge cases, but fundamentally it's similar to a single thread version). The hardware keeps track of which cache lines are involved in the transaction, ensures they're not written to memory, and if there's a conflict it throws them away to retry the transactions.

However in Clojure, you're not modifying the map, you're creating a new version that shares most of the structure with the old version except for your differences. If two threads modify the original hash map A, to yield B and C, all you've got now are three different trees. You might be able to invent a data structure that allow you to merge the diffs of A->B and A->C to yield D, containing both versions. But it's probably not going to be awfully fast and it's going to have corner cases that violate the transactional guarantees.

reitzensteinm wrote at 2020-10-30 16:12:31:

I should point out that I'm fairly familiar with the literature roughly up to the level that Clojure takes advantage of (I've used it to implement custom versions of persistent maps and RRB trees that store to disk). There may be more advances since then or techniques I'm not familiar with.

I think the problem I'm describing is fundamental, but I'm not a researcher and am not claiming any advanced knowledge beyond being an experienced Clojure developer that is familiar with the internals.

jwr wrote at 2020-10-30 09:02:48:

> Also it isn’t particularly fast afaik

What exactly "isn't particularly fast"? :)

In any real-world application the cost of STM machinery itself is, at a first approximation, 0.

There is a cost to immutable data structures, but I don't think that's what you were talking about.

mrh0057 wrote at 2020-10-30 09:31:25:

STM under a concurrent load can be expensive due to how modern CPUs work. Since you are updating values on different CPUs going to have cache invalidation. Then you have the issue if you are doing an expensive calculation in the transaction can cause the same transaction to fail repeatedly leading to starvation.

_pmf_ wrote at 2020-10-30 07:25:44:

It was supposed to be THE killer feature of Clojure. Rich Hickey's "ants" demo is still firmly in my mind.

mcintyre1994 wrote at 2020-10-30 07:08:35:

Thanks for sharing this here, it was a great read and really well written. I really like the style of communication of taking a problem and existing solution, and adding requirements and workarounds one at a time until that stops working to motivate a new approach. And this was a really nice introduction for me to STM too, which seems like an interesting topic to look into more. Appreciate the references too!

fnord123 wrote at 2020-10-30 09:21:01:

Let’s say we’re a bank managing many bank accounts. Each account has a total. We get a never-ending stream of requests to move a sum of money m from an account a to account b.

We all know that you should not do accounting using floating point numbers. But also, please understand that bank accounts are [append only] ledgers and the balance is a cached computation. I truly wish that we had a different example use case because this is really bad.

tomstuart wrote at 2020-10-30 10:04:40:

This is the opposite of my point of view. I think it’s a great example because the _whole point_ is that, while most people’s everyday experience of a bank account is as a single mutable value, the reality is far more complicated and subtle. Teaching is all about using simplified examples to convey intuition, and “you might think it’s easy to keep a single number up to date, but moving money between two accounts is fraught with danger” is a perfect demonstration.

jeltz wrote at 2020-10-30 10:48:15:

Maybe for bank accounts that is true, but online casinos work pretty close to that example (except the floating point part) since we do not ever want to allow an account balance to be negative. Allowing balances to be negative would be a regulatory violation so what most online casinos do is to first lock and update the balance (or balances since there is also bonus money) while making sure it does not go into the negatives and then log transaction amount plus after balances.

So for online casinos I would not say that the balance is cached. And I imagine traditional banking also has accounts which are never allowed to go into the negatives where a similar solution is required.

the-dude wrote at 2020-10-30 09:23:29:

Regulators would like a word with you. Also, in theory it sounds nice, but reality is always more nuanced.

I have implemented a bank-like SaaS which was regulated by our national financial regulators.

fnord123 wrote at 2020-10-30 09:36:25:

Of course it's more nuanced than a short HN comment. It's also more complicated than using a mutex to make changes to two values.

> I have implemented a bank-like SaaS

Which one? What made it bank like? How was it implemented?

> which was regulated by our national financial regulators.

Which country?

> Regulators would like a word with you.

To say what?

the-dude wrote at 2020-10-30 09:40:42:

> To say what?

Among other things, they require you to store the before & after balance with each transaction. As a truth value, not a cache.

And now your whole premise is out of the window ( there is a bunch of practical reasons it won't work in reality too ).

kevingadd wrote at 2020-10-30 10:34:20:

A string of before and after balances is still a ledger. This is distinct from just 'account, current_balance' tuples like a more typical schema would have.

the-dude wrote at 2020-10-30 10:45:32:

Isn't one of the charactarising features of a ledger that it is a string of relative values instead of absolute values?

a1369209993 wrote at 2020-10-30 19:10:42:

No, it is not. A ledger is a append-only data structure consisting of a list of (applied) _transactions_. What kind of transactions depends on what you're ledger-ing, but it certainly doesn't imply relative values (if anything the opposite, though probably not anything).

the-dude wrote at 2020-10-30 20:01:50:

So if they are not related(or relative) to one another, why are they on the same list?

chrisseaton wrote at 2020-10-30 10:58:13:

> I truly wish that we had a different example use case because this is really bad.

Well... that’s why the article proposes a different, better example.

fnord123 wrote at 2020-10-30 11:12:57:

The second example was interesting and insightful.

pansa2 wrote at 2020-10-30 07:05:21:

A few years ago, the PyPy developers were working on STM for Python, as a way to mitigate the Global Interpreter Lock.

Was that work ever completed? How does it compare to this work for Ruby?

mattip wrote at 2020-10-30 17:28:26:

Here is the thesis paper that came out of it.

https://dl.acm.org/doi/epdf/10.1145/3276479

All the following is my opinion, and may contain errors. Take it with a grain of salt and kindly correct me where I misstated something.

If you view PyPy as a sophisticated test bed for virtual machine research it was a success. There are lots of conclusions about what works and what doesn’t, and benchmarks that show faster code when using many core. But the project ended with the conclusion the the STM model and the tracing JIT model don’t play well together, and the PyPy team preferred to stay with the tracing JIT.

Python and Ruby are very similar. The same JIT that powers PyPy powers topaz, which still performs quite well in ruby benchmarks even though it has been dormant since 2017

danmur wrote at 2020-10-30 08:03:51:

My recollection (which could be completely wrong) was that they had a branch with it, but it really needed hardware transactional memory to memory to have decent speed for most applications.

pansa2 wrote at 2020-10-30 08:57:39:

Makes sense - they seem to have stopped work on STM so presumably it couldn’t be made to work well.

How will STM in Ruby solve (or avoid) the problems that doomed STM in Python?

the_mitsuhiko wrote at 2020-10-30 10:07:03:

It exists, it's just quite slow. I doubt there are users of it.

im3w1l wrote at 2020-10-30 07:40:30:

So I have very little knowledge here, but wont the transaction log need a global lock? Is the trick that you hold it for a very short time so that it doesn't become a problem? Also with high contention, wont the transactional threads do _a lot_ of replaying each others work to figure out if they have conflicts or not?

sriram_malhar wrote at 2020-10-30 08:56:54:

Yes, to both.

The transaction log is append only, so there are more optimal methods (including lock-free approaches to make it efficient. Also, the duration of the lock is limited to the append operation, and not determined by application (when the atomically block ends).

And yes, STM and high contention don't mix. It works well for occasional writes and predominant reads.

Tim Sweeney's presentation on this topic is hugely informative.

https://www.st.cs.uni-saarland.de//edu/seminare/2005/advance...

mrh0057 wrote at 2020-10-30 08:06:16:

STM is implemented using immutable data structures and using atomics you can do a compare and swap to update the value. If the swap fails you try your transaction again. I’m not sure how you would implement this in a language with mutable data structures easily. I know Azul had hardware transactional memory that supported its.

klodolph wrote at 2020-10-30 07:50:19:

You use a lock-free structure for the log.

md5person wrote at 2020-10-30 08:07:06:

> "lock-free"

I never liked that term.

Lock-free algorithms will usually boil down to some combination of polling, waiting (sleep(), futex-style), or use what are essentially more fine-grained, hardware-backed locks (CAS, memory barriers, LOCK instructions, hardware-specific transactional instructions, etc). The locks are very much still there.

dbaupp wrote at 2020-10-30 11:02:48:

It’s used flippantly at times, but lock-free is a technical term referring to a guarantee of global progress

https://en.m.wikipedia.org/wiki/Non-blocking_algorithm

(I think it’s also sometimes used as a synonym for non-blocking, so including “obstruction-free” too).

For instance, if all threads except one are stopped, the single running thread should “finish” (for whatever finish means) in a finite time (lock-free is stronger than this condition, too). An algorithm using mutex-style locks fails this guarantee, if one of the stopped threads is inside the critical section.

chrisseaton wrote at 2020-10-30 10:57:10:

‘Lock-free’ means the application cannot ‘lock up’, rather than there are no locks. CAS does use a kind of hardware-level lock in the cache, but can never cause any application to lock up because there is no per-process state like in a software lock.

md5person wrote at 2020-10-30 12:31:03:

Yes, that aligns more closely with my mental model of a "lock-free algorithm", but what I see in practicality is people avoiding (or re-inventing) synchronization primitives, thinking that they don't belong in a "lock-free algorithm".

You can still use standard kernel-provided synchronization objects (mutex/events/semaphores) in your "lock-free applications", as long as you provide timeouts to blocking wait() calls, and handle abandoned objects gracefully.

sriram_malhar wrote at 2020-10-30 09:02:35:

True, but the key difference in the lock-free approach is that the "lock" is basically a pointer-swizzling or timestamp-swizzling operation, and so the lock duration is strictly bounded; it is determined by the application or the thread-scheduler.

WJW wrote at 2020-10-30 10:43:14:

Naming is hard, but in this case lock-free is just misleading. Why oh why didn't they go with bounded-lock data structures? :)

klodolph wrote at 2020-10-30 16:24:45:

Because there is no lock.

With a lock, you acquire the lock, you perform your operation, and then you release the lock.

Lock-free is typically done differently. You _don’t_ acquire a lock, you start performing the operation “optimistically”, and you commit the result if no other thread has stomped on your data. If another thread has stomped on your data, you start over.

One of the important differences here is that it’s a race. Whoever commits first, wins. With a lock, you have to wait for whatever thread acquired the lock first. That thread may not even be running.

So, lock-free is fundamentally different because with a lock-free system, a thread that is running will always complete work (you just don’t know _which_ thread). In a system with locks, threads that are running may be prevented from making progress by threads that are not running.

Systems with locks can also deadlock, lock-free systems cannot deadlock. Usually, there is some sort of guarantee that lock-free systems always make progress.

md5person wrote at 2020-10-30 16:43:09:

Your explanation doesn't make sense to me.

> "you commit the result if no other thread has stomped on your data"

What does "committing" mean here? If it means performing an atomic write (a-la CAS), then you're using a lock (see my next point).

> "If another thread has stomped on your data, you start over."

So you let your thread sit there in a spin-loop and CAS a condition-variable. That's a lock for all intents and purposes, and your system can still "dead-lock" (read: your thread will never get to "win the race"), if your "data gets stomped on" over and over again in-between reads (which are obviously non-atomic, otherwise you'd be locking there too).

> "Usually, there is some sort of guarantee that lock-free systems always make progress."

Getting a time-slice to continue running in a tight-loop while trying to CAS is not the same as "making progress".

"Making progress" would be - you'd get a chance to commit your changes and proceed to your next bit of business logic. But with contention, that's not guaranteed to happen at-all?

klodolph wrote at 2020-10-30 19:35:47:

I think you’ve just mixed up wait-free with lock-free.

- Lock-free: at least one thread will make progress

- Wait-free: all threads will make progress

The difference between locks and lock-free is very noticeable if a thread holding a lock is suspended (which can happen on a modern system just by accessing memory that is paged out). On real-time systems it’s also possible to guarantee that high-priority tasks will make progress, regardless of how a low-priority task behaves. With locks, a low-priority task can easily prevent a high-priority task from running at all.

md5person wrote at 2020-10-30 10:57:25:

> "the lock duration is strictly bounded; it is determined by the application or the thread-scheduler"

Practically speaking - wouldn't the exact duration be heavily influenced by the countless layers of abstraction beneath the application itself (kernel, scheduler, hardware, speculative execution, etc)? If so, can we ever truly make the claim that the "duration is strictly bounded"?

chrisseaton wrote at 2020-10-30 11:06:17:

In terms of the level at which we reason about thread interleavings, which is what we’re talking about, it is bound.

md5person wrote at 2020-10-30 16:29:51:

I wonder why every single comment I leave in this thread is getting down-voted. This community has truly deteriorated over the years.

coder543 wrote at 2020-10-30 16:46:20:

Because a lot of your comments in this thread are "missing the point" or (unintentionally) misleading at best, and it's easier for people to use a downvote to make those comments less prominent in the discussion than to go through the trouble of trying to explain why the comments are missing the point or unintentionally misleading.

But, a lot of people have gone through the effort to leave explanations in this case. Isn't that good enough?

For one unaddressed example:

> You can still use standard kernel-provided synchronization objects (mutex/events/semaphores) in your "lock-free applications", as long as you provide timeouts to blocking wait() calls, and handle abandoned objects gracefully.

This might be technically true, but atomics provide significant performance benefits compared to using a mutex with a timeout, and I don't think of "lock free" as relying on busy loops as you seem to think in your comments. I'm sure a busy loop makes sense in very specific algorithms, and mutexes themselves often use a limited amount of busy waiting to avoid context switches. You can simulate atomics with mutexes, but that's not the point. Lock free is hard to do correctly. Mutexes create the possibility of a deadlock, which is incompatible with being lock-free, so you have to avoid all the footguns of lock-free _and_ the footgun of using mutexes.

So, yes, you can use standard sync objects... but that doesn't make anything better in this case. It misses the point. "Technically correct" isn't always good enough to get an upvote.

If you don't agree, that's fine... but this is completely uncalled for and against the community guidelines:

> This community has truly deteriorated over the years.

No one is obligated to upvote things they consider incorrect, and downvotes are perfectly suitable for this purpose under HN guidelines.

md5person wrote at 2020-10-30 17:00:48:

> "While technically true... I don't see how it contributes anything useful to the conversation, so I downvoted it."

If you don't see how it contributes to the conversation - why not just ignore it then? Just because you think this doesn't contribute to the conversation, doesn't make it true.

> "If you don't agree, that's fine..."

Clearly it's not "fine", otherwise you wouldn't try to silence comments you disagree with by down-voting them.

This is my last comment on HN.

coder543 wrote at 2020-10-30 17:07:39:

> why not just ignore it then?

Because it's misleading and/or missing the point. As stated. Why would I want more people to read something that I believe is misleading?

> Clearly it's not "fine"

Once again missing the point. The prevailing opinion on HN is what will get upvoted, and a lot of other stuff will get downvoted. I have seen that prevailing opinion be _hilariously wrong_ before.

If you want to be heard with an unpopular opinion, you have to make a compelling argument... you can't expect to be upvoted to the top just because you're sharing an opinion. It's rare for a comment to be lukewarm enough for _no one_ to feel strongly enough to downvote it. It's going up, going down, or it's on a thread no one is reading. Downvotes don't mean people hate the person leaving the downvoted comments; it's just a tool for disagreeing with comments.

> This is my last comment on HN.

Because people disagreed with you? People even left explanations, but that wasn't enough.

7373737373 wrote at 2020-10-30 10:35:46:

Seeing routing in action like this is very satisfying

cutler wrote at 2020-10-30 22:32:11:

Just use Clojure.

waffletower wrote at 2020-10-30 13:30:18:

Ruby's proposed software transaction memory (STM) is Clojure.

chrisseaton wrote at 2020-10-30 13:34:30:

Not sure what you mean? STM wasn't originally developed for Clojure. STM as we know it today pre-dates Clojure by over a decade.

waffletower wrote at 2020-10-30 14:24:55:

My troll was intended to say, when a Ruby developer desires to utilize STM, they would instead become a Clojurist and use a language where STM was a foundational design consideration. I am not sure why it is relevant to discuss the originality of STM. But again my post was a troll so say -- whatever -- as I did.