💾 Archived View for mediocregopher.com › posts › erlang-pitfalls.gmi captured on 2024-09-28 at 23:59:43. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-08-18)
-=-=-=-=-=-=-
Common pitfalls that people may run into when designing and writing large-scale erlang applications.
I've been involved with a large-ish scale erlang project at Grooveshark since sometime around 2011. I started this project knowing absolutely nothing about erlang, but now I feel I have accumulated enough knowlege over time that I could conceivably give some back. Specifically, common pitfalls that people may run into when designing and writing a large-scale erlang application. Some of these may show up when searching for them, but some of them you may not even know you need to search for.
The cononical way of getting the current timestamp in erlang is to use `erlang:now()`. This works great at small loads, but if you find your application slowing down greatly at highly parallel loads and you're calling `erlang:now()` a lot, it may be the culprit.
A property of this method you may not realize is that it is monotonically increasing, meaning even if two processes call it at the *exact* same time they will both receive different output. This is done through some locking on the low-level, as well as a bit of math to balance out the time getting out of sync in the scenario.
There are situations where fetching always unique timestamps is useful, such as seeding RNGs and generating unique identifiers for things, but usually when people fetch a timestamp they just want a timestamp. For these cases, `os:timestamp()` can be used. It is not blocked by any locks, it simply returns the time.
The built-in `rpc` module is slower than you'd think. This mostly stems from it doing a lot of extra work for every `call` and `cast` that you do, ensuring that certain conditions are accounted for. If, however, it's sufficient for the calling side to know that a call timed-out on them and not worry about it any further you may benefit from simply writing your own rpc module. Alternatively, use one which already exists.
One of erlang's niceties is transparent message sending between two phsyical erlang nodes. Once nodes are connected, a process on one can send any message to a process on the other exactly as if they existed on the same node. This is fine for many data-types, but for anonymous functions it should be avoided.
For example:
RemotePid ! {fn, fun(I) -> I + 1 end}.
Would be better written as
incr(I) -> I + 1. RemotePid ! {fn, ?MODULE, incr}.
and then using an `apply` on the RemotePid to actually execute the function.
This is because hot-swapping code messes with anonymous functions quite a bit. Erlang isn't actually sending a function definition across the wire; it's simply sending a reference to a function. If you've changed the code within the anonymous function on a node, that reference changes. The sending node is sending a reference to a function which may not exist anymore on the receiving node, and you'll get a weird error which Google doesn't return many results for.
Alternatively, if you simply send atoms across the wire and use `apply` on the other side, only atoms are sent and the two nodes involved can have totally different ideas of what the function itself does without any problems.
Hot swapping code is the bees-knees. It lets you not have to worry about rolling-restarts for trivial code changes, and so adds stability to your cluster. My warning is that you should not rely on it. If your cluster can't survive a node being restarted for a code change, then it can't survive if that node fails completely, or fails and comes back up. Design your system pretending that hot-swapping does not exist, and only once you've done that allow yourself to use it.
Erlang garbage collection (GC) acts on a per-erlang-process basis, meaning that each process decides on its own to garbage collect itself. This is nice because it means stop-the-world isn't a problem, but it does have some interesting effects.
We had a problem with our node memory graphs looking like an upwards facing line, instead of a nice sinusoid relative to the number of connections during the day. We couldn't find a memory leak *anywhere*, and so started profiling. We found that the memory seemed to be comprised of mostly binary data in process heaps. On a hunch my coworker Mike Cugini (who gets all the credit for this) ran the following on a node:
lists:foreach(erlang:garbage_collect/1, erlang:processes()).
and saw memory drop in a huge way. We made that code run every 10 minutes or so and suddenly our memory problem went away.
The problem is that we had a lot of processes which individually didn't have much heap data, but all-together were crushing the box. Each didn't think it had enough to garbage collect very often, so memory just kept going up. Calling the above forces all processes to garbage collect, and thus throw away all those little binary bits they were hoarding.
The `erl` process has tons of command-line options which allow you to tweak all kinds of knobs. We've had tons of performance problems with our application, as of yet not a single one has been solved with turning one of these knobs. They've all been design issues or just run-of-the-mill bugs. I'm not saying the knobs are *never* useful, but I haven't seen it yet.
The erlang model of allowing processes to manage global state works really well in many cases. Possibly even most cases. There are, however, times when it becomes a performance problem. This became apparent in the project I was working on for Grooveshark, which was, at its heart, a pubsub server.
The architecture was very simple: each channel was managed by a process, client connection processes subscribed to that channel and received publishes from it. Easy right? The problem was that extremely high volume channels were simply not able to keep up with the load. The channel process could do certain things very fast, but there were some operations which simply took time and slowed everything down. For example, channels could have arbitrary properties set on them by their owners. Retrieving an arbitrary property from a channel was a fairly fast operation: client `call`s the channel process, channel process immediately responds with the property value. No blocking involved.
But as soon as there was any kind of call which required the channel process to talk to yet *another* process (unfortunately necessary), things got hairy. On high volume channels publishes/gets/set operations would get massively backed up in the message queue while the process was blocked on another process. We tried many things, but ultimately gave up on the process-per-channel approach.
We instead decided on keeping *all* channel state in a transactional database. When client processes "called" operations on a channel, they really are just acting on the database data inline, no message passing involved. This means that read-only operations are super-fast because there is minimal blocking, and if some random other process is being slow it only affects the one client making the call which is causing it to be slow, and not holding up a whole host of other clients.
This one is probably a bit controversial, and definitely subject to use-cases. Do your own testing and profiling, find out what's right for you.
Mnesia is erlang's solution for global state. It's an in-memory transactional database which can scale to N nodes and persist to disk. It is hosted directly in the erlang processes memory so you interact with it in erlang directly in your code; no calling out to database drivers and such. Sounds great right?
Unfortunately mnesia is not a very full-featured database. It is essentially a key-value store which can hold arbitrary erlang data-types, albeit in a set schema which you lay out for it during startup. This means that more complex types like sorted sets and hash maps (although this was addressed with the introduction of the map data-type in R17) are difficult to work with within mnesia. Additionally, erlang's data model of immutability, while awesome usually, can bite you here because it's difficult (impossible?) to pull out chunks of data within a record without accessing the whole record.
For example, when retrieving the list of processes subscribed to a channel our application doesn't simply pull the full list and iterate over it. This is too slow, and in some cases the subscriber list was so large it wasn't actually feasible. The channel process wasn't cleaning up its heap fast enough, so multiple publishes would end up with multiple copies of the giant list in memory. This became a problem. Instead we chain spawned processes, each of which pull a set chunk of the subsciber list, and iterate over that. This is very difficult to implement in mnesia without pulling the full subscriber list into the process' memory at some point in the process.
It is, however, fairly trivial to implement in redis using sorted sets. For this case, and many other cases after, the motto for performance improvements became "stick it in redis". The application is at the point where *all* state which isn't directly tied to a specific connection is kept in redis, encoded using `term_to_binary`. The performance hit of going to an outside process for data was actually much less than we'd originally thought, and ended up being a plus since we had much more freedom to do interesting hacks to speedup up our accesses.
-----
Published 2014-10-29