In every Go project of significant size there comes a point where someone, in the name of performance, goes around the codebase changing functions to accept and return structs as pointers rather than values.
The idea is that it's faster to copy a pointer to a struct, which is 8 bytes on a 64-bit machine, rather than the entire struct itself, which is usually bigger. The full size of a struct will depend on its fields, but on a 64-bit machine most fields in a Go struct are going to be 8 bytes each, with interfaces and strings being 16 and slices being 24. Having to copy all of those bytes must surely be slower than just copying the pointer.
As an example, a struct value with an integer, a slice, and a string will require 48 bytes to be copied everytime it is passed into a function, while a pointer to that struct will only require 8. That's a savings of 600%!
I have always been against making this change, for a number of reasons.
While it's important for code to be performant, it's also important for it to be readable and easy to change. When a struct is passed by value it is a signal to the coder that the struct is indeed being used as a piece of immutable data, and not some mutable object which might get changed from afar at any moment.
Secondly, but somewhat relatedly, a pointer can have an extra state that a value cannot: nil. So now when reading a function which accepts a pointer I have to ask yet another question of myself: could this argument be nil? And if so, what's the intended meaning of it being nil? It's yet another piece of mental overhead to understand the code which, for structs carrying data, isn't necessary.
Finally, while it's true that passing by pointer is faster at the moment you're calling the function, it also means that the struct being passed is _most likely_ allocated on the heap, rather than the stack. Go does try to keep things on the stack when it can, but more often then not if a struct is passed by pointer it's going to go onto the heap, which means a heap allocation is required.
Heap allocations are not free, and in fact when I've done serious profiling work in the past they are one of the biggest sources of slowdowns. With that bias in mind I try to avoid them when possible. My reasoning is that even if a struct is relatively big, it's not generally _that_ big, and computers are good at copying contiguous blocks of memory anyway, so most likely any slowdown incurred by a multi-field struct is going to be dwarfed by the heap allocation.
While I stand by the first two points, the final point always felt flimsy to me, and so I decided to verify it.
My testing methodology was to write a benchmark which tests along two axis: the size of a struct and number of times it is passed by value. I ran the benchmark for multiple values along each axis using both pass-by-value and pass-by-struct. I then produced a heatmap showing the relative performance of each.
The final results, colorized according to the degree performance was affected
Links showing how I arrived at this result can be found at the bottom of this page.
The chart above shows a number for each combination of struct size (vertical) and number of function calls (horizontal). The number itself is the pass-by-value benchmark result divided by the pass-by-pointer benchmark result. This means a value of `2` indicates that, for that particular combination struct size and number of calls, pass-by-value is twice as fast as pass-by-pointer. A value of `0.2` means pass-by-pointer is 5 times as fast. A value of `1` means they are the same.
The raw numbers themselves aren't actually the point, just the general "shape" of the results. And looking at the results, I don't feel particularly vindicated in my position. Here's what can be said:
So what does it all mean? First, I still don't believe it's always worthwhile to pass-by-pointer. The benefits to code readability and resiliance are real, and more often than not you have to go pretty far down the optimization rabbit-hole before pass-by-value becomes the biggest problem. In that context my default for personal projects will remain pass-by-value.
For work-related and open-source projects I'm not so sure. You can't always know how code will evolve, and what points will become so entrenched that refactoring them becomes practically impossible. It's good to establish patterns which make sense from the start and can avoid foreseeable problems. What should the pattern here be? For open-source libraries specifically it's important to avoid backwards incompatible changes as much as possible, so the "start with pass-by-value and re-evaluate later" strategy is not so viable.
I can say for certain that at least for structs which are private to a package the programmer should remain free to start with pass-by-value and re-evaluate later. For public structs I don't think it's reasonable to just always use pass-by-pointer; if the user isn't going to pass that struct around more than 8-16 times then passing by pointer will be a net detriment. If they will pass it around more than that, but always on their side of the fence, then they can put it on the heap themselves and get the savings that way. Given just that you could say that the default should remain pass-by-value. The exception is if the value might be passed back and forth through your package's public functions/methods, in which case the decision of where to put the struct is on you and not the user. If you can be certain that the value will be passed across your package's functions frequently then pass-by-pointer is reasonable. If it's unclear then you'll just have to take a best guess, but personally I'll err on the side of pass-by-value for the non-performance benefits.
So that's where I've landed here. It's not quite an "I don't know", but it's pretty close. I guess that makes sense, otherwise there'd be an established right answer and I wouldn't have to do this. If you have any feedback for me on this topic I would love to hear it, please let me know your thoughts. This page will be kept up-to-date based on feedback and my future experiences.
Suggestions welcome! Help me improve this page
Code to process the output of the benchmark into a CSV
-----
Last Updated: 2024-11-14