💾 Archived View for dioskouroi.xyz › thread › 29359521 captured on 2021-11-30 at 20:18:30. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
________________________________________________________________________________
Unfortunately this write-up didn’t help me at all understanding pinning.
Alomost feels I need to read the article to the end in order to understand the assumptions at the start.
I guess I’m missing some context.
- what is ‘trivially moveable’?
- why would I want to overload the assignment operator?
- what is a move constructor?
- why would this make memory safe apis tricky?
- what does moving references outside unsafe code mean?
- what now we suddenly talk about Pin and Unpin, without explaining them first.
I’m so confused
"Trivially movable" means anyone can take an instance (that they own) and copy its data (just the plain memory cells) to a new location with a new address, then continue to use that instance at that new location without breaking anything.
This is a weaker guarantee than `Copy` because they're not necessarily allowed to use both copies of the data as true instances. They can still delay choosing one, but they can only ever interact with one of them through its API.
I'll see if I can add a link to an explanation there. In Rust it's mostly a base assumption and many other languages (C#, Java, JS…) avoid the topic almost entirely, so they all don't explain it very prominently.
C++ is much more explicit about it, but unfortunately calls this property a bunch of different names. (I've seen "trivially relocatable" as general term for it, since a "move" seems to be a very specific action in C++.)
The intro section turned out a bit C++-y, maybe. You can read up on the terms by clicking any underlined text. I'm not too happy about links not being blue, but I can't change it without being in the Hashnode ambassador program. (Not much spare money means my entire web presence is strung together from free or very inexpensive platform services. Maybe I'll post about that in the future.)
You can also skip the "The Problem" section entirely, though. It's not that important overall and serves mostly as on-ramp for users of other languages with manual memory management.
There was an interesting talk in RustConf 2021 discussing the possibility of adding in-place construction and C++-like "move constructors" for pinned data in Rust. You can find it under the title "Move Constructors: Is It Possible?" by Michel Young de la Sota.
There's actually a direct mention of that in my post already, but it's all the way near the bottom in the "Weakening non-moveability" section. It probably goes a bit too far to link in the intro, so I inlined a quick definition as relative clause there instead.
The `moveit` library is interesting also because it would allow pinning containers to expose a wider mutable API, though there's currently no trait to update instances post-move, which would allow `realloc` use. I filed an issue for it about a week ago:
https://github.com/google/moveit/issues/26
That still leaves the issue of callback targets though, which would have to be notified before a possible move to support one-step reallocation.
Thanks, this already helps quite a lot! I came to rust from mostly higher level languages, Haskell/JVM/JS. So there is this continuous battle of Rust solving problems that I’m not necessarily sure what they are and why those problems exist in the first place.
That's understandable, I originally come from about the same position too, with a bit less functional programming background.
I updated the intro to also cover that angle:
https://github.com/Tamschi/Abstraction-Haven-Backup/commit/f...
Since it turned out quite a bit longer that way, I was pretty liberal with the bold formatting to make it still easy to skip for those familiar with the issue.
Trivially moveable means that you can move the object by simply copying the bytes to a new location, and non-trivially-moveable objects are those that would be incorrect to move in this manner. A move constructor is a C++ concept for non-trivially-moveable types that lets you do something more complicated than just copying the bytes over when you want to move the value to a new location, and creating a move constructor is essentially how you overload the assignment operator in C++.
The classic example of something that is not trivially moveable is something with a pointer into itself. If you just copy the bytes, that pointer still points at the old location, which would not be valid. In C++, you would handle that by having a move constructor update the pointer to point at the new location.
Dealing with non-trivially-moveable types make memory safe APIs tricky because Rust generally guarantees that code with no unsafe block can never result in memory unsafety. This also applies to library design, and anyone designing a library with unsafe blocks in it should design their library to provide the same guarantee — i.e. that anyone using only the safe methods from the library may not be able to cause memory unsafety, even if they can call methods in the library containing unsafe blocks. This should hold no matter how ridiculous the thing they're doing is, as long as it doesn't require unsafe.
So the reason why this is tricky is that safe Rust permits anyone to perform a "trivial move" of any object they have ownership of. Thus, if you make even the constructor of such an object safe, then safe code could call that constructor, then move the object in an unsound way.
So how do you solve this? Any library that has a non-trivially-moveable type needs some way for the user to promise not to move it using a trivial move. The way they do this is by introducing a Pin type, which is a wrapper around a (typically mutable) reference. The primary constructor for Pin is called new_unchecked, and it is an unsafe method whose safety requirement for calling it is that you never move the object that the reference points at. Thus, any library that sees a reference wrapped in a Pin knows that whoever created it _must_ have called an unsafe block, promising to follow the rules for that unsafe method. It is, in a sense, a way to shift the blame if anything goes wrong in an unsafe block in the library — normally the library would be at blame because the library is the only one with an unsafe block, but when using Pin, it is the person who _created_ the Pin object who is at blame for the incorrect use of unsafe — not the library.
It is worth noting that new_unchecked is not the only way to create a Pin. For example, there's a macro called pin! that uses variable shadowing to prevent the user from accessing the underlying variable, preventing the user from moving the object since they can't access the variable name anymore. This lets safe code create a Pin reference to something the safe code owns, and the unsafe block is inside the definition of the macro.
Regarding moving references outside unsafe code, well that's because if someone gets hold of a mutable reference to a pinned value, then they could call std::mem::swap to perform a trivial move of that value, and this would not involve an unsafe block. Thus, if the library exposes any way of getting such a mutable reference in safe code, then that would be an incorrect unsafe block, as safe code is allowed to do anything it wants, as long as it uses only unsafe.
The Pin wrapper type has methods for unwrapping it and getting back the mutable reference that it is a wrapper around, however they are unsafe, so if anyone did that and then called std::mem::swap, they would be at fault because the one who unwrapped it called an unsafe method without following its rules.
As for Unpin, that's a marker on types that are trivially moveable. The Pin wrapper is essentially a no-op for references to those types and provides no guarantees. There's a safe Pin::new method that can only be used for such types.
It's worth pointing out that in Rust, non-trivially-moveable types must still be trivially moveable _until_ they get pinned. So they typically have an "init" state or something like that which can be safely moved around, but then once you pin it, the object becomes non-trivially-moveable.
Something else that I didn't mention in the post is that `Pin<Box<T>>` is `From<Box<T>>` (so it can be cast directly, and here without reallocation or such), and this could also be implemented for (other) exclusively-owning containers with heap storage.
The standard library collections don't do this because they don't have an API that would make it useful.
It suspect it's partly because you can't `self: &mut Pin<Self>` in stable, which would be needed to cleanly give them one, but there's also a fair argument for not making them kitchen-sink types while a wrapper could provide this easily.
Nearly all my confusion is gone. This is amazing, thanks!
This is an amazing summary. Thank you.
Not to detract from the author's work - it does express the concept in "plain English" as promised - but boy oh boy is Pinning a confusing concept. This is coming from someone who has used rust pretty regularly since 2015.
Sure, in simplest terms, I get it - "Pinning is used to guarantee something is never moved", and the utility of this is mostly (exclusively?) for guarantees in concurrency.
The moment I see the symbol "!Unpin" my brain goes into a fog. "Pinning means things cannot be moved... but most things are Unpin... which can be unpinned after being pinned... but some things are !Unpin... so they can NOT be UN-pinned... so when they are Pinnned they are REALLY pinned..."
It seems to me like "Pin" is a perfectly sound, logical concept, but it was designed by type-system-science nerds, that barely translates to the promise of Rust which is to empower _everyone_, not just type-system-science nerds, to make safe software.
I would largely agree, in the cases where I've needed something pinned I just drop to Box'ing to, keeping the box internal and managing the pinning directly. It's annoying because you lose some classes of things that you may want to pin(values on stack for short durations) but I found similar complexities in the API.
For instance if you take a pinned Arc(ex: Pin<Arc<T>>) there wasn't a way I could determine to get out the mutable interior pointer(ex: via Arc::get_mut) since those APIs require a specific signature and you would need to un-pin the value to make those calls despite them being by reference(safe from a pinning perspective).
Most of Rust is incredibly well designed but so it's pretty jarring that Pinning is so much harder to use. I've found work-arounds for the cases where I've needed either via boxing or dropping to unsafe but would have been nice if it had worked for my use cases.
> safe from a pinning perspective
The reason you can't get &mut T from Pin<Arc<T>> is because the mutable reference lets you move the data in safe Rust, e.g. with std::mem::swap.
So getting the &mut T from a Pin<Arc<T>> is _not_ safe from a pinning perspective.
Fair, but feels like that significantly reduces the utility of pinned types then if for certain values you lose the ability to mutate values when the compiler overwise knows that there is a single mutable reference.
There's a hole or oversight in `Arc`'s API in this regard, annoyingly. The function you're looking for would be something like
pub fn get_mut_pinned(this: &mut Pin<Arc<T>>) -> Option<Pin<&mut T>>
, which for pinning-aware values should give you enough mutability and for `T: Unpin` can give you `Option<&mut T>` through `Option::as_deref_mut`.
Unfortunately, I haven't seen any `Arc` implementations that actually provide it, since almost none of the third-party ones are pinning-aware. (My `tiptoe::Arc`'s `get_mut` has this signature, but that's a specialty container with additional requirements for the contained value.)
The same goes for `make_mut_pinned`, that's missing too (but to be fair would be much less useful, since pinning and `Clone` don't often mix that well).
Have you tried proposing these API's on the internals.rust-lang.org forum or via posting a proposed RFC? It's not clear to me if they're sound in the general case, but if that's the case they can absolutely be added.
I should. I'm not too familiar with the process so far, but I'll find some time for it.
These functions are indeed generally sound (also for `Rc`), as any value that cares would be `!Unpin`, which would still bar access to `&mut T`.
`make_mut_pinned` also shouldn't cause too much confusion, as availability of `Clone` would have to be declared explicitly just about everywhere that's relevant.
They can be implemented as very thin wrappers around their non-pinning equivalents, with only a few `unsafe` operations to make the types fit.
> I should. I'm not too familiar with the process so far, but I'll find some time for it.
The internals forum is good for informal discussion, so start there. Posting a RFC proposal is of course the formal step for actually getting the feature included in Rust proper.
Pins are very often `Unpin`, but this isn't relevant to their function.
This idea is what I've seen get a lot of people lost, but it's pretty natural. By analogy, a DANGER HIGH VOLTAGE sign doesn't itself need to be dangerous or high voltage in order to be useful. If your DANGER HIGH VOLTAGE sign (the sign) is dangerous or high voltage, you'd probably want a second DANGER HIGH VOLTAGE sign pointing out this fact about the first sign. The same for Pin: if you're wanting the pinned pointer (the pointer itself) to not be Unpin, your pinned pointer would need to be pinned by some second pinned pointer.
This is why I insist on never calling it "pinned pointer" in the post, even if not explicitly.
I'm aware the term is used in the standard library docs (for context: `Pin`'s tagline is "A pinned pointer."), but in my eyes that's a misnomer 99% of the time.
It might be a good idea to split the wording thoroughly and officially (into "pinned" vs. "pin"/"pinning …"), since this is such a common sticking point and seemingly mixing concepts right now. (I.e. "pinned pointers" are (usually) "`Pin`-wrapped" but not "pinned in place". It only becomes more muddled once you have inherently-pinning or "add-on" pins without `Pin` in their type.)
This is supposed to be plain English?