1 //! Slots and global/thread local data for the Helping strategy.
2 //!
3 //! This is inspired (but not an exact copy) of
4 //! <https://pvk.ca/Blog/2020/07/07/flatter-wait-free-hazard-pointers/>. The debts are mostly
5 //! copies of the ones used by the hybrid strategy, but modified a bit. Just like in the hybrid
6 //! strategy, in case the slots run out or when the writer updates the value, the debts are paid by
7 //! incrementing the ref count (which is a little slower, but still wait-free/lock-free and still
8 //! in order of nanoseconds).
9 //!
10 //! ## Reader, the fast path
11 //!
12 //! * Publish an active address ‒ the address we'll be loading stuff from.
13 //! * Puts a generation into the control.
14 //! * Loads the pointer and puts it to the debt slot.
15 //! * Confirms by CaS-replacing the generation back to idle state.
16 //!
17 //! * Later, we pay it back by CaS-replacing it with the NO_DEPT (like any other slot).
18 //!
19 //! ## Writer, the non-colliding path
20 //!
21 //! * Replaces the pointer in the storage.
22 //! * The writer walks over all debts. It pays each debt that it is concerned with by bumping the
23 //!   reference and replacing the dept with NO_DEPT. The relevant reader will fail in the CaS
24 //!   (either because it finds NO_DEPT or other pointer in there) and knows the reference was
25 //!   bumped, so it needs to decrement it. Note that it is possible that someone also reuses the
26 //!   slot for the _same_ pointer. In that case that reader will set it to NO_DEPT and the newer
27 //!   reader will have a pre-paid debt, which is fine.
28 //!
29 //! ## The collision path
30 //!
31 //! The reservation of a slot is not atomic, therefore a writer can observe the reservation in
32 //! progress. But it doesn't want to wait for it to complete (it wants to be lock-free, which means
33 //! it needs to be able to resolve the situation on its own).
34 //!
35 //! The way it knows it is in progress of the reservation is by seeing a generation in there (it has
36 //! a distinct tag). In that case it'll try to:
37 //!
38 //! * First verify that the reservation is being done for the same address it modified, by reading
39 //!   and re-confirming the active_addr slot corresponding to the currently handled node. If it is
40 //!   for some other address, the writer doesn't have to be concerned and proceeds to the next slot.
41 //! * It does a full load. That is fine, because the writer must be on a different thread than the
42 //!   reader and therefore there is at least one free slot. Full load means paying the debt right
43 //!   away by incrementing the reference count.
44 //! * Then it tries to pass the already fully protected/paid pointer to the reader. It writes it to
45 //!   an envelope and CaS-replaces it into the control, instead of the generation (if it fails,
46 //!   someone has been faster and it rolls back). We need the envelope because the pointer itself
47 //!   doesn't have to be aligned to 4 bytes and we need the space for tags to distinguish the types
48 //!   of info in control; we can ensure the envelope is).
49 //! * The reader then finds the generation got replaced by a pointer to the envelope and uses that
50 //!   pointer inside the envelope. It aborts its own debt. This effectively exchanges the envelopes
51 //!   between the threads so each one has an envelope ready for future.
52 //!
53 //! ## ABA protection
54 //!
55 //! The generation as pre-reserving the slot allows the writer to make sure it is offering the
56 //! loaded pointer to the same reader and that the read value is new enough (and of the same type).
57 //!
58 //! This solves the general case, but there's also much less frequent but theoretical ABA problem
59 //! that could lead to UB, if left unsolved:
60 //!
61 //! * There is a collision on generation G.
62 //! * The writer loads a pointer, bumps it.
63 //! * In the meantime, all the 2^30 or 2^62 generations (depending on the usize width) generations
64 //!   wrap around.
65 //! * The writer stores the outdated and possibly different-typed pointer in there and the reader
66 //!   uses it.
67 //!
68 //! To mitigate that, every time the counter overflows we take the current node and un-assign it
69 //! from our current thread. We mark it as in "cooldown" and let it in there until there are no
70 //! writers messing with that node any more (if they are not on the node, they can't experience the
71 //! ABA problem on it). After that, we are allowed to use it again.
72 //!
73 //! This doesn't block the reader, it'll simply find *a* node next time ‒ this one, or possibly a
74 //! different (or new) one.
75 //!
76 //! # Orderings
77 //!
78 //! The linked lists/nodes are already provided for us. So we just need to make sure the debt
79 //! juggling is done right. We assume that the local node is ours to write to (others have only
80 //! limited right to write to certain fields under certain conditions) and that we are counted into
81 //! active writers while we dig through it on the writer end.
82 //!
83 //! We use SeqCst on a read-write operation both here at the very start of the sequence (storing
84 //! the generation into the control) and in the writer on the actual pointer. That establishes a
85 //! relation of what has happened first.
86 //!
87 //! After that we split the time into segments by read-write operations with AcqRel read-write
88 //! operations on the control. There's just one control in play for both threads so we don't need
89 //! SeqCst and the segments are understood by both the same way. The writer can sometimes use only
90 //! load-Acquire on that, because it needs to only read from data written by the reader. It'll
91 //! always see data from at least the segment before the observed control value and uses CaS to
92 //! send the results back, so it can't go into the past.
93 //!
94 //! There are two little gotchas:
95 //!
96 //! * When we read the address we should be loading from, we need to give up if the address does
97 //!   not match (we can't simply load from there, because it can be dangling by that point and we
98 //!   don't know its type, so we need to use our address for all loading ‒ and we just check they
99 //!   match). If we give up, we don't do that CaS into control, therefore we could have given up on
100 //!   newer address than the control we have read. For that reason, the address is also stored by
101 //!   reader with Release and we read it with Acquire, which'll bring an up to date version of
102 //!   control into our thread ‒ and we re-read that one to confirm the address is indeed between
103 //!   two same values holding the generation, therefore corresponding to it.
104 //! * The destructor doesn't have a SeqCst in the writer, because there was no write in there.
105 //!   That's OK. We need to ensure there are no new readers after the "change" we confirm in the
106 //!   writer and that change is the destruction ‒ by that time, the destroying thread has exclusive
107 //!   ownership and therefore there can be no new readers.
108 
109 use std::cell::Cell;
110 use std::ptr;
111 use std::sync::atomic::Ordering::*;
112 use std::sync::atomic::{AtomicPtr, AtomicUsize};
113 
114 use super::Debt;
115 use crate::RefCnt;
116 
117 pub const REPLACEMENT_TAG: usize = 0b01;
118 pub const GEN_TAG: usize = 0b10;
119 pub const TAG_MASK: usize = 0b11;
120 pub const IDLE: usize = 0;
121 
122 /// Thread local data for the helping strategy.
123 #[derive(Default)]
124 pub(super) struct Local {
125     // The generation counter.
126     generation: Cell<usize>,
127 }
128 
129 // Make sure the pointers have 2 empty bits. Always.
130 #[derive(Default)]
131 #[repr(align(4))]
132 struct Handover(AtomicUsize);
133 
134 /// The slots for the helping strategy.
135 pub(super) struct Slots {
136     /// The control structure of the slot.
137     ///
138     /// Different threads signal what stage they are in in there. It can contain:
139     ///
140     /// * `IDLE` (nothing is happening, and there may or may not be an active debt).
141     /// * a generation, tagged with GEN_TAG. The reader is trying to acquire a slot right now and a
142     ///   writer might try to help out.
143     /// * A replacement pointer, tagged with REPLACEMENT_TAG. This pointer points to an Handover,
144     ///   containing an already protected value, provided by the writer for the benefit of the
145     ///   reader. The reader should abort its own debt and use this instead. This indirection
146     ///   (storing pointer to the envelope with the actual pointer) is to make sure there's a space
147     ///   for the tag ‒ there is no guarantee the real pointer is aligned to at least 4 bytes, we
148     ///   can however force that for the Handover type.
149     control: AtomicUsize,
150     /// A possibly active debt.
151     slot: Debt,
152     /// If there's a generation in control, this signifies what address the reader is trying to
153     /// load from.
154     active_addr: AtomicUsize,
155     /// A place where a writer can put a replacement value.
156     ///
157     /// Note that this is simply an allocation, and every participating slot contributes one, but
158     /// they may be passed around through the lifetime of the program. It is not accessed directly,
159     /// but through the space_offer thing.
160     ///
161     handover: Handover,
162     /// A pointer to a handover envelope this node currently owns.
163     ///
164     /// A writer makes a switch of its and readers handover when successfully storing a replacement
165     /// in the control.
166     space_offer: AtomicPtr<Handover>,
167 }
168 
169 impl Default for Slots {
default() -> Self170     fn default() -> Self {
171         Slots {
172             control: AtomicUsize::new(IDLE),
173             slot: Debt::default(),
174             // Doesn't matter yet
175             active_addr: AtomicUsize::new(0),
176             // Also doesn't matter
177             handover: Handover::default(),
178             // Here we would like it to point to our handover. But for that we need to be in place
179             // in RAM (effectively pinned, though we use older Rust than Pin, possibly?), so not
180             // yet. See init().
181             space_offer: AtomicPtr::new(ptr::null_mut()),
182         }
183     }
184 }
185 
186 impl Slots {
slot(&self) -> &Debt187     pub(super) fn slot(&self) -> &Debt {
188         &self.slot
189     }
190 
get_debt(&self, ptr: usize, local: &Local) -> (usize, bool)191     pub(super) fn get_debt(&self, ptr: usize, local: &Local) -> (usize, bool) {
192         // Incrementing by 4 ensures we always have enough space for 2 bit of tags.
193         let gen = local.generation.get().wrapping_add(4);
194         debug_assert_eq!(gen & GEN_TAG, 0);
195         local.generation.set(gen);
196         // Signal the caller that the node should be sent to a cooldown.
197         let discard = gen == 0;
198         let gen = gen | GEN_TAG;
199         // We will sync by the write to the control. But we also sync the value of the previous
200         // generation/released slot. That way we may re-confirm in the writer that the reader is
201         // not in between here and the compare_exchange below with a stale gen (eg. if we are in
202         // here, the re-confirm there will load the NO_DEPT and we are fine).
203         self.active_addr.store(ptr, SeqCst);
204 
205         // We are the only ones allowed to do the IDLE -> * transition and we never leave it in
206         // anything else after an transaction, so this is OK. But we still need a load-store SeqCst
207         // operation here to form a relation between this and the store of the actual pointer in
208         // the writer thread :-(.
209         let prev = self.control.swap(gen, SeqCst);
210         debug_assert_eq!(IDLE, prev, "Left control in wrong state");
211 
212         (gen, discard)
213     }
214 
help<R, T>(&self, who: &Self, storage_addr: usize, replacement: &R) where T: RefCnt, R: Fn() -> T,215     pub(super) fn help<R, T>(&self, who: &Self, storage_addr: usize, replacement: &R)
216     where
217         T: RefCnt,
218         R: Fn() -> T,
219     {
220         debug_assert_eq!(IDLE, self.control.load(Relaxed));
221         // Also acquires the auxiliary data in other variables.
222         let mut control = who.control.load(SeqCst);
223         loop {
224             match control & TAG_MASK {
225                 // Nothing to help with
226                 IDLE if control == IDLE => break,
227                 // Someone has already helped out with that, so we have nothing to do here
228                 REPLACEMENT_TAG => break,
229                 // Something is going on, let's have a better look.
230                 GEN_TAG => {
231                     debug_assert!(
232                         !ptr::eq(self, who),
233                         "Refusing to help myself, makes no sense"
234                     );
235                     // Get the address that other thread is trying to load from. By that acquire,
236                     // we also sync the control into our thread once more and reconfirm that the
237                     // value of the active_addr is in between two same instances, therefore up to
238                     // date to it.
239                     let active_addr = who.active_addr.load(SeqCst);
240                     if active_addr != storage_addr {
241                         // Acquire for the same reason as on the top.
242                         let new_control = who.control.load(SeqCst);
243                         if new_control == control {
244                             // The other thread is doing something, but to some other ArcSwap, so
245                             // we don't care. Cool, done.
246                             break;
247                         } else {
248                             // The control just changed under our hands, we don't know what to
249                             // trust, so retry.
250                             control = new_control;
251                             continue;
252                         }
253                     }
254 
255                     // Now we know this work is for us. Try to create a replacement and offer it.
256                     // This actually does a full-featured load under the hood, but we are currently
257                     // idle and the load doesn't re-enter write, so that's all fine.
258                     let replacement = replacement();
259                     let replace_addr = T::as_ptr(&replacement) as usize;
260                     // If we succeed in helping the other thread, we take their empty space in
261                     // return for us that we pass to them. It's already there, the value is synced
262                     // to us by Acquire on control.
263                     let their_space = who.space_offer.load(SeqCst);
264                     // Relaxed is fine, our own thread and nobody but us writes in here.
265                     let my_space = self.space_offer.load(SeqCst);
266                     // Relaxed is fine, we'll sync by the next compare-exchange. If we don't, the
267                     // value won't ever be read anyway.
268                     unsafe {
269                         (*my_space).0.store(replace_addr, SeqCst);
270                     }
271                     // Ensured by the align annotation at the type.
272                     assert_eq!(my_space as usize & TAG_MASK, 0);
273                     let space_addr = (my_space as usize) | REPLACEMENT_TAG;
274                     // Acquire on failure -> same reason as at the top, reading the value.
275                     // Release on success -> we send data to that thread through here. Must be
276                     // AcqRel, because success must be superset of failure. Also, load to get their
277                     // space (it won't have changed, it does when the control is set to IDLE).
278                     match who
279                         .control
280                         .compare_exchange(control, space_addr, SeqCst, SeqCst)
281                     {
282                         Ok(_) => {
283                             // We have successfully sent our replacement out (Release) and got
284                             // their space in return (Acquire on that load above).
285                             self.space_offer.store(their_space, SeqCst);
286                             // The ref count went with it, so forget about it here.
287                             T::into_ptr(replacement);
288                             // We have successfully helped out, so we are done.
289                             break;
290                         }
291                         Err(new_control) => {
292                             // Something has changed in between. Let's try again, nothing changed
293                             // (the replacement will get dropped at the end of scope, we didn't do
294                             // anything with the spaces, etc.
295                             control = new_control;
296                         }
297                     }
298                 }
299                 _ => unreachable!("Invalid control value {:X}", control),
300             }
301         }
302     }
303 
init(&mut self)304     pub(super) fn init(&mut self) {
305         *self.space_offer.get_mut() = &mut self.handover;
306     }
307 
confirm(&self, gen: usize, ptr: usize) -> Result<(), usize>308     pub(super) fn confirm(&self, gen: usize, ptr: usize) -> Result<(), usize> {
309         // Put the slot there and consider it acquire of a „lock“. For that we need swap, not store
310         // only (we need Acquire and Acquire works only on loads). Release is to make sure control
311         // is observable by the other thread (but that's probably not necessary anyway?)
312         let prev = self.slot.0.swap(ptr, SeqCst);
313         debug_assert_eq!(Debt::NONE, prev);
314 
315         // Confirm by writing to the control (or discover that we got helped). We stop anyone else
316         // from helping by setting it to IDLE.
317         let control = self.control.swap(IDLE, SeqCst);
318         if control == gen {
319             // Nobody interfered, we have our debt in place and can proceed.
320             Ok(())
321         } else {
322             // Someone put a replacement in there.
323             debug_assert_eq!(control & TAG_MASK, REPLACEMENT_TAG);
324             let handover = (control & !TAG_MASK) as *mut Handover;
325             let replacement = unsafe { &*handover }.0.load(SeqCst);
326             // Make sure we advertise the right envelope when we set it to generation next time.
327             self.space_offer.store(handover, SeqCst);
328             // Note we've left the debt in place. The caller should pay it back (without ever
329             // taking advantage of it) to make sure any extra is actually dropped (it is possible
330             // someone provided the replacement *and* paid the debt and we need just one of them).
331             Err(replacement)
332         }
333     }
334 }
335