1 //! Slots and global/thread local data for the Helping strategy. 2 //! 3 //! This is inspired (but not an exact copy) of 4 //! <https://pvk.ca/Blog/2020/07/07/flatter-wait-free-hazard-pointers/>. The debts are mostly 5 //! copies of the ones used by the hybrid strategy, but modified a bit. Just like in the hybrid 6 //! strategy, in case the slots run out or when the writer updates the value, the debts are paid by 7 //! incrementing the ref count (which is a little slower, but still wait-free/lock-free and still 8 //! in order of nanoseconds). 9 //! 10 //! ## Reader, the fast path 11 //! 12 //! * Publish an active address ‒ the address we'll be loading stuff from. 13 //! * Puts a generation into the control. 14 //! * Loads the pointer and puts it to the debt slot. 15 //! * Confirms by CaS-replacing the generation back to idle state. 16 //! 17 //! * Later, we pay it back by CaS-replacing it with the NO_DEPT (like any other slot). 18 //! 19 //! ## Writer, the non-colliding path 20 //! 21 //! * Replaces the pointer in the storage. 22 //! * The writer walks over all debts. It pays each debt that it is concerned with by bumping the 23 //! reference and replacing the dept with NO_DEPT. The relevant reader will fail in the CaS 24 //! (either because it finds NO_DEPT or other pointer in there) and knows the reference was 25 //! bumped, so it needs to decrement it. Note that it is possible that someone also reuses the 26 //! slot for the _same_ pointer. In that case that reader will set it to NO_DEPT and the newer 27 //! reader will have a pre-paid debt, which is fine. 28 //! 29 //! ## The collision path 30 //! 31 //! The reservation of a slot is not atomic, therefore a writer can observe the reservation in 32 //! progress. But it doesn't want to wait for it to complete (it wants to be lock-free, which means 33 //! it needs to be able to resolve the situation on its own). 34 //! 35 //! The way it knows it is in progress of the reservation is by seeing a generation in there (it has 36 //! a distinct tag). In that case it'll try to: 37 //! 38 //! * First verify that the reservation is being done for the same address it modified, by reading 39 //! and re-confirming the active_addr slot corresponding to the currently handled node. If it is 40 //! for some other address, the writer doesn't have to be concerned and proceeds to the next slot. 41 //! * It does a full load. That is fine, because the writer must be on a different thread than the 42 //! reader and therefore there is at least one free slot. Full load means paying the debt right 43 //! away by incrementing the reference count. 44 //! * Then it tries to pass the already fully protected/paid pointer to the reader. It writes it to 45 //! an envelope and CaS-replaces it into the control, instead of the generation (if it fails, 46 //! someone has been faster and it rolls back). We need the envelope because the pointer itself 47 //! doesn't have to be aligned to 4 bytes and we need the space for tags to distinguish the types 48 //! of info in control; we can ensure the envelope is). 49 //! * The reader then finds the generation got replaced by a pointer to the envelope and uses that 50 //! pointer inside the envelope. It aborts its own debt. This effectively exchanges the envelopes 51 //! between the threads so each one has an envelope ready for future. 52 //! 53 //! ## ABA protection 54 //! 55 //! The generation as pre-reserving the slot allows the writer to make sure it is offering the 56 //! loaded pointer to the same reader and that the read value is new enough (and of the same type). 57 //! 58 //! This solves the general case, but there's also much less frequent but theoretical ABA problem 59 //! that could lead to UB, if left unsolved: 60 //! 61 //! * There is a collision on generation G. 62 //! * The writer loads a pointer, bumps it. 63 //! * In the meantime, all the 2^30 or 2^62 generations (depending on the usize width) generations 64 //! wrap around. 65 //! * The writer stores the outdated and possibly different-typed pointer in there and the reader 66 //! uses it. 67 //! 68 //! To mitigate that, every time the counter overflows we take the current node and un-assign it 69 //! from our current thread. We mark it as in "cooldown" and let it in there until there are no 70 //! writers messing with that node any more (if they are not on the node, they can't experience the 71 //! ABA problem on it). After that, we are allowed to use it again. 72 //! 73 //! This doesn't block the reader, it'll simply find *a* node next time ‒ this one, or possibly a 74 //! different (or new) one. 75 //! 76 //! # Orderings 77 //! 78 //! The linked lists/nodes are already provided for us. So we just need to make sure the debt 79 //! juggling is done right. We assume that the local node is ours to write to (others have only 80 //! limited right to write to certain fields under certain conditions) and that we are counted into 81 //! active writers while we dig through it on the writer end. 82 //! 83 //! We use SeqCst on a read-write operation both here at the very start of the sequence (storing 84 //! the generation into the control) and in the writer on the actual pointer. That establishes a 85 //! relation of what has happened first. 86 //! 87 //! After that we split the time into segments by read-write operations with AcqRel read-write 88 //! operations on the control. There's just one control in play for both threads so we don't need 89 //! SeqCst and the segments are understood by both the same way. The writer can sometimes use only 90 //! load-Acquire on that, because it needs to only read from data written by the reader. It'll 91 //! always see data from at least the segment before the observed control value and uses CaS to 92 //! send the results back, so it can't go into the past. 93 //! 94 //! There are two little gotchas: 95 //! 96 //! * When we read the address we should be loading from, we need to give up if the address does 97 //! not match (we can't simply load from there, because it can be dangling by that point and we 98 //! don't know its type, so we need to use our address for all loading ‒ and we just check they 99 //! match). If we give up, we don't do that CaS into control, therefore we could have given up on 100 //! newer address than the control we have read. For that reason, the address is also stored by 101 //! reader with Release and we read it with Acquire, which'll bring an up to date version of 102 //! control into our thread ‒ and we re-read that one to confirm the address is indeed between 103 //! two same values holding the generation, therefore corresponding to it. 104 //! * The destructor doesn't have a SeqCst in the writer, because there was no write in there. 105 //! That's OK. We need to ensure there are no new readers after the "change" we confirm in the 106 //! writer and that change is the destruction ‒ by that time, the destroying thread has exclusive 107 //! ownership and therefore there can be no new readers. 108 109 use std::cell::Cell; 110 use std::ptr; 111 use std::sync::atomic::Ordering::*; 112 use std::sync::atomic::{AtomicPtr, AtomicUsize}; 113 114 use super::Debt; 115 use crate::RefCnt; 116 117 pub const REPLACEMENT_TAG: usize = 0b01; 118 pub const GEN_TAG: usize = 0b10; 119 pub const TAG_MASK: usize = 0b11; 120 pub const IDLE: usize = 0; 121 122 /// Thread local data for the helping strategy. 123 #[derive(Default)] 124 pub(super) struct Local { 125 // The generation counter. 126 generation: Cell<usize>, 127 } 128 129 // Make sure the pointers have 2 empty bits. Always. 130 #[derive(Default)] 131 #[repr(align(4))] 132 struct Handover(AtomicUsize); 133 134 /// The slots for the helping strategy. 135 pub(super) struct Slots { 136 /// The control structure of the slot. 137 /// 138 /// Different threads signal what stage they are in in there. It can contain: 139 /// 140 /// * `IDLE` (nothing is happening, and there may or may not be an active debt). 141 /// * a generation, tagged with GEN_TAG. The reader is trying to acquire a slot right now and a 142 /// writer might try to help out. 143 /// * A replacement pointer, tagged with REPLACEMENT_TAG. This pointer points to an Handover, 144 /// containing an already protected value, provided by the writer for the benefit of the 145 /// reader. The reader should abort its own debt and use this instead. This indirection 146 /// (storing pointer to the envelope with the actual pointer) is to make sure there's a space 147 /// for the tag ‒ there is no guarantee the real pointer is aligned to at least 4 bytes, we 148 /// can however force that for the Handover type. 149 control: AtomicUsize, 150 /// A possibly active debt. 151 slot: Debt, 152 /// If there's a generation in control, this signifies what address the reader is trying to 153 /// load from. 154 active_addr: AtomicUsize, 155 /// A place where a writer can put a replacement value. 156 /// 157 /// Note that this is simply an allocation, and every participating slot contributes one, but 158 /// they may be passed around through the lifetime of the program. It is not accessed directly, 159 /// but through the space_offer thing. 160 /// 161 handover: Handover, 162 /// A pointer to a handover envelope this node currently owns. 163 /// 164 /// A writer makes a switch of its and readers handover when successfully storing a replacement 165 /// in the control. 166 space_offer: AtomicPtr<Handover>, 167 } 168 169 impl Default for Slots { default() -> Self170 fn default() -> Self { 171 Slots { 172 control: AtomicUsize::new(IDLE), 173 slot: Debt::default(), 174 // Doesn't matter yet 175 active_addr: AtomicUsize::new(0), 176 // Also doesn't matter 177 handover: Handover::default(), 178 // Here we would like it to point to our handover. But for that we need to be in place 179 // in RAM (effectively pinned, though we use older Rust than Pin, possibly?), so not 180 // yet. See init(). 181 space_offer: AtomicPtr::new(ptr::null_mut()), 182 } 183 } 184 } 185 186 impl Slots { slot(&self) -> &Debt187 pub(super) fn slot(&self) -> &Debt { 188 &self.slot 189 } 190 get_debt(&self, ptr: usize, local: &Local) -> (usize, bool)191 pub(super) fn get_debt(&self, ptr: usize, local: &Local) -> (usize, bool) { 192 // Incrementing by 4 ensures we always have enough space for 2 bit of tags. 193 let gen = local.generation.get().wrapping_add(4); 194 debug_assert_eq!(gen & GEN_TAG, 0); 195 local.generation.set(gen); 196 // Signal the caller that the node should be sent to a cooldown. 197 let discard = gen == 0; 198 let gen = gen | GEN_TAG; 199 // We will sync by the write to the control. But we also sync the value of the previous 200 // generation/released slot. That way we may re-confirm in the writer that the reader is 201 // not in between here and the compare_exchange below with a stale gen (eg. if we are in 202 // here, the re-confirm there will load the NO_DEPT and we are fine). 203 self.active_addr.store(ptr, SeqCst); 204 205 // We are the only ones allowed to do the IDLE -> * transition and we never leave it in 206 // anything else after an transaction, so this is OK. But we still need a load-store SeqCst 207 // operation here to form a relation between this and the store of the actual pointer in 208 // the writer thread :-(. 209 let prev = self.control.swap(gen, SeqCst); 210 debug_assert_eq!(IDLE, prev, "Left control in wrong state"); 211 212 (gen, discard) 213 } 214 help<R, T>(&self, who: &Self, storage_addr: usize, replacement: &R) where T: RefCnt, R: Fn() -> T,215 pub(super) fn help<R, T>(&self, who: &Self, storage_addr: usize, replacement: &R) 216 where 217 T: RefCnt, 218 R: Fn() -> T, 219 { 220 debug_assert_eq!(IDLE, self.control.load(Relaxed)); 221 // Also acquires the auxiliary data in other variables. 222 let mut control = who.control.load(SeqCst); 223 loop { 224 match control & TAG_MASK { 225 // Nothing to help with 226 IDLE if control == IDLE => break, 227 // Someone has already helped out with that, so we have nothing to do here 228 REPLACEMENT_TAG => break, 229 // Something is going on, let's have a better look. 230 GEN_TAG => { 231 debug_assert!( 232 !ptr::eq(self, who), 233 "Refusing to help myself, makes no sense" 234 ); 235 // Get the address that other thread is trying to load from. By that acquire, 236 // we also sync the control into our thread once more and reconfirm that the 237 // value of the active_addr is in between two same instances, therefore up to 238 // date to it. 239 let active_addr = who.active_addr.load(SeqCst); 240 if active_addr != storage_addr { 241 // Acquire for the same reason as on the top. 242 let new_control = who.control.load(SeqCst); 243 if new_control == control { 244 // The other thread is doing something, but to some other ArcSwap, so 245 // we don't care. Cool, done. 246 break; 247 } else { 248 // The control just changed under our hands, we don't know what to 249 // trust, so retry. 250 control = new_control; 251 continue; 252 } 253 } 254 255 // Now we know this work is for us. Try to create a replacement and offer it. 256 // This actually does a full-featured load under the hood, but we are currently 257 // idle and the load doesn't re-enter write, so that's all fine. 258 let replacement = replacement(); 259 let replace_addr = T::as_ptr(&replacement) as usize; 260 // If we succeed in helping the other thread, we take their empty space in 261 // return for us that we pass to them. It's already there, the value is synced 262 // to us by Acquire on control. 263 let their_space = who.space_offer.load(SeqCst); 264 // Relaxed is fine, our own thread and nobody but us writes in here. 265 let my_space = self.space_offer.load(SeqCst); 266 // Relaxed is fine, we'll sync by the next compare-exchange. If we don't, the 267 // value won't ever be read anyway. 268 unsafe { 269 (*my_space).0.store(replace_addr, SeqCst); 270 } 271 // Ensured by the align annotation at the type. 272 assert_eq!(my_space as usize & TAG_MASK, 0); 273 let space_addr = (my_space as usize) | REPLACEMENT_TAG; 274 // Acquire on failure -> same reason as at the top, reading the value. 275 // Release on success -> we send data to that thread through here. Must be 276 // AcqRel, because success must be superset of failure. Also, load to get their 277 // space (it won't have changed, it does when the control is set to IDLE). 278 match who 279 .control 280 .compare_exchange(control, space_addr, SeqCst, SeqCst) 281 { 282 Ok(_) => { 283 // We have successfully sent our replacement out (Release) and got 284 // their space in return (Acquire on that load above). 285 self.space_offer.store(their_space, SeqCst); 286 // The ref count went with it, so forget about it here. 287 T::into_ptr(replacement); 288 // We have successfully helped out, so we are done. 289 break; 290 } 291 Err(new_control) => { 292 // Something has changed in between. Let's try again, nothing changed 293 // (the replacement will get dropped at the end of scope, we didn't do 294 // anything with the spaces, etc. 295 control = new_control; 296 } 297 } 298 } 299 _ => unreachable!("Invalid control value {:X}", control), 300 } 301 } 302 } 303 init(&mut self)304 pub(super) fn init(&mut self) { 305 *self.space_offer.get_mut() = &mut self.handover; 306 } 307 confirm(&self, gen: usize, ptr: usize) -> Result<(), usize>308 pub(super) fn confirm(&self, gen: usize, ptr: usize) -> Result<(), usize> { 309 // Put the slot there and consider it acquire of a „lock“. For that we need swap, not store 310 // only (we need Acquire and Acquire works only on loads). Release is to make sure control 311 // is observable by the other thread (but that's probably not necessary anyway?) 312 let prev = self.slot.0.swap(ptr, SeqCst); 313 debug_assert_eq!(Debt::NONE, prev); 314 315 // Confirm by writing to the control (or discover that we got helped). We stop anyone else 316 // from helping by setting it to IDLE. 317 let control = self.control.swap(IDLE, SeqCst); 318 if control == gen { 319 // Nobody interfered, we have our debt in place and can proceed. 320 Ok(()) 321 } else { 322 // Someone put a replacement in there. 323 debug_assert_eq!(control & TAG_MASK, REPLACEMENT_TAG); 324 let handover = (control & !TAG_MASK) as *mut Handover; 325 let replacement = unsafe { &*handover }.0.load(SeqCst); 326 // Make sure we advertise the right envelope when we set it to generation next time. 327 self.space_offer.store(handover, SeqCst); 328 // Note we've left the debt in place. The caller should pay it back (without ever 329 // taking advantage of it) to make sure any extra is actually dropped (it is possible 330 // someone provided the replacement *and* paid the debt and we need just one of them). 331 Err(replacement) 332 } 333 } 334 } 335