architecture

john 2018-08-05T15:13:17.000044Z

So today I'm endeavoring to build, in clojurescript, a threadsafe alloc/`free` mechanism over a js/SharedArrayBuffer slab. Pointers are welcome! Feel free to point out any pointers on pointers.

orestis 2018-08-05T15:16:13.000004Z

I thought SAB was decommissioned after the Intel Spectre bugs?

john 2018-08-05T15:16:56.000044Z

Temporarily... They're going to introduce some time obfuscation at the super low level, at some later date, iiuc

john 2018-08-05T15:18:15.000085Z

SABs are critical to WASM eventually interoperating with JS, if I understand correctly

john 2018-08-05T15:20:20.000011Z

SABs aren't thread safe, but they have an js/Atomics interface where you can atomically (CAS) swap out one byte at a time.

john 2018-08-05T15:24:55.000003Z

You can js/Atomics.wait on a particular byte, and then another tread can `js/Atomics.wake` js/Atomics.notify on that byte and any waiting threads will be woken up.

john 2018-08-05T15:27:17.000099Z

So when we just wake up every thread trying to get a lock, each tries to swap their id into lock the byte. Whoever wins proceeds with access to the rest of the data structure in the allocation. The rest go back to sleep waiting on the lock byte.

john 2018-08-05T15:30:13.000004Z

What I've been doing, the data structure is duplicated in two parts of the SAB. So when someone gets a lock, they can only write to a second version of the structure. Once they're done writing, the flip the internal pointer to the new structure. That way, you get thread safe, non-blocking reads.

john 2018-08-05T15:32:12.000075Z

The downside is that we cannot synchronously expand the size of a SAB. So unless we can predict the possible expansion rate of the memory beforehand, we may fail on asynchronously growing the memory fast enough.

john 2018-08-05T15:33:41.000033Z

If you know you'll never add more than twice the previous size of the memory, we can pre-swap out the SAB with bigger SABs.

john 2018-08-05T15:36:48.000082Z

One problem with that scheme is that you double your memory usage, obviously. Also, constantly growing and shrinking little SABs is more likely to hit those edge cases where you fail to grow them fast enough. So just using pointers on a SAB slab would amortize that failure rate as well across a larger memory pool.

john 2018-08-05T15:39:12.000059Z

So this is the code I'm looking at right now: (you can ignore the sab and int32, as they're not yet touched in the below implementation. just the control maps are showing the memory would be managed. btw, in the maps, keys are indexes into the int32, and values are lengths)

john 2018-08-05T15:44:35.000057Z

john 2018-08-05T15:51:43.000009Z

Also, there's going to be another SAB in gmem which is the "control structure" that stores :alloc and :free which controls the locking and current address. There'll likely be a pointer map that atomically points to the latest/current address of the object. And a lock map (of CASable bytes) that control who has access to updating the structure.

john 2018-08-05T15:53:22.000014Z

I'm thinking I may need a read counter too, which readers can atomically inc and dec on, delaying any free calls until the read counter is at 0, so as to prevent new allocations from writing over a slow reader.

john 2018-08-05T15:56:29.000072Z

I can't imagine a writer winning that race though, since this is an ordered block of memory and writers will always be staring after readers, at a location to the left any readers in the buffer.

john 2018-08-05T16:00:37.000002Z

I'm not as much worried about the performance of the above implementation at them moment, over correctness. For best performance, the whole thing could be converted to bit-twiddling directly on the SAB

john 2018-08-05T16:02:19.000069Z

For the time being, I'm going to store all these control maps as serialized strings in the control SAB

john 2018-08-05T17:02:33.000029Z

Looks like they improved the situation with "site isolation" https://security.googleblog.com/2018/07/mitigating-spectre-with-site-isolation.html

john 2018-08-05T17:22:12.000012Z

the problem with the read-counter idea (if it's even necessary) is that misbehaving workers may die or get killed by a watchdog service during mid-read, so the counter may never reach zero. In that case, we'd need rather a read-registry of IDs, so a watchdog service can remove all registrations for readers that fail some health checks.

john 2018-08-05T17:59:57.000072Z

Or perhaps free can have a few dozen seconds timeout on reads, and just assume any threads that haven't decd the counter must be dead workers. No single read should take that long anyway.

john 2018-08-05T23:41:30.000004Z

K, here's a more thorough example model. It has a vector (vab) pretending to be an ArrayBuffer. As new objects are assigned to pointers, the old memory gets deallocated.