clojure-dev

Issues: https://clojure.atlassian.net/browse/CLJ | Guide: https://insideclojure.org/2015/05/01/contributing-clojure/
2019-09-06T08:00:03.086100Z

I do not have an experiment in code form that demonstrates that transient collections must have additional outside synchronization to pass them safely from one thread to another, but I am about 90% sure they do. More details in a thread, so those not interested can skip it more easily.

2019-09-06T08:00:54.086200Z

In particular, even with all of the volatile fields in the transient version of PersistentVector, it is possible to do an assoc! operation on a transient that returns the same identical object that was passed to assoc!, and the only changes made to the internal data structures are to one element of a Java object array. These are not AtomicReferenceArray's, but just normal Object arrays, so they have no special synchronization on them.

2019-09-06T08:01:12.086400Z

Such an assoc! operation reads some volatile fields, but never writes to any, and without writes to 'release' and a later matching volatile read to 'acquire', in the JMM synchronization lingo, there is nothing that I can see that enforces such java Object array element modifications being visible to a later thread that accesses the same transient vector, unless outside synchronization is used, e.g. storing a reference to the entire persistent vector object into an AtomicReference, a BlockingQueue, a volatile reference, etc.

😱 1
2019-09-06T08:03:40.086600Z

I am not proposing that anything change in the implementation of transients -- just that perhaps some documentation might be appropriate for explaining this, and perhaps some related issues.

2019-09-06T14:58:34.089100Z

I made some perf measurements regarding https://clojure.atlassian.net/browse/CLJ-2146

alexmiller 2019-09-06T15:14:23.089700Z

I ran it. most of the diffs were pretty low.

alexmiller 2019-09-06T15:14:28.089900Z

take 11.493305090156428%
take-nth 4.3784596592768725%
drop 23.302769624078746%
drop-while 41.14023842520035%
partition-by 7.9452740134274755%
keep-indexed 14.810195595843428%
map-indexed 12.049953792470527%
distinct 4.421372477465172%
interpose 17.946646239978342%
dedupe 23.70828320719508%

alexmiller 2019-09-06T15:14:59.090200Z

does not seem like "up to 50%" to me

alexmiller 2019-09-06T15:15:15.090400Z

on one run, I even had one that was faster

alexmiller 2019-09-06T15:16:02.090800Z

or slower I guess I should say

2019-09-06T15:17:43.091300Z

I think it's reasonable to say volatiles are not free

2019-09-06T15:18:43.091700Z

and I'm still to be convinced about the benefits

2019-09-06T17:49:28.091800Z

Noting mikerod's reaction, I don't think the sky is falling here -- From recent discussion and investigation of, for example, core.async, it appears to do its own synchronization for every let/loop local variable in your code that should be sufficient for this, so bouncing a go block's execution between different threads should not be a problem, even if your code uses transients, or anything else that is mutable.

2019-09-06T18:01:22.092Z

My understanding is that core.async channels can have transducers added on to them, and there the execution can be to any one of several threads in a thread pool. Are you saying that like hiredman, you believe that in that context, there is other synchronization in place that avoids the need for volatiles inside of transducer implementations?

2019-09-06T18:24:36.092200Z

yes, it's always protected by the channel lock

2019-09-06T20:30:47.092400Z

Do you have any code pointers to where that channel lock is, e.g. its precise name in the core.async source code? Is there a separate lock per channel?

cgrand 2019-09-06T21:40:33.095400Z

I agree with @hiredman and @leonoel. In general I think the focus should be on outer synchronization rather than making everything threadsafe.