clojure-dev

Issues: https://clojure.atlassian.net/browse/CLJ | Guide: https://insideclojure.org/2015/05/01/contributing-clojure/
2019-08-02T00:34:56.137Z

OK, for any Java concurrency experts out there, I am now less sure of transient collection thread safety for passing transient vectors (at least, as first example I am looking at) from one thread to another. Suppose everything is nicely synchronized between two threads when things begin, then thread 1 calls transient on a vector and does a pop! on it, which caused its count to decrease from 3 to 2. Thread 2 gets a reference to this post-pop! transient vector, and calls count on it. Is thread 2 guaranteed to get the updated count when pop! completes? A few more details in the thread I will start on this.

2019-08-02T07:13:00.141900Z

https://clojure.atlassian.net/browse/CLJ-1580 is the one I found that looks most related, and its patch was merged in with the same Clojure release where the birth-thread check was removed from transients.

2019-08-02T07:14:05.142300Z

what makes transients memory consistent is not volatiles, it's the golden rule of transients : always use return value for later changes. To follow this rule in a multithreaded context implies a synchronization of t1 and t2 such that t1's pop! returned HB t2's count called, thus ensuring all changes made by t1 are visible on t2.

2019-08-02T07:15:49.142500Z

I agree that may be correct, but is it necessarily true that t2 getting a reference to the updated transient value requires a synchronization from t1 to t2? If so, why all the warnings in JCIP about improper publishing of a reference to an object?

2019-08-02T07:18:43.142700Z

And if it isn't the volatile's, why bother adding them to the implementation?

2019-08-02T07:29:32.142900Z

JCIP's safe publication stuff is about visibility guarantees on race conditions. Transients are explicitly designed to be updated one thread at a time, so I guess it's safe to assume no race conditions here.

2019-08-02T07:33:22.143100Z

BTW I would be very glad to know the purpose of all these volatiles in the implementation as well

2019-08-02T11:40:39.143400Z

probably the same purpose as volatiles in stateful transducers

2019-08-02T20:48:57.143800Z

In response to this statement of yours Alex: "There is a total ordering of all volatiles and some additional rules related to happen-before too iirc" I think it is true that there is a total ordering of all accesses to a single volatile field among all threads, but at least from my reading so far I haven't seen anything that says there must be a total ordering among accesses to two different volatile fields (unless there is some other constraint like program order or monitor locks, etc.)

2019-08-02T23:48:16.144900Z

FYI for those interested I finally noticed that Goetz's book "Java Concurrency In Practice" (JCIP) has a chap. 16 that attempts to make explicit connections between the Java Memory Model specification terminology, and recommendations made elsewhere in the book. Nice.

šŸ‘ 1
2019-08-04T02:45:25.145300Z

One source for the purpose of the volatiles is here: https://clojure.atlassian.net/browse/CLJ-1580 Those were a continuation of changes started with this ticket: https://clojure.atlassian.net/browse/CLJ-1498

2019-08-02T00:36:36.137100Z

I think maybe the answer is "no". Why not? Because even though all of the fields of class PersistentVector$TransientVector are volatile: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L520-L524

2019-08-02T00:38:18.137400Z

The pop! operation for transient vectors ends by assigning a value to the root field, followed later by decrementing cnt: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L787-L792

2019-08-02T00:39:45.137700Z

Thread 2's count starts by calling ensureEditable, which reads the volatile field root (I am assuming that this happens strictly after pop! completes for this example), then reads the volatile field cnt: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L537-L540

2019-08-02T00:40:17.138Z

So we know Thread 1's pop! causes write of root to happen-before write of cnt, by program order of operations in a single thread.

2019-08-02T00:40:30.138200Z

And we know Thread 2's count causes read of root to happen-before read of cnt, for the same reason.

2019-08-02T00:41:59.138400Z

But even if we assume that Thread 1's write of root happens-before Thread 2's read of root, that doesn't seem to require Thread 2's read of cnt to see the value of Thread 1's update of cnt. That is: separate volatile fields do not have any coordination between them.

2019-08-02T00:43:18.138700Z

For this example scenario, it seems that putting an assignment to the field root in method pop at the very end, after updating cnt, would be safer.

2019-08-02T00:44:47.138900Z

All TransientVector methods that are public either start with ensureEditable(), or some other function that calls ensureEditable(), which reads the field root (good), but in order for that to guarantee a consistent state among all 4 volatile fields, it seems like all methods that modify the collection state should also end with an assignment to root.

2019-08-02T00:52:38.139100Z

Perhaps an answer is: depending upon precisely how thread 2 got the reference to the transient collection, it might introduce more synchronization constraints with thread 1, that guarantee things are correct.

alexmiller 2019-08-02T04:34:27.140500Z

You might be right, but Iā€™d have to do some code study to tell. I vaguely recall there being a jira in this area too if you want to hunt it up

alexmiller 2019-08-02T04:35:05.141700Z

There is a total ordering of all volatiles and some additional rules related to happen-before too iirc