Sorry if I sound like a broken record here -- new question from me but still have Java memory model correct synchronization of data between threads bouncing in my brain, and looked at the implementations of all of the transducers within Clojure. A bunch have no internal state, so look perfectly safe to move between threads, and a bunch more have a single item of internal state created and updated using volatile!, so also look safe to me.
partition-by and partition-all look unsafe to me, from what I know. They have not only a volatile! in their internal state, but also a java.util.ArrayList, whose Java API docs say require outside synchronization if you do add method calls on them, which those transducers do.
Those java.util.ArrayList objects are not synchronized in the code of partitional-all or partition-by.
there is a ticket for the partition ones
if you patch, I'd be happy to screen it, it's been vetted by rich already
Just went looking for an existing JIRA, and then noticed you already linked to it when I came back 🙂
Thinking about a patch -- quickly realizing that detecting that code might be improperly synchronized, and finding the most efficient correction, are different levels of difficulty.
you can follow the path of other stateful transducers using volatile
I don't think it's any different than those
Yep, that is one way to skin that cat -- just will look a bit messy probably since there are a fair number of Java method calls for mutating it.
I am trying to work out if any of the current ways we have to use transducers (in core an core.async) would have an issue there, and I can't see that they would.
in core things don't just change threads, and in core.async thread changes all happen around taking channel locks
That’s the idea
@andy.fingerhut not sure I understand what’s hard
nothing hard, I believe. Except I am finding sometimes, getting attachments onto JIRA tickets.
Yeah, make sure to turn off the “new” ui if you haven’t as it is bad
Given the optimizations around clearing young objects, I wonder if it would actually be better to use nil for empty partition and just throw it away when done rather than clearing and reusing
Would simplify the code a lot
And may even be faster
I can try a variant like that.
clearing young objects, meaning generational GC? Or some other optimizations you are referring to there?
I’m in the car so can’t properly look at the patch but doing multiple ops on the volatile has safe publication but not thread safety, but I guess that doesn’t matter in this context
If by thread safety you mean that things can go badly if multiple threads simultaneously call this transducer function, then perfectly agreed, but good that none of the stateful transducers are intended to work in such an environment.
@hiredman I would be curious to learn what you find out there.