core-async

dpsutton 2020-01-23T03:43:56.003200Z

I'm doing dining philosophers in cljs and wanted to recreate the standard deadlock situation before moving to the standard deadlock prevention strategies. However, I'm unable to actually trigger deadlock here. Does anyone see anything that would not allow the cyclic deadlock?

dpsutton 2020-01-23T04:06:07.001200Z

playing with it some more, it seems that there is no context switching between the (<! left) and (<! right) takes. if i put a small timeout channel take there it does happen. I thought it was a bit more cooperative multitasking there but it doesn't seem like it

2020-01-23T07:57:02.000600Z

That is because you are buffering the channels

2020-01-23T07:58:10.001400Z

The buffer immediately accepts 1 item, without having to wait for a consumer

2020-01-23T22:25:03.004300Z

Why would they switch?

2020-01-23T22:25:38.004700Z

If you find what you're looking for in the channel immediately, it won't context switch

2020-01-23T22:25:54.005Z

It will once something is waiting only

2020-01-23T22:26:04.005400Z

Otherwise it would just slow down the whole process

2020-01-23T22:27:35.005700Z

But, it does switch from the put to the take

2020-01-23T22:27:42.005900Z

You can think of it as a rendez-vous

2020-01-23T22:28:06.006200Z

put waits for a taker and take waits for a puter

2020-01-23T22:29:15.007100Z

Well, up to the buffer, so like @hiredman said, in your case the puter won't wait for the taker unless the buffer is full and the taker won't wait for the puter unless the buffer is empty

2020-01-23T22:30:49.008200Z

So the buffers control when things get to switch in a way. That said, from what I understand, two takes in a row won't switch if the take immediately succeeds.

2020-01-23T22:33:10.009600Z

I believe the context switch is round-robin? Actually, can someone confirm? So if you have something putting super quickly, and something taking a more slowly, it won't starve other processes, since between each rendez-vous of the put/take, it'll make a visit to the other processes

2020-01-23T22:37:10.011100Z

I like to think of it as I am the process. As I'm doing something, I won't stop to do anything else, until I reach a point where I can't continue on my current task, because I'm waiting for something, that's when I'll wonder off and see if there's anything else I can do in the mean time. Rince and Repeat

2020-01-23T22:38:00.011600Z

the context switch is a fifo queue

2020-01-23T22:39:14.012700Z

as in, go blocks are scheduled by being put on the end of a queue, which some threads are pulling tasks off and running

2020-01-23T22:40:00.013400Z

there is no central list of go blocks that is traversed to schedule them

2020-01-23T22:40:53.015Z

the way they are schedule is the go macro turns them into callbacks attached to channels, and when something happens on the channel the callback is put on the queue

alexmiller 2020-01-23T22:40:55.015100Z

having no central scheduler/controller is one of the most clever parts of core.async imo (esp when you look at alts etc)

2020-01-23T22:42:02.015500Z

Couldn't that open the door for starvation?

2020-01-23T22:42:23.015800Z

If two process just ping-pong with each other forever

2020-01-23T22:44:02.016200Z

they still wait in the queue

2020-01-23T22:45:38.017600Z

if A and B are ping ponging, then first A runs, which puts B on the queue and puts A "to sleep", and then B is taken off the queue and run, which puts A on the queue and puts B "to sleep"

2020-01-23T22:46:30.018600Z

at any point something else could be put on the queue which would be process beforeish(there are acutally multiple threads consuming and running tasks) the next A or B

2020-01-23T22:47:43.019700Z

there have been bugs related to this I think in the cljs implementation, where it would try to optimize by skipping the queue step, would easily results in starvation on a js vm because you only have one thread

2020-01-23T22:52:49.021600Z

That's exactly the scenario I was thinking. I can see how the thread pool for processed on the JVM would help a bit, but it still seems at least theoretically prone to it

2020-01-23T22:54:37.023100Z

no, given a fifo queue it won't happen, if you go through each step of A and B ping ponging, and assume at some point something else puts something on the queue, you can show eventually it reaches the front of the queue ahead of any work for A and B

2020-01-23T22:56:36.023500Z

Right, but how is that something else going to do that, if A and B are ping ponging

2020-01-23T22:57:19.024200Z

in the specific case of core.async, core.async itself has multiple threads in its pool and the jvm has multiple threads outside of core.async

2020-01-23T22:57:48.024800Z

but even if you are restrict yourself to a single thread and only core.async, it is still the case

2020-01-23T22:57:58.025100Z

in order for a task to be starved it must exist first

2020-01-23T22:58:56.026200Z

if the task exists for core.async it is either currently running, attached as a callback to some channel, or in the queue to be run

2020-01-23T22:59:19.026600Z

if it is attached to some channel that means it is waiting for activity on that channel and has nothing to do

2020-01-23T22:59:51.027300Z

when a task has nothing to do it cannot be starved (by definition starvation is a task with work to do not being able to get time to run)

2020-01-23T23:00:16.028200Z

so in order for a task to be starved it is either currently running or currently in the queue

2020-01-23T23:00:17.028300Z

"in order for a task to be starved it must exist first", hum, is this true? wouldn't task start processing before every go block is evaluated?

2020-01-23T23:00:55.028800Z

it is true

2020-01-23T23:01:26.029800Z

Say you had a go-loop just taking and puting on a channel

2020-01-23T23:01:26.029900Z

if you write a go block you cannot complain it is being starved because it is never run in my repl

2020-01-23T23:01:57.030700Z

and later you have another go trying to put a "stop" on it, which should sto the go-loop from ping-ponging

2020-01-23T23:02:03.031Z

if you write a program you cannot say it is being starved by the scheduler if you never run it

2020-01-23T23:02:07.031200Z

And we are in ClojureScript with only one thread ?

2020-01-23T23:04:35.032700Z

the problem with taking clojurescript as an example is I have seen people writing patches for cljs choose performance over what in my estimation is "correct" over and over again, so it feels like thin ice

2020-01-23T23:08:45.034700Z

The key invariants are: at some point one of the processes must yield (do some channel operation that can't complete immediately so it registers itself as a callback on the channel) and when it does complete the task must pass through the fifo queue before running again

2020-01-23T23:13:52.035500Z

(def a (async/chan 1))
(async/put! a 10)
(do (async/go-loop [a a]
      (let [v (async/<! a)]
        (println v)
        (async/>! a 20)
        (when v
          (recur a))))
    (async/go (async/>! a false)))

2020-01-23T23:14:03.035700Z

I think something like that

2020-01-23T23:14:32.036500Z

Logically, given a single threaded environment, I feel it would starve the second go process, and thus infinite loop

2020-01-23T23:14:33.036600Z

that is not two processes ping ponging

2020-01-23T23:17:32.037600Z

Hum..., isn't each recur a new process?

2020-01-23T23:17:37.037800Z

no

2020-01-23T23:18:13.038600Z

and even if it was, you would have successive processes doing stuff to a, not two processes ping ponging

2020-01-23T23:19:49.039600Z

True, sorry, I guess I meant with a buffer of 0:

(def a (async/chan))
  (async/put! a 10)
  (do (async/go-loop [a a]
        (let [v (async/<! a)]
          (println v)
          (async/>! a 20)
          (when v
            (recur a))))
      (async/go (println "ending the loop")
                (async/>! a false)))

2020-01-23T23:20:29.040500Z

Otherwise the first go-loop never yielded

2020-01-23T23:20:43.040800Z

not sure what changing the buffer size is supposed to change about it not being two processes ping ponging

2020-01-23T23:21:26.041800Z

Well, on the take of a, a continuation of the rest is created and put on the queue no ?

2020-01-23T23:21:34.042Z

a ping pong is A sending a ping to B then waiting for a pong, B waiting for the ping, then sending a pong to A, then A sending a ping to B

2020-01-23T23:22:06.042500Z

right, so here I'm using the same loop for that, put its like sending a ping to itself 😛

2020-01-23T23:22:17.042900Z

Okay, I mean you can do it with two seperate go block as well

2020-01-23T23:22:24.043300Z

you cannot

2020-01-23T23:22:27.043400Z

But I was seeing this as recursion

2020-01-23T23:23:28.043800Z

recursion is not the same thing as multiple processes

2020-01-23T23:26:12.045700Z

if instead of saying "couldn't two ping ponging processes lead to starvation" you said "couldn't an infinite loop that does nothing but read and write to a channel that no one else is reading and writing to lead to starvation", the answer is yes, because those are different things

2020-01-23T23:28:57.046700Z

(do
  (def a (async/chan))
  (async/put! a 10)
  (async/go-loop [a a]
    (when a
      (async/>! a 20)
      (recur a)))
  (async/go-loop [a a]
    (when a
      (println (async/<! a))
      (recur a)))
  (async/go (println "ending the loop")
            (async/close! a)))

2020-01-23T23:29:02.047Z

but even then I think it maybe a case where cljs has "optimized" things and it breaks things

2020-01-23T23:29:17.047400Z

What about this, this infinite loops, but isn't that due to starvation?

2020-01-23T23:29:35.047600Z

unless I just have a bug

2020-01-23T23:29:54.047900Z

Even in Clojure it infinite loops

2020-01-23T23:30:35.048400Z

nah, I was wrong, I was just checking the code

2020-01-23T23:31:16.049100Z

closing a doesn't end the loops

2020-01-23T23:31:30.049700Z

because you don't stop looping when a is closed

2020-01-23T23:31:35.049900Z

well, possibly my repl chokes because of the print, and it could eventually close

2020-01-23T23:31:59.050500Z

Wouldn't, oh ya

2020-01-23T23:32:01.050700Z

sorry

2020-01-23T23:32:05.050900Z

a is always truthy

2020-01-23T23:32:25.051300Z

ya, got confused with my old version

2020-01-23T23:32:57.052300Z

Hum... okay not sure how to create a repro of the scenario I'm imagining then 😛

2020-01-23T23:33:01.052400Z

on clojure that will terminate always anyway because you need at least 8 tasks not yielding

2020-01-23T23:48:32.052700Z

Alright, well I can't make it fail 😛

2020-01-23T23:48:43.053Z

This works as well, in clojure and cljs:

(do
  (def a (async/chan))
  (def keep-going (atom true))
  (async/go-loop [a a]
    (when @keep-going
      (async/>! a 20)
      (recur a)))
  (async/go-loop [a a]
    (when @keep-going
      (println (async/<! a))
      (recur a)))
  (async/go (println "ending the loop")
            (reset! keep-going false)
            (async/close! a)))

2020-01-23T23:52:44.053700Z

So, something makes the third go block execute, even though the first and second ping/pong between putting and taking

2020-01-23T23:53:50.054400Z

I think its got something to do with the evaluation actually

2020-01-23T23:55:53.055100Z

Ya, if I just send those 3 forms to the REPL, then it infinite loops, but wrapped in a do it doesn't, will try from a file