core-async

2021-02-01T05:19:47.006700Z

purely for learning purpose why does

(require '[clojure.core.async :refer [go >!! close! <!! >! <! thread]])

(def c (chan))

(do
  (go
    (do
      (doseq [i (range 1000000)]
        (>! c i))
      (close! c)))

  (go
    (time
     (loop [i (<! c)]
       (when i
         (recur (<! c)))))))
perform better than
(require '[clojure.core.async :refer [go >!! close! <!! >! <! thread]])

(def c (chan))

(do
  (thread
    (do
      (doseq [i (range 1000000)]
        (>!! c i))
      (close! c)))

  (thread
    (time
     (loop [i (<!! c)]
       (when i
         (recur (<!! c)))))))
i understand when there are lots of go-processes, the switching time pays off if goroutines are used as compared to thread but with chan size = 1 and two goroutines, ultimately the threads from go threadpool are going to block on the channel until a put happens. so essentially at low level it is the same as thread blocking right?

2021-02-01T16:27:29.008400Z

at a low level a process / thread context switch is much more expensive than a goroutine state machine context switch

2021-02-01T16:33:41.008600Z

for example the saving of state needs to save all state when doing an OS context switch, while go reuses most of its state and just switches out state machines to pick up different blocks. even if multiple thread ids are being used, it's a fixed number of them with cooperative context switching, and it never spends time in OS allocated work slices waiting on a signal like thread can.

phronmophobic 2021-02-01T05:25:28.007200Z

for the go example, it could theoretically just bounce back and forth between pushing and pulling on the same thread.

2021-02-01T07:25:27.007400Z

ya that’s what i thought too but the thread ids were surprisingly different

phronmophobic 2021-02-01T07:27:41.007600Z

Interesting. not sure how to investigate further without using a profiler like https://github.com/clojure-goes-fast/clj-async-profiler

1👍
2021-02-01T16:27:29.008400Z

at a low level a process / thread context switch is much more expensive than a goroutine state machine context switch

2021-02-01T16:33:41.008600Z

for example the saving of state needs to save all state when doing an OS context switch, while go reuses most of its state and just switches out state machines to pick up different blocks. even if multiple thread ids are being used, it's a fixed number of them with cooperative context switching, and it never spends time in OS allocated work slices waiting on a signal like thread can.

cassiel 2021-02-01T20:15:57.010900Z

Rather a newbie-flavoured question here re: combining core.async with Stuart Sierra’s Component machinery. Which would be better style? (i) Have one of the component modules own all the channels in the system, and do (chan) and (close!) on them all, with other components referring to them when firing up their (go) blocks (ii) Assuming a pipeline of consume-produce components, have each one create and close the channel it sends to only (which it can be assumed to “own”)?

2021-02-01T20:17:21.011300Z

I would not do that

cassiel 2021-02-01T20:18:05.011900Z

I think you got in as I accidentally posted … I went back and edited.

2021-02-01T20:18:16.012200Z

yes, I would do i

2021-02-01T20:18:29.012700Z

i is making a global singleton

cassiel 2021-02-01T20:18:45.013Z

Hmm, (ii) feels better structured to me, but (i) feels simpler and more reliable.

cassiel 2021-02-01T20:20:43.016500Z

Bonus question: given that components are firing up their own go threads/coroutines, should they also be responsible for shutting them down? Or should the go blocks be required to exit on closed input channels, so that close! everywhere brings them down? I’m veering towards the latter: close the channels to shut down the go blocks.

2021-02-01T20:21:08.017100Z

i is only more reliable if you don't understand that component does things synchronously, so if you have asynchronous tasks (via go blocks, threads, threadpools, whatever) you need to bridge that divide

2021-02-01T20:21:25.017400Z

components must be responsible for shutting them down

2021-02-01T20:21:52.018Z

your stop function shouldn't return until all the async tasks you have started have exited (it is not enough to signal them to stop)

cassiel 2021-02-01T20:23:08.020Z

(Aside: I’m in CLJS so it’s all coroutined.) The only way to ensure a shutdown then is to make every go block hang on an alt! and have explicit shutdown channels. That feels like a lot of machinery to achieve something which feels like it should be simpler.

cassiel 2021-02-01T20:23:35.020600Z

Unless I’m missing an obvious pattern to do that.

2021-02-01T20:24:00.021Z

that is not entirely correct

cassiel 2021-02-01T20:24:26.021900Z

Am happy to be enlightened…!

2021-02-01T20:24:35.022100Z

it is often easier to use an explicit shutdown channel, but you can make all your go blocks check for reading nil from the input channel

2021-02-01T20:24:54.022900Z

oh, in cljs you are screwed anyway

cassiel 2021-02-01T20:24:57.023100Z

Yes - they do - so a closed channel will always bring down a go-block consumer.

2021-02-01T20:25:07.023300Z

you can't bridge the async -> sync divide

2021-02-01T20:26:28.024500Z

so in clj the component's stop will close the input, which will signal the go block to exit, and then do a blocking take from the channel returned when that go block is started, to ensure that after the component is stopped the go loop has exited

2021-02-01T20:27:02.025100Z

(if you just close the channel, your component may return from stop while the go loop is processing a back log of messages in the channel)

2021-02-01T20:27:19.025400Z

of course you can't do blocking takes in cljs

cassiel 2021-02-01T20:27:26.025600Z

I was about to say…!

cassiel 2021-02-01T20:29:33.028Z

There’s no way to block on a channel from within the “main thread” in component, so I guess all I can do is arrange for go blocks to stop on their next read-from-closed, and be content with that.

2021-02-01T20:30:33.029300Z

the "correct" thing would be to write your own version of component where the lifecycle protocol is itself asynchronous

2021-02-01T20:30:42.029500Z

😐

cassiel 2021-02-01T20:30:53.029800Z

Sometimes something can be too correct!

cassiel 2021-02-01T20:32:14.031Z

OK, so it feels like I have to live with some decoupling here - I can shut down a component, and make it shut channels, but have to code the go blocks so that they obligingly shut themselves down when the thread jumps to them.

cassiel 2021-02-01T20:33:56.032100Z

Am happy to go with a singleton channel “owner” which mints fresh ones on start and closes them all on stop. I don’t think that leads to any deadlocks or orphaned go contexts.

cassiel 2021-02-01T20:35:24.032500Z

Thanks very much for the discussion - it’s been illuminating.

2021-02-01T20:49:46.034200Z

"orphaned go contexts" - IIRC go blocks are stored on channels, if they aren't running or waiting on a channel they go out of scope and get gc'd

cassiel 2021-02-01T20:51:49.035700Z

I believe so - they’re just parked contexts. But I think I read something recently that hinted that they came from a pool. As I think about it, that doesn’t make sense to me - I don’t see any reason why they’d need to be limited.

2021-02-01T20:52:20.036100Z

threads are pooled, the blocks are "unlimited"

cassiel 2021-02-01T20:58:27.037Z

Sure - though in CLJS I don’t have threads and am just jumping around inside a coroutine state machine, so I guess there are no limits of any kind.

2021-02-01T20:59:31.037500Z

well "only one thread ever exists" is a bit of a limit 😄

cassiel 2021-02-01T20:59:52.037800Z

True - but once you get over that, the sky’s the limit.