core-async

dharrigan 2019-10-08T16:43:25.027700Z

So, I'm trying to reason about core async and how to use it. I'm new so lots to learn. I have this, but I'm not sure if it's correct way to use core.async, please ask questions if not clear on the intent:

dharrigan 2019-10-08T16:43:33.027900Z

(let [c chan]
  (go-loop []
    (when-let [ids (a-function-that-gathers-ids)]
      (do-something-with-ids-dumping-results-into-channel c)
      (<!!
       (go
        (a-function-that-updates-something-with-the-results (alts!! [c (timeout 5000)])))))
    (Thread/sleep 60000)
    (recur)))

dharrigan 2019-10-08T16:44:20.028600Z

the a-function-that-gathers-results reads from a database, and a do-something-with-ids... reads from a RESTful API.

dharrigan 2019-10-08T16:44:25.028800Z

(using clj-http)

2019-10-08T16:57:02.029800Z

you don't want anything to block in the dynamic extent of a go block, meaning you should not do any blocking operations or call any functions or methods that block from a go block

2019-10-08T16:57:24.030300Z

so for example <!! blocks, so you should not call it from a go block

2019-10-08T16:58:33.031400Z

the general form of what you want to do is something like:

(go-loop [] (&lt;! (thread (some-blocking-thing))) (recur)) 

2019-10-08T16:59:13.032Z

you have a go loop, you do the blocking operations on a real thread, then use <! to get the result

2019-10-08T17:02:47.034500Z

operations with double bangs (<!!, >!!, alts!!) block the real jvm thread they are run on, while operations with a single bang (<! , >!, alts!) sort of act like they block (the execution of the go block will stop until that operation completes), but they actually release the jvm thread to do other work while waiting

dharrigan 2019-10-08T17:22:56.034800Z

I see, okay, thank you for the explanation. Will experiment 🙂

dharrigan 2019-10-08T17:23:18.035200Z

It always confuses me when I should do async or block, esp with network/db operations

2019-10-08T17:58:37.036100Z

unless you know for sure otherwise, the safe assumption is anything that does io ( network, db, file, etc) is going to block, so give it a real thread

dharrigan 2019-10-08T18:53:58.036300Z

that's great advice, thanks!

dharrigan 2019-10-08T20:31:09.037400Z

Generally, why do blocking operations in a thread and not in a go block? Shouldn't the go block run in a separate thread from the main thread and suspend itself if something blocking occurs (thus the main thread keeps going?)

2019-10-08T20:32:06.037700Z

go blocks use a thread pool of restricted size

2019-10-08T20:32:23.038100Z

they aren't meant for worker threads, they are meant for coordination between channels

2019-10-08T20:32:57.038700Z

jvm threads don't suspend themselves in the way the go block abstraction does

2019-10-08T20:33:58.039900Z

sure, threads don't block each other (unless they monopolize resources), but context switches on the OS level (which threads do) is much more expensive than context switches in a state machine in a thread (what go blocks do)

2019-10-08T20:34:13.040300Z

which is why we even have go blocks

dharrigan 2019-10-08T20:35:50.042100Z

Okay, cool, but aren't go blocks meant to be "light-weight" and thus hundreds (thousands?) can be created? I appreciate the explanation of the context, but are you saying that go blocks are designed for just channel (message) coordination? Simply trying to understand when/where to use a go block vs. threads 🙂

2019-10-08T20:36:14.042600Z

you can create thousands of go blocks, right, but only N of them can be running

2019-10-08T20:36:16.042900Z

N is small

2019-10-08T20:37:01.043600Z

I really want to say "there is no main thread" because the jvm is a real multithreaded runtime and threads are all the same, but technically there is often a thread with the name "main" which is just the first thread the jvm starts

2019-10-08T20:37:10.043800Z

hundreds is nothing

2019-10-08T20:38:09.044700Z

generally normal threads are light weight enough to run hundreds of thousands

2019-10-08T20:38:43.045400Z

go blocks only do anything useful when used with core.async channels

2019-10-08T20:41:31.047400Z

(core.async channels however are pretty useful outside of go blocks)

dharrigan 2019-10-08T20:42:12.048500Z

I see, so go blocks for channels. So what's the approach then, if I have something, i.e., in the original quesiton I asked, that does two things, go out to a db and read results, then go out to the web for further results, and I want those two operations to be tucked away, doing their own thing without disturbing the "main" 🙂 flow (so-to-speak), should I do what you kindly put up above?

dharrigan 2019-10-08T20:42:32.048900Z

something that would "scale" (magical word!)

2019-10-08T20:43:16.049900Z

I think you need to get more specific about what you want and use fewer magic words 🙂

2019-10-08T20:43:46.050500Z

(go-loop (let [x (&lt;! rq-chan) db-info (&lt;! (thread (lookup x)))] (&lt;! (thread (web-search db-info))) (recur))

dharrigan 2019-10-08T20:44:09.051Z

very interesting. thanks noisesmith

2019-10-08T20:44:27.051800Z

I almost missed the first chan

2019-10-08T20:44:32.052400Z

but like, why use a go block at all there?

dharrigan 2019-10-08T20:45:01.053100Z

(to be honest, I assumed go-block since I thought it was 'the thing to do' (tm))

2019-10-08T20:45:18.053800Z

you likely just need future

2019-10-08T20:45:32.054300Z

yeah - my snippet doesn't really make sense without other coordination with async blocks

2019-10-08T20:45:48.055Z

(other channels used to coordinate or buffer)

dharrigan 2019-10-08T20:46:38.056200Z

You know, as a newbie, there's precious little examples of how to use go blocks/theads in clojure that do "real-world-bread-and-butter-stuff" (tm) of writing to a db, reading from a web service and coordinating that. I'm welcome to be shown wrong! 🙂

2019-10-08T20:47:50.056800Z

because just use threads

2019-10-08T20:48:53.058100Z

if you are not comfortable writing multithreaded software with real threads, I don't know that core.async is going to solve anything for you

2019-10-08T20:49:55.059100Z

like, absent any other requirements it sounds like you just want (future (do-something-else (do-something)))

dharrigan 2019-10-08T20:50:14.059600Z

Oh, learning, and trying - all the cool kids seem to be using core async and go blocks these days....so trying to understand if that's for me! 🙂

dharrigan 2019-10-08T20:51:08.060600Z

I really appreciate your very helpful feedback! I have much studying to do! 🙂

dharrigan 2019-10-08T20:51:31.061300Z

(both of you!) 🙂

2019-10-08T20:51:38.061500Z

my heuristic is that every PR that first adds core.async to a project has subtle bugs where the code only works accidentally, and the problem being solved doesn't actually need core.async - I've yet to see it proved wrong

2019-10-08T20:52:22.062600Z

but that being said, there are coordination tasks where core.async helps a lot (eg. when you have an expensive thing to process and something might be in flight or need to be retried...)

2019-10-08T20:52:25.062800Z

every pr that adds core.async or maybe just every pr

2019-10-08T20:52:30.063Z

haha

dharrigan 2019-10-08T20:52:56.063500Z

Perhaps future is all I need for now 🙂 keep it simple 🙂

2019-10-08T20:53:22.064500Z

yeah, core.async, in my view is all about communication between logical threads of execution, in your example you don't have any of that

2019-10-08T20:53:28.064800Z

there's also claypoole which has more flexibility than future but works in the same basic paradigm

dharrigan 2019-10-08T20:53:30.064900Z

It's definitely an area I need to understand a whole lot more

2019-10-08T20:55:03.066400Z

for personal stuff if I am playing around with flow control or a consensus algorithm, core.async ends up being a way to experiment with that stuff without having to re-invent how communication happens

2019-10-08T20:55:44.067200Z

for work we use core.async pretty heavily for our chat system, which lends itself to a sort of agent view of the world, lots of process loops exchanging messages

2019-10-08T20:57:43.068700Z

there are some cases where you might use something like pipeline-blocking from core.async without explicitly having a model of multiple communicating processes, but for the most part core.async is for communication between multiple things

👍 1
dharrigan 2019-10-08T20:59:21.068900Z

thank you all! 🙂