So, I'm trying to reason about core async and how to use it. I'm new so lots to learn. I have this, but I'm not sure if it's correct way to use core.async, please ask questions if not clear on the intent:
(let [c chan]
(go-loop []
(when-let [ids (a-function-that-gathers-ids)]
(do-something-with-ids-dumping-results-into-channel c)
(<!!
(go
(a-function-that-updates-something-with-the-results (alts!! [c (timeout 5000)])))))
(Thread/sleep 60000)
(recur)))
the a-function-that-gathers-results
reads from a database, and a do-something-with-ids...
reads from a RESTful API.
(using clj-http)
you don't want anything to block in the dynamic extent of a go block, meaning you should not do any blocking operations or call any functions or methods that block from a go block
so for example <!! blocks, so you should not call it from a go block
the general form of what you want to do is something like:
(go-loop [] (<! (thread (some-blocking-thing))) (recur))
you have a go loop, you do the blocking operations on a real thread, then use <! to get the result
operations with double bangs (<!!, >!!, alts!!) block the real jvm thread they are run on, while operations with a single bang (<! , >!, alts!) sort of act like they block (the execution of the go block will stop until that operation completes), but they actually release the jvm thread to do other work while waiting
I see, okay, thank you for the explanation. Will experiment 🙂
It always confuses me when I should do async or block, esp with network/db operations
unless you know for sure otherwise, the safe assumption is anything that does io ( network, db, file, etc) is going to block, so give it a real thread
that's great advice, thanks!
Generally, why do blocking operations in a thread and not in a go block? Shouldn't the go block run in a separate thread from the main thread and suspend itself if something blocking occurs (thus the main thread keeps going?)
go blocks use a thread pool of restricted size
they aren't meant for worker threads, they are meant for coordination between channels
jvm threads don't suspend themselves in the way the go block abstraction does
sure, threads don't block each other (unless they monopolize resources), but context switches on the OS level (which threads do) is much more expensive than context switches in a state machine in a thread (what go blocks do)
which is why we even have go blocks
Okay, cool, but aren't go blocks meant to be "light-weight" and thus hundreds (thousands?) can be created? I appreciate the explanation of the context, but are you saying that go blocks are designed for just channel (message) coordination? Simply trying to understand when/where to use a go block vs. threads 🙂
you can create thousands of go blocks, right, but only N of them can be running
N is small
I really want to say "there is no main thread" because the jvm is a real multithreaded runtime and threads are all the same, but technically there is often a thread with the name "main" which is just the first thread the jvm starts
hundreds is nothing
generally normal threads are light weight enough to run hundreds of thousands
go blocks only do anything useful when used with core.async channels
(core.async channels however are pretty useful outside of go blocks)
I see, so go blocks for channels. So what's the approach then, if I have something, i.e., in the original quesiton I asked, that does two things, go out to a db and read results, then go out to the web for further results, and I want those two operations to be tucked away, doing their own thing without disturbing the "main" 🙂 flow (so-to-speak), should I do what you kindly put up above?
something that would "scale" (magical word!)
I think you need to get more specific about what you want and use fewer magic words 🙂
(go-loop (let [x (<! rq-chan) db-info (<! (thread (lookup x)))] (<! (thread (web-search db-info))) (recur))
very interesting. thanks noisesmith
I almost missed the first chan
but like, why use a go block at all there?
(to be honest, I assumed go-block since I thought it was 'the thing to do' (tm))
you likely just need future
yeah - my snippet doesn't really make sense without other coordination with async blocks
(other channels used to coordinate or buffer)
You know, as a newbie, there's precious little examples of how to use go blocks/theads in clojure that do "real-world-bread-and-butter-stuff" (tm) of writing to a db, reading from a web service and coordinating that. I'm welcome to be shown wrong! 🙂
because just use threads
if you are not comfortable writing multithreaded software with real threads, I don't know that core.async is going to solve anything for you
like, absent any other requirements it sounds like you just want (future (do-something-else (do-something)))
Oh, learning, and trying - all the cool kids seem to be using core async and go blocks these days....so trying to understand if that's for me! 🙂
I really appreciate your very helpful feedback! I have much studying to do! 🙂
(both of you!) 🙂
my heuristic is that every PR that first adds core.async to a project has subtle bugs where the code only works accidentally, and the problem being solved doesn't actually need core.async - I've yet to see it proved wrong
but that being said, there are coordination tasks where core.async helps a lot (eg. when you have an expensive thing to process and something might be in flight or need to be retried...)
every pr that adds core.async or maybe just every pr
haha
Perhaps future is all I need for now 🙂 keep it simple 🙂
yeah, core.async, in my view is all about communication between logical threads of execution, in your example you don't have any of that
there's also claypoole which has more flexibility than future but works in the same basic paradigm
It's definitely an area I need to understand a whole lot more
for personal stuff if I am playing around with flow control or a consensus algorithm, core.async ends up being a way to experiment with that stuff without having to re-invent how communication happens
for work we use core.async pretty heavily for our chat system, which lends itself to a sort of agent view of the world, lots of process loops exchanging messages
there are some cases where you might use something like pipeline-blocking from core.async without explicitly having a model of multiple communicating processes, but for the most part core.async is for communication between multiple things
thank you all! 🙂