When setting up my little machinery with threads running infinite loops communicating over channels I run across the issue that the entire system lacks supervision. If an exception occurred in one thread, I need a way to signal to the thread which started the whole mechanism that it failed. Maybe even throw an exception. The only way I can think of to achieve this is to do something like:
(defn <!?
[p timeout]
(let [t (async/timeout timeout)]
(async/alt!!
p ([t] (throw t))
t :probably-ok)))
With which I can sample the process after a grace period in which I believe it should be okayUnlike multi-host distsys you can prove whether a peer is making progress in core.async vs. failed / locked, if you design your communication patterns well. I'd love to see an authoritative guide to what those patterns should look like (though I can sort of work it out ad-hoc currently)
you'd generally want to check the return value of all channel takes / puts, and a cascading shutdown (having a channel you can close to signal wrap up when your source closes) tends to work well and is supported by the built in functions
that said, timeout based assertion / failure works across the network, so it will also work in one process - it's just not always the most elegant option
one jvm can't have a "netsplit" etc.
I'm familiar with cascading shutdowns but I'm still stumped regarding starting and monitoring. I've been so stumped by it that I took a break to read Armstrong's thesis (that and handling errors between processes in general)
Eventually I feel I'd implement a poor-man's-otp anyway, and otplike
already exists, although it's its own idiom.
I think the Erlang people were onto something, though. Perhaps designing the Erlang behaviors in terms of protocols is the right solution
thinking much more widely about the issue, maybe brainstorming a bit, you could make blocks that toggle or increment/decrement an observable state when entered / exited. Ideally you could use try/finally for that but that doesn't play nicely with the core.async state machine transforms
@ben.sless honestly I don't think this is a solved problem in the JVM, not to mention in clojure, but if you are early in your design process you might want to also look at ztellman/manifold
Well, I found that I can register a watch on a shared ref, which I can pass to the running process and it can reset!
it with a failure reason if it fails to start. I can even get it to throw an error in the starting thread. Another option is having an ok
channel to signal a process started successfully which I can wait on.
Regarding try/finally I'm running my processes on real threads not in go blocks, so I can knock myself out using those. I developed an intuition regarding go blocks (which may be incorrect) that I should keep them for conveyance/ioc, not for "work".
I never looked to deep into manifold, and from what I understand all of Zach's projects are practically abandoned now. Do you recommend it for how it handles errors? I liked promesa's take on it, manifold always looked like it did both queues and monadic delayed computations. I wonder if I should add another abstraction vs. just alt
between success and error channels :thinking_face:
my idea with the toggle/increment was basically (try (swap! status update [:task-a] inc) .... (finally (swap! status update [:task-a] dec)))
- like a monitor but clojure flavored, and you can use this to see which parts of the system are running at what level
but that's just a brainstorm, totally untested
if one of them keeps going up, you have a lock, if one keeps going up and down but you aren't seeing results you have a logic error, etc.
a watcher on status
could create those higher level metrics about churn rate / maximums etc. etc.