core-async

Ben Sless 2020-08-07T15:37:58.108400Z

When setting up my little machinery with threads running infinite loops communicating over channels I run across the issue that the entire system lacks supervision. If an exception occurred in one thread, I need a way to signal to the thread which started the whole mechanism that it failed. Maybe even throw an exception. The only way I can think of to achieve this is to do something like:

(defn <!?
  [p timeout]
  (let [t (async/timeout timeout)]
    (async/alt!!
      p ([t] (throw t))
      t :probably-ok)))
With which I can sample the process after a grace period in which I believe it should be okay

2020-08-07T18:14:21.110200Z

Unlike multi-host distsys you can prove whether a peer is making progress in core.async vs. failed / locked, if you design your communication patterns well. I'd love to see an authoritative guide to what those patterns should look like (though I can sort of work it out ad-hoc currently)

2020-08-07T18:15:45.111800Z

you'd generally want to check the return value of all channel takes / puts, and a cascading shutdown (having a channel you can close to signal wrap up when your source closes) tends to work well and is supported by the built in functions

2020-08-07T18:16:30.112600Z

that said, timeout based assertion / failure works across the network, so it will also work in one process - it's just not always the most elegant option

2020-08-07T18:16:54.112900Z

one jvm can't have a "netsplit" etc.

Ben Sless 2020-08-07T19:50:01.115600Z

I'm familiar with cascading shutdowns but I'm still stumped regarding starting and monitoring. I've been so stumped by it that I took a break to read Armstrong's thesis (that and handling errors between processes in general) Eventually I feel I'd implement a poor-man's-otp anyway, and otplike already exists, although it's its own idiom. I think the Erlang people were onto something, though. Perhaps designing the Erlang behaviors in terms of protocols is the right solution

2020-08-07T19:55:31.117300Z

thinking much more widely about the issue, maybe brainstorming a bit, you could make blocks that toggle or increment/decrement an observable state when entered / exited. Ideally you could use try/finally for that but that doesn't play nicely with the core.async state machine transforms

2020-08-07T19:56:25.118200Z

@ben.sless honestly I don't think this is a solved problem in the JVM, not to mention in clojure, but if you are early in your design process you might want to also look at ztellman/manifold

Ben Sless 2020-08-07T20:09:44.123500Z

Well, I found that I can register a watch on a shared ref, which I can pass to the running process and it can reset! it with a failure reason if it fails to start. I can even get it to throw an error in the starting thread. Another option is having an ok channel to signal a process started successfully which I can wait on. Regarding try/finally I'm running my processes on real threads not in go blocks, so I can knock myself out using those. I developed an intuition regarding go blocks (which may be incorrect) that I should keep them for conveyance/ioc, not for "work". I never looked to deep into manifold, and from what I understand all of Zach's projects are practically abandoned now. Do you recommend it for how it handles errors? I liked promesa's take on it, manifold always looked like it did both queues and monadic delayed computations. I wonder if I should add another abstraction vs. just alt between success and error channels :thinking_face:

2020-08-07T20:14:11.125100Z

my idea with the toggle/increment was basically (try (swap! status update [:task-a] inc) .... (finally (swap! status update [:task-a] dec))) - like a monitor but clojure flavored, and you can use this to see which parts of the system are running at what level

2020-08-07T20:14:20.125400Z

but that's just a brainstorm, totally untested

2020-08-07T20:14:57.126100Z

if one of them keeps going up, you have a lock, if one keeps going up and down but you aren't seeing results you have a logic error, etc.

2020-08-07T20:15:22.126600Z

a watcher on status could create those higher level metrics about churn rate / maximums etc. etc.