Yup. Term buffer needs to be a small fraction of the amount of available shm space
Fair enough @lucasbradstreet, @michaeldrogalis - it was worth a polite question at least 😉
@sreekanth thanks
(At least we know it CAN be done)
That’s true :)
There’s still Onyx! ;)
3😍@dave.dixon I’m guessing the log gc dropped the job from the killed-jobs vector, so it was probably just misleading us
@dave.dixon I think the most likely situation is there’s a kill-job log entry being emitted as a result of an exception (not one that I’ve seen a log message for, since those have hit the other code path thus far), or maybe maybe there’s a bug in the log gc
@kenny I covered the whole Aeron term buffer/shm-size thing in a talk at ClojureX last year. Looks like the website is a bit poorly right now. https://skillsmatter.com/skillscasts/10939-how-i-bled-all-over-onyx
1Hello, @jasonbell ! you pointed in this video that huge (i.e., Mb) messages not good for onyx. I’ve been thinking about processing images with onyx -- it is not a good idea?
I don't see any problem with processing images. And remember that video is old now and things have moved on. I'd try it first and then make a call.
@lucasbradstreet Had the same thing, saved all the peer logs this time, let me know if there's something I should look for. I'll restart the cluster at some point and remove the the job GC stuff.
Actually, this time, looking through the logs, it does appear that onyx recovered from a transient S3 DNS issue, after a flurry of exceptions. The shutdown was due to aeron timeout, so my health check isn't working right.