onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
sparkofreason 2018-07-11T22:17:26.000238Z

I have a batch job driven by a custom input task which generates segments. Those segments are aggregated in time windows, the aggregated values emitted, and further aggregated in a larger window, So, for example, first aggregation is in one hour windows, next aggregates those in a one day window, ultimately feeding a dashboard. The last one or two hour bins always seem to come up short, but only when running in the production environment, not locally. I've tried various triggers, including doing it every segment, not sure where else to look. Any suggestions appreciated.

lucasbradstreet 2018-07-11T22:19:10.000258Z

What kind of trigger are you using? With batch jobs it’s possible it’s not triggeri g on final job completion.

lucasbradstreet 2018-07-11T22:19:22.000117Z

Are you trigger/emit or using trigger/sync?

sparkofreason 2018-07-11T22:26:19.000126Z

I've tried everything I could think of, from my own hybrid segment-timer trigger, to timer, to triggering on every segment. Using trigger/emit. Batch size was 100, if that matters.

lucasbradstreet 2018-07-11T22:29:06.000021Z

I believe you’re hitting this https://github.com/onyx-platform/onyx/issues/779, assuming you’re getting a job-completed signal. I apologise if it’s missing from the trigger/emit docs. I’ll fix the docs if so. I really want to fix this but it will require an extra signalling phase to safely complete a job.

sparkofreason 2018-07-11T22:33:33.000021Z

Thanks, not sure about a job-completed signal per se, but I assume that's occurring since the job status shows as completed. Lifecycles to the rescue again :the_horns:

lucasbradstreet 2018-07-11T22:49:06.000001Z

Right. This might not apply in your case but one option is to emit a sentinel record which triggers the final emit before the job finishes. If you’re using kafka the plugin supports emitting one when it hits end offsets that you supply

sparkofreason 2018-07-11T22:53:20.000006Z

Input is my own, so I presume I emit the sentinel. Is that just :done?

lucasbradstreet 2018-07-11T22:56:07.000209Z

Yep. That’ll work except it gets a little complicated if you end up partitioning the work somehow, eg. groups

lucasbradstreet 2018-07-11T22:56:47.000013Z

It’d probably be best if I fixed the multiple phase completion

lucasbradstreet 2018-07-11T22:57:06.000027Z

If you could confirm that’s what’s going on it would help push me over the edge. It can get complicated to fix it otherwise

lucasbradstreet 2018-07-11T22:57:47.000222Z

Since you need to signal to each group that it’s finished and to flush

sparkofreason 2018-07-11T22:58:51.000194Z

No groups. Let me give it a shot.

lucasbradstreet 2018-07-11T22:59:05.000266Z

That should make things easier then

lucasbradstreet 2018-07-11T23:23:02.000083Z

Hmm. Did you change anything about the code that implements the window protocols that you’re using? That’s a weird one as the window record should never be nil

lucasbradstreet 2018-07-11T23:23:32.000106Z

Oh it might just be on the coerce

lucasbradstreet 2018-07-11T23:23:53.000219Z

Maybe it’s passing a nil value for the time unit in? I’m on my phone so it’s hard to check the code that’s calling it

lucasbradstreet 2018-07-11T23:47:58.000188Z

Ah. Done / sentinel support was removed and that is not a map, so the done keyword is probably getting passed through. If you return a valid segment and then check for that segment in the trigger you should be good

sparkofreason 2018-07-11T23:52:29.000110Z

Ok, thanks.