onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
twashing 2017-12-23T00:08:55.000123Z

When firing up i) onyx ii) with onyx-kafka , iii) on a single machine, onyx.log spits out this message.

17-12-22 23:39:14 6c4ec751622d INFO [onyx.peer.virtual-peer:17] - Starting Virtual Peer a8fad2b9-f0d9-e015-1c76-62ab461c076d
17-12-22 23:39:14 6c4ec751622d INFO [onyx.log.zookeeper:151] - Stopping ZooKeeper client connection
17-12-22 23:39:16 6c4ec751622d INFO [onyx.log.commands.submit-job:91] - Job ID 76d33ae1-79df-8d07-daac-1d556c3346a0 has been submitted with tenancy ID dev, but received no virtual peers to start its execution.
                     Tasks each require at least one peer to be started, and may require more if :onyx/n-peers or :onyx/min-peers is set.
                     If you were expecting your job to start now, either start more virtual peers or choose a different job scheduler, or wait for existing jobs to finish.

twashing 2017-12-23T00:09:48.000054Z

I have logs for the app, kafka and zk if needed.

twashing 2017-12-23T00:10:08.000191Z

… this is all in docker, btw.

lucasbradstreet 2017-12-23T00:10:10.000130Z

Seems like the job was either submitted on a different tenancy id, or the peers haven’t started up fully (possibly because they can’t connect to Aeron - it will warn you about this)

twashing 2017-12-23T00:13:04.000058Z

@lucasbradstreet This is my :peer-config, which is what I use when calling onyx.api/start-peer-group then onyx.api/start-peers, then onyx.api/submit-job.

{:onyx/tenancy-id "dev"
  :zookeeper/address "zookeeper:2181"
  :onyx.peer/job-scheduler :onyx.job-scheduler/balanced
  :onyx.peer/zookeeper-timeout 5000
  :onyx.messaging/impl :aeron
  :onyx.messaging/bind-addr "0.0.0.0"
  :onyx.messaging/external-addr "0.0.0.0"
  :onyx.messaging/peer-port 40200
  :onyx.messaging.aeron/embedded-driver? true}

twashing 2017-12-23T00:13:21.000018Z

Do I need to wait before calling onyx.api/submit-job?

lucasbradstreet 2017-12-23T00:14:02.000006Z

Hmm, that looks fine. No, you don’t need to wait, but it may give that warning if the peers haven’t started up or you don’t have enough yet. Did you start enough peers for the job?

twashing 2017-12-23T00:15:06.000124Z

There’s one box, and I set the peer count to 1.

twashing 2017-12-23T00:15:14.000152Z

Lemme double check…

lucasbradstreet 2017-12-23T00:16:05.000136Z

Well, that’ll be it. It needs enough for all the tasks as described in the warning

twashing 2017-12-23T00:17:56.000063Z

So I need more than I machine, in other words?

lucasbradstreet 2017-12-23T00:18:55.000090Z

Nope, but the onyx.api/start-peers call needs to start enough to run the job.

lucasbradstreet 2017-12-23T00:19:07.000038Z

I mean virtual-peers not peers, sorry

twashing 2017-12-23T00:19:27.000015Z

np

twashing 2017-12-23T00:22:40.000027Z

So a peer-count of 1 is not enough then. B/c for the moment, I’m just firing 1 onyx.api/submit-job once.

(let [{:keys [zookeeper-url] :as config} (read-config (io/resource config-file-string))
        peer-config (:peer-config config)
        peer-group (onyx.api/start-peer-group peer-config)
        peer-count 1
        v-peers (onyx.api/start-peers peer-count peer-group)]

    (for [[the-workflow the-lifecycles the-catalog] [[psc/workflow
                                                      (psc/lifecycles :kafka)
                                                      (psc/catalog zookeeper-url "scanner-command"
                                                                   peer-count :kafka)]]]

      (do (println "the-catalog: " the-catalog)
          (let [job {:workflow the-workflow
                     :catalog the-catalog
                     :lifecycles the-lifecycles
                     :task-scheduler :onyx.task-scheduler/balanced}
                {:keys [job-id task-ids] :as submitted-job} (onyx.api/submit-job peer-config job)]

            submitted-job))))

lucasbradstreet 2017-12-23T00:23:26.000009Z

You only need one submit job, but yes, you need a higher peer count there.

lucasbradstreet 2017-12-23T00:23:46.000028Z

You just need to add up max-peers/n-peers for all the tasks

twashing 2017-12-23T00:26:24.000014Z

Ah ha, so if my catalog has 4 “things” (with :onyx/max-peers) in it, then the peer count is 4.

lucasbradstreet 2017-12-23T00:27:31.000060Z

If one is min-peers 2, one is n-peers 1, and another max-peers 2, it will need 2 + 1 + 1

twashing 2017-12-23T00:27:49.000021Z

Right…

twashing 2017-12-23T01:13:19.000044Z

Hmm, using onyx / onyx-kafka, I’m not seeing any output from my workflow (onyx.logs here: https://pastebin.com/ktQrFDdJ).

twashing 2017-12-23T01:13:45.000076Z

I have the workflow and catalog if needed.

twashing 2017-12-23T01:17:08.000059Z

Besides setting :onyx.log/config what other ways are there to troubleshoot / inspect job execution?

lucasbradstreet 2017-12-23T01:18:53.000011Z

If you’re using 0.12 you can try out “onyx.api/job-state. Plays back the log for a given tenancy-id and job-id and returns the current state of the job.”

lucasbradstreet 2017-12-23T01:22:08.000080Z

Looks like it did start though. If you use onyx-http-peer-query you can also query /metrics and find out what it’s up to

twashing 2017-12-23T01:26:01.000011Z

Ok yeah, I don’t think onyx.api/job-state or onyx-http-peer-query have made it into the cheatsheet yet… Let me take a look at the source.

lucasbradstreet 2017-12-23T01:26:46.000036Z

Http query requires https://github.com/onyx-platform/onyx-peer-http-query

twashing 2017-12-23T01:27:08.000064Z

Ah ha

lucasbradstreet 2017-12-23T01:28:31.000050Z

Gotta run - Xmas travel. Good luck

lucasbradstreet 2017-12-23T01:29:14.000035Z

The thing I’d most be interested in is the epoch_Value metrics - those should be increasing over time.

twashing 2017-12-23T01:29:22.000071Z

Oooh, Merry Christmas :)

twashing 2017-12-23T01:30:06.000113Z

Hmm, yeah bothapproaches look interesting.

twashing 2017-12-23T01:30:18.000042Z

I’ll start with onyx.api/job-state and see how far I get. Thanks !

twashing 2017-12-23T02:29:09.000057Z

@lucasbradstreet Have a basic workflow happening with onyx and kafka.

twashing 2017-12-23T02:29:21.000104Z

Also working for multiple jobs and workflows. So 2 birds with 1 late night in. Many thanks :)

lucasbradstreet 2017-12-23T02:32:14.000030Z

Any idea what the problem was?

twashing 2017-12-23T03:21:54.000026Z

@lucasbradstreet After I got i) a correct peer-count, ii) I realized that my :kafka/key-deserializer-fn had the wrong arity.

twashing 2017-12-23T03:22:31.000038Z

I also upgraded to org.onyxplatform/onyx "0.12.0" and org.onyxplatform/onyx-kafka "0.12.0.0". But that probably tangential.

twashing 2017-12-23T03:28:16.000054Z

I figured out both errors, while watching onyx.log. So that was the breakthrough, after getting stuck watching console logs for zookeeper, kafka, my app. I think what threw me off were those NoNode exceptions in zk. I spent a lot of time digging into that, which was a non-issue.

lucasbradstreet 2017-12-23T04:28:02.000049Z

Cool, thanks

schmee 2017-12-23T22:22:58.000091Z

I’m using the onyx-seq plugin to read a file from disk. how can I fire a trigger when I’ve reached the end of the seq? I want to do a batch computation over all the data and only emit once, when all the data is processed

schmee 2017-12-23T23:10:17.000018Z

figured out a solution: use a punctuation trigger with a pred checking if (= event-type :job-completed)

schmee 2017-12-23T23:18:46.000012Z

Am I correct that these segments cannot be used by downstream task due to https://github.com/onyx-platform/onyx/issues/779?