@j0ni that’s odd. Is that purely with with-test-env, no external zookeeper?
hmm, doesn’t look like an incompatibility issue
I wonder if you’re starting up two of them, or getting some bad gc weirdness or something else like that. It’s a strange one to get from purely with-test-env + a job
@lucasbradstreet Onyx dashboard docker images are being pushed to https://hub.docker.com/r/onyx/onyx-dashboard/
But official docker repo seems to be this. https://hub.docker.com/u/onyxplatform/
is this intentional?
@lucasbradstreet I have an external zookeeper with a configuration tweaked accordingly. Config:
dev> (clojure.pprint/pprint (read-config (io/resource "config.edn") {:profile :local}))
{:env-config
{:onyx/tenancy-id "2",
:zookeeper/address "localhost:2181",
:zookeeper/server? false,
:zookeeper.server/port 2188,
:onyx.log/config {:level :debug}},
:peer-config
{:onyx.peer/storage.zk.insanely-allow-windowing? true,
:onyx/tenancy-id "2",
:onyx.messaging/allow-short-circuit? false,
:onyx.messaging/impl :aeron,
:onyx.log/config {:level :debug},
:onyx.messaging/peer-port 40200,
:zookeeper/address "localhost:2181",
:onyx.peer/job-scheduler :onyx.job-scheduler/balanced,
:onyx.messaging.aeron/embedded-driver? true,
:onyx.peer/zookeeper-timeout 60000,
:onyx.messaging/bind-addr "localhost"}}
=> nil
and my test code:
(defn run-test
"Requires (a) peer(s) to be up and running"
[]
(init-queue)
(let [{:keys [env-config peer-config]}
(aero.core/read-config (io/resource "config.edn") {:profile :local})
job (inspect (as/action-service-job {:onyx/batch-size 10
:onyx/batch-timeout 1000}))
{:keys [out]} (get-core-async-channels job)
in-segments (take 100 (test-data-gen))]
(with-test-env [alpha [5 env-config peer-config]]
(validate-enough-peers! (inspect alpha) job)
(let [job (inspect (api/submit-job (inspect peer-config) job))]
(assert (:success? job))
(send-data in-segments)
;; wait for the job to finish
(feedback-exception! peer-config (:job-id job))))
(assert (= (set (take-segments! out 100))
(set in-segments)))))
(`send-data` is code which pushes data into a sqs queue, which I've confirmed gets the data, though it is never removed by the onyx sqs plugin)
The code hangs in the call to feedback-exception!
and then starts throwing the BadVersion error after about a minute
at a high level that seems fine. I’m not sure how much data you’re sending through, but maybe reduce it to a single message and check whether it ends up processing it.
I had some progress actually - I'm now seeing data arriving at my function from the sqs input task - which is awesome. Now I'm trying to figure out how to make the job end after all the data is through
I suspect in the test that I used as a template for this, the closing of the input channel has that effect
I see the test you guys have in the sqs plugin seems to wait for some value (`:epoch`) to change and then kills the job
is that the best approach?