onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
lucasbradstreet 2018-08-15T02:47:41.000100Z

That’s definitely my overall impression of onyx-sql

lucasbradstreet 2018-08-15T05:05:36.000100Z

onyx-kafka, onyx-seq, and onyx-core-async (for testing), onyx-amazon-s3, onyx-amazon-sqs, and onyx-http are the most popular plugins overall

lucasbradstreet 2018-08-15T05:07:32.000100Z

Ranking between those is tricky

lucasbradstreet 2018-08-15T05:08:00.000100Z

onyx-kafka is definitely at the top, followed by sqs and s3 probably

2018-08-15T05:28:42.000100Z

@rustam.gilaztdinov not sure whether it fits in your architecture, but perhaps it's a better idea to use Kafka Connect or something like that to source data from PostgreSQL into Kafka, and then read from Kafka directly in Onyx

lucasbradstreet 2018-08-15T05:43:24.000100Z

What about https://debezium.io

lucasbradstreet 2018-08-15T05:43:57.000100Z

I’ve heard a lot of good things for CDC with debezium. Then onyx-kafka

lucasbradstreet 2018-08-15T05:44:43.000100Z

The CDC case is pretty big and hasn’t been paid enough attention to imo

2018-08-15T05:55:42.000100Z

nice, didn't know about debezium

2018-08-15T05:56:43.000100Z

that's pretty sweet

lucasbradstreet 2018-08-15T05:58:29.000100Z

The CDC direction to streaming is a pretty low risk way to start integrating systems into the streaming model

2018-08-15T06:01:20.000100Z

yes, and it makes a lot of sense -- i see that debezium uses actual database xlogs to capture changes

2018-08-15T06:01:38.000100Z

i still remember the insane old days of PostgreSQL + Slony

jasonbell 2018-08-15T12:08:17.000100Z

I should really dust off the twitter plugin work that I did.

rustam.gilaztdinov 2018-08-15T12:19:43.000100Z

this is a very frustrating behavior of sql plugin. Still doesn’t solve it

2018-08-15T12:54:28.000100Z

@jasonbell actually i've brought it up to date until 0.12.7, i think it's compatible with 0.14 as well

2018-08-15T12:54:57.000100Z

@rustam.gilaztdinov you mean the NPE?

rustam.gilaztdinov 2018-08-20T13:54:11.000100Z

@lmergen hello, sorry for the delay, can we pls try to fix this? how can I know slot-id? ranges is right, I guess -- [0 99] [100 199] [200 299]...

rustam.gilaztdinov 2018-08-20T13:56:17.000100Z

(take-nth n-peers (drop slot-id ...)) -- in my case, n-peers = 3, so, I choose every 3rd value in sequence

rustam.gilaztdinov 2018-08-15T12:55:12.000100Z

yep

2018-08-15T12:55:25.000100Z

do you know which line of code is triggering the NPE? that would help

rustam.gilaztdinov 2018-08-15T12:56:57.000100Z

Handling uncaught exception thrown inside task lifecycle :lifecycle/read-batch. Killing the job.


java.lang.Thread.run              Thread.java:  748
java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  624
 java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1149
                                               ...
                 clojure.core.async/thread-call/fn                async.clj:  441
 onyx.peer.task-lifecycle/start-task-lifecycle!/fn       task_lifecycle.clj: 1155
      onyx.peer.task-lifecycle/run-task-lifecycle!       task_lifecycle.clj:  550
    onyx.peer.task-lifecycle.TaskStateMachine/exec       task_lifecycle.clj: 1070
onyx.peer.task-lifecycle/wrap-lifecycle-metrics/fn       task_lifecycle.clj: 1097
      onyx.peer.task-lifecycle/build-read-batch/fn       task_lifecycle.clj:  651
             onyx.peer.read-batch/read-input-batch           read_batch.clj:   49
          onyx.peer.read-batch/read-input-batch/fn           read_batch.clj:   54
              onyx.plugin.sql.SqlPartitioner/poll!                  sql.clj:  114
                                clojure.core/first                 core.clj:   55
                                               ...
                          clojure.core/take-nth/fn                 core.clj: 4271
                                  clojure.core/seq                 core.clj:  137
                                               ...
                              clojure.core/drop/fn                 core.clj: 2924
                            clojure.core/drop/step                 core.clj: 2921

2018-08-15T12:58:34.000100Z

great, that helps, let me see

2018-08-15T12:59:53.000100Z

ohhh, i see what's going on already

jasonbell 2018-08-15T13:02:24.000100Z

Excellent news @lmergen thanks for letting me know

2018-08-15T13:07:42.000100Z

https://github.com/solatis/onyx-sql/tree/master i've just pushed a fix for this issue there, could you try it out ?

rustam.gilaztdinov 2018-08-15T13:17:59.000100Z

no, not helped(

2018-08-15T13:20:55.000100Z

are you sure you're using the correct version? is the error still exactly the same ?

rustam.gilaztdinov 2018-08-15T13:23:40.000100Z

yep, I’m sure, and error the same

rustam.gilaztdinov 2018-08-15T13:24:29.000100Z

this is weird, lein clean not help

rustam.gilaztdinov 2018-08-15T13:26:16.000100Z

error on onyx.plugin.sql.SqlPartitioner/poll! sql.clj: 115

2018-08-15T13:57:18.000100Z

ok, it could be that partition-table returns nil

2018-08-15T14:01:19.000100Z

could you try to change the if-let at line 115 to this:

(if-let [part (and rst
                       (first @rst))]

rustam.gilaztdinov 2018-08-15T14:14:24.000100Z

no, doesn’t help 😞

2018-08-15T14:23:12.000100Z

this makes no sense at all

rustam.gilaztdinov 2018-08-15T14:28:17.000100Z

:thisisfine:

2018-08-15T14:29:38.000100Z

wait! :thinking_face:

2018-08-15T14:29:52.000100Z

could it be that there is a sequence with lazy side-effects here

2018-08-15T14:29:59.000100Z

which is triggered by the first...

2018-08-15T14:30:01.000100Z

hmm

2018-08-15T14:31:45.000100Z

ok i don't have the time to debug this issue atm, but i'm fairly sure the issue is with the drop function inside the partition-table function

rustam.gilaztdinov 2018-08-15T14:33:07.000100Z

👀

rustam.gilaztdinov 2018-08-15T14:33:13.000100Z

i will check it

2018-08-15T14:47:20.000100Z

there's probably an incorrect slot-id, or an empty ranges, or something like that

2018-08-15T14:48:24.000100Z

what happens at that function is that onyx takes your total input table, and divides it over the number of peers you have assigned to the input task

2018-08-15T14:48:33.000100Z

that way, you get automatic partitioning

2018-08-15T14:49:20.000100Z

it does this by creating N partitions, and then dropping the first M partitions (where M == our own peer id)

2018-08-15T14:53:47.000100Z

there is probably something off in one of those things

2018-08-15T14:54:20.000100Z

if you could identify what the inputs of that function are, specifically slot-id and the ranges, that would be very useful