onyx

FYI: alternative Onyx :onyx: chat is at <https://gitter.im/onyx-platform/onyx> ; log can be found at <https://clojurians-log.clojureverse.org/onyx/index.html>
halcyon 2017-11-30T03:54:37.000052Z

I'm attempting to use the results of a jdbc query as an input for my workflow. What is the recommended way of accomplishing that? The onyx-sql plugin looks like it would require massive amounts of unnecessary data to come in over the wire, and I loose connection to zookeeper when using the onyx-seq plugin if I don't severely limit the results from my query.

michaeldrogalis 2017-11-30T04:31:38.000198Z

@halcyon Are you looking to execute a single query, and pump those results through the workflow once? Or recurringly query and continuous put results in?

michaeldrogalis 2017-11-30T04:32:21.000058Z

With respect to losing your ZK connection, you probably need to switch your checkpoint storage to :s3 instead of :zookeeper and you'll be fine

halcyon 2017-11-30T04:45:21.000004Z

single query and pump those results through the workflow once

michaeldrogalis 2017-11-30T04:58:17.000133Z

Probably onyx-seq or onyx-kafka. If the results fit in memory use onyx-seq, if not spool them into a Kafka topic. Either way definitely make sure checkpointing is using S3

michaeldrogalis 2017-11-30T04:58:26.000174Z

ZK is only for development

halcyon 2017-11-30T04:59:31.000094Z

thank you!

michaeldrogalis 2017-11-30T05:05:39.000065Z

Anytime! Let us know if you hit any trouble.

halcyon 2017-11-30T05:13:04.000138Z

After switching to S3 for checkpointing, will onyx still require ZK for anything else?

michaeldrogalis 2017-11-30T05:24:41.000180Z

Yeah, it will for coordination

michaeldrogalis 2017-11-30T05:24:56.000073Z

It just wont be storing large amounts of data inside of it, which is why you were hitting those problems

michaeldrogalis 2017-11-30T05:24:59.000132Z

Off for the night now. 🙂

halcyon 2017-11-30T05:31:13.000153Z

Got it, thanks again - good night!

eelke 2017-11-30T10:08:25.000307Z

Hey, I was wondering if it is a possibility for the onyx-kafka plugin to have the number of peers per task scale with cluster size up to :n-partitions. Currently it is required that :onyx/min-peers must equal :onyx/max-peers, or :onyx/n-peers must be set, and :onyx/min-peers and :onyx/max-peers must not be set. Most desired functionality is that the n-peers scale automatically up and down with increasing and decreasing amounts of instances. Not sure if that is feasible, but I am thinking it will require a complete reconfiguration of the assignment of partitions per peer with each scale action. If it is possible and fits in the spec I am willing to help out developing it.

jasonbell 2017-11-30T10:12:31.000144Z

That would be a nice addition @eelke I’ve always had to bring the running job down, reconfigure and bring it back up again.

eelke 2017-11-30T10:19:27.000067Z

Yeah, I think it is nice to have if you want to use autoscaling. Our use case is that the throughput range is quite big, so on moments of low throughput we would like to have less instances and the other way around.

jasonbell 2017-11-30T10:36:33.000235Z

agreed

michaeldrogalis 2017-11-30T15:58:21.000393Z

@eelke Onyx would need to support repartitioning state. Each peer is set up with some checkpoint state about each partition's offset. Changing the way partitions and peers are reassigned means that we'd need a better way to use resume points which translate between the before/after configuration.

michaeldrogalis 2017-11-30T15:58:30.000354Z

It's completely possible, but not on our short term roadmap.

eriktjacobsen 2017-11-30T20:06:17.000206Z

Are there any open source projects using onyx in production(rather than reference implementations) that are known for having good best practices and format? Would love to read through some codebases to see how they've structured onyx, even better if they've been vetted by Distributed Masonry.

👀 1
lucasbradstreet 2017-11-30T22:51:01.000077Z

It wouldn’t be so hard to repartition the kafka offset state, so it’s certainly more possible than general rescaling of window state.

eelke 2017-12-04T09:02:59.000398Z

Ok sounds good. Let's do it 😉